Repository: xcLee001/SonicVale
Branch: master
Commit: 85c0838f1478
Files: 114
Total size: 573.3 KB
Directory structure:
gitextract_bvsr8afo/
├── LICENSE
├── README.md
├── SonicVale/
│ ├── .gitignore
│ ├── README.md
│ ├── app/
│ │ ├── __init__.py
│ │ ├── core/
│ │ │ ├── __init__.py
│ │ │ ├── audio_engin.py
│ │ │ ├── config.py
│ │ │ ├── enums.py
│ │ │ ├── llm_engine.py
│ │ │ ├── prompts.py
│ │ │ ├── response.py
│ │ │ ├── subtitle/
│ │ │ │ ├── ASRData.py
│ │ │ │ ├── BaseASR.py
│ │ │ │ ├── BcutASR.py
│ │ │ │ ├── JianYingASR.py
│ │ │ │ ├── KuaiShouASR.py
│ │ │ │ ├── WhisperASR.py
│ │ │ │ ├── __init__.py
│ │ │ │ └── subtitle_engine.py
│ │ │ ├── text_correct_engine.py
│ │ │ ├── tts_engine.py
│ │ │ ├── tts_runtime.py
│ │ │ └── ws_manager.py
│ │ ├── db/
│ │ │ └── database.py
│ │ ├── dto/
│ │ │ ├── chapter_dto.py
│ │ │ ├── emotion_dto.py
│ │ │ ├── line_dto.py
│ │ │ ├── llm_provider_dto.py
│ │ │ ├── multi_emotion_voice_dto.py
│ │ │ ├── project_dto.py
│ │ │ ├── prompt_dto.py
│ │ │ ├── role_dto.py
│ │ │ ├── strength_dto.py
│ │ │ ├── tts_provider_dto.py
│ │ │ └── voice_dto.py
│ │ ├── entity/
│ │ │ ├── chapter_entity.py
│ │ │ ├── emotion_entity.py
│ │ │ ├── line_entity.py
│ │ │ ├── llm_provider_entity.py
│ │ │ ├── multi_emotion_voice_entity.py
│ │ │ ├── project_entity.py
│ │ │ ├── prompt_entity.py
│ │ │ ├── role_entity.py
│ │ │ ├── strength_entity.py
│ │ │ ├── tts_provider_entity.py
│ │ │ └── voice_entity.py
│ │ ├── main.py
│ │ ├── models/
│ │ │ └── po.py
│ │ ├── repositories/
│ │ │ ├── chapter_repository.py
│ │ │ ├── emotion_repository.py
│ │ │ ├── line_repository.py
│ │ │ ├── llm_provider_repository.py
│ │ │ ├── multi_emotion_voice_repository.py
│ │ │ ├── project_repository.py
│ │ │ ├── prompt_repository.py
│ │ │ ├── role_repository.py
│ │ │ ├── strength_repository.py
│ │ │ ├── tts_provider_repository.py
│ │ │ └── voice_repository.py
│ │ ├── routers/
│ │ │ ├── chapter_router.py
│ │ │ ├── emotion_router.py
│ │ │ ├── line_router.py
│ │ │ ├── llm_provider_router.py
│ │ │ ├── multi_emotion_voice_router.py
│ │ │ ├── project_router.py
│ │ │ ├── prompt_router.py
│ │ │ ├── role_router.py
│ │ │ ├── strength_router.py
│ │ │ ├── tts_provider_router.py
│ │ │ └── voice_router.py
│ │ └── services/
│ │ ├── chapter_service.py
│ │ ├── emotion_service.py
│ │ ├── line_service.py
│ │ ├── llm_provider_service.py
│ │ ├── multi_emotion_voice_service.py
│ │ ├── project_service.py
│ │ ├── prompt_service.py
│ │ ├── role_service.py
│ │ ├── strength_service.py
│ │ ├── tts_provider_service.py
│ │ └── voice_service.py
│ └── requirements.txt
└── sonicvale-front/
├── .gitignore
├── .vscode/
│ └── extensions.json
├── README.md
├── electron/
│ ├── logger.js
│ ├── main.js
│ └── preload.js
├── index.html
├── package.json
├── resource/
│ └── license.txt
├── src/
│ ├── App.vue
│ ├── api/
│ │ ├── chapter.js
│ │ ├── config.js
│ │ ├── enums.js
│ │ ├── line.js
│ │ ├── multiEmotionVoice.js
│ │ ├── project.js
│ │ ├── prompt.js
│ │ ├── provider.js
│ │ ├── role.js
│ │ └── voice.js
│ ├── components/
│ │ └── WaveCellPro.vue
│ ├── main.js
│ ├── pages/
│ │ ├── ConfigCenter.vue
│ │ ├── ProjectDubbingDetail.vue
│ │ ├── ProjectList.vue
│ │ ├── PromptManager.vue
│ │ └── VoiceManager.vue
│ ├── router/
│ │ └── index.js
│ ├── style.css
│ └── utils/
│ └── utf8-or-gbk.js
└── vite.config.js
================================================
FILE CONTENTS
================================================
================================================
FILE: LICENSE
================================================
GNU AFFERO GENERAL PUBLIC LICENSE
Version 3, 19 November 2007
Copyright (C) 2007 Free Software Foundation, Inc.
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU Affero General Public License is a free, copyleft license for
software and other kinds of works, specifically designed to ensure
cooperation with the community in the case of network server software.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
our General Public Licenses are intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
Developers that use our General Public Licenses protect your rights
with two steps: (1) assert copyright on the software, and (2) offer
you this License which gives you legal permission to copy, distribute
and/or modify the software.
A secondary benefit of defending all users' freedom is that
improvements made in alternate versions of the program, if they
receive widespread use, become available for other developers to
incorporate. Many developers of free software are heartened and
encouraged by the resulting cooperation. However, in the case of
software used on network servers, this result may fail to come about.
The GNU General Public License permits making a modified version and
letting the public access it on a server without ever releasing its
source code to the public.
The GNU Affero General Public License is designed specifically to
ensure that, in such cases, the modified source code becomes available
to the community. It requires the operator of a network server to
provide the source code of the modified version running there to the
users of that server. Therefore, public use of a modified version, on
a publicly accessible server, gives the public access to the source
code of the modified version.
An older license, called the Affero General Public License and
published by Affero, was designed to accomplish similar goals. This is
a different license, not a version of the Affero GPL, but Affero has
released a new version of the Affero GPL which permits relicensing under
this license.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU Affero General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Remote Network Interaction; Use with the GNU General Public License.
Notwithstanding any other provision of this License, if you modify the
Program, your modified version must prominently offer all users
interacting with it remotely through a computer network (if your version
supports such interaction) an opportunity to receive the Corresponding
Source of your version by providing access to the Corresponding Source
from a network server at no charge, through some standard or customary
means of facilitating copying of software. This Corresponding Source
shall include the Corresponding Source for any work covered by version 3
of the GNU General Public License that is incorporated pursuant to the
following paragraph.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the work with which it is combined will remain governed by version
3 of the GNU General Public License.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU Affero General Public License from time to time. Such new versions
will be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU Affero General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU Affero General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU Affero General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
Copyright (C)
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published
by the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see .
Also add information on how to contact you by electronic and paper mail.
If your software can interact with users remotely through a computer
network, you should also make sure that it provides a way for users to
get its source. For example, if your program is a web application, its
interface could display a "Source" link that leads users to an archive
of the code. There are many ways you could offer source, and different
solutions will be better for different programs; see section 13 for the
specific requirements.
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU AGPL, see
.
================================================
FILE: README.md
================================================
# 音谷 - AI 多角色多情绪配音平台
> 一个开源的多角色、多情绪 AI 配音生成平台,支持小说、剧本、视频等内容的自动配音与导出。
---
## 📝 详细使用文档
[音谷 - AI 多角色多情绪配音平台使用教程](https://sw4s2hg7k5y.feishu.cn/wiki/WjbUw1t7JiWIa7k2pFXcxqSbnde?from=from_copylink)
## 📖 软件简介
- **软件名称**:音谷 - AI 多角色多情绪配音平台
- **定位**:为小说、剧本、视频等内容提供多角色、多情绪的 AI 语音合成与配音服务
- **主要功能**:
- 小说 / 剧本文本导入
- 多角色角色库管理
- 情绪音色选择与绑定
- 台词自动拆分与配音生成
- 批量任务管理与导出
- 支持自定义 LLM 接口选择与调用
- 基于Index-TTS-2.0的多情绪TTS服务
- 支持精准的音频编辑功能,可以自定义删除音频片段或者添加静音片段。
- 支持自定义提示词,适配个性化拆分需求
## 🛠 技术栈
- **前端**:Electron + Vue + Element Plus
- **后端**:FastAPI / Python
- **AI 接口**:兼容 OpenAI API 协议的大模型
- **TTS 服务**:IndexTTs-2 + Cloud Native Build 平台(免费 H20 显卡支持)/ 本地部署整合包
## 二次开发说明
本软件依据 **AGPL-3.0** 开源许可协议发布。基于本项目进行二次开发时,开发者须遵守以下规范:
### 1. 署名要求
必须在衍生软件的用户界面及代码文档中清晰标注:
> "本软件基于开源项目《音谷》二次开发"
并附上原项目仓库链接。
### 2. 商业使用限制
未获得书面商业授权前,任何基于本项目的衍生作品不得用于商业用途或提供商业服务。
## 🚀 快速开始
### 1️⃣ 克隆项目
```bash
git clone https://github.com/xcLee001/SonicVale.git
cd SonicVale
```
### 2️⃣ 启动后端
首先,需要下载ffmpeg.exe到app/core/ffmpeg/ffmpeg.exe
可以去官网[ffmpeg](https://www.gyan.dev/ffmpeg/builds/packages/ffmpeg-8.0-full_build.7z)
。也可以使用[此镜像](https://www.alipan.com/s/ey5QRqW3Jji)
然后复制到app/core/ffmpeg/目录下
安装依赖和启动服务
```bash
cd SonicVale
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8200
```
```
app/
├── core/ # 全局配置、tts引擎、llm引擎、ffmpeg封装、字幕生成、websocket、异步队列
├── db/ # 数据库连接和Base
├── models/ # ORM模型
├── dto/ # 数据传输对象(请求/响应验证)
├── entity/ # 实体类(结合 ORM 与业务层)
├── repositories/ # 数据库封装
├── services/ # 核心业务逻辑
├── routers/ # FastAPI路由接口
└── main.py # FastAPI启动入口
```
### 3️⃣ 启动前端
```bash
cd sonicvale-front
npm install # 安装依赖
npm run start # 启动前端包括electron
```
## Coffe
如果您觉得我的项目对您有所帮助,欢迎您的赞助。您的支持将使我有更多的动力继续维护和改进这个项目。
您可以通过扫描下面的二维码来请我喝杯咖啡:
## 🎥 效果演示
👉 [点击查看 B 站演示效果视频](https://www.bilibili.com/video/BV1tSpTz6EBH/)
## 📷 截图
LLM 配置界面

TTS 配置界面

音色管理界面

项目创建界面

章节创建界面

章节内容导入

台词自动拆分

角色绑定,多章节共享角色音色

台词编辑,高度自定义

- 在台词编辑区,用户可手动修改:
- 台词文本
- 角色归属
- 情绪类型
- 情绪轻度
- 修改后自动保存并更新。
配音生成

生成后音频可编辑

## 📬 联系方式
如果在使用过程中遇到 **Bug** 或者有 **功能建议**,请通过 [GitHub Issues](https://github.com/xcLee001/SonicVale/issues) 提交,这样可以帮助我们更好地跟踪与解决问题。
如果你希望加入用户交流社区,欢迎加入我们的 QQ 群:
- 💬 QQ交流群:1060711739(1群已满)、575715633(2群) (验证信息请填写 “音谷配音”)
## 📜 协议
本项目采用 [GNU Affero General Public License v3.0 (AGPL-3.0)](./LICENSE) 开源协议。
您可以自由地使用、复制、修改、合并、发布和分发本软件及其副本,但必须遵守以下条款:
- 您必须在分发的软件中包含原始许可声明和版权声明。
- 若您修改并发布本软件,或通过网络提供服务(如 SaaS、Web 应用),您必须同时公开修改后的源代码。
- 您不得附加任何与 AGPL-3.0 条款冲突的限制。
## ⚠️ 免责声明
本项目仅供学习与研究使用。
用户不得利用本项目从事任何违法违规行为,包括但不限于:
- 克隆或模仿未经授权的声音;
- 侵犯他人声音权、肖像权、著作权、名誉权;
- 其他可能违反法律法规的行为。
开发者不对用户使用本项目所产生的任何后果负责,所有风险与责任由用户自行承担。
使用本项目即表示您已阅读并同意本免责声明。
---
## ⚠️ Disclaimer
This project is intended for research and educational purposes only.
Users are strictly prohibited from using this project for any unlawful activities, including but not limited to:
- Cloning or imitating voices without authorization;
- Infringing upon the rights of others (voice rights, portrait rights, copyrights, reputation rights, etc.);
- Any other activities in violation of applicable laws and regulations.
The developer shall not be held liable for any consequences arising from the use of this project.
All risks and responsibilities lie solely with the user.
By using this project, you acknowledge that you have read and agreed to this disclaimer.
================================================
FILE: SonicVale/.gitignore
================================================
# python cache
__pycache__/
*.pyc
*.pyo
*.pyd
# JetBrains IDE
.idea/
# venv
.venv/
env/
venv/
# 打包输出
dist
build
*.spec
*.exe
# logs
*.log
# others
.DS_Store
================================================
FILE: SonicVale/README.md
================================================
```
app/
├── core/ # 全局配置、tts引擎、llm引擎、ffmpeg封装、字幕生成、websocket、异步队列
├── db/ # 数据库连接和Base
├── models/ # ORM模型
├── dto/ # 数据传输对象(请求/响应验证)
├── entity/ # 实体类(结合 ORM 与业务层)
├── repositories/ # 数据库封装
├── services/ # 核心业务逻辑
├── routers/ # FastAPI路由接口
└── main.py # FastAPI启动入口
```
================================================
FILE: SonicVale/app/__init__.py
================================================
================================================
FILE: SonicVale/app/core/__init__.py
================================================
================================================
FILE: SonicVale/app/core/audio_engin.py
================================================
import os
import subprocess
import tempfile
import soundfile as sf
import numpy as np
from app.core.config import getFfmpegPath
class AudioProcessor:
def __init__(self, audio_path: str, keep_format=True, default_sr=44100, default_ch=2):
self.audio_path = audio_path
self.keep_format = keep_format
self.default_sr = default_sr
self.default_ch = default_ch
info = sf.info(audio_path)
self.sr = info.samplerate if keep_format else default_sr
self.ch = info.channels if keep_format else default_ch
self.duration = info.duration
self.ffmpeg_path = getFfmpegPath()
self.temp_path = self._create_tmp_file()
def _create_tmp_file(self):
os.makedirs(os.path.dirname(self.audio_path) or ".", exist_ok=True)
tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".wav",
dir=os.path.dirname(self.audio_path) or ".")
return tmp.name
def _run_ffmpeg(self, cmd):
subprocess.run(
cmd, check=True,
creationflags=subprocess.CREATE_NO_WINDOW if os.name == "nt" else 0
)
def _normalize(self, path):
"""防止音量削波"""
data, sr = sf.read(path, dtype="float32", always_2d=True)
peak = float(np.max(np.abs(data)))
if peak > 1.0:
data = data / peak
sf.write(path, data, sr, format="WAV", subtype="PCM_16")
# ---------------------- 模块功能 ---------------------- #
def cut(self, start_ms: int, end_ms: int):
"""删除音频区间 [start_ms, end_ms]"""
start_sec = start_ms / 1000
end_sec = end_ms / 1000
cmd = [
self.ffmpeg_path, "-y", "-i", self.audio_path,
"-filter_complex",
f"[0:a]atrim=0:{start_sec},asetpts=PTS-STARTPTS[first];"
f"[0:a]atrim={end_sec},asetpts=PTS-STARTPTS[second];"
f"[first][second]concat=n=2:v=0:a=1[out]",
"-map", "[out]",
"-ar", str(self.sr),
"-ac", str(self.ch),
"-c:a", "pcm_s16le",
self.temp_path
]
self._run_ffmpeg(cmd)
os.replace(self.temp_path, self.audio_path)
def insert_silence(self, insert_ms: int, duration_sec: float):
"""在指定时间点插入静音"""
insert_sec = insert_ms / 1000
cmd = [
self.ffmpeg_path, "-y",
"-i", self.audio_path,
"-f", "lavfi", "-t", str(duration_sec),
"-i", f"anullsrc=channel_layout={'stereo' if self.ch == 2 else 'mono'}:sample_rate={self.sr}",
"-filter_complex",
f"[0:a]atrim=0:{insert_sec},asetpts=PTS-STARTPTS[first];"
f"[0:a]atrim={insert_sec},asetpts=PTS-STARTPTS[second];"
f"[first][1:a][second]concat=n=3:v=0:a=1[out]",
"-map", "[out]",
"-ar", str(self.sr),
"-ac", str(self.ch),
"-c:a", "pcm_s16le",
self.temp_path
]
self._run_ffmpeg(cmd)
os.replace(self.temp_path, self.audio_path)
def append_silence(self, duration_sec: float):
"""
在音频末尾添加或裁剪静音段:
- duration_sec > 0: 在末尾添加指定秒数静音
- duration_sec < 0: 从末尾裁剪指定秒数的内容
"""
if duration_sec == 0:
return # 无需处理
# ---------- 情况1:添加静音 ----------
if duration_sec > 0:
cmd = [
self.ffmpeg_path, "-y",
"-i", self.audio_path,
"-f", "lavfi", "-t", str(duration_sec),
"-i", f"anullsrc=channel_layout={'stereo' if self.ch == 2 else 'mono'}:sample_rate={self.sr}",
"-filter_complex",
"[0:a][1:a]concat=n=2:v=0:a=1[out]",
"-map", "[out]",
"-ar", str(self.sr),
"-ac", str(self.ch),
"-c:a", "pcm_s16le",
self.temp_path
]
# ---------- 情况2:裁剪末尾 ----------
else:
cut_dur = self.duration + duration_sec # 因为 duration_sec 为负
if cut_dur < 0:
cut_dur = 0 # 防止全裁掉出错
cmd = [
self.ffmpeg_path, "-y",
"-i", self.audio_path,
"-filter_complex",
f"[0:a]atrim=0:{cut_dur},asetpts=PTS-STARTPTS[out]",
"-map", "[out]",
"-ar", str(self.sr),
"-ac", str(self.ch),
"-c:a", "pcm_s16le",
self.temp_path
]
# 执行 ffmpeg 命令
self._run_ffmpeg(cmd)
os.replace(self.temp_path, self.audio_path)
# 更新音频时长(防止后续操作出错)
info = sf.info(self.audio_path)
self.duration = info.duration
def change_speed(self, speed: float):
"""变速处理 (0.5~2.0倍)"""
speed = float(np.clip(speed, 0.5, 2.0))
cmd = [
self.ffmpeg_path, "-y", "-i", self.audio_path,
"-af", f"atempo={speed}",
"-ar", str(self.sr),
"-ac", str(self.ch),
"-c:a", "pcm_s16le",
self.temp_path
]
self._run_ffmpeg(cmd)
os.replace(self.temp_path, self.audio_path)
def change_volume(self, volume: float):
"""音量调整"""
volume = max(0.0, float(volume))
cmd = [
self.ffmpeg_path, "-y", "-i", self.audio_path,
"-af", f"volume={volume}",
"-ar", str(self.sr),
"-ac", str(self.ch),
"-c:a", "pcm_s16le",
self.temp_path
]
self._run_ffmpeg(cmd)
os.replace(self.temp_path, self.audio_path)
def export(self, out_path: str):
"""导出音频到目标路径(带软限幅)"""
self._normalize(self.audio_path)
os.replace(self.audio_path, out_path)
return out_path
================================================
FILE: SonicVale/app/core/config.py
================================================
import os
import os, sys
from pathlib import Path
# 得到默认配置文件
def getConfigPath():
# 用户 目录下SonicVale目录
user_dir = os.path.join(os.path.expanduser("~"), "SonicVale")
# 如果目录不存在,创建它
if not os.path.exists(user_dir):
os.makedirs(user_dir, exist_ok=True)
# 返回 config.json 路径(目录已保证存在)
return user_dir
def getFfmpegPath():
BASE_DIR = getattr(sys, "_MEIPASS", Path(os.path.abspath(".")))
FFMPEG_PATH = os.path.join(BASE_DIR, "core", "ffmpeg", "ffmpeg.exe")
return FFMPEG_PATH
================================================
FILE: SonicVale/app/core/enums.py
================================================
from enum import Enum
class TaskEnum(str, Enum):
DUBBING = "台词拆分"
================================================
FILE: SonicVale/app/core/llm_engine.py
================================================
# app/core/llm_engine.py
import json
import logging
# app/core/llm_engine.py
import re
import time
import random
from openai import OpenAI
from numba.cuda import stream
from app.core.prompts import get_auto_fix_json_prompt
class LLMEngine:
def __init__(self, api_key: str, base_url: str, model_name: str, custom_params: str):
"""
api_key: LLM API Key
base_url: OpenAI-compatible API URL(例如企业版/自建 LLM)
model_name: 模型名称
custom_params: 自定义参数(JSON字符串)
"""
self.api_key = api_key
self.base_url = base_url.rstrip("/") # 去掉末尾斜杠
self.model_name = model_name
# custom_params从string转为dict
custom_params = json.loads(custom_params)
if not isinstance(custom_params, dict):
raise ValueError("无效的 custom_params")
self.custom_params = custom_params
# 使用新版 OpenAI 客户端
self.client = OpenAI(
api_key=api_key,
base_url=self.base_url
)
def _extract_result_tag(self, text: str) -> str:
"""提取 标签内容"""
match = re.search(r"(.*?)", text, re.DOTALL)
if not match:
raise ValueError("Response does not contain ... tag")
return match.group(1).strip()
def generate_text_test(self, prompt: str) -> str:
"""
测试:生成结果并返回(非流式)
"""
response = self.client.chat.completions.create(
model=self.model_name,
messages=[{"role": "user", "content": prompt}],
timeout=3000,
**self.custom_params
)
return response.choices[0].message.content
def generate_text(self, prompt: str, retries: int = 3, delay: float = 1.0) -> str:
"""
流式生成:边生成边输出
"""
for attempt in range(retries):
try:
# 开启流式
# stream = self.client.chat.completions.create(
# model=self.model_name,
# messages=[{"role": "user", "content": prompt}],
# stream=True,
# timeout=3000,
# **self.custom_params
# )
# 关闭流式,直接获取完整响应
response = self.client.chat.completions.create(
model=self.model_name,
messages=[{"role": "user", "content": prompt}],
stream=False, # 关键:设置为 False
timeout=3000,
**self.custom_params
)
# 直接获取完整文本
full_text = response.choices[0].message.content
return full_text
except Exception as e:
if attempt < retries - 1:
sleep_time = delay * (2 ** attempt) + random.random()
time.sleep(sleep_time)
else:
raise e
def save_load_json(self, json_str: str):
"""解析JSON,支持自动提取标签内容"""
# 先尝试提取 标签内容
try:
json_str = self._extract_result_tag(json_str)
except ValueError:
# 没有 标签,直接使用原文本
pass
# 尝试加载json
try:
return json.loads(json_str)
except json.JSONDecodeError:
# JSON解析失败,尝试让LLM修复
prompt = get_auto_fix_json_prompt(json_str)
res = self.generate_text(prompt)
# 递归调用,修复后的结果也可能包含 标签
return self.save_load_json(res)
def generate_smart_text(self, prompt: str) -> str:
"""
智能文本生成(流式)
"""
stream = self.client.chat.completions.create(
model=self.model_name,
messages=[{"role": "user", "content": prompt}],
stream=True,
timeout=3000
)
# 拼接 delta.content
full_text = ""
for chunk in stream:
if chunk.choices and len(chunk.choices) > 0:
delta = chunk.choices[0].delta
content = delta.content if hasattr(delta, 'content') else None
if content:
# print(content, end="", flush=True)
full_text += content
logging.debug("流式生成完成")
return full_text
================================================
FILE: SonicVale/app/core/prompts.py
================================================
# 根据小说内容生成
import textwrap
def get_context2lines_prompt(possible_characters, novel_content,possible_emotions,possible_strengths) -> str:
prompt = f"""
你的任务是将给定小说内容划分为角色台词和旁白,并输出包含标签的结构化JSON结果。
划分规则:
台词识别:
识别所有角色说话的内容,包括带引号、破折号、叹号等常见台词标记的文本。
如果角色在给定角色列表中,使用该角色名;
如果角色未在列表中出现,根据上下文合理归纳角色名。
重要规则:相邻台词之间如果角色相同,可以适当合并,但是一段内容最多不超过150字。如果单段内容超过150字,请将内容拆分为多条。
旁白识别:
对叙述性、心理描写、环境描写、动作描写等非台词内容统一标记为“旁白:”。
重要规则:相邻台词之间如果都为旁白内容,可以适当合并,但是一段内容最多不超过150字。如果单段内容超过150字,请将内容拆分为多条。
情绪以及情绪强弱识别:
根据上下文场景,识别出每条台词所对应的情绪以及情绪强度。情绪和情绪强度的内容必须来自情绪列表possible_emotions和情绪强度列表possible_strengths。
旁白的情绪和情绪强度统一为一样的,统一为‘平静’情绪,强度为‘中等’。
特殊情况处理:
多角色对话连续出现时,每条台词对应正确角色。
混合旁白和台词的段落可拆分为旁白和台词两条记录。
避免重复、遗漏台词或旁白。
输出格式:
输出严格遵循包含标签的JSON数组形式
示例:
[
{"role_name": "张三", "text_content": "你到底在干什么!", "emotion_name": "生气", "strength_name": "强烈"},
{"role_name": "旁白", "text_content": "此时,张三愤怒站着", "emotion_name": "平静", "strength_name": "中等"},
{"role_name": "李四", "text_content": "这可不管我的事儿", "emotion_name": "害怕", "strength_name": "微弱"}
]
注意事项:
保持文本顺序与逻辑一致。
不要改写原文台词或旁白内容。
所有划分结果必须完整输出在 标签内。
输入内容:
可能包含的角色列表:
{possible_characters}
可能包含的情绪列表:
{possible_emotions}
可能包含的情绪强弱列表:
{possible_strengths}
小说原文:
{novel_content}
"""
return textwrap.dedent(prompt)
def get_prompt_str():
prompt = """
你的任务是将给定小说内容划分为角色和内容,并输出为结构化JSON结果。
台词识别规则:
1. 必须完整保留原文内容,不得遗漏、删改或省略任何字句。
2. 提取角色对话内容喝旁白。识别所有内容,包括带引号(“”)、破折号(——)、感叹号(!)、冒号(:)等常见台词标记的文本,其余均为旁白内容。
3. 若角色在已知角色列表中,则直接使用该角色名;若不在列表中,则根据上下文合理判断角色身份。
4. 相邻台词如属同一角色,可合并为一条,但单条台词长度不得超过150字。
5. 若单条台词超过150字,需按语义完整性拆分为多条,每条不超过150字,并确保原文内容不缺失。
旁白识别规则:
1. 所有非台词的叙述性内容(包括心理活动、环境描写、动作描写、场景过渡等)均标记为“旁白”。
2. 必须保留原文的所有文字内容,不得遗漏、删改或省略任何字句。
3. 相邻的旁白内容可合并为一条,但单条长度不得超过150字。
4. 若单条旁白超过150字,需按语义完整性拆分为多条,每条不超过150字,确保原文内容完整呈现。
情绪与情绪强度识别规则:
1. 根据上下文语境、语气及场景变化,为每条台词识别情绪和情绪强度。
2. 情绪与强度必须严格从提供的情绪列表(possible_emotions)与强度列表(possible_strengths)中选择。
3. “旁白”内容的情绪与强度统一为:情绪“平静”,强度“中等”。
4. 情绪识别不得影响或改写原文内容,仅用于标注。
特殊情况处理:
1. 多角色连续对话时,确保每条台词对应正确角色,避免角色错配。
2. 当段落中混合出现旁白与台词时,应拆分为独立记录:旁白一条、台词一条。
3. 输出结果不得出现遗漏、重复、合并错误或原文缺失的情况。
4. 拆分、合并及情绪标注仅为结构化目的,须保证原文内容100%完整保留。
输出格式:
严格输出为 json数组。
示例:
小说原文:
一名靠前的灰衣少年似乎与石台上的少年颇为熟悉,他听得大伙的窃窃私语,不由得得意一笑,压低声音道:“牧哥可是被选拔出来参加过“灵路”的人,我们整个北灵境中,可就牧哥一人有名额,你们应该也知道参加“灵路”的都是些什么变态吧?当年我们这北灵境可是因为此事沸腾了好一阵的,从那里出来的人,最后基本全部都是被“五大院”给预定了的。”
输出:
[
{"role_name": "旁白", "text_content": "一名靠前的灰衣少年似乎与石台上的少年颇为熟悉,他听得大伙的窃窃私语,不由得得意一笑,压低声音道", "emotion_name": "平静", "strength_name": "中等"},
{"role_name": "灰衣少年", "text_content": "牧哥可是被选拔出来参加过“灵路”的人,我们整个北灵境中,可就牧哥一人有名额,你们应该也知道参加“灵路”的都是些什么变态吧?当年我们这北灵境可是因为此事沸腾了好一阵的,从那里出来的人,最后基本全部都是被“五大院”给预定了的。", "emotion_name": "高兴", "strength_name": "中等"}
]
输入内容:
可能包含的角色列表:
{possible_characters}
可能包含的情绪列表:
{possible_emotions}
可能包含的情绪强弱列表:
{possible_strengths}
小说原文:
{novel_content}
"""
return textwrap.dedent(prompt)
def get_auto_fix_json_prompt(json_str: str) -> str:
prompt = f"""
你将收到一段可能出错的 JSON 字符串(它可能是 LLM 生成的结果),其中可能存在以下问题:
多余或缺失的逗号
缺少引号或多余引号
键值格式错误
JSON 外含无关说明文字
非法转义符
你的任务是:
仅输出一个严格合法、可被 json.loads 解析的 JSON。
保持原有数据结构和内容不变(除非必须修正格式)。
不要在 JSON 外输出任何解释、额外文字或注释。
输出必须完整输出在 标签内。
输入内容:
{json_str}
w
"""
return textwrap.dedent(prompt)
def get_add_smart_role_and_voice(original_text: str, role_name, voice_names):
prompt = f"""
你是“角色音色匹配助手”。你的任务是:根据小说原文中的角色表现,为每个在中出现的角色匹配最符合其语气与性格的音色。
原文内容:
{original_text}
角色列表信息:
{role_name}
音色列表信息:
{voice_names}
匹配规则(必须严格遵守):
1. 仅根据【原文内容】判断哪些角色实际出现;未在原文中出现的角色一律忽略,不输出。
2. 对于每个实际出现的角色,根据原文中体现的性格特征、语气风格、情绪倾向、年龄感等信息,推断该角色适合的音色类型。
3. 再根据音色库中每个音色的名称或描述,为角色挑选最匹配的音色。
4. 若某角色最匹配的音色与其他角色重复使用是不允许的(音色数量可能不足)。
5. 若确实存在无法匹配的角色(例如原文完全无语气风格线索),则该角色不输出。
6. 不得臆造原文中不存在的角色特征或音色特征。
7. 最终输出必须是一个标准 JSON 数组,且数组中的每个对象必须包含:
- "role_name": 角色名
- "voice_name": 匹配的音色名
输出格式要求:
- 严格输出 JSON 数组。
- 不得输出任何解释说明、自然语言、注释或多余内容。
示例输出(格式示例):
[
{{ "role_name": "灰衣少年", "voice_name": "小王" }},
{{ "role_name": "白衣少年", "voice_name": "小正" }}
]
"""
return textwrap.dedent(prompt)
def get_subtitle_correction_prompt(original_text: str, subtitle_lines: list) -> str:
"""
生成字幕矫正的prompt
original_text: 原始正确文本
subtitle_lines: ASR识别的字幕行列表,格式为 [{"index": 1, "text": "..."}]
"""
subtitle_json = "\n".join([f' {{"index": {item["index"]}, "text": "{item["text"]}"}}' for item in subtitle_lines])
prompt = f"""
你是一个专业的字幕校对助手。你的任务是根据原文内容,修正ASR自动识别产生的字幕错误。
## 任务说明
ASR(自动语音识别)生成的字幕可能存在以下问题:
1. 同音字错误(如"他"与"她"、"的"与"得")
2. 近音字错误
3. 词语分割错误
4. 标点符号错误或缺失
你需要参考原文,将每条字幕修正为正确的文本。
## 重要规则
1. 严格保持字幕条目数量不变(输入多少条,输出多少条)
2. 尽量保持每条字幕的长度相近,不要大幅改变字幕的切分位置
3. 仅修正错误,不要改写原意或增删内容
4. 如果某条字幕已经正确,原样保留
5. 输出格式必须是JSON数组
## 原文内容
{original_text}
## 待矫正的字幕
[
{subtitle_json}
]
## 输出格式
严格输出JSON数组,每个元素包含index和corrected_text字段:
[
{{"index": 1, "corrected_text": "修正后的文本"}},
{{"index": 2, "corrected_text": "修正后的文本"}}
]
请开始矫正:
"""
return textwrap.dedent(prompt)
================================================
FILE: SonicVale/app/core/response.py
================================================
# app/core/response.py
from pydantic.generics import GenericModel
from typing import Generic, TypeVar, Optional
T = TypeVar("T")
class Res(GenericModel, Generic[T]):
code: int = 200
message: str = "success"
data: Optional[T] = None
================================================
FILE: SonicVale/app/core/subtitle/ASRData.py
================================================
import json
import logging
import re
from typing import List
from pathlib import Path
class ASRDataSeg:
def __init__(self, text, start_time, end_time):
self.text = text
self.start_time = start_time
self.end_time = end_time
def to_srt_ts(self) -> str:
"""Convert to SRT timestamp format"""
return f"{self._ms_to_srt_time(self.start_time)} --> {self._ms_to_srt_time(self.end_time)}"
def to_lrc_ts(self) -> str:
"""Convert to LRC timestamp format"""
return f"[{self._ms_to_lrc_time(self.start_time)}]"
def to_ass_ts(self) -> tuple[str, str]:
"""Convert to ASS timestamp format"""
return self._ms_to_ass_ts(self.start_time), self._ms_to_ass_ts(self.end_time)
def _ms_to_lrc_time(self, ms) -> str:
seconds = ms / 1000
minutes, seconds = divmod(seconds, 60)
return f"{int(minutes):02}:{seconds:.2f}"
@staticmethod
def _ms_to_srt_time(ms) -> str:
"""Convert milliseconds to SRT time format (HH:MM:SS,mmm)"""
total_seconds, milliseconds = divmod(ms, 1000)
minutes, seconds = divmod(total_seconds, 60)
hours, minutes = divmod(minutes, 60)
return f"{int(hours):02}:{int(minutes):02}:{int(seconds):02},{int(milliseconds):03}"
@staticmethod
def _ms_to_ass_ts(ms) -> str:
"""Convert milliseconds to ASS timestamp format (H:MM:SS.cc)"""
total_seconds, milliseconds = divmod(ms, 1000)
minutes, seconds = divmod(total_seconds, 60)
hours, minutes = divmod(minutes, 60)
# ASS格式使用厘秒(1/100秒)而不是毫秒
centiseconds = int(milliseconds / 10)
return f"{int(hours):01}:{int(minutes):02}:{int(seconds):02}.{centiseconds:02}"
@property
def transcript(self) -> str:
"""Return segment text"""
return self.text
def __str__(self) -> str:
return f"ASRDataSeg({self.text}, {self.start_time}, {self.end_time})"
class ASRData:
def __init__(self, segments: List[ASRDataSeg]):
self.segments = segments
def __iter__(self):
return iter(self.segments)
def __len__(self) -> int:
return len(self.segments)
def has_data(self) -> bool:
"""Check if there are any utterances"""
return len(self.segments) > 0
def is_word_timestamp(self) -> bool:
"""
判断是否是字级时间戳
规则:
1. 对于英文,每个segment应该只包含一个单词
2. 对于中文,每个segment应该只包含一个汉字
3. 允许20%的误差率
"""
if not self.segments:
return False
valid_segments = 0
total_segments = len(self.segments)
for seg in self.segments:
text = seg.text.strip()
# 检查是否只包含一个英文单词或一个汉字
if (len(text.split()) == 1 and text.isascii()) or len(text.strip()) <= 2:
valid_segments += 1
logging.info("valid_segments: %s, total_segments: %s", valid_segments, total_segments)
return (valid_segments / total_segments) >= 0.8
def save(self, save_path: str, ass_style: str = None, layout: str = "原文在上") -> None:
"""Save the ASRData to a file"""
# 根据文件后缀名选择保存格式
Path(save_path).parent.mkdir(parents=True, exist_ok=True)
if save_path.endswith('.srt'):
self.to_srt(save_path=save_path)
elif save_path.endswith('.txt'):
with open(save_path, 'w', encoding='utf-8') as f:
f.write(self.to_txt())
elif save_path.endswith('.json'):
with open(save_path, 'w', encoding='utf-8') as f:
json.dump(self.to_json(), f, ensure_ascii=False)
elif save_path.endswith('.ass'):
self.to_ass(save_path=save_path, style_str=ass_style, layout=layout)
else:
raise ValueError(f"Unsupported file extension: {save_path}")
def to_txt(self) -> str:
"""Convert to plain text subtitle format (without timestamps)"""
return "\n".join(seg.transcript for seg in self.segments)
def to_srt(self, save_path=None) -> str:
"""Convert to SRT subtitle format"""
srt_text = "\n".join(
f"{n}\n{seg.to_srt_ts()}\n{seg.transcript}\n"
for n, seg in enumerate(self.segments, 1))
if save_path:
with open(save_path, 'w', encoding='utf-8') as f:
f.write(srt_text)
return srt_text
def to_lrc(self, save_path=None) -> str:
"""Convert to LRC subtitle format"""
lrc_text = "\n".join(
f"{seg.to_lrc_ts()}{seg.transcript}" for seg in self.segments
)
if save_path:
with open(save_path, 'w', encoding='utf-8') as f:
f.write(lrc_text)
return lrc_text
def to_json(self) -> dict:
result_json = {}
for i, segment in enumerate(self.segments, 1):
# 检查是否有换行符
if "\n" in segment.text:
original_subtitle, translated_subtitle = segment.text.split("\n")
else:
original_subtitle, translated_subtitle = segment.text, ""
result_json[str(i)] = {
"start_time": segment.start_time,
"end_time": segment.end_time,
"original_subtitle": original_subtitle,
"translated_subtitle": translated_subtitle
}
return result_json
def to_ass(self, style_str: str = None, layout: str = "原文在上", save_path: str = None) -> str:
"""转换为ASS字幕格式
Args:
style_str: ASS样式字符串,为空则使用默认样式
layout: 字幕布局,可选值["译文在上", "原文在上", "仅原文", "仅译文"]
Returns:
ASS格式字幕内容
"""
# 默认样式
if not style_str:
style_str = (
"[V4+ Styles]\n"
"Format: Name,Fontname,Fontsize,PrimaryColour,SecondaryColour,OutlineColour,BackColour,"
"Bold,Italic,Underline,StrikeOut,ScaleX,ScaleY,Spacing,Angle,BorderStyle,Outline,Shadow,"
"Alignment,MarginL,MarginR,MarginV,Encoding\n"
"Style: Default,微软雅黑,66,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,-1,0,0,0,100,100,"
"0,0,1,2,0,2,10,10,10,1\n"
"Style: Translate,微软雅黑,40,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,-1,0,0,0,100,100,"
"0,0,1,2,0,2,10,10,10,1"
)
# 构建ASS文件头
ass_content = (
"[Script Info]\n"
"; Script generated by VideoCaptioner\n"
"; https://github.com/weifeng2333\n"
"ScriptType: v4.00+\n"
"PlayResX: 1280\n"
"PlayResY: 720\n\n"
f"{style_str}\n\n"
"[Events]\n"
"Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text\n"
)
# 根据布局生成对话内容
for seg in self.segments:
start_time = seg.to_ass_ts()[0]
end_time = seg.to_ass_ts()[1]
dialogue_template = 'Dialogue: 0,{},{},{},,0,0,0,,{}\n'
# 检查是否有换行符分隔的原文和译文
if "\n" in seg.text:
original, translate = seg.text.split("\n")
if layout == "译文在上" and translate:
ass_content += dialogue_template.format(start_time, end_time, "Secondary", original)
ass_content += dialogue_template.format(start_time, end_time, "Default", translate)
elif layout == "原文在上" and translate:
ass_content += dialogue_template.format(start_time, end_time, "Secondary", translate)
ass_content += dialogue_template.format(start_time, end_time, "Default", original)
elif layout == "仅原文":
ass_content += dialogue_template.format(start_time, end_time, "Default", original)
elif layout == "仅译文" and translate:
ass_content += dialogue_template.format(start_time, end_time, "Default", translate)
else:
original = seg.text
ass_content += dialogue_template.format(start_time, end_time, "Default", original)
# 根据布局生成对话行
if save_path:
with open(save_path, 'w', encoding='utf-8') as f:
f.write(ass_content)
return ass_content
def merge_segments(self, start_index: int, end_index: int, merged_text: str = None):
"""合并从 start_index 到 end_index 的段(包含)。"""
if start_index < 0 or end_index >= len(self.segments) or start_index > end_index:
raise IndexError("无效的段索引。")
merged_start_time = self.segments[start_index].start_time
merged_end_time = self.segments[end_index].end_time
if merged_text is None:
merged_text = ''.join(seg.text for seg in self.segments[start_index:end_index+1])
merged_seg = ASRDataSeg(merged_text, merged_start_time, merged_end_time)
# 替换 segments[start_index:end_index+1] 为 merged_seg
self.segments[start_index:end_index+1] = [merged_seg]
def merge_with_next_segment(self, index: int) -> None:
"""合并指定索引的段与下一个段。"""
if index < 0 or index >= len(self.segments) - 1:
raise IndexError("索引超出范围或没有下一个段可合并。")
current_seg = self.segments[index]
next_seg = self.segments[index + 1]
# 合并文本
merged_text = f"{current_seg.text} {next_seg.text}"
merged_start_time = current_seg.start_time
merged_end_time = next_seg.end_time
merged_seg = ASRDataSeg(merged_text, merged_start_time, merged_end_time)
# 替换当前段为合并后的段
self.segments[index] = merged_seg
# 删除下一个段
del self.segments[index + 1]
def __str__(self):
return self.to_txt()
def from_subtitle_file(file_path: str) -> 'ASRData':
"""从文件路径加载ASRData实例
Args:
file_path: 字幕文件路径,支持.srt、.vtt、.ass、.json格式
Returns:
ASRData: 解析后的ASRData实例
Raises:
ValueError: 不支持的文件格式或文件读取错误
"""
file_path = Path(file_path)
if not file_path.exists():
raise FileNotFoundError(f"文件不存在: {file_path}")
try:
content = file_path.read_text(encoding='utf-8')
except UnicodeDecodeError:
content = file_path.read_text(encoding='gbk')
suffix = file_path.suffix.lower()
if suffix == '.srt':
return from_srt(content)
elif suffix == '.vtt':
if '' in content: # YouTube VTT格式包含字级时间戳
return from_youtube_vtt(content)
return from_vtt(content)
elif suffix == '.ass':
return from_ass(content)
elif suffix == '.json':
return from_json(json.loads(content))
else:
raise ValueError(f"不支持的文件格式: {suffix}")
def from_json(json_data: dict) -> 'ASRData':
"""从JSON数据创建ASRData实例"""
segments = []
for i in sorted(json_data.keys(), key=int):
segment_data = json_data[i]
text = segment_data['original_subtitle']
if segment_data['translated_subtitle']:
text += '\n' + segment_data['translated_subtitle']
segment = ASRDataSeg(
text=text,
start_time=segment_data['start_time'],
end_time=segment_data['end_time']
)
segments.append(segment)
return ASRData(segments)
def from_srt(srt_str: str) -> 'ASRData':
"""
从SRT格式的字符串创建ASRData实例。
:param srt_str: 包含SRT格式字幕的字符串。
:return: 解析后的ASRData实例。
"""
segments = []
srt_time_pattern = re.compile(
r'(\d{2}):(\d{2}):(\d{1,2})[.,](\d{3})\s-->\s(\d{2}):(\d{2}):(\d{1,2})[.,](\d{3})'
)
for block in re.split(r'\n\s*\n', srt_str.strip()):
lines = block.splitlines()
if len(lines) < 3:
continue
match = srt_time_pattern.match(lines[1])
if not match:
continue
time_parts = list(map(int, match.groups()))
start_time = sum([
time_parts[0] * 3600000,
time_parts[1] * 60000,
time_parts[2] * 1000,
time_parts[3]
])
end_time = sum([
time_parts[4] * 3600000,
time_parts[5] * 60000,
time_parts[6] * 1000,
time_parts[7]
])
text = '\n'.join(lines[2:]).strip()
segments.append(ASRDataSeg(text, start_time, end_time))
return ASRData(segments)
def from_vtt(vtt_str: str) -> 'ASRData':
"""
从YouTube VTT格式的字符串创建ASRData实例。
:param vtt_str: YouTube VTT格式的字幕字符串
:return: ASRData实例
"""
segments = []
# 跳过头部元数据
content = vtt_str.split('\n\n')[2:]
current_text = ""
current_start = 0
current_end = 0
for block in content:
lines = block.strip().split('\n')
if not lines:
continue
# 解析时间戳行
timestamp_line = lines[0]
if '-->' not in timestamp_line:
continue
# 提取开始和结束时间
times = timestamp_line.split(' --> ')[0]
hours, minutes, seconds = times.split(':')
seconds, milliseconds = seconds.split('.')
start_time = (int(hours) * 3600 + int(minutes) * 60 + int(seconds)) * 1000 + int(milliseconds)
times = timestamp_line.split(' --> ')[1].split()[0]
hours, minutes, seconds = times.split(':')
seconds, milliseconds = seconds.split('.')
end_time = (int(hours) * 3600 + int(minutes) * 60 + int(seconds)) * 1000 + int(milliseconds)
# 提取并清文本内容
if len(lines) > 1:
text_line = lines[1]
# 移除时间戳和样式标记
cleaned_text = re.sub(r'<\d{2}:\d{2}:\d{2}\.\d{3}>', '', text_line)
cleaned_text = re.sub(r'?c>', '', cleaned_text)
cleaned_text = cleaned_text.strip()
if cleaned_text and cleaned_text != " ":
segments.append(ASRDataSeg(cleaned_text, start_time, end_time))
return ASRData(segments)
def from_youtube_vtt(vtt_str: str) -> 'ASRData':
"""
从YouTube VTT格式的字符串创建ASRData实例,提取字级时间戳。
:param vtt_str: 包含VTT格式字幕的字符串
:return: 解析后的ASRData实例
"""
def parse_timestamp(ts: str) -> int:
"""将时间戳字符串转换为毫秒"""
h, m, s = ts.split(':')
return int(float(h) * 3600000 + float(m) * 60000 + float(s) * 1000)
def split_timestamped_text(text: str) -> List[ASRDataSeg]:
"""分离带时间戳的文本为单词段"""
# 匹配 <时间戳>文本 的模式
pattern = re.compile(r'<(\d{2}:\d{2}:\d{2}\.\d{3})>([^<]*)')
matches = list(pattern.finditer(text))
word_segments = []
for i in range(len(matches) - 1):
current_match = matches[i]
next_match = matches[i + 1]
start_time = parse_timestamp(current_match.group(1))
end_time = parse_timestamp(next_match.group(1))
word = current_match.group(2).strip()
if word: # 只有当文本不为空时才创建segment
word_segments.append(ASRDataSeg(word, start_time, end_time))
return word_segments
segments = []
# 跳过WEBVTT头部
blocks = re.split(r'\n\n+', vtt_str.strip())
# 时间戳匹配模式
timestamp_pattern = re.compile(
r'(\d{2}):(\d{2}):(\d{2}\.\d{3})\s*-->\s*(\d{2}):(\d{2}):(\d{2}\.\d{3})'
)
for block in blocks:
lines = block.strip().split('\n')
if not lines:
continue
# 匹配时间戳行
match = timestamp_pattern.match(lines[0])
if not match:
continue
# 计算块的开始和结束时间
block_start_time = (
int(match.group(1)) * 3600000 +
int(match.group(2)) * 60000 +
float(match.group(3)) * 1000
)
block_end_time = (
int(match.group(4)) * 3600000 +
int(match.group(5)) * 60000 +
float(match.group(6)) * 1000
)
# 获取文本内容
text = '\n'.join(lines)
timestamp_row = re.search(r'\n(.*?.*?.*)', block)
if timestamp_row:
text = re.sub(r'|', '', timestamp_row.group(1))
block_start_time_string = f"{match.group(1)}:{match.group(2)}:{match.group(3)}"
block_end_time_string = f"{match.group(4)}:{match.group(5)}:{match.group(6)}"
text = f"<{block_start_time_string}>{text}<{block_end_time_string}>"
# 分离每个带时间戳的单词
word_segments = split_timestamped_text(text)
segments.extend(word_segments)
return ASRData(segments)
def from_ass(ass_str: str) -> 'ASRData':
"""
从ASS格式的字符串创建ASRData实例。
:param ass_str: 包含ASS格式字幕的字符串
:return: ASRData实例
"""
segments = []
# ASS时间戳格式: H:MM:SS.cc
ass_time_pattern = re.compile(r'Dialogue: \d+,(\d+:\d{2}:\d{2}\.\d{2}),(\d+:\d{2}:\d{2}\.\d{2}),(.*?),.*?,\d+,\d+,\d+,.*?,(.*?)$')
def parse_ass_time(time_str: str) -> int:
"""将ASS时间戳转换为毫秒"""
hours, minutes, seconds = time_str.split(':')
seconds, centiseconds = seconds.split('.')
return (int(hours) * 3600000 +
int(minutes) * 60000 +
int(seconds) * 1000 +
int(centiseconds) * 10) # 厘秒转毫秒
# 按行处理ASS文件
for line in ass_str.splitlines():
if line.startswith('Dialogue:'):
match = ass_time_pattern.match(line)
if match:
start_time = parse_ass_time(match.group(1))
end_time = parse_ass_time(match.group(2))
text = match.group(4)
# 清理ASS格式标记
text = re.sub(r'\{[^}]*\}', '', text) # 移除样式标记 {xxx}
text = text.replace('\\N', '\n') # 处理换行符
text = text.strip()
if text: # 只有当文本不为空时才创建segment
segments.append(ASRDataSeg(text, start_time, end_time))
return ASRData(segments)
if __name__ == '__main__':
ass_style_str = """[V4+ Styles]
Format: Name,Fontname,Fontsize,PrimaryColour,SecondaryColour,OutlineColour,BackColour,Bold,Italic,Underline,StrikeOut,ScaleX,ScaleY,Spacing,Angle,BorderStyle,Outline,Shadow,Alignment,MarginL,MarginR,MarginV,Encoding
Style: Default,微软雅黑,62,&H0017f1be,&H000000FF,&H00000000,&H00000000,-1,0,0,0,100,100,1.0,0,1,0.8,0,2,10,10,10,1
Style: Secondary,微软雅黑,40,&H00ffffff,&H000000FF,&H00000000,&H00000000,-1,0,0,0,100,100,0.0,0,1,0.0,0,2,10,10,10,1"""
# 测试
from pathlib import Path
# vtt_file_path = r"E:\GithubProject\VideoCaptioner\app\work_dir\Setting the record straight\subtitle\original_subtitle.en.vtt"
# vtt_file_path = r"E:\GithubProject\VideoCaptioner\work_dir\Wake up babe a dangerous new open-source AI model is here\subtitle\original.en.vtt"
# asr_data = from_youtube_vtt(Path(vtt_file_path).read_text(encoding="utf-8"))
srt_file_path = r"E:\GithubProject\VideoCaptioner\app\work_dir\低视力音乐助人者_mp4\result_subtitle.srt"
asr_data = from_srt(Path(srt_file_path).read_text(encoding="utf-8"))
logging.info("%s", asr_data.to_ass(style_str=ass_style_str, save_path=srt_file_path.replace(".srt", ".ass")))
# pass
# asr_data = ASRData(seg)
# Uncomment to test different formats:
# print(asr_data.to_srt(save_path=vtt_file_path.replace(".vtt", ".srt")))
# print(asr_data.to_lrc())
# print(asr_data.to_txt())
# print(asr_data.to_json())
# print(asr_data.to_json())
================================================
FILE: SonicVale/app/core/subtitle/BaseASR.py
================================================
import json
import logging
import os
import zlib
import tempfile
import threading
from .ASRData import ASRDataSeg, ASRData
class BaseASR:
SUPPORTED_SOUND_FORMAT = ["flac", "m4a", "mp3", "wav"]
CACHE_FILE = os.path.join(tempfile.gettempdir(), "bk_asr", "asr_cache.json")
_lock = threading.Lock()
def __init__(self, audio_path: [str, bytes], use_cache: bool = False):
self.audio_path = audio_path
self.file_binary = None
self.crc32_hex = None
self.use_cache = use_cache
self._set_data()
self.cache = self._load_cache()
def _load_cache(self):
if not self.use_cache:
return {}
os.makedirs(os.path.dirname(self.CACHE_FILE), exist_ok=True)
with self._lock:
if os.path.exists(self.CACHE_FILE):
try:
with open(self.CACHE_FILE, 'r', encoding='utf-8') as f:
cache = json.load(f)
if isinstance(cache, dict):
return cache
except (json.JSONDecodeError, IOError):
return {}
return {}
def _save_cache(self):
if not self.use_cache:
return
with self._lock:
try:
with open(self.CACHE_FILE, 'w', encoding='utf-8') as f:
json.dump(self.cache, f, ensure_ascii=False, indent=2)
if os.path.exists(self.CACHE_FILE) and os.path.getsize(self.CACHE_FILE) > 10 * 1024 * 1024:
os.remove(self.CACHE_FILE)
except IOError as e:
logging.error(f"Failed to save cache: {e}")
def _set_data(self):
if isinstance(self.audio_path, bytes):
self.file_binary = self.audio_path
else:
ext = self.audio_path.split(".")[-1].lower()
assert ext in self.SUPPORTED_SOUND_FORMAT, f"Unsupported sound format: {ext}"
assert os.path.exists(self.audio_path), f"File not found: {self.audio_path}"
with open(self.audio_path, "rb") as f:
self.file_binary = f.read()
crc32_value = zlib.crc32(self.file_binary) & 0xFFFFFFFF
self.crc32_hex = format(crc32_value, '08x')
def _get_key(self):
return f"{self.__class__.__name__}-{self.crc32_hex}"
def run(self):
k = self._get_key()
if k in self.cache and self.use_cache:
resp_data = self.cache[k]
else:
resp_data = self._run()
# Cache the result
self.cache[k] = resp_data
self._save_cache()
segments = self._make_segments(resp_data)
return ASRData(segments)
def _make_segments(self, resp_data: dict) -> list[ASRDataSeg]:
raise NotImplementedError("_make_segments method must be implemented in subclass")
def _run(self) -> dict:
""" Run the ASR service and return the response data. """
raise NotImplementedError("_run method must be implemented in subclass")
================================================
FILE: SonicVale/app/core/subtitle/BcutASR.py
================================================
import json
import logging
import time
from os import PathLike
from typing import Optional
import requests
from .ASRData import ASRData, ASRDataSeg
from .BaseASR import BaseASR
__version__ = "0.0.3"
API_BASE_URL = "https://member.bilibili.com/x/bcut/rubick-interface"
# 申请上传
API_REQ_UPLOAD = API_BASE_URL + "/resource/create"
# 提交上传
API_COMMIT_UPLOAD = API_BASE_URL + "/resource/create/complete"
# 创建任务
API_CREATE_TASK = API_BASE_URL + "/task"
# 查询结果
API_QUERY_RESULT = API_BASE_URL + "/task/result"
class BcutASR(BaseASR):
"""必剪 语音识别接口"""
headers = {
'User-Agent': 'Bilibili/1.0.0 (https://www.bilibili.com)',
'Content-Type': 'application/json'
}
def __init__(self, audio_path: [str, bytes], use_cache: bool = False):
super().__init__(audio_path, use_cache=use_cache)
self.session = requests.Session()
self.task_id = None
self.__etags = []
self.__in_boss_key: Optional[str, None] = None
self.__resource_id: Optional[str, None] = None
self.__upload_id: Optional[str, None] = None
self.__upload_urls: Optional[list[str]] = []
self.__per_size: Optional[int, None] = None
self.__clips: Optional[int, None] = None
self.__etags: Optional[list[str]] = []
self.__download_url: Optional[str, None] = None
self.task_id: Optional[str, None] = None
def upload(self) -> None:
"""申请上传"""
if not self.file_binary:
raise ValueError("none set data")
payload = json.dumps({
"type": 2,
"name": "audio.mp3",
"size": len(self.file_binary),
"ResourceFileType": "mp3",
"model_id": "8",
})
resp = requests.post(
API_REQ_UPLOAD,
data=payload,
headers=self.headers
)
resp.raise_for_status()
resp = resp.json()
resp_data = resp["data"]
self.__in_boss_key = resp_data["in_boss_key"]
self.__resource_id = resp_data["resource_id"]
self.__upload_id = resp_data["upload_id"]
self.__upload_urls = resp_data["upload_urls"]
self.__per_size = resp_data["per_size"]
self.__clips = len(resp_data["upload_urls"])
logging.info(
f"申请上传成功, 总计大小{resp_data['size'] // 1024}KB, {self.__clips}分片, 分片大小{resp_data['per_size'] // 1024}KB: {self.__in_boss_key}"
)
self.__upload_part()
self.__commit_upload()
def __upload_part(self) -> None:
"""上传音频数据"""
for clip in range(self.__clips):
start_range = clip * self.__per_size
end_range = (clip + 1) * self.__per_size
logging.info(f"开始上传分片{clip}: {start_range}-{end_range}")
resp = requests.put(
self.__upload_urls[clip],
data=self.file_binary[start_range:end_range],
headers=self.headers
)
resp.raise_for_status()
etag = resp.headers.get("Etag")
self.__etags.append(etag)
logging.info(f"分片{clip}上传成功: {etag}")
def __commit_upload(self) -> None:
"""提交上传数据"""
data = json.dumps({
"InBossKey": self.__in_boss_key,
"ResourceId": self.__resource_id,
"Etags": ",".join(self.__etags),
"UploadId": self.__upload_id,
"model_id": "8",
})
resp = requests.post(
API_COMMIT_UPLOAD,
data=data,
headers=self.headers
)
resp.raise_for_status()
resp = resp.json()
self.__download_url = resp["data"]["download_url"]
logging.info(f"提交成功")
def create_task(self) -> str:
"""开始创建转换任务"""
resp = requests.post(
API_CREATE_TASK, json={"resource": self.__download_url, "model_id": "8"}, headers=self.headers
)
resp.raise_for_status()
resp = resp.json()
self.task_id = resp["data"]["task_id"]
logging.info(f"任务已创建: {self.task_id}")
return self.task_id
def result(self, task_id: Optional[str] = None):
"""查询转换结果"""
resp = requests.get(API_QUERY_RESULT, params={"model_id": 7, "task_id": task_id or self.task_id}, headers=self.headers)
resp.raise_for_status()
resp = resp.json()
return resp["data"]
def _run(self):
self.upload()
self.create_task()
# 轮询检查任务状态
for _ in range(500):
task_resp = self.result()
if task_resp["state"] == 4:
break
time.sleep(1)
logging.info(f"转换成功")
return json.loads(task_resp["result"])
def _make_segments(self, resp_data: dict) -> list[ASRDataSeg]:
return [ASRDataSeg(u['transcript'], u['start_time'], u['end_time']) for u in resp_data['utterances']]
if __name__ == '__main__':
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
# Example usage
audio_file = r"test.mp3"
asr = BcutASR(audio_file)
asr_data = asr.run()
logging.info("%s", asr_data)
================================================
FILE: SonicVale/app/core/subtitle/JianYingASR.py
================================================
import datetime
import hashlib
import hmac
import json
import os
import time
import uuid
from typing import Dict, Tuple, Union
import requests
from .ASRData import ASRDataSeg
from .BaseASR import BaseASR
# from ASRData import ASRDataSeg
# from BaseASR import BaseASR
class JianYingASR(BaseASR):
def __init__(self, audio_path: Union[str, bytes], use_cache: bool = False, need_word_time_stamp: bool = False,
start_time: float = 0, end_time: float = 6000):
super().__init__(audio_path, use_cache)
self.audio_path = audio_path
self.end_time = end_time
self.start_time = start_time
# AWS credentials
self.session_token = None
self.secret_key = None
self.access_key = None
# Upload details
self.store_uri = None
self.auth = None
self.upload_id = None
self.session_key = None
self.upload_hosts = None
self.need_word_time_stamp = need_word_time_stamp
self.tdid = "3943278516897751" if datetime.datetime.now().year != 2024 else f"{uuid.getnode():012d}"
def submit(self) -> str:
"""Submit the task"""
url = "https://lv-pc-api-sinfonlinec.ulikecam.com/lv/v1/audio_subtitle/submit"
payload = {
"adjust_endtime": 200,
"audio": self.store_uri,
"caption_type": 2,
"client_request_id": "45faf98c-160f-4fae-a649-6d89b0fe35be",
"max_lines": 1,
"songs_info": [{"end_time": self.end_time, "id": "", "start_time": self.start_time}],
"words_per_line": 16
}
sign, device_time = self._generate_sign_parameters(url='/lv/v1/audio_subtitle/submit', pf='4', appvr='6.6.0',
tdid=self.tdid)
headers = self._build_headers(device_time, sign)
response = requests.post(url, json=payload, headers=headers)
resp_data = response.json()
if resp_data.get('ret') != '0':
error_msg = f"API Error: {resp_data.get('errmsg', 'Unknown error')} (ret: {resp_data.get('ret')})"
raise ValueError(error_msg)
query_id = resp_data['data']['id']
return query_id
def upload(self):
"""Upload the file"""
self._upload_sign()
self._upload_auth()
self._upload_file()
self._upload_check()
uri = self._upload_commit()
return uri
def query(self, query_id: str):
"""Query the task"""
url = "https://lv-pc-api-sinfonlinec.ulikecam.com/lv/v1/audio_subtitle/query"
payload = {
"id": query_id,
"pack_options": {"need_attribute": True}
}
sign, device_time = self._generate_sign_parameters(url='/lv/v1/audio_subtitle/query', pf='4', appvr='6.6.0',
tdid=self.tdid)
headers = self._build_headers(device_time, sign)
response = requests.post(url, json=payload, headers=headers)
resp_data = response.json()
if resp_data.get('ret') != '0':
error_msg = f"API Error: {resp_data.get('errmsg', 'Unknown error')} (ret: {resp_data.get('ret')})"
raise ValueError(error_msg)
return resp_data
def _run(self, callback=None):
# logging.info("正在上传文件...")
if callback:
callback(20, "正在上传...")
self.upload()
if callback:
callback(50, "提交任务...")
query_id = self.submit()
if callback:
callback(60, "获取结果...")
resp_data = self.query(query_id)
if callback:
callback(100, "转录完成")
return resp_data
def _make_segments(self, resp_data: dict) -> list[ASRDataSeg]:
if self.need_word_time_stamp:
return [ASRDataSeg(w['text'].strip(), w['start_time'], w['end_time']) for u in
resp_data['data']['utterances'] for w in u['words']]
else:
return [ASRDataSeg(u['text'], u['start_time'], u['end_time']) for u in resp_data['data']['utterances']]
def _get_key(self):
return f"{self.__class__.__name__}-{self.crc32_hex}-{self.need_word_time_stamp}"
def _generate_sign_parameters(self, url: str, pf: str = '4', appvr: str = '6.6.0', tdid='') -> \
Tuple[str, str]:
"""Generate signature and timestamp via an HTTP request"""
current_time = str(int(time.time()))
data = {
'url': url,
'current_time': current_time,
'pf': pf,
'appvr': appvr,
'tdid': self.tdid
}
# Replace with your actual endpoint URL
get_sign_url = 'https://asrtools-update.bkfeng.top/sign'
try:
response = requests.post(get_sign_url, json=data)
response.raise_for_status()
response_data = response.json()
sign = response_data.get('sign')
if not sign:
raise ValueError("No 'sign' in response")
except requests.exceptions.RequestException as e:
raise SystemExit(f"HTTP Request failed: {e}")
except ValueError as ve:
raise SystemExit(f"Invalid response: {ve}")
return sign.lower(), current_time
def _build_headers(self, device_time: str, sign: str) -> Dict[str, str]:
"""Build headers for requests"""
return {
'User-Agent': "Cronet/TTNetVersion:d4572e53 2024-06-12 QuicVersion:4bf243e0 2023-04-17",
'appvr': "6.6.0",
'device-time': str(device_time),
'pf': "4",
'sign': sign,
'sign-ver': "1",
'tdid': self.tdid,
}
def _uplosd_headers(self):
headers = {
'User-Agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36 Thea/1.0.1",
'Authorization': self.auth,
'Content-CRC32': self.crc32_hex,
}
return headers
def _upload_sign(self):
"""Get upload sign"""
url = "https://lv-pc-api-sinfonlinec.ulikecam.com/lv/v1/upload_sign"
payload = json.dumps({"biz": "pc-recognition"})
sign, device_time = self._generate_sign_parameters(url='/lv/v1/upload_sign', pf='4', appvr='6.6.0',
tdid=self.tdid)
headers = self._build_headers(device_time, sign)
response = requests.post(url, data=payload, headers=headers)
response.raise_for_status()
login_data = response.json()
self.access_key = login_data['data']['access_key_id']
self.secret_key = login_data['data']['secret_access_key']
self.session_token = login_data['data']['session_token']
return self.access_key, self.secret_key, self.session_token
def _upload_auth(self):
"""Get upload authorization"""
if isinstance(self.audio_path, bytes):
file_size = len(self.audio_path)
else:
file_size = os.path.getsize(self.audio_path)
request_parameters = f'Action=ApplyUploadInner&FileSize={file_size}&FileType=object&IsInner=1&SpaceName=lv-mac-recognition&Version=2020-11-19&s=5y0udbjapi'
t = datetime.datetime.utcnow()
amz_date = t.strftime('%Y%m%dT%H%M%SZ')
datestamp = t.strftime('%Y%m%d')
headers = {
"x-amz-date": amz_date,
"x-amz-security-token": self.session_token
}
signature = aws_signature(self.secret_key, request_parameters, headers, region="cn", service="vod")
authorization = f"AWS4-HMAC-SHA256 Credential={self.access_key}/{datestamp}/cn/vod/aws4_request, SignedHeaders=x-amz-date;x-amz-security-token, Signature={signature}"
headers["authorization"] = authorization
response = requests.get(f"https://vod.bytedanceapi.com/?{request_parameters}", headers=headers)
store_infos = response.json()
self.store_uri = store_infos['Result']['UploadAddress']['StoreInfos'][0]['StoreUri']
self.auth = store_infos['Result']['UploadAddress']['StoreInfos'][0]['Auth']
self.upload_id = store_infos['Result']['UploadAddress']['StoreInfos'][0]['UploadID']
self.session_key = store_infos['Result']['UploadAddress']['SessionKey']
self.upload_hosts = store_infos['Result']['UploadAddress']['UploadHosts'][0]
self.store_uri = store_infos['Result']['UploadAddress']['StoreInfos'][0]['StoreUri']
return store_infos
def _upload_file(self):
"""Upload the file"""
url = f"https://{self.upload_hosts}/{self.store_uri}?partNumber=1&uploadID={self.upload_id}"
headers = self._uplosd_headers()
response = requests.put(url, data=self.file_binary, headers=headers)
resp_data = response.json()
assert resp_data['success'] == 0, f"File upload failed: {response.text}"
return resp_data
def _upload_check(self):
"""Check upload result"""
url = f"https://{self.upload_hosts}/{self.store_uri}?uploadID={self.upload_id}"
payload = f"1:{self.crc32_hex}"
headers = self._uplosd_headers()
response = requests.post(url, data=payload, headers=headers)
resp_data = response.json()
return resp_data
def _upload_commit(self):
"""Commit the uploaded file"""
url = f"https://{self.upload_hosts}/{self.store_uri}?uploadID={self.upload_id}&partNumber=1&x-amz-security-token={self.session_token}"
headers = self._uplosd_headers()
response = requests.put(url, data=self.file_binary, headers=headers)
return self.store_uri
def sign(key: bytes, msg: str) -> bytes:
"""使用HMAC-SHA256生成签名"""
return hmac.new(key, msg.encode('utf-8'), hashlib.sha256).digest()
def get_signature_key(secret_key: str, date_stamp: str, region_name: str, service_name: str) -> bytes:
"""生成用于AWS签名的密钥"""
k_date = sign(('AWS4' + secret_key).encode('utf-8'), date_stamp)
k_region = sign(k_date, region_name)
k_service = sign(k_region, service_name)
k_signing = sign(k_service, 'aws4_request')
return k_signing
def aws_signature(secret_key: str, request_parameters: str, headers: Dict[str, str],
method: str = "GET", payload: str = '', region: str = "cn", service: str = "vod") -> str:
"""生成AWS签名"""
canonical_uri = '/'
canonical_querystring = request_parameters
canonical_headers = '\n'.join([f"{key}:{value}" for key, value in headers.items()]) + '\n'
signed_headers = ';'.join(headers.keys())
payload_hash = hashlib.sha256(payload.encode('utf-8')).hexdigest()
canonical_request = f"{method}\n{canonical_uri}\n{canonical_querystring}\n{canonical_headers}\n{signed_headers}\n{payload_hash}"
amzdate = headers["x-amz-date"]
datestamp = amzdate.split('T')[0]
algorithm = 'AWS4-HMAC-SHA256'
credential_scope = f"{datestamp}/{region}/{service}/aws4_request"
string_to_sign = f"{algorithm}\n{amzdate}\n{credential_scope}\n{hashlib.sha256(canonical_request.encode('utf-8')).hexdigest()}"
signing_key = get_signature_key(secret_key, datestamp, region, service)
signature = hmac.new(signing_key, string_to_sign.encode('utf-8'), hashlib.sha256).hexdigest()
return signature
================================================
FILE: SonicVale/app/core/subtitle/KuaiShouASR.py
================================================
import requests
from .ASRData import ASRDataSeg
from .BaseASR import BaseASR
class KuaiShouASR(BaseASR):
def __init__(self, audio_path: [str, bytes], use_cache: bool = False):
super().__init__(audio_path, use_cache)
def _run(self) -> dict:
return self._submit()
def _make_segments(self, resp_data: dict) -> list[ASRDataSeg]:
return [ASRDataSeg(u['text'], u['start_time'], u['end_time']) for u in resp_data['data']['text']]
def _submit(self) -> dict:
payload = {
"typeId": "1"
}
files = [('file', ('test.mp3', self.file_binary, 'audio/mpeg'))]
result = requests.post("https://ai.kuaishou.com/api/effects/subtitle_generate", data=payload, files=files)
return result.json()
================================================
FILE: SonicVale/app/core/subtitle/WhisperASR.py
================================================
import os
from openai import OpenAI
from .ASRData import ASRDataSeg
from .BaseASR import BaseASR
class WhisperASR(BaseASR):
def __init__(self, audio_path: [str, bytes], model: str = MODEL, use_cache: bool = False):
super().__init__(audio_path, use_cache)
self.base_url = os.getenv('OPENAI_BASE_URL')
self.api_key = os.getenv('OPENAI_API_KEY')
if not self.base_url or not self.api_key:
raise ValueError("环境变量 OPENAI_BASE_URL 和 OPENAI_API_KEY 必须设置")
self.model = model
self.client = OpenAI(base_url=self.base_url, api_key=self.api_key)
def _run(self) -> dict:
return self._submit()
def _make_segments(self, resp_data: dict) -> list[ASRDataSeg]:
return [ASRDataSeg(u['text'], u['start'], u['end']) for u in resp_data['segments']]
def _get_key(self) -> str:
return f"{self.__class__.__name__}-{self.model}-{self.crc32_hex}-{self.model}"
def _submit(self) -> dict:
completion = self.client.audio.transcriptions.create(
model=self.model,
temperature=0,
response_format="verbose_json",
file=("test.mp3", self.file_binary, "audio/mp3"),
prompt="",
language="zh"
)
return completion.to_dict()
================================================
FILE: SonicVale/app/core/subtitle/__init__.py
================================================
================================================
FILE: SonicVale/app/core/subtitle/subtitle_engine.py
================================================
from app.core.subtitle.BcutASR import BcutASR
from app.core.subtitle.JianYingASR import JianYingASR
from app.core.prompts import get_subtitle_correction_prompt
from app.core.llm_engine import LLMEngine
def generate_subtitle(audio_file,save_path):
# asr = JianYingASR(audio_file)
asr = BcutASR(audio_file)
result = asr.run()
result.to_srt(save_path)
return result
# 字幕矫正
import re
import difflib
import shutil
import logging
from pypinyin import lazy_pinyin
# -------------------- 基础工具 --------------------
def is_same_char(c1: str, c2: str) -> bool:
"""字面相同或拼音相同(处理同音字)"""
if c1 == c2:
return True
return lazy_pinyin(c1) == lazy_pinyin(c2)
def correct_text_with_pinyin(original: str, recognized: str) -> str:
"""全局文本纠正:把 recognized 纠正到 original"""
sm = difflib.SequenceMatcher(None, recognized, original, autojunk=False)
out = []
for tag, i1, i2, j1, j2 in sm.get_opcodes():
if tag == "equal":
out.append(recognized[i1:i2])
else:
r = recognized[i1:i2]
o = original[j1:j2]
seg = []
L = max(len(r), len(o))
for k in range(L):
c1 = r[k] if k < len(r) else ""
c2 = o[k] if k < len(o) else ""
if c1 and c2 and is_same_char(c1, c2):
seg.append(c2)
elif c2:
seg.append(c2)
out.append("".join(seg))
return "".join(out)
# -------------------- SRT 读写 --------------------
SRT_BLOCK = re.compile(
r"(\d+)\s+([\d:,]+ --> [\d:,]+)\s+([\s\S]*?)(?=\n\n|\Z)", re.MULTILINE
)
def read_srt(path: str):
with open(path, "r", encoding="utf-8") as f:
content = f.read()
blocks = SRT_BLOCK.findall(content)
entries = []
for idx, ts, txt in blocks:
text_raw = txt.strip("\r\n")
entries.append((int(idx), ts, text_raw))
return entries
def write_srt(path: str, entries):
with open(path, "w", encoding="utf-8") as f:
for idx, ts, text in entries:
f.write(f"{idx}\n{ts}\n{text}\n\n")
# -------------------- 对齐切分 --------------------
def flatten_for_align(text: str) -> str:
return text.replace("\r", "").replace("\n", "")
def segment_corrected_by_recognized_boundaries(recognized_full: str,
corrected_full: str,
line_lengths: list[int]):
boundaries = [0]
acc = 0
for L in line_lengths:
acc += L
boundaries.append(acc)
sm = difflib.SequenceMatcher(None, recognized_full, corrected_full, autojunk=False)
ops = sm.get_opcodes()
out_lines, buf = [], []
r_pos, next_bi = 0, 1
next_boundary = boundaries[next_bi] if next_bi < len(boundaries) else len(recognized_full)
def flush_line():
nonlocal buf, out_lines, next_bi, next_boundary
out_lines.append("".join(buf))
buf = []
next_bi += 1
next_boundary = boundaries[next_bi] if next_bi < len(boundaries) else boundaries[-1]
for tag, i1, i2, j1, j2 in ops:
if tag in ("equal", "replace"):
while r_pos < i2:
take = min(i2 - r_pos, next_boundary - r_pos)
recog_len_total = i2 - i1
corr_len_total = j2 - j1
start_ratio = (r_pos - i1) / max(1, recog_len_total)
end_ratio = (r_pos + take - i1) / max(1, recog_len_total)
cj_start = j1 + round(start_ratio * corr_len_total)
cj_end = j1 + round(end_ratio * corr_len_total)
if cj_start < cj_end:
buf.append(corrected_full[cj_start:cj_end])
r_pos += take
if r_pos == next_boundary:
flush_line()
elif tag == "delete":
while r_pos < i2:
take = min(i2 - r_pos, next_boundary - r_pos)
r_pos += take
if r_pos == next_boundary:
flush_line()
elif tag == "insert":
buf.append(corrected_full[j1:j2])
if buf:
while len(out_lines) < len(line_lengths) - 1:
out_lines.append("")
out_lines.append("".join(buf))
if len(out_lines) < len(line_lengths):
out_lines += [""] * (len(line_lengths) - len(out_lines))
elif len(out_lines) > len(line_lengths):
extra = "".join(out_lines[len(line_lengths)-1:])
out_lines = out_lines[:len(line_lengths)-1] + [extra]
cleaned = []
for line in out_lines:
# line = re.sub(r"\s+", "", line)
# line = re.sub(r'^(…{1,2}|\.{3,}|[,。!?;:、”])+', '', line)
# line = re.sub(r'(…{1,2}|\.{3,}|[,。!?;:、“])+$', '', line)
# 同时匹配中英文符号
line = re.sub(r"\s+", "", line)
line = re.sub(r'^(…{1,2}|\.{3,}|[,,。.!!??;;::、”“"“])+', '', line)
line = re.sub(r'(…{1,2}|\.{3,}|[,,。.!!??;;::、”“"“])+$', '', line)
cleaned.append(line)
return cleaned
# -------------------- 外部调用 --------------------
def correct_srt_file(original_text: str, srt_path: str,
overwrite: bool = True, backup: bool = False,
out_path: str = None):
"""
original_text: 原始完整文本(直接传字符串)
srt_path: 输入字幕文件路径
overwrite: 是否覆盖原文件(默认 True)
backup: 覆盖时是否先生成 .bak 文件(默认 True)
out_path: 如果不覆盖,可以指定输出文件路径
"""
original_full = original_text.replace("\r", "").replace("\n", "").strip()
entries = read_srt(srt_path)
recognized_lines = [flatten_for_align(txt) for _, _, txt in entries]
recognized_full = "".join(recognized_lines)
corrected_full = correct_text_with_pinyin(original_full, recognized_full)
line_lengths = [len(s) for s in recognized_lines]
corrected_lines = segment_corrected_by_recognized_boundaries(
recognized_full, corrected_full, line_lengths
)
corrected_entries = []
for (idx, ts, _), line_text in zip(entries, corrected_lines):
corrected_entries.append((idx, ts, line_text))
# 目标路径
if overwrite:
if backup:
shutil.copy(srt_path, srt_path + ".bak")
logging.info("已生成备份文件:%s.bak", srt_path)
target_path = srt_path
else:
target_path = out_path or (srt_path + ".corrected.srt")
write_srt(target_path, corrected_entries)
logging.info("已生成 %s (逐行对齐修正完成)", target_path)
if __name__ == '__main__':
generate_subtitle("C:\\Users\\lxc18\\SonicVale\\1\\1\\audio\\id_2.wav","C:\\Users\\lxc18\\SonicVale\\1\\1\\audio\\id_1.srt")
# -------------------- LLM 字幕矫正 --------------------
def correct_srt_file_with_llm(
original_text: str,
srt_path: str,
llm_engine: LLMEngine,
batch_size: int = 20,
overwrite: bool = True,
backup: bool = False,
out_path: str = None
):
"""
使用LLM进行字幕矫正,分批处理
original_text: 原始完整文本
srt_path: 输入字幕文件路径
llm_engine: LLM引擎实例
batch_size: 每批处理的字幕条数(默认20条)
overwrite: 是否覆盖原文件
backup: 覆盖时是否生成.bak文件
out_path: 如果不覆盖,可以指定输出文件路径
"""
original_full = original_text.replace("\r", "").replace("\n", "").strip()
entries = read_srt(srt_path)
if not entries:
logging.warning("字幕文件为空:%s", srt_path)
return
# 分批处理
corrected_entries = []
total_batches = (len(entries) + batch_size - 1) // batch_size
for batch_idx in range(total_batches):
start_idx = batch_idx * batch_size
end_idx = min(start_idx + batch_size, len(entries))
batch_entries = entries[start_idx:end_idx]
logging.info("处理字幕批次 %d/%d (第%d-%d条)",
batch_idx + 1, total_batches, start_idx + 1, end_idx)
# 准备当前批次的字幕数据
subtitle_lines = [
{"index": idx, "text": txt.replace("\n", " ").replace('"', '\\"')}
for idx, ts, txt in batch_entries
]
# 调用LLM进行矫正
prompt = get_subtitle_correction_prompt(original_full, subtitle_lines)
try:
response = llm_engine.generate_text(prompt)
corrected_batch = llm_engine.save_load_json(response)
# 构建索引映射
corrected_map = {item["index"]: item["corrected_text"] for item in corrected_batch}
# 处理当前批次的结果
for idx, ts, original_txt in batch_entries:
corrected_text = corrected_map.get(idx, original_txt)
# 清理文本
corrected_text = clean_subtitle_text(corrected_text)
corrected_entries.append((idx, ts, corrected_text))
except Exception as e:
logging.error("批次 %d 矫正失败,使用原始文本: %s", batch_idx + 1, str(e))
# 失败时保留原始文本
for idx, ts, txt in batch_entries:
corrected_entries.append((idx, ts, txt))
# 确定目标路径
if overwrite:
if backup:
shutil.copy(srt_path, srt_path + ".bak")
logging.info("已生成备份文件:%s.bak", srt_path)
target_path = srt_path
else:
target_path = out_path or (srt_path + ".corrected.srt")
write_srt(target_path, corrected_entries)
logging.info("已生成 %s (LLM字幕矫正完成)", target_path)
def clean_subtitle_text(text: str) -> str:
"""清理字幕文本"""
# 去除空白字符
text = re.sub(r"\s+", "", text)
# 清理首尾标点
text = re.sub(r'^(…{1,2}|\.{3,}|[,,。.!!??;;::、"“”""])+', '', text)
text = re.sub(r'(…{1,2}|\.{3,}|[,,。.!!??;;::、"“”""])+$', '', text)
return text
================================================
FILE: SonicVale/app/core/text_correct_engine.py
================================================
import re
import json
import difflib
import logging
from typing import List, Dict, Tuple, Optional
class TextCorrectorFinal:
# 默认配置参数
DEFAULT_BASE_THRESHOLD = 0.65 # 基础相似度阈值
DEFAULT_BASE_WINDOW = 30 # 基础搜索窗口
DEFAULT_EXTENDED_WINDOW = 80 # 扩展搜索窗口(匹配失败时使用)
def __init__(self, base_threshold: float = None, base_window: int = None):
"""初始化文本校正器
Args:
base_threshold: 基础相似度阈值,默认0.65
base_window: 基础搜索窗口大小,默认30
"""
self.base_threshold = base_threshold or self.DEFAULT_BASE_THRESHOLD
self.base_window = base_window or self.DEFAULT_BASE_WINDOW
self.extended_window = self.DEFAULT_EXTENDED_WINDOW
def clean_text(self, text: str) -> str:
"""清理文本用于最终输出,移除换行符和全角空格,保留引号。"""
text = re.sub(r'[\n\r\u3000]', '', text)
text = re.sub(r'\s+', ' ', text)
return text.strip()
def clean_for_compare(self, text: str) -> str:
"""清理文本用于相似度比较,移除换行符、全角空格和引号。"""
text = re.sub(r'[\n\r\u3000]', '', text)
text = re.sub(r'["""「」『』]', '', text) # 仅在比较时移除引号
text = re.sub(r'\s+', ' ', text)
return text.strip()
def get_adaptive_threshold(self, sentence: str) -> float:
"""根据句子长度自适应调整相似度阈值。
短句子容易误匹配,需要更高阈值;长句子可以适当降低阈值。
"""
length = len(self.clean_for_compare(sentence))
if length <= 5:
# 非常短的句子,需要很高的阈值防止误匹配
return max(self.base_threshold, 0.85)
elif length <= 10:
# 短句子
return max(self.base_threshold, 0.75)
elif length <= 20:
# 中等长度
return self.base_threshold
else:
# 长句子可以适当降低阈值
return max(self.base_threshold - 0.05, 0.55)
def _looks_like_abbreviation(self, sentence_with_dot: str) -> bool:
"""
判断当前这一个 '.' 更像是缩写的一部分,而不是句子结束。
sentence_with_dot: 当前已经累积的句子(包含这个点)
"""
s = sentence_with_dot.rstrip()
# 找到以 . 结尾的最后一个 token(字母/数字/点)
m = re.search(r'([A-Za-z0-9\.]+)\.$', s)
if not m:
return False
token = m.group(1) # 不包含最后这个点,但可能包含内部的 .
# 1) 小写长度很短的缩写,例如 Mr. Dr. etc.
# 这里简单认为:1~4 个字母,首字母大写
if re.fullmatch(r'[A-Za-z]{1,4}', token) and token[0].isupper():
return True
# 2) 多点缩写:U.S.A / F.B.I 这种(至少 3 个字母、2 个点)
# U.S.A -> token 为 'U.S.A'
if re.fullmatch(r'[A-Za-z](?:\.[A-Za-z]){2,}', token):
return True
# 3) 你如果有特殊缩写,可以在这里硬编码
# if token in {"etc", "e.g", "i.e"}:
# return True
return False
def split_sentences(self, text: str) -> List[str]:
"""按照标点符号进行细粒度分句,同时尽量保护英文缩写和数字。
保留换行作为候选分句符。如果产生了“只有标点/引号”的句子,则直接丢弃。
"""
# 规范化换行:把 \r\n 和 \r 统一为 \n
text = text.replace('\r\n', '\n').replace('\r', '\n')
# 替换全角空格为普通空格,保留换行
text = text.replace('\u3000', ' ').strip()
# 分割:中文标点、特殊点号、或换行
sentences = re.split(r'([。!?!?:;]|(? Tuple[Optional[int], float]:
"""在原文句子列表中找到与AI句子最匹配的单个句子。
Args:
ai_sentence: AI生成的句子
original_sentences: 原文句子列表
start_index: 搜索起始位置
use_extended: 是否使用扩展搜索窗口
Returns:
(匹配索引, 相似度) 或 (None, 最高相似度)
"""
# 预处理 - 使用专门的比较清理方法
processed_ai_sentence = self.clean_for_compare(ai_sentence)
if not processed_ai_sentence:
return None, 0
# 根据句子长度获取自适应阈值
threshold = self.get_adaptive_threshold(ai_sentence)
# 选择搜索窗口大小
search_window = self.extended_window if use_extended else self.base_window
best_match_index = None
best_similarity = 0
# 向前搜索
end_index = min(start_index + search_window, len(original_sentences))
for i in range(start_index, end_index):
original_sentence = original_sentences[i]
processed_original_sentence = self.clean_for_compare(original_sentence)
if not processed_original_sentence:
continue
matcher = difflib.SequenceMatcher(None, processed_ai_sentence, processed_original_sentence)
similarity = matcher.ratio()
if similarity > best_similarity:
best_similarity = similarity
best_match_index = i
# 如果没找到匹配且未使用扩展窗口,尝试向后搜索(处理乱序情况)
if best_similarity < threshold and not use_extended and start_index > 0:
backward_start = max(0, start_index - 10)
for i in range(backward_start, start_index):
original_sentence = original_sentences[i]
processed_original_sentence = self.clean_for_compare(original_sentence)
if not processed_original_sentence:
continue
matcher = difflib.SequenceMatcher(None, processed_ai_sentence, processed_original_sentence)
similarity = matcher.ratio()
if similarity > best_similarity:
best_similarity = similarity
best_match_index = i
if best_similarity < threshold:
return None, best_similarity
return best_match_index, best_similarity
def correct_ai_text(self, original_text: str, ai_data: List[Dict]) -> List[Dict]:
"""使用分句匹配 + difflib 的方式校正AI文本。
改进的算法:
1. 记录每个校正后item对应的原文句子索引范围
2. 基于实际索引位置插入遗漏句子
3. 支持自适应搜索窗口
"""
original_sentences = self.split_sentences(original_text)
# 存储校正后的数据,以及每个item对应的原文索引范围
corrected_data = [] # List of (item_dict, matched_indices_list)
used_original_indices = set()
current_original_index = 0
for ai_item in ai_data:
ai_text = ai_item.get('text_content', '')
ai_sentences = self.split_sentences(ai_text)
corrected_sentences_for_item = []
matched_indices_for_item = [] # 记录这个item匹配到的所有原文索引
logging.info("处理角色: %s (AI原文: '%s')", ai_item.get('role_name', '未知'), ai_text[:50] if ai_text else '')
for ai_sentence in ai_sentences:
# 首先尝试基础窗口搜索
match_index, similarity = self.find_best_sentence_match(
ai_sentence, original_sentences, current_original_index, use_extended=False
)
# 如果基础窗口没找到,尝试扩展窗口
if match_index is None:
match_index, similarity = self.find_best_sentence_match(
ai_sentence, original_sentences, current_original_index, use_extended=True
)
if match_index is not None:
original_match = original_sentences[match_index]
corrected_sentences_for_item.append(original_match)
matched_indices_for_item.append(match_index)
used_original_indices.add(match_index)
current_original_index = match_index + 1
logging.info("匹配成功 (相似度: %.2f): AI='%s' -> 原文='%s'", similarity, ai_sentence, original_match)
else:
corrected_sentences_for_item.append(ai_sentence)
logging.warning("匹配失败 (最高相似度: %.2f),保留AI原句: '%s'", similarity, ai_sentence)
# 最终清理 - 保留原始格式(包括引号)
corrected_text = self.clean_text(" ".join(corrected_sentences_for_item))
if corrected_text:
corrected_item = ai_item.copy()
corrected_item['text_content'] = corrected_text
corrected_data.append((corrected_item, matched_indices_for_item))
# 处理遗漏的原文句子 - 改进的插入逻辑
missing_indices = set(range(len(original_sentences))) - used_original_indices
if not missing_indices:
# 没有遗漏,直接返回校正数据
return [item for item, _ in corrected_data]
logging.info("发现 %d 个遗漏句子,正在插入...", len(missing_indices))
# 构建原文索引到校正item的映射
# index_to_item_map: {原文索引: (corrected_item, item在corrected_data中的位置)}
index_to_item = {}
for item_idx, (item, matched_indices) in enumerate(corrected_data):
for orig_idx in matched_indices:
index_to_item[orig_idx] = (item, item_idx)
# 按原文顺序构建最终结果
final_data = []
inserted_items = set() # 记录已插入的item索引,避免重复插入
for orig_idx in range(len(original_sentences)):
if orig_idx in missing_indices:
# 插入遗漏的句子
missing_sentence = self.clean_text(original_sentences[orig_idx])
if missing_sentence:
logging.info("插入遗漏句子 (位置%d): '%s'", orig_idx, missing_sentence)
final_data.append({
'role_name': '旁白',
'text_content': missing_sentence,
'emotion_name': '',
'strength_name': ''
})
elif orig_idx in index_to_item:
item, item_idx = index_to_item[orig_idx]
# 只在第一次遇到这个item的匹配索引时插入
if item_idx not in inserted_items:
final_data.append(item)
inserted_items.add(item_idx)
# 处理可能没有匹配到任何原文索引但仍需要保留的item(纯AI生成内容)
for item_idx, (item, matched_indices) in enumerate(corrected_data):
if item_idx not in inserted_items:
final_data.append(item)
logging.warning("Item未匹配到原文,追加到末尾: %s", item.get('text_content', '')[:30])
return final_data
def read_files():
"""读取原文和AI输出文件"""
try:
with open('原文3.txt', 'r', encoding='utf-8') as f:
original_text = f.read()
with open('AI输出的包含错误的文本3.json', 'r', encoding='utf-8') as f:
ai_data = json.load(f)
return original_text, ai_data
except FileNotFoundError as e:
logging.error("文件读取错误: %s", e)
return None, None
except json.JSONDecodeError as e:
logging.error("JSON解析错误: %s", e)
return None, None
def save_corrected_data(corrected_data: List[Dict]):
"""保存校正后的数据"""
try:
with open('校正后的文本_final.json', 'w', encoding='utf-8') as f:
json.dump(corrected_data, f, ensure_ascii=False, indent=4)
logging.info("校正结果已保存到: 校正后的文本_final.json")
except Exception as e:
logging.error("保存文件时出错: %s", e)
def main():
original_text, ai_data = read_files()
if original_text is None or ai_data is None:
return
logging.info("文件读取成功!开始校正...")
corrector = TextCorrectorFinal()
corrected_data = corrector.correct_ai_text(original_text, ai_data)
save_corrected_data(corrected_data)
logging.info("校正完成!")
if __name__ == "__main__":
main()
================================================
FILE: SonicVale/app/core/tts_engine.py
================================================
import requests
from typing import Optional, List
import os
import logging
class TTSEngine:
def __init__(self, base_url: str):
"""
初始化 TTS 引擎
:param base_url: TTS 服务的基础 URL,如 http://127.0.0.1:8000
"""
self.base_url = base_url.rstrip("/")
def synthesize(
self,
text: str,
filename: str,
emo_text: Optional[str] = None,
emo_vector: Optional[List[float]] = None,
save_path: Optional[str] = None
) -> bytes:
"""
调用 /v2/synthesize 接口进行语音合成
:param text: 要合成的文本
:param filename: 参考音频文件名(服务端已存在)
:param emo_text: 情绪文本(可选)
:param emo_vector: 8维情绪向量(可选,优先级高于 emo_text)
:param save_path: 如果指定,将保存生成的音频文件到本地
:return: 音频二进制数据
"""
url = f"{self.base_url}/v2/synthesize"
payload = {"text": text, "audio_path": filename}
if emo_vector is not None:
payload["emo_vector"] = emo_vector
elif emo_text:
payload["emo_text"] = emo_text
try:
resp = requests.post(url, json=payload, timeout=120)
if resp.status_code != 200:
# 尝试解析错误信息
try:
error_data = resp.json()
error_msg = error_data.get('detail') or error_data.get('message') or error_data.get('msg') or resp.text
except:
error_msg = resp.text
raise Exception(f"TTS服务返回错误({resp.status_code}): {error_msg}")
audio_bytes = resp.content
# 检查返回的内容是否为有效音频
if len(audio_bytes) < 100:
raise Exception(f"TTS服务返回的音频数据无效,大小: {len(audio_bytes)} 字节")
if save_path:
with open(save_path, "wb") as f:
f.write(audio_bytes)
return audio_bytes
except requests.exceptions.ConnectionError:
raise Exception(f"TTS服务连接失败,请检查TTS服务是否已启动 ({self.base_url})")
except requests.exceptions.Timeout:
raise Exception(f"TTS服务请求超时,请检查TTS服务是否正常运行")
except requests.exceptions.RequestException as e:
raise Exception(f"TTS服务请求异常: {str(e)}")
def get_models(self) -> dict:
"""
调用 /v1/models 获取模型列表
:return: 模型信息
"""
url = f"{self.base_url}/v1/models"
resp = requests.get(url)
resp.raise_for_status()
return resp.json()
def check_audio_exists(self, filename: str) -> bool:
"""
调用 /v1/check/audio 检查参考音频是否存在
:param filename: 原始文件名
:return: True or False
"""
url = f"{self.base_url}/v1/check/audio"
params = {"file_name": filename}
resp = requests.get(url, params=params)
resp.raise_for_status()
return resp.json().get("exists", False)
def upload_audio(self, file_path: str,full_path=None) -> dict:
"""
调用 /v1/upload_audio 上传音频
:param file_path: 本地音频文件路径
:param full_path: 用于唯一标识的全路径(可选,如果不传则使用 file_path)
:return: 服务端响应 JSON
"""
if not os.path.isfile(file_path):
return {"code": 400, "msg": f"文件不存在: {file_path}"}
url = f"{self.base_url}/v1/upload_audio"
try:
with open(file_path, "rb") as f:
files = {
"audio": (os.path.basename(file_path), f, "audio/wav")
}
# 如果需要额外传 fullpath 参数
data = {}
if full_path:
data["full_path"] = full_path
resp = requests.post(url, files=files, data=data, timeout=30)
resp.raise_for_status()
return resp.json()
except requests.exceptions.RequestException as e:
return {"code": 500, "msg": f"请求失败: {str(e)}"}
except Exception as e:
return {"code": 500, "msg": f"上传异常: {str(e)}"}
if __name__ == "__main__":
# 示例使用
engine = TTSEngine("https://eihh5fmon4-8200.cnb.run/")
# 1. 上传音频
upload_res = engine.upload_audio("C:\\Users\\lxc18\\Music\\多情绪\\吴泽\\解说\\中等.wav",full_path="C:\\Users\\lxc18\\Music\\多情绪\\吴泽\\解说\\中等.wav")
# print("上传结果:", upload_res)
# 2. 检查音频是否存在
exists = engine.check_audio_exists("C:\\Users\\lxc18\\Music\\多情绪\\吴泽\\解说\\中等.wav")
logging.info("音频存在: %s", exists)
# 3. 获取模型列表
models = engine.get_models()
logging.info("模型信息: %s", models)
# 4. 合成语音
if exists:
audio = engine.synthesize("萧炎,斗之力,三段!级别:低级!", "C:\\Users\\lxc18\\Music\\多情绪\\吴泽\\解说\\中等.wav",emo_text="愤怒", save_path="output.wav")
logging.info("语音已保存到 output.wav, 大小 %s 字节", len(audio))
================================================
FILE: SonicVale/app/core/tts_runtime.py
================================================
# app/tts_worker.py
import asyncio
from fastapi import FastAPI
from markdown_it.rules_block import reference
from app.core.ws_manager import manager
from app.db.database import SessionLocal
from app.routers.chapter_router import get_voice_service, get_emotion_service, get_strength_service
from app.routers.multi_emotion_voice_router import get_multi_emotion_voice_service
from app.routers.role_router import get_line_service, get_role_service, get_project_service
TTS_TIMEOUT_SECONDS = 1200 # 可调
def emotion_text_to_vector(emotion: str, intensity: str) -> list[float]:
"""
将情绪(文本) + 强度(文本) 转换成 8维向量
8维分别对应: [高兴, 生气, 伤心, 害怕, 厌恶, 低落, 惊喜, 平静]
基础情绪为 one-hot,复合情绪为多维加权混合
:param emotion: 情绪名称
:param intensity: "微弱" / "稍弱" / "中等" / "较强" / "强烈"
:return: 长度为8的向量
"""
# 8维基础情绪索引: 高兴=0, 生气=1, 伤心=2, 害怕=3, 厌恶=4, 低落=5, 惊喜=6, 平静=7
BASE_EMOTIONS = ["高兴", "生气", "伤心", "害怕", "厌恶", "低落", "惊喜", "平静"]
# 复合情绪 → 基础情绪权重(各维度满强度,由 intensity 统一缩放)
COMPOSITE_MAP = {
"嘲讽": {"高兴": 0.5, "厌恶": 1.0}, # 讽刺语气
"悲愤": {"伤心": 1.0, "生气": 1.0}, # 悲愤交加
}
INTENSITY_MAP = {
"微弱": 0.2,
"稍弱": 0.4,
"中等": 0.6,
"较强": 0.8,
"强烈": 1.0
}
scale = INTENSITY_MAP.get(intensity, 0.5)
vec = [0.0] * 8
if emotion in BASE_EMOTIONS:
# 基础情绪: one-hot
vec[BASE_EMOTIONS.index(emotion)] = scale
elif emotion in COMPOSITE_MAP:
# 复合情绪: 多维加权混合
for base_name, weight in COMPOSITE_MAP[emotion].items():
vec[BASE_EMOTIONS.index(base_name)] = round(scale * weight, 4)
# 未知情绪返回全零向量(静默降级)
return vec
async def tts_worker(app: FastAPI):
q = app.state.tts_queue
ex = app.state.tts_executor
while True:
project_id, dto = await q.get()
db = SessionLocal()
try:
line_service = get_line_service(db)
role_service = get_role_service(db)
voice_service = get_voice_service(db)
multi_emotion_service = get_multi_emotion_voice_service(db)
project_service = get_project_service(db)
emotion_service = get_emotion_service(db)
strength_service = get_strength_service(db)
# line_service.update_line(dto.id, {"status": "processing"})
await manager.broadcast({
"event": "line_update",
"line_id": dto.id,
"status": "processing",
"progress": q.qsize() + 1, # +1 包含当前正在处理的任务
"meta": f"角色 {dto.role_id} 开始生成"
})
role = role_service.get_role(dto.role_id)
voice = voice_service.get_voice(role.default_voice_id)
reference_path = voice.reference_path
# if voice.is_multi_emotion == 1:
# # 使用多音色
# multi_emotion = multi_emotion_service.get_multi_emotion_voice_by_voice_id_emotion_id_strength_id(voice.id, dto.emotion_id, dto.strength_id)
# if multi_emotion is not None:
# reference_path = multi_emotion.reference_path
# 9.13
emotion = emotion_service.get_emotion(dto.emotion_id)
strength = strength_service.get_strength(dto.strength_id)
# 拼接
# emo_text = f"{strength.name}的{emotion.name} "
# if emotion.name is "解说":
# emo_text = None
emo_text = None
emo_vector = emotion_text_to_vector(emotion.name, strength.name)
project = project_service.get_project(project_id)
loop = asyncio.get_running_loop()
await asyncio.wait_for(
loop.run_in_executor(
ex,
line_service.generate_audio,
reference_path,
project.tts_provider_id,
dto.text_content,
emo_text,
emo_vector,
dto.audio_path
),
timeout=TTS_TIMEOUT_SECONDS
)
line_service.update_line(dto.id, {"status": "done"})
await manager.broadcast({
"event": "line_update",
"line_id": dto.id,
"status": "done",
"progress": q.qsize(),
"meta": "生成完成",
"audio_path": dto.audio_path
})
# 发送给前端,队列中剩余的数量
await manager.broadcast({
"event": "tts_queue_rest",
"queue_rest": q.qsize(),
"project_id": project_id
})
except Exception as e:
try:
line_service.update_line(dto.id, {"status": "failed"})
except Exception:
pass
await manager.broadcast({
"event": "line_update",
"line_id": dto.id,
"status": "failed",
"progress": q.qsize(),
"meta": f"失败: {e}"
})
finally:
db.close()
q.task_done()
================================================
FILE: SonicVale/app/core/ws_manager.py
================================================
# ws_manager.py
from fastapi import WebSocket
from typing import List
class WSManager:
def __init__(self):
self.conns: List[WebSocket] = []
async def connect(self, ws: WebSocket):
await ws.accept()
self.conns.append(ws)
def disconnect(self, ws: WebSocket):
if ws in self.conns:
self.conns.remove(ws)
async def broadcast(self, data: dict):
dead = []
for ws in self.conns:
try:
await ws.send_json(data)
except:
dead.append(ws)
for d in dead:
self.disconnect(d)
manager = WSManager()
================================================
FILE: SonicVale/app/db/database.py
================================================
from typing import Any, Generator
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, declarative_base, Session
from app.core.config import *
config_path = getConfigPath()
# SQLite 数据库文件,存储在用户目录下的 SonicVale
SQLALCHEMY_DATABASE_URL = f"sqlite:///{os.path.join(config_path, 'app_test.db')}"
# echo=True 会打印执行的 SQL 语句,调试用
engine = create_engine(
SQLALCHEMY_DATABASE_URL, connect_args={"check_same_thread": False}, echo=False
)
# SessionLocal 用于依赖注入
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
# Base 类,所有 ORM 模型继承它
Base = declarative_base()
# 依赖函数
def get_db() -> Generator[Session, Any, None]:
db = SessionLocal()
try:
yield db
finally:
db.close()
================================================
FILE: SonicVale/app/dto/chapter_dto.py
================================================
from datetime import datetime
from pydantic import BaseModel
from typing import Optional
class ChapterCreateDTO(BaseModel):
title: str
project_id: int
order_index: Optional[int] = None
id: Optional[int] = None
text_content : Optional[str] = None
class ChapterResponseDTO(BaseModel):
title: str
project_id: int
order_index: Optional[int] = None
id: Optional[int] = None
text_content: Optional[str] = None
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/dto/emotion_dto.py
================================================
from datetime import datetime
from pydantic import BaseModel
from typing import Optional
class EmotionCreateDTO(BaseModel):
name: str
id: Optional[int] = None
description: Optional[str] = None
is_active: Optional[int] = 1
class EmotionResponseDTO(BaseModel):
name: str
id: Optional[int] = None
description: Optional[str] = None
is_active: Optional[int] = 1
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/dto/line_dto.py
================================================
from datetime import datetime
from pydantic import BaseModel
from typing import Optional
class LineInitDTO(BaseModel):
role_name: Optional[str] = None
text_content: str
emotion_name: Optional[str] = None
strength_name: Optional[str] = None
class LineOrderDTO(BaseModel):
id: int
line_order: int
class LineAudioProcessDTO(BaseModel):
# 默认是1
speed: Optional[float] = 1.0
# 默认是1
volume: Optional[float] = 1.0
start_ms: Optional[int] = None
end_ms: Optional[int] = None
# 静止时间
silence_sec: Optional[float] = 0.0
current_ms: Optional[int] = None
class LineCreateDTO(BaseModel):
chapter_id: int
role_id:Optional[int] = None
voice_id : Optional[int] = None
line_order: Optional[int] = None
id: Optional[int] = None
text_content: Optional[str] = None
emotion_id: Optional[int] = None
strength_id: Optional[int] = None
audio_path : Optional[str] = None
status : Optional[str] = None
is_done : Optional[int] = 0
subtitle_path : Optional[str] = None
class LineResponseDTO(BaseModel):
chapter_id: int
role_id:Optional[int] = None
voice_id : Optional[int] = None
line_order: Optional[int] = None
id: Optional[int] = None
text_content: Optional[str] = None
emotion_id: Optional[int] = None
strength_id: Optional[int] = None
audio_path : Optional[str] = None
status : Optional[str] = None
is_done: Optional[int] = 0
subtitle_path : Optional[str] = None
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/dto/llm_provider_dto.py
================================================
from dataclasses import Field
from datetime import datetime
from pydantic import BaseModel
from typing import Optional, Dict, Any
from pydantic import BaseModel, Field as PydField
class LLMProviderCreateDTO(BaseModel):
"""业务实体:LLM"""
name: str
id: Optional[int] = None
api_base_url : Optional[str] = None
api_key: Optional[str] = None
model_list: Optional[str] = None
status : Optional[int] = None
# ✅ 默认自定义参数
custom_params: Optional[str] = None
class LLMProviderResponseDTO(BaseModel):
"""业务实体:LLM"""
name: str
id: Optional[int] = None
api_base_url : Optional[str] = None
api_key: Optional[str] = None
model_list: Optional[str] = None
status : Optional[int] = None
updated_at: Optional[datetime] = None
created_at: Optional[datetime] = None
# ✅ 默认自定义参数
custom_params: Optional[str] = None
================================================
FILE: SonicVale/app/dto/multi_emotion_voice_dto.py
================================================
from datetime import datetime
from pydantic import BaseModel
from typing import Optional
class MultiEmotionVoiceCreateDTO(BaseModel):
emotion_id: int
voice_id: int
strength_id: int
id: Optional[int] = None
reference_path: Optional[str] = None
class MultiEmotionVoiceResponseDTO(BaseModel):
emotion_id: int
voice_id: int
strength_id: int
id: Optional[int] = None
reference_path: Optional[str] = None
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/dto/project_dto.py
================================================
from datetime import datetime
from pydantic import BaseModel
from typing import Optional
class ProjectCreateDTO(BaseModel):
name: str
description: Optional[str] = None
llm_provider_id: Optional[int] = None
llm_model: Optional[str] = None
tts_provider_id: Optional[int] = None
prompt_id: Optional[int] = None
# 精准填充
is_precise_fill: Optional[int] = None
# 项目路径
project_root_path : Optional[str] = None
class ProjectResponseDTO(BaseModel):
id: int
name: str
description: Optional[str] = None
llm_provider_id: Optional[int] = None
llm_model: Optional[str] = None
tts_provider_id: Optional[int] = None
prompt_id: Optional[int] = None
# 精准填充
is_precise_fill : Optional[int] = None
# 项目路径
project_root_path : Optional[str] = None
created_at: datetime
updated_at: datetime
class ProjectImportDTO(BaseModel):
id : int
content: str
================================================
FILE: SonicVale/app/dto/prompt_dto.py
================================================
from datetime import datetime
from pydantic import BaseModel
from typing import Optional
class PromptCreateDTO(BaseModel):
"""业务实体:提示词"""
name: str
task: str
description: Optional[str] = None
content: Optional[str] = None
id: Optional[int] = None
class PromptResponseDTO(BaseModel):
"""业务实体:提示词"""
name: str
task: str
description: Optional[str] = None
content: Optional[str] = None
id: Optional[int] = None
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/dto/role_dto.py
================================================
from datetime import datetime
from pydantic import BaseModel
from typing import Optional
class RoleCreateDTO(BaseModel):
name: str
project_id: int
id: Optional[int] = None
default_voice_id: Optional[int] = None
class RoleResponseDTO(BaseModel):
name: str
project_id: int
id: Optional[int] = None
default_voice_id: Optional[int] = None
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/dto/strength_dto.py
================================================
from datetime import datetime
from pydantic import BaseModel
from typing import Optional
class StrengthCreateDTO(BaseModel):
name: str
id: Optional[int] = None
description: Optional[str] = None
is_active: Optional[int] = 1
class StrengthResponseDTO(BaseModel):
name: str
id: Optional[int] = None
description: Optional[str] = None
is_active: Optional[int] = 1
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/dto/tts_provider_dto.py
================================================
from datetime import datetime
from pydantic import BaseModel
from typing import Optional
class TTSProviderCreateDTO(BaseModel):
name: Optional[str] = None
id: Optional[int] = None
api_base_url: Optional[str] = None
api_key: Optional[str] = None
status: Optional[int] = None
class TTSProviderResponseDTO(BaseModel):
"""业务实体:tts_provider"""
name: str
id: Optional[int] = None
api_base_url : Optional[str] = None
api_key: Optional[str] = None
status : Optional[int] = None
updated_at: Optional[datetime] = None
created_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/dto/voice_dto.py
================================================
from datetime import datetime
from typing import Optional, List
from pydantic import BaseModel, Field, AliasChoices
class VoiceCreateDTO(BaseModel):
name: str
tts_provider_id: int
id: Optional[int] = None
reference_path: Optional[str] = None
description: Optional[str] = None
is_multi_emotion: Optional[int] = 0
class VoiceResponseDTO(BaseModel):
name: str
tts_provider_id: int
id: Optional[int] = None
reference_path : Optional[str] = None
description : Optional[str] = None
is_multi_emotion : Optional[int] = 0
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
class VoiceExportDTO(BaseModel):
"""导出音色库请求DTO"""
tts_provider_id: int
export_path: str
ids: Optional[List[int]] = Field(default=None, validation_alias=AliasChoices("ids", "voice_ids"))
class VoiceImportDTO(BaseModel):
"""导入音色库请求DTO"""
tts_provider_id: int
zip_path: str
target_dir: str
class VoiceImportResultDTO(BaseModel):
"""导入音色库结果DTO"""
success_count: int
skipped_count: int
skipped_names: List[str]
class VoiceAudioProcessDTO(BaseModel):
"""音色参考音频处理DTO"""
audio_path: str
speed: Optional[float] = 1.0
volume: Optional[float] = 1.0
start_ms: Optional[int] = None
end_ms: Optional[int] = None
silence_sec: Optional[float] = 0.0
current_ms: Optional[int] = None
class VoiceCopyDTO(BaseModel):
"""复制音色请求DTO"""
source_voice_id: int
new_name: str
target_dir: Optional[str] = None # 为空则使用原音色同目录
================================================
FILE: SonicVale/app/entity/chapter_entity.py
================================================
from dataclasses import dataclass
from datetime import datetime
from typing import Optional
@dataclass
class ChapterEntity:
"""业务实体:章节"""
title: str
project_id: int
order_index: Optional[int] = None
id: Optional[int] = None
text_content : Optional[str] = None
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/entity/emotion_entity.py
================================================
from dataclasses import dataclass
from datetime import datetime
from typing import Optional
@dataclass
class EmotionEntity:
"""业务实体:情绪枚举"""
name: str
id: Optional[int] = None
description: Optional[str] = None
is_active: Optional[int] = 1
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/entity/line_entity.py
================================================
from dataclasses import dataclass
from datetime import datetime
from typing import Optional
@dataclass
class LineEntity:
"""业务实体:台词"""
chapter_id: int
id: Optional[int] = None
role_id : Optional[ int] = None
voice_id : Optional[int] = None
line_order : Optional[int] = None
text_content : Optional[str] = None
emotion_id : Optional[int] = None
strength_id : Optional[int] = None
audio_path : Optional[str] = None
subtitle_path : Optional[str] = None
status : Optional[str] = None
# 是否完成
is_done : Optional[int] = 0
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/entity/llm_provider_entity.py
================================================
from dataclasses import dataclass, field
from datetime import datetime
from typing import Optional, Dict, Any
@dataclass
class LLMProviderEntity:
"""业务实体:LLM"""
name: str
id: Optional[int] = None
api_base_url : Optional[str] = None
api_key: Optional[str] = None
model_list : Optional[str] = None
status : Optional[int] = None
updated_at: Optional[datetime] = None
created_at: Optional[datetime] = None
# ✅ 自定义参数字段(默认值与数据库一致)
custom_params: Optional[str] = None
================================================
FILE: SonicVale/app/entity/multi_emotion_voice_entity.py
================================================
from dataclasses import dataclass
from datetime import datetime
from typing import Optional
# class MultiEmotionVoicePO(Base):
# __tablename__ = "multi_emotion"
# id = Column(Integer, primary_key=True, autoincrement=True, index=True)
# emotion_id = Column(Integer, nullable=False)
# voice_id = Column(Integer, nullable=False)
# strength_id = Column(Integer, nullable=True)
# reference_path = Column(String(255), nullable=True)
# created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
# updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now(timezone.utc),
# nullable=False)
@dataclass
class MultiEmotionVoiceEntity:
"""业务实体:多情感音色"""
emotion_id: int
voice_id: int
strength_id: int
id: Optional[int] = None
reference_path: Optional[str] = None
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/entity/project_entity.py
================================================
from dataclasses import dataclass
from datetime import datetime
from typing import Optional
@dataclass
class ProjectEntity:
"""业务实体:项目"""
name: str
id: Optional[int] = None
description: Optional[str] = None
llm_provider_id: Optional[int] = None
llm_model: Optional[str] = None
tts_provider_id: Optional[int] = None
prompt_id: Optional[int] = None # 提示词
# 精准填充
is_precise_fill: Optional[int] = None
# 项目保存地址
project_root_path: Optional[str] = None
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/entity/prompt_entity.py
================================================
from dataclasses import dataclass
from datetime import datetime
from typing import Optional
# class PromptPO(Base):
# __tablename__ = "prompt"
# id = Column(Integer, primary_key=True, index=True, autoincrement=True)
# name = Column(String(255), nullable=False)
# description = Column(Text, nullable=True)
# content = Column(Text, nullable=True)
# created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
# updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now(timezone.utc),nullable=False)
@dataclass
class PromptEntity:
"""业务实体:提示词"""
name: str
task: str
description: Optional[str] = None
content: Optional[str] = None
id: Optional[int] = None
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/entity/role_entity.py
================================================
from dataclasses import dataclass
from datetime import datetime
from typing import Optional
@dataclass
class RoleEntity:
"""业务实体:角色"""
name: str
project_id: int
id: Optional[int] = None
default_voice_id : Optional[int] = None
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/entity/strength_entity.py
================================================
from dataclasses import dataclass
from datetime import datetime
from typing import Optional
@dataclass
class StrengthEntity:
"""业务实体:情绪强弱枚举"""
name: str
id: Optional[int] = None
description: Optional[str] = None
is_active: Optional[int] = 1
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/entity/tts_provider_entity.py
================================================
from dataclasses import dataclass
from datetime import datetime
from typing import Optional
@dataclass
class TTSProviderEntity:
"""业务实体:TTS"""
name: str
id: Optional[int] = None
api_base_url : Optional[str] = None
api_key: Optional[str] = None
status : Optional[int] = None
updated_at: Optional[datetime] = None
created_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/entity/voice_entity.py
================================================
from dataclasses import dataclass
from datetime import datetime
from typing import Optional
@dataclass
class VoiceEntity:
"""业务实体:音色"""
name: str
tts_provider_id: int
id: Optional[int] = None
reference_path : Optional[str] = None
description : Optional[str] = None
is_multi_emotion : Optional[int] = 0
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = None
================================================
FILE: SonicVale/app/main.py
================================================
# app/main.py
import asyncio
import logging
from concurrent.futures import ThreadPoolExecutor
import uvicorn
from fastapi import FastAPI, Depends
from sqlalchemy.orm import Session
from starlette.middleware.cors import CORSMiddleware
from app.core.config import getConfigPath
from app.core.prompts import get_prompt_str
from app.core.tts_runtime import tts_worker
from app.core.ws_manager import manager
from app.db.database import Base, engine, SessionLocal, get_db
from app.entity.emotion_entity import EmotionEntity
from app.entity.strength_entity import StrengthEntity
from app.models.po import *
from app.repositories.llm_provider_repository import LLMProviderRepository
from app.repositories.tts_provider_repository import TTSProviderRepository
from app.routers import project_router, chapter_router, role_router, voice_router, llm_provider_router, \
tts_provider_router, line_router, emotion_router, strength_router, multi_emotion_voice_router, prompt_router
from app.routers.chapter_router import get_strength_service, get_prompt_service, get_project_service
from app.routers.emotion_router import get_emotion_service
from app.routers.llm_provider_router import get_llm_service
from app.services.llm_provider_service import LLMProviderService
from app.services.tts_provider_service import TTSProviderService
import os
import sys
root_path = os.getcwd()
sys.path.append(root_path)
# =========================
# 日志配置(同时输出到控制台和文件)
# =========================
log_file_path = os.path.join(getConfigPath(), "app.log")
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s',
handlers=[
logging.StreamHandler(), # 控制台输出
logging.FileHandler(log_file_path, encoding='utf-8') # 文件输出
]
)
logging.info(f"日志文件路径: {log_file_path}")
# =========================
# FastAPI 实例
# =========================
app = FastAPI(
title="音墟 (YinXu) - AI多角色小说配音",
description="桌面端小说多角色配音系统,支持 TTS、GPT 提取角色、台词管理及字幕生成",
version="1.0.0",
)
# 跨域
# 允许的前端地址
origins = [
"http://localhost:5173", # Vue 开发服务器
"http://127.0.0.1:5173" # 有些浏览器可能会用这个
]
app.add_middleware(
CORSMiddleware,
allow_origins=origins, # 允许的源
allow_credentials=True,
allow_methods=["*"], # 允许所有方法(GET, POST, DELETE...)
allow_headers=["*"], # 允许所有请求头
)
# =========================
# 数据库初始化(创建表)
# =========================
# 启动时创建表
# @app.on_event("startup")
# def startup():
# Base.metadata.create_all(bind=engine)
WORKERS = 1
QUEUE_CAPACITY = 0
from sqlalchemy import text
def add_prompt_id_column():
with engine.connect() as conn:
# 检查 project 表是否已有 prompt_id
result = conn.execute(text("PRAGMA table_info(projects)"))
columns = [row[1] for row in result.fetchall()]
if "prompt_id" not in columns:
conn.execute(text("ALTER TABLE projects ADD COLUMN prompt_id INTEGER"))
conn.commit()
# 添加line表中is_done字段
def add_is_done_column():
with engine.connect() as conn:
result = conn.execute(text("PRAGMA table_info(lines)"))
columns = [row[1] for row in result.fetchall()]
if "is_done" not in columns:
# ✅ 添加列并设置默认值 0
conn.execute(text("ALTER TABLE lines ADD COLUMN is_done INTEGER DEFAULT 0"))
conn.commit()
# 添加LLM自定义参数字段
def add_custom_params_column():
with engine.begin() as conn: # ✅ 用 begin() 自动提交事务
result = conn.execute(text("PRAGMA table_info(llm_provider)"))
columns = [row[1] for row in result.fetchall()]
if "custom_params" not in columns:
# ✅ 添加列
conn.execute(text("ALTER TABLE llm_provider ADD COLUMN custom_params TEXT"))
# ✅ 可选:为历史数据填入默认 JSON(推荐)
import json
default_json = json.dumps({
"response_format": {"type": "json_object"},
"temperature": 0.7,
"top_p": 0.9
}, ensure_ascii=False)
conn.execute(
text("UPDATE llm_provider SET custom_params = :val"),
{"val": default_json}
)
logging.info("已添加 custom_params 列并写入默认值。")
else:
logging.info("custom_params 列已存在,跳过。")
# 添加精准填充字段】
def add_is_precise_fill_column():
with engine.begin() as conn: # ✅ 用 begin() 自动提交事务
result = conn.execute(text("PRAGMA table_info(projects)"))
columns = [row[1] for row in result.fetchall()]
if "is_precise_fill" not in columns:
# ✅ 添加列
conn.execute(text("ALTER TABLE projects ADD COLUMN is_precise_fill INTEGER DEFAULT 0"))
conn.commit()
# 添加项目保存路径字段(project_path)
def add_project_root_path_column():
with engine.begin() as conn: # ✅ 用 begin() 自动提交事务
result = conn.execute(text("PRAGMA table_info(projects)"))
columns = [row[1] for row in result.fetchall()]
if "project_root_path" not in columns:
# ✅ 添加列
conn.execute(text("ALTER TABLE projects ADD COLUMN project_root_path TEXT"))
conn.commit()
def get_tts_service(db: Session = Depends(get_db)) -> TTSProviderService:
return TTSProviderService(TTSProviderRepository(db))
@app.on_event("startup")
async def startup_event():
# 1) 建表
try:
Base.metadata.create_all(bind=engine)
except Exception as e:
logging.exception("❌ 数据库建表失败: %s", e)
# 更改数据库表字段
add_prompt_id_column()
# v1.0.6添加字段 is_done
add_is_done_column()
# v1.0.7 添加字段 custom_params
add_custom_params_column()
# v1.0.7 添加项目的字段 is_precise_fill
add_is_precise_fill_column()
# v1.0.7 添加项目的字段 project_root_path
add_project_root_path_column()
# 2) 初始化共享运行时
try:
app.state.tts_queue = asyncio.Queue(maxsize=QUEUE_CAPACITY)
app.state.tts_executor = ThreadPoolExecutor(max_workers=WORKERS)
except Exception as e:
logging.exception("❌ 初始化队列/线程池失败: %s", e)
# 3) 启动后台 worker
try:
app.state.tts_workers = [
asyncio.create_task(tts_worker(app)) for _ in range(WORKERS)
]
except Exception as e:
logging.exception("❌ 启动 worker 失败: %s", e)
# 4) 初始化默认数据
db = SessionLocal()
try:
try:
tts_service = get_tts_service(db)
tts_service.create_default_tts_provider()
except Exception as e:
logging.warning("⚠️ 默认 TTS provider 初始化失败: %s", e)
try:
emotion_service = get_emotion_service(db)
for name in [
# 8种基础情绪
"高兴", "生气", "伤心", "害怕", "厌恶", "低落", "惊喜", "平静",
# 2种独特复合情绪
"嘲讽", "悲愤",
]:
try:
emotion_service.create_emotion(EmotionEntity(name=name))
except Exception as e:
logging.debug("情绪 %s 已存在或创建失败: %s", name, e)
except Exception as e:
logging.warning("⚠️ 情绪初始化失败: %s", e)
try:
strength_service = get_strength_service(db)
for name in ["微弱","稍弱","中等","较强","强烈"]:
try:
strength_service.create_strength(StrengthEntity(name=name))
except Exception as e:
logging.debug("强度 %s 已存在或创建失败: %s", name, e)
except Exception as e:
logging.warning("⚠️ 强度初始化失败: %s", e)
# 创建默认提示词
try:
prompt_service = get_prompt_service(db)
if not prompt_service.get_all_prompts():
logging.info("创建默认提示词")
prompt_service.create_default_prompt()
else:
default_prompt = prompt_service.get_prompt_by_name("默认拆分台词提示词")
if not default_prompt:
prompt_service.create_default_prompt()
else:
#修改默认提示词
default_prompt_content = get_prompt_str()
default_prompt.content = default_prompt_content
prompt_service.update_prompt(default_prompt.id, default_prompt.__dict__)
except Exception as e:
logging.warning("⚠️ 默认提示词创建失败: %s", e)
# 兼容之前版本,已有的项目的project_root_path 为 getConfigPath()
try:
project_service = get_project_service(db)
for project in project_service.get_all_projects():
if not project.project_root_path:
project.project_root_path = getConfigPath()
project_service.update_project(project.id, project.__dict__)
logging.info("项目 %s 默认项目路径已修改为 %s", project.name, project.project_root_path)
# todo:修改所有的保存路径,然后前端请求添加保存路径(利用electron读取文件夹路径)
except Exception as e:
logging.warning("⚠️ 项目默认项目路径初始化失败: %s", e)
except Exception as e:
logging.exception("❌ 默认数据初始化异常: %s", e)
finally:
db.close()
@app.on_event("shutdown")
async def shutdown_event():
# 优雅退出
for t in getattr(app.state, "tts_workers", []):
t.cancel()
ex = getattr(app.state, "tts_executor", None)
if ex:
ex.shutdown(wait=False, cancel_futures=True)
# =========================
# 注册路由
# =========================
app.include_router(project_router.router)
app.include_router(chapter_router.router)
app.include_router(role_router.router)
app.include_router(voice_router.router)
app.include_router(llm_provider_router.router)
app.include_router(tts_provider_router.router)
app.include_router(line_router.router)
app.include_router(emotion_router.router)
app.include_router(strength_router.router)
app.include_router(multi_emotion_voice_router.router)
app.include_router(prompt_router.router)
# =========================
# 健康检查接口
# =========================
@app.get("/")
def read_root():
return {"msg": "音墟 (YinXu) 后端服务运行中!"}
# =========================
# 小测试接口:插入并查询 ProjectPO
# =========================
@app.get("/test-db")
def test_db():
session: Session = SessionLocal()
try:
# 使用时间戳生成唯一名称,避免 UNIQUE 冲突
name = f"测试项目_{int(datetime.now().timestamp())}"
test_project = ProjectPO(name=name, description="测试用项目")
session.add(test_project)
session.commit()
session.refresh(test_project)
return {
"msg": "插入成功",
"id": test_project.id,
"name": test_project.name,
"created_at": test_project.created_at,
"updated_at": test_project.updated_at
}
except Exception as e:
session.rollback()
return {"error": str(e)}
finally:
session.close()
import json
from fastapi import WebSocket, WebSocketDisconnect
@app.websocket("/ws")
async def ws_endpoint(ws: WebSocket):
await manager.connect(ws)
logging.info("WebSocket 客户端已连接")
try:
while True:
msg_text = await ws.receive_text()
try:
data = json.loads(msg_text)
except json.JSONDecodeError:
data = {}
# 👇 心跳处理:收到 ping 立即回复 pong
if data.get("type") == "ping":
logging.debug("receive ping")
await ws.send_text(json.dumps({"type": "pong"}))
continue
# 这里可以扩展处理订阅/其他消息
except WebSocketDisconnect:
logging.info("WebSocket 客户端主动断开")
manager.disconnect(ws)
except Exception as e:
logging.warning(f"WebSocket 连接异常: {e}")
manager.disconnect(ws)
if __name__ == "__main__":
# uvicorn.run(app, host="127.0.0.1", port=8200)
# 使用自定义 logger,避免 uvicorn 自动配置失败
# logging.basicConfig(level=logging.INFO)
uvicorn.run("app.main:app", host="127.0.0.1", port=8200, log_config=None)
================================================
FILE: SonicVale/app/models/po.py
================================================
from sqlalchemy import Column, Integer, Integer, String, Text, Enum, ForeignKey, DateTime, JSON, Index
from datetime import datetime, timezone
from app.db.database import Base
# ------------------------------
# 1. 项目表 projects
# ------------------------------
class ProjectPO(Base):
__tablename__ = "projects"
id = Column(Integer, primary_key=True, autoincrement=True,index=True)
name = Column(String(255), nullable=False, unique=True, index=True)
description = Column(Text, nullable=True)
llm_provider_id = Column(Integer, nullable=True) # LLM提供商
llm_model = Column(String(255), nullable=True) # 指定模型
tts_provider_id = Column(Integer, nullable=True) # TTS提供商
prompt_id = Column(Integer, nullable=True) # 关联的prompt
# 是否开启精准填充
is_precise_fill = Column(Integer, default=0, nullable=False)
# 项目根地址
project_root_path = Column(String(255), nullable=True)
created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now(timezone.utc), nullable=False)
# ------------------------------
# 2. 项目的全局角色表 roles
# ------------------------------
class RolePO(Base):
__tablename__ = "roles"
id = Column(Integer, primary_key=True, autoincrement=True,index=True)
project_id = Column(Integer, nullable=False)
name = Column(String(100), nullable=False)
default_voice_id = Column(Integer, ForeignKey("voices.id"), nullable=True)
created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now(timezone.utc), nullable=False)
# ------------------------------
# 3. 音色表 voices
# ------------------------------
class VoicePO(Base):
__tablename__ = "voices"
id = Column(Integer, primary_key=True, autoincrement=True, index=True)
tts_provider_id = Column(Integer, nullable=True)
name = Column(String(100), nullable=False)
reference_path = Column(String(255), nullable=True)
description = Column(Text, nullable=True)
# 是否包含多情绪
is_multi_emotion = Column(Integer, default=0, nullable=False)
created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now(timezone.utc),
nullable=False)
# 多情绪表
class MultiEmotionVoicePO(Base):
__tablename__ = "multi_emotion"
id = Column(Integer, primary_key=True, autoincrement=True, index=True)
voice_id = Column(Integer, nullable=False)
emotion_id = Column(Integer, nullable=False)
strength_id = Column(Integer, nullable=True)
reference_path = Column(String(255), nullable=True)
created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now(timezone.utc),
nullable=False)
# ------------------------------
# 4. 章节表 chapters
# ------------------------------
class ChapterPO(Base):
__tablename__ = "chapters"
id = Column(Integer, primary_key=True, autoincrement=True,index=True)
project_id = Column(Integer, nullable=False)
title = Column(String(255), nullable=False)
order_index = Column(Integer, nullable=True)
text_content = Column(Text, nullable=True) # SQLite 没有 LongText,用 Text 替代
created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now(timezone.utc),
nullable=False)
# ------------------------------
# 5. 台词表 lines
# ------------------------------
# 情绪枚举表
class EmotionPO(Base):
__tablename__ = "emotions"
id = Column(Integer, primary_key=True, autoincrement=True, index=True)
name = Column(String(100), nullable=False)
description = Column(Text, nullable=True)
is_active = Column(Integer, default=1, nullable=False)
created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now())
# 情绪强弱枚举表
class StrengthPO(Base):
__tablename__ = "strengths"
id = Column(Integer, primary_key=True, autoincrement=True, index=True)
name = Column(String(100), nullable=False)
description = Column(Text, nullable=True)
is_active = Column(Integer, default=1, nullable=False)
created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
updated_at = Column(DateTime, default=lambda: datetime.now())
class LinePO(Base):
__tablename__ = "lines"
id = Column(Integer, primary_key=True, index=True, autoincrement=True)
# 外键
chapter_id = Column(Integer, nullable=False, index=True)
role_id = Column(Integer, nullable=True)
voice_id = Column(Integer, nullable=True)
# 核心信息
line_order = Column(Integer, nullable=True, index=True)
text_content = Column(Text, nullable=True)
# 情绪 和 强弱
emotion_id = Column(Integer, nullable=True)
strength_id = Column(Integer, nullable=True)
# 9.1 新增
# 输出资源
audio_path = Column(String(500), nullable=True)
subtitle_path = Column(String(500), nullable=True)
# 间隔停留时间(秒)
# wait_time = Column(Integer, default=0, nullable=True)
# 状态
status = Column(
Enum("pending", "processing", "done", "failed", name="line_status"),
default="pending",
nullable=False
)
# 是否完成
is_done = Column(Integer, default=0, nullable=False)
# 时间戳
created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now(timezone.utc), nullable=False)
__table_args__ = (
Index("idx_chapter_order", "chapter_id", "line_order"),
)
# -------------------------
# LLMProviderPO
# -------------------------
class LLMProviderPO(Base):
__tablename__ = "llm_provider"
id = Column(Integer, primary_key=True, index=True, autoincrement=True)
name = Column(String(255), nullable=False, unique=True) # 提供商名称
api_base_url = Column(String(500), nullable=False)
api_key = Column(String(500), nullable=True) # 可加密存储
model_list = Column(JSON, nullable=True) # 支持的模型列表
status = Column(Integer, default=1, nullable=False) # 启用/禁用
# ✅ 自定义参数(默认包含 response_format、temperature、top_p)
custom_params = Column(
Text,
nullable=False,
default=lambda: {
"response_format": {"type": "json_object"},
"temperature": 0.7,
"top_p": 0.9
}
)
# 时间戳
created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now(timezone.utc),
nullable=False)
# -------------------------
# TTSProviderPO
# -------------------------
class TTSProviderPO(Base):
__tablename__ = "tts_provider"
id = Column(Integer, primary_key=True, index=True, autoincrement=True)
name = Column(String(255), nullable=False, unique=True)
api_base_url = Column(String(500), nullable=False)
api_key = Column(String(500), nullable=True)
# voice_list = Column(JSON, nullable=True)
status = Column(Integer, default=1, nullable=False)
# 时间戳
created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now(timezone.utc),
nullable=False)
class PromptPO(Base):
__tablename__ = "prompts"
id = Column(Integer, primary_key=True, index=True, autoincrement=True)
name = Column(String(255), nullable=False)
task = Column(String(255), nullable=False)
description = Column(Text, nullable=True)
content = Column(Text, nullable=True)
created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now(timezone.utc),nullable=False)
# -------------------------
# ProjectSettings
# -------------------------
# class ProjectSettings(Base):
# __tablename__ = "project_settings"
#
# id = Column(Integer, primary_key=True, index=True, autoincrement=True)
# project_id = Column(Integer, nullable=False) # 所属项目
# llm_provider_id = Column(Integer, nullable=True) # LLM提供商
# llm_model = Column(String(255), nullable=True) # 指定模型
# tts_provider_id = Column(Integer, nullable=True) # TTS提供商
#
# # 时间戳
# created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), nullable=False)
# updated_at = Column(DateTime, default=lambda: datetime.now(timezone.utc), onupdate=lambda: datetime.now(timezone.utc),
# nullable=False)
================================================
FILE: SonicVale/app/repositories/chapter_repository.py
================================================
from typing import Optional, Sequence
from sqlalchemy import select
from sqlalchemy.orm import Session
from app.models.po import ChapterPO
class ChapterRepository:
def __init__(self, db: Session):
self.db = db
def get_by_id(self, chapter_id: int) -> Optional[ChapterPO]:
"""根据 ID 查询项目"""
return self.db.get(ChapterPO, chapter_id)
def get_all(self, project_id: int) -> Sequence[ChapterPO]:
"""获取指定项目下的所有章节"""
stmt = select(ChapterPO).where(ChapterPO.project_id == project_id)
return self.db.execute(stmt).scalars().all()
def create(self, chapter_data: ChapterPO) -> ChapterPO:
"""新建项目"""
self.db.add(chapter_data)
self.db.commit()
self.db.refresh(chapter_data)
return chapter_data
def update(self, chapter_id: int, chapter_data: dict) -> Optional[ChapterPO]:
"""更新项目"""
chapter = self.get_by_id(chapter_id)
if not chapter:
return None
for key, value in chapter_data.items():
if value is not None: # 只更新不为空的字段
setattr(chapter, key, value)
self.db.commit()
self.db.refresh(chapter)
return chapter
def delete(self, chapter_id: int) -> bool:
"""删除章节"""
project = self.get_by_id(chapter_id)
if not project:
return False
self.db.delete(project)
self.db.commit()
return True
# def delete_all_by_project_id(self, project_id: int) -> bool:
# """删除指定项目下的所有章节"""
# pos = self.get_all(project_id)
# for po in pos:
# self.db.delete(po)
# self.db.commit()
# return True
def get_by_name(self, name: str, project_id: int) -> Optional[ChapterPO]:
"""根据项目ID和章节名称查找章节"""
stmt = (
select(ChapterPO)
.where(ChapterPO.title == name)
.where(ChapterPO.project_id == project_id)
)
return self.db.execute(stmt).scalar_one_or_none()
def search(self, keyword: str) -> Sequence[ChapterPO]:
"""模糊搜索"""
stmt = select(ChapterPO).where(ChapterPO.title.ilike(f"%{keyword}%"))
return self.db.execute(stmt).scalars().all()
================================================
FILE: SonicVale/app/repositories/emotion_repository.py
================================================
from typing import Optional, Sequence
from sqlalchemy.orm import Session
from app.models.po import EmotionPO
class EmotionRepository:
def __init__(self, db: Session):
self.db = db
def get_by_id(self, id: int) -> Optional[EmotionPO]:
"""通过id获取情绪"""
return self.db.query(EmotionPO).filter(EmotionPO.id == id).first()
def get_by_name(self, name: str) -> Optional[EmotionPO]:
"""通过名称获取情绪"""
return self.db.query(EmotionPO).filter(EmotionPO.name == name).first()
def get_all(self) -> list[type[EmotionPO]]:
"""获取所有情绪"""
return self.db.query(EmotionPO).all()
def create(self, emotion: EmotionPO) -> EmotionPO:
"""创建情绪"""
self.db.add(emotion)
self.db.commit()
self.db.refresh(emotion)
return emotion
def update(self, id: int, data: dict) -> Optional[EmotionPO]:
"""更新情绪"""
emotion = self.get_by_id(id)
if not emotion:
return None
for key, value in data.items():
if value is not None:
setattr(emotion, key, value)
self.db.commit()
self.db.refresh(emotion)
return emotion
def delete(self, id: int) -> bool:
"""删除情绪"""
emotion = self.get_by_id(id)
if not emotion:
return False
self.db.delete(emotion)
self.db.commit()
return True
================================================
FILE: SonicVale/app/repositories/line_repository.py
================================================
from typing import Optional, List
from sqlalchemy import Sequence, select, update
from sqlalchemy.orm import Session
from app.dto.line_dto import LineOrderDTO
from app.models.po import LinePO
class LineRepository:
def __init__(self, db: Session):
self.db = db
def get_by_id(self, id: int) -> Optional[LinePO]:
"""根据 ID 查询单行台词"""
return self.db.get(LinePO, id)
def get_all(self, chapter_id: int) -> Sequence[LinePO]:
"""获取章节下所有单行台词,按 line_order 排序"""
stmt = (
select(LinePO)
.where(LinePO.chapter_id == chapter_id)
.order_by(LinePO.line_order.asc()) # 升序
)
return self.db.execute(stmt).scalars().all()
def create(self, data: LinePO) -> LinePO:
"""新增单行台词"""
self.db.add(data)
self.db.commit()
self.db.refresh(data)
return data
def update(self, line_id: int, line_data: dict) -> Optional[LinePO]:
"""更新单行台词信息"""
line = self.get_by_id(line_id)
if not line:
return None
for key, value in line_data.items():
if value is not None: # 只更新不为空的字段
setattr(line, key, value)
self.db.commit()
self.db.refresh(line)
return line
def delete(self, line_id: int) -> bool:
"""删除台词"""
line = self.get_by_id(line_id)
if not line:
return False
self.db.delete(line)
self.db.commit()
return True
def delete_all_by_chapter_id(self, chapter_id: int) -> bool:
"""删除章节下的所有台词"""
lines = self.get_all(chapter_id)
for line in lines:
self.db.delete(line)
self.db.commit()
return True
def get_lines_by_role_id(self, role_id: int):
return self.db.execute(select(LinePO).where(LinePO.role_id == role_id)).scalars().all()
def batch_update_line_order(self, line_orders:List[LineOrderDTO])-> int:
"""批量更新台词的顺序"""
if not line_orders:
return 0
from sqlalchemy import bindparam
stmt = (
update(LinePO)
.where(LinePO.id == bindparam("id"))
.values(line_order=bindparam("line_order"))
)
params = [{"id": it.id, "line_order": it.line_order} for it in line_orders]
res = self.db.execute(stmt, params) # executemany
self.db.commit()
return res.rowcount if res.rowcount not in (None, -1) else len(params)
================================================
FILE: SonicVale/app/repositories/llm_provider_repository.py
================================================
from typing import List, Optional, Sequence, Any
from sqlalchemy.orm import Session
from sqlalchemy import select, Row, RowMapping
from app.models.po import LLMProviderPO
class LLMProviderRepository:
def __init__(self, db: Session):
self.db = db
def get_by_id(self, llm_provider_id: int) -> Optional[LLMProviderPO]:
"""根据 ID 查询LLM供应商"""
return self.db.get(LLMProviderPO, llm_provider_id)
def get_all(self) -> Sequence[LLMProviderPO]:
"""获取所有LLM供应商"""
return self.db.execute(select(LLMProviderPO)).scalars().all()
def create(self, llm_provider_data: LLMProviderPO) -> LLMProviderPO:
"""新建LLM供应商"""
self.db.add(llm_provider_data)
self.db.commit()
self.db.refresh(llm_provider_data)
return llm_provider_data
def update(self, llm_provider_id: int, llm_provider_data: dict) -> Optional[LLMProviderPO]:
"""更新LLM供应商"""
llm_provider = self.get_by_id(llm_provider_id)
if not llm_provider:
return None
for key, value in llm_provider_data.items():
if value is not None: # 只更新不为空的字段
setattr(llm_provider, key, value)
self.db.commit()
self.db.refresh(llm_provider)
return llm_provider
def delete(self, llm_provider_id: int) -> bool:
"""删除LLM供应商"""
llm_provider = self.get_by_id(llm_provider_id)
if not llm_provider:
return False
self.db.delete(llm_provider)
self.db.commit()
return True
def get_by_name(self, name: str) -> Optional[LLMProviderPO]:
"""根据名称查找LLM供应商"""
stmt = select(LLMProviderPO).where(LLMProviderPO.name == name)
return self.db.execute(stmt).scalar_one_or_none()
def search(self, keyword: str) -> Sequence[LLMProviderPO]:
"""模糊搜索"""
stmt = select(LLMProviderPO).where(LLMProviderPO.name.ilike(f"%{keyword}%"))
return self.db.execute(stmt).scalars().all()
================================================
FILE: SonicVale/app/repositories/multi_emotion_voice_repository.py
================================================
from typing import Optional, Sequence, Any
from sqlalchemy.orm import Session, Query
from app.models.po import MultiEmotionVoicePO
class MultiEmotionVoiceRepository:
def __init__(self, db: Session):
self.db = db
def get_by_id(self, id: int) -> Optional[MultiEmotionVoicePO]:
"""通过id获取多情绪音色"""
return self.db.query(MultiEmotionVoicePO).filter(MultiEmotionVoicePO.id == id).first()
# 根据voice_id,emotion_id,strength_id获取多情绪音色
def get_by_voice_id_emotion_id_strength_id(self, voice_id: int, emotion_id: int, strength_id: int) -> type[MultiEmotionVoicePO] | None:
"""根据voice_id,emotion_id,strength_id获取多情绪音色"""
return self.db.query(MultiEmotionVoicePO).filter(MultiEmotionVoicePO.voice_id == voice_id,
MultiEmotionVoicePO.emotion_id == emotion_id,
MultiEmotionVoicePO.strength_id == strength_id).one_or_none()
# 根据voice_id获取多情绪音色
def get_by_voice_id(self, voice_id: int) -> Sequence[type[MultiEmotionVoicePO]]:
"""根据voice_id获取多情绪音色"""
return self.db.query(MultiEmotionVoicePO).filter(MultiEmotionVoicePO.voice_id == voice_id).all()
def get_all(self) -> list[type[MultiEmotionVoicePO]]:
"""获取所有多情绪音色"""
return self.db.query(MultiEmotionVoicePO).all()
def create(self, multi_emotion_voice: MultiEmotionVoicePO) -> MultiEmotionVoicePO:
"""创建多情绪音色"""
self.db.add(multi_emotion_voice)
self.db.commit()
self.db.refresh(multi_emotion_voice)
return multi_emotion_voice
def update(self, id: int, data: dict) -> Optional[MultiEmotionVoicePO]:
"""更新多情绪音色"""
multi_emotion_voice = self.get_by_id(id)
if not multi_emotion_voice:
return None
for key, value in data.items():
if value is not None:
setattr(multi_emotion_voice, key, value)
self.db.commit()
self.db.refresh(multi_emotion_voice)
return multi_emotion_voice
def delete(self, id: int) -> bool:
"""删除多情绪音色"""
multi_emotion_voice = self.get_by_id(id)
if not multi_emotion_voice:
return False
self.db.delete(multi_emotion_voice)
self.db.commit()
return True
def delete_multi_emotion_voice_by_voice_id(self, voice_id):
"""通过音色id删除所有的多音色"""
multi_voices = self.get_by_voice_id(voice_id)
for multi_voice in multi_voices:
self.db.delete(multi_voice)
self.db.commit()
return True
================================================
FILE: SonicVale/app/repositories/project_repository.py
================================================
from typing import List, Optional, Sequence, Any
from sqlalchemy.orm import Session
from sqlalchemy import select, Row, RowMapping
from app.models.po import ProjectPO
class ProjectRepository:
def __init__(self, db: Session):
self.db = db
def get_by_id(self, project_id: int) -> Optional[ProjectPO]:
"""根据 ID 查询项目"""
return self.db.get(ProjectPO, project_id)
def get_all(self) -> Sequence[ProjectPO]:
"""获取所有项目"""
return self.db.execute(select(ProjectPO)).scalars().all()
def create(self, project_data: ProjectPO) -> ProjectPO:
"""新建项目"""
self.db.add(project_data)
self.db.commit()
self.db.refresh(project_data)
return project_data
def update(self, project_id: int, project_data: dict) -> Optional[ProjectPO]:
"""更新项目"""
project = self.get_by_id(project_id)
if not project:
return None
for key, value in project_data.items():
setattr(project, key, value)
self.db.commit()
self.db.refresh(project)
return project
def delete(self, project_id: int) -> bool:
"""删除项目"""
project = self.get_by_id(project_id)
if not project:
return False
self.db.delete(project)
self.db.commit()
return True
def get_by_name(self, name: str) -> Optional[ProjectPO]:
"""根据名称查找项目"""
stmt = select(ProjectPO).where(ProjectPO.name == name)
return self.db.execute(stmt).scalar_one_or_none()
def search(self, keyword: str) -> Sequence[ProjectPO]:
"""模糊搜索"""
stmt = select(ProjectPO).where(ProjectPO.name.ilike(f"%{keyword}%"))
return self.db.execute(stmt).scalars().all()
================================================
FILE: SonicVale/app/repositories/prompt_repository.py
================================================
from typing import List, Optional, Sequence, Any
from sqlalchemy.orm import Session
from sqlalchemy import select, Row, RowMapping
from app.models.po import PromptPO
class PromptRepository:
def __init__(self, db: Session):
self.db = db
def get_by_id(self, prompt_id: int) -> Optional[PromptPO]:
"""根据 ID 查询提示词"""
return self.db.get(PromptPO, prompt_id)
def get_all(self) -> Sequence[PromptPO]:
"""获取所有提示词"""
return self.db.execute(select(PromptPO)).scalars().all()
def create(self, prompt_data: PromptPO) -> PromptPO:
"""新建提示词"""
self.db.add(prompt_data)
self.db.commit()
self.db.refresh(prompt_data)
return prompt_data
def update(self, prompt_id: int, prompt_data: dict) -> Optional[PromptPO]:
"""更新提示词"""
prompt = self.get_by_id(prompt_id)
if not prompt:
return None
for key, value in prompt_data.items():
if value is not None: # 只更新不为空的字段
setattr(prompt, key, value)
self.db.commit()
self.db.refresh(prompt)
return prompt
def delete(self, prompt_id: int) -> bool:
"""删除提示词"""
prompt = self.get_by_id(prompt_id)
if not prompt:
return False
self.db.delete(prompt)
self.db.commit()
return True
def get_by_name(self, name: str) -> Optional[PromptPO]:
"""根据名称查找提示词"""
stmt = select(PromptPO).where(PromptPO.name == name)
return self.db.execute(stmt).scalar_one_or_none()
# 根据任务查询,返回多个提示词
def get_by_task(self, task: str) -> Sequence[PromptPO]:
stmt = select(PromptPO).where(PromptPO.task == task)
return self.db.execute(stmt).scalars().all()
def search(self, keyword: str) -> Sequence[PromptPO]:
"""模糊搜索"""
stmt = select(PromptPO).where(PromptPO.name.ilike(f"%{keyword}%"))
return self.db.execute(stmt).scalars().all()
================================================
FILE: SonicVale/app/repositories/role_repository.py
================================================
from typing import Optional
from sqlalchemy import Sequence, select
from sqlalchemy.orm import Session
from app.models.po import RolePO
class RoleRepository:
def __init__(self, db: Session):
self.db = db
def get_by_id(self, id: int) -> Optional[RolePO]:
"""根据 ID 查询角色"""
return self.db.get(RolePO, id)
def get_all(self,project_id: int) -> Sequence[RolePO]:
"""获取项目下所有角色"""
return self.db.execute(select(RolePO).where(RolePO.project_id == project_id)).scalars().all()
def create(self, data: RolePO) -> RolePO:
"""新增角色"""
self.db.add(data)
self.db.commit()
self.db.refresh(data)
return data
def update(self, role_id: int, role_data: dict) -> Optional[RolePO]:
"""更新角色信息"""
role = self.get_by_id(role_id)
if not role:
return None
for key, value in role_data.items():
if value is not None: # 只更新不为空的字段
setattr(role, key, value)
self.db.commit()
self.db.refresh(role)
return role
def delete(self, role_id: int) -> bool:
"""删除项目"""
role = self.get_by_id(role_id)
if not role:
return False
self.db.delete(role)
self.db.commit()
return True
def get_by_name(self, name: str,project_id: int) -> Optional[RolePO]:
"""根据名称查找项目下的角色信息"""
return self.db.execute(select(RolePO).where(RolePO.name == name,RolePO.project_id == project_id)).scalars().first()
================================================
FILE: SonicVale/app/repositories/strength_repository.py
================================================
from typing import Optional, Sequence
from sqlalchemy.orm import Session
from app.models.po import StrengthPO
class StrengthRepository:
def __init__(self, db: Session):
self.db = db
def get_by_id(self, id: int) -> Optional[StrengthPO]:
"""通过id获取情绪强弱"""
return self.db.query(StrengthPO).filter(StrengthPO.id == id).first()
def get_by_name(self, name: str) -> Optional[StrengthPO]:
"""通过名称获取情绪强弱"""
return self.db.query(StrengthPO).filter(StrengthPO.name == name).first()
def get_all(self) -> list[type[StrengthPO]]:
"""获取所有情绪强弱"""
return self.db.query(StrengthPO).all()
def create(self, strength: StrengthPO) -> StrengthPO:
"""创建情绪强弱"""
self.db.add(strength)
self.db.commit()
self.db.refresh(strength)
return strength
def update(self, id: int, data: dict) -> Optional[StrengthPO]:
"""更新情绪强弱"""
strength = self.get_by_id(id)
if not strength:
return None
for key, value in data.items():
if value is not None:
setattr(strength, key, value)
self.db.commit()
self.db.refresh(strength)
return strength
def delete(self, id: int) -> bool:
"""删除情绪强弱"""
strength = self.get_by_id(id)
if not strength:
return False
self.db.delete(strength)
self.db.commit()
return True
================================================
FILE: SonicVale/app/repositories/tts_provider_repository.py
================================================
from typing import Optional
from sqlalchemy import Sequence, select
from sqlalchemy.orm import Session
from app.models.po import TTSProviderPO
class TTSProviderRepository:
def __init__(self, db: Session):
self.db = db
def get_by_id(self, id: int) -> Optional[TTSProviderPO]:
"""根据 ID 查询tts供应商"""
return self.db.get(TTSProviderPO, id)
def get_all(self) -> Sequence[TTSProviderPO]:
"""获取tts下所有tts供应商"""
return self.db.execute(select(TTSProviderPO)).scalars().all()
def create(self, data: TTSProviderPO) -> TTSProviderPO:
"""新增tts供应商"""
self.db.add(data)
self.db.commit()
self.db.refresh(data)
return data
def update(self, tts_provider_id: int, voice_data: dict) -> Optional[TTSProviderPO]:
"""更新tts供应商信息"""
voice = self.get_by_id(tts_provider_id)
if not voice:
return None
for key, value in voice_data.items():
if value is not None: # 只更新不为空的字段
setattr(voice, key, value)
self.db.commit()
self.db.refresh(voice)
return voice
# def delete(self, voice_id: int) -> bool:
# """删除项目"""
# voice = self.get_by_id(voice_id)
# if not voice:
# return False
# self.db.delete(voice)
# self.db.commit()
# return True
#
#
def get_by_name(self, name: str) -> Optional[TTSProviderPO]:
"""根据名称查找项目下的tts供应商信息"""
return self.db.execute(select(TTSProviderPO).where(TTSProviderPO.name == name)).scalars().first()
================================================
FILE: SonicVale/app/repositories/voice_repository.py
================================================
from typing import Optional
from sqlalchemy import Sequence, select
from sqlalchemy.orm import Session
from app.models.po import VoicePO
class VoiceRepository:
def __init__(self, db: Session):
self.db = db
def get_by_id(self, id: int) -> Optional[VoicePO]:
"""根据 ID 查询音色"""
return self.db.get(VoicePO, id)
def get_all(self,tts_id: int) -> Sequence[VoicePO]:
"""获取tts下所有音色"""
return self.db.execute(select(VoicePO).where(VoicePO.tts_provider_id == tts_id)).scalars().all()
def get_by_ids(self, tts_id: int, ids: list[int]) -> Sequence[VoicePO]:
"""根据ids获取tts下的音色"""
if not ids:
return []
return self.db.execute(
select(VoicePO).where(VoicePO.tts_provider_id == tts_id, VoicePO.id.in_(ids))
).scalars().all()
def create(self, data: VoicePO) -> VoicePO:
"""新增音色"""
self.db.add(data)
self.db.commit()
self.db.refresh(data)
return data
def update(self, voice_id: int, voice_data: dict) -> Optional[VoicePO]:
"""更新音色信息"""
voice = self.get_by_id(voice_id)
if not voice:
return None
for key, value in voice_data.items():
setattr(voice, key, value)
self.db.commit()
self.db.refresh(voice)
return voice
def delete(self, voice_id: int) -> bool:
"""删除项目"""
voice = self.get_by_id(voice_id)
if not voice:
return False
self.db.delete(voice)
self.db.commit()
return True
def get_by_name(self, name: str,tts_id: int) -> Optional[VoicePO]:
"""根据名称查找项目下的音色信息"""
return self.db.execute(select(VoicePO).where(VoicePO.name == name,VoicePO.tts_provider_id == tts_id)).scalars().first()
================================================
FILE: SonicVale/app/routers/chapter_router.py
================================================
# 初始化 router
import asyncio
import io
import json
import logging
import os
import traceback
from typing import List
from sqlalchemy.orm import Session
from fastapi import APIRouter, Depends, HTTPException, Form
from app.core.response import Res
from app.core.text_correct_engine import TextCorrectorFinal
from app.core.ws_manager import manager
from app.db.database import get_db, SessionLocal
from app.dto.chapter_dto import ChapterResponseDTO, ChapterCreateDTO
from app.dto.line_dto import LineInitDTO, LineCreateDTO, LineResponseDTO
from app.entity.chapter_entity import ChapterEntity
from app.repositories.chapter_repository import ChapterRepository
from app.repositories.emotion_repository import EmotionRepository
from app.repositories.line_repository import LineRepository
from app.repositories.llm_provider_repository import LLMProviderRepository
from app.repositories.multi_emotion_voice_repository import MultiEmotionVoiceRepository
from app.repositories.project_repository import ProjectRepository
from app.repositories.prompt_repository import PromptRepository
from app.repositories.role_repository import RoleRepository
from app.repositories.strength_repository import StrengthRepository
from app.repositories.tts_provider_repository import TTSProviderRepository
from app.repositories.voice_repository import VoiceRepository
from app.services.chapter_service import ChapterService
from app.services.emotion_service import EmotionService
from app.services.line_service import LineService
from app.services.multi_emotion_voice_service import MultiEmotionVoiceService
from app.services.project_service import ProjectService
from app.services.prompt_service import PromptService
from app.services.role_service import RoleService
from app.services.strength_service import StrengthService
from app.services.voice_service import VoiceService
router = APIRouter(prefix="/chapters", tags=["Chapters"])
# 依赖注入(实际项目可用 DI 容器)
def get_chapter_service(db: Session = Depends(get_db)) -> ChapterService:
repository = ChapterRepository(db) # ✅ 传入 db
return ChapterService(repository)
def get_line_service(db: Session = Depends(get_db)) -> LineService:
repository = LineRepository(db)
role_repository = RoleRepository(db)
tts_provider_repository = TTSProviderRepository(db)
llm_provider_repository = LLMProviderRepository(db)
return LineService(repository, role_repository, tts_provider_repository, llm_provider_repository)
def get_project_service(db: Session = Depends(get_db)) -> ProjectService:
repository = ProjectRepository(db)
return ProjectService(repository)
def get_voice_service(db: Session = Depends(get_db)) -> VoiceService:
repository = VoiceRepository(db)
multi_emotion_voice_repository = MultiEmotionVoiceRepository(db)
return VoiceService(repository,multi_emotion_voice_repository)
def get_role_service(db: Session = Depends(get_db)) -> RoleService:
repository = RoleRepository(db)
return RoleService(repository)
def get_emotion_service(db: Session = Depends(get_db)) -> EmotionService:
repository = EmotionRepository(db)
return EmotionService(repository)
def get_strength_service(db: Session = Depends(get_db)) -> StrengthService:
repository = StrengthRepository(db)
return StrengthService(repository)
def get_multi_emotion_voice_service(db: Session = Depends(get_db)) -> MultiEmotionVoiceService:
repository = MultiEmotionVoiceRepository(db)
return MultiEmotionVoiceService(repository)
def get_prompt_service(db: Session = Depends(get_db)) -> PromptService:
repository = PromptRepository(db)
return PromptService(repository)
@router.post("", response_model=Res[ChapterResponseDTO],
summary="创建章节",
description="根据项目ID创建章节,章节名称在同一项目下不可重复" )
async def create_chapter(dto: ChapterCreateDTO, chapter_service: ChapterService = Depends(get_chapter_service),
project_service: ProjectService = Depends(get_project_service)):
"""创建章节"""
try:
# DTO → Entity
entity = ChapterEntity(**dto.__dict__)
# 判断project_id是否存在
project = project_service.get_project(dto.project_id)
if project is None:
return Res(data=None, code=400, message=f"项目 '{dto.project_id}' 不存在")
# 调用 Service 创建项目(返回 True/False)
entityRes = chapter_service.create_chapter(entity)
# 返回统一 Response
if entityRes is not None:
# 创建成功,可以返回 DTO 或者部分字段
res = ChapterResponseDTO(**entityRes.__dict__)
return Res(data=res, code=200, message="创建成功")
else:
return Res(data=None, code=400, message=f"章节 '{entity.title}' 已存在")
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
@router.get("/{chapter_id}", response_model=Res[ChapterResponseDTO],
summary="查询章节",
description="根据章节id查询章节信息")
async def get_chapter(chapter_id: int, chapter_service: ChapterService = Depends(get_chapter_service)):
entity = chapter_service.get_chapter(chapter_id)
if entity:
res = ChapterResponseDTO(**entity.__dict__)
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=None, code=404, message="项目不存在")
@router.get("/project/{project_id}", response_model=Res[List[ChapterResponseDTO]],
summary="查询项目下的所有章节",
description="根据项目id查询项目下的所有章节信息")
async def get_all_chapters(project_id: int, chapter_service: ChapterService = Depends(get_chapter_service)):
entities = chapter_service.get_all_chapters(project_id)
if entities:
res = [ChapterResponseDTO(**e.__dict__) for e in entities]
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=[], code=404, message="项目不存在章节")
# 修改,传入的参数是id
@router.put("/{chapter_id}", response_model=Res[ChapterCreateDTO],
summary="修改章节",
description="根据章节id修改章节信息,并且不能修改项目id")
async def update_chapter(chapter_id: int, dto: ChapterCreateDTO, chapter_service: ChapterService = Depends(get_chapter_service)):
chapter = chapter_service.get_chapter(chapter_id)
if chapter is None:
return Res(data=None, code=404, message="章节不存在")
res = chapter_service.update_chapter(chapter_id, dto.dict(exclude_unset=True))
if res:
return Res(data=dto, code=200, message="修改成功")
else:
return Res(data=None, code=400, message="修改失败")
# 根据id,删除
@router.delete("/{chapter_id}", response_model=Res,
summary="删除章节",
description="根据章节id删除章节信息,并且级联删除")
async def delete_chapter(chapter_id: int, chapter_service: ChapterService = Depends(get_chapter_service)):
success = chapter_service.delete_chapter(chapter_id)
if success:
return Res(data=None, code=200, message="删除成功")
else:
return Res(data=None, code=400, message="删除失败或章节不存在")
# 根据内容进行解析得到json,初次解析,然后可编辑角色昵称以及内容,以及可以合并上下或者增加。(json都是多条,角色+台词)
@router.get(
"/get-lines/{project_id}/{chapter_id}",
response_model=Res[str],
summary="根据内容进行解析得到json",
description="根据内容进行解析得到json"
)
async def get_lines(
project_id: int,
chapter_id: int,
chapter_service: ChapterService = Depends(get_chapter_service),
line_service: LineService = Depends(get_line_service),
role_service: RoleService = Depends(get_role_service),
emotion_service: EmotionService = Depends(get_emotion_service),
strength_service: StrengthService = Depends(get_strength_service),
prompt_service: PromptService = Depends(get_prompt_service),
project_service: ProjectService = Depends(get_project_service)
):
# 判断章节内容是否存在
chapter = chapter_service.get_chapter(chapter_id)
if chapter.text_content is None:
return Res(data=None, code=400, message="章节内容不存在")
try:
contents = chapter_service.split_text(chapter_id, 1500)
logging.info("内容划分为 %s 段", len(contents))
except Exception as e:
logging.error(f"章节拆分失败: {e}\n{traceback.format_exc()}")
return Res(data=None, code=500, message="章节拆分失败")
all_line_data = []
try:
roles = role_service.get_all_roles(project_id)
roles = set(role.name for role in roles)
emotions = emotion_service.get_all_emotions()
strengths = strength_service.get_all_strengths()
emotion_names = [emotion.name for emotion in emotions]
strength_names = [strength.name for strength in strengths]
emotions_dict = {emotion.name: emotion.id for emotion in emotions}
strengths_dict = {strength.name: strength.id for strength in strengths}
except Exception as e:
logging.error(f"初始化角色/情绪/强度失败: {e}\n{traceback.format_exc()}")
return Res(data=None, code=500, message="初始化角色/情绪/强度失败")
project = project_service.get_project(project_id)
# 精准填充
is_precise_fill = project.is_precise_fill
# 判断tts,llm,model是否存在
if project.tts_provider_id is None or project.llm_provider_id is None or project.llm_model is None:
return Res(data=None, code=500, message="tts/llm/model不存在")
prompt = prompt_service.get_prompt(project.prompt_id) if project else None
if prompt is None:
return Res(data=None, code=500, message="提示词不存在")
for idx, content in enumerate(contents):
logging.info(f"解析第 {idx + 1}/{len(contents)} 段...")
try:
roles_list = list(roles)
result = chapter_service.para_content(
prompt.content, chapter_id, content,
roles_list, emotion_names, strength_names,is_precise_fill
)
if not result["success"]:
return Res(
data=None,
code=500,
message=result["message"]
)
# 提取lines_data中的角色
lines_data = result["data"]
for line_data in lines_data:
roles.add(line_data.role_name)
all_line_data.extend(lines_data)
except Exception as e:
logging.error(
f"解析第 {idx + 1} 段失败: {e}\n{traceback.format_exc()}"
)
return Res(data=None, code=500, message=f"解析失败:第 {idx + 1} 段处理出错,错误信息:{e}")
try:
audio_path = os.path.join(project.project_root_path,str(project_id),str(chapter_id),"audio")
os.makedirs(audio_path, exist_ok=True)
line_service.update_init_lines(
all_line_data, project_id, chapter_id, emotions_dict, strengths_dict,audio_path
)
except Exception as e:
logging.error(f"写入数据库失败: {e}\n{traceback.format_exc()}")
return Res(data=None, code=500, message="写入数据库失败")
return Res(data=None, code=200, message="解析成功")
# 导出LLM prompt指令
@router.get("/export-llm-prompt/{project_id}/{chapter_id}",response_model=Res[str],summary="导出LLM prompt指令",description="导出LLM prompt指令")
async def export_llm_prompt(project_id:int,chapter_id: int, chapter_service: ChapterService = Depends(get_chapter_service),
project_service = Depends(get_project_service),
prompt_service: PromptService = Depends(get_prompt_service),
role_service: RoleService = Depends(get_role_service),
emotion_service: EmotionService = Depends(get_emotion_service),
strength_service: StrengthService = Depends(get_strength_service)):
try:
roles = role_service.get_all_roles(project_id)
roles = [role.name for role in roles]
emotions = emotion_service.get_all_emotions()
strengths = strength_service.get_all_strengths()
emotion_names = [emotion.name for emotion in emotions]
strength_names = [strength.name for strength in strengths]
except Exception as e:
return Res(data=None, code=500, message="初始化角色/情绪/强度失败")
project = project_service.get_project(project_id)
prompt = prompt_service.get_prompt(project.prompt_id) if project else None
chapter = chapter_service.get_chapter(chapter_id)
content = chapter.text_content
res = chapter_service.fill_prompt(prompt.content, roles, emotion_names, strength_names, content)
# record
return Res(data=res, code=200, message="导出成功")
# 解析第三方的json
@router.post("/import-lines/{project_id}/{chapter_id}",response_model=Res[str],summary="导入第三方json",description="导入第三方json")
async def import_lines(project_id: int,chapter_id: int,data:str=Form( ...),line_service: LineService = Depends(get_line_service),
emotion_service: EmotionService = Depends(get_emotion_service),
strength_service: StrengthService = Depends(get_strength_service),
project_service: ProjectService = Depends(get_project_service),
chapter_service: ChapterService = Depends(get_chapter_service)):
# 解析data
lines_data = json.loads(data)
# 转化成List[LineInitDTO]
emotions = emotion_service.get_all_emotions()
strengths = strength_service.get_all_strengths()
emotions_dict = {emotion.name: emotion.id for emotion in emotions}
strengths_dict = {strength.name: strength.id for strength in strengths}
# 精准填充
project = project_service.get_project(project_id)
is_precise_fill = project.is_precise_fill
if is_precise_fill == 1:
# 获取章节内容
content = chapter_service.get_chapter(chapter_id).text_content
if not content:
return Res(data=None, code=500, message="章节内容为空")
corrector = TextCorrectorFinal()
lines_data = corrector.correct_ai_text(content, lines_data)
lines_data = [LineInitDTO(**line) for line in lines_data]
audio_path = os.path.join(project.project_root_path,str(project_id),str(chapter_id),"audio")
os.makedirs(audio_path, exist_ok=True)
line_service.update_init_lines(lines_data, project_id, chapter_id, emotions_dict, strengths_dict,audio_path)
return Res(data=None, code=200, message="导入成功")
# @router.post("/save-init-lines/{project_id}/{chapter_id}",response_model=Res[str],summary="保存初始化调整后的解析内容",description="保存初始化调整后的解析内容")
# async def update_init_lines(project_id: int,chapter_id: int,lines: List[LineInitDTO], chapter_service: ChapterService = Depends(get_chapter_service)):
# chapter_service.update_init_lines(lines,project_id,chapter_id)
# return Res(data=None, code=200, message="保存成功")
# 绑定音色就是采用的修改角色信息
# 获取章节下所有台词
# 传入台词实体,然后生成音频
# @router.post("/generate-audio/{project_id}/{chapter_id}",response_model=Res[str],summary="生成音频",description="生成音频")
# async def generate_audio(project_id: int,chapter_id: int,
# dto: LineCreateDTO, chapter_service: ChapterService = Depends(get_chapter_service),
# voice_service: VoiceService = Depends(get_voice_service),
# role_service: RoleService = Depends(get_role_service),
# project_service: ProjectService = Depends(get_project_service)):
# """生成音频"""
# # 获取角色绑定的音色的reference_path
# role = role_service.get_role(dto.role_id)
# voice = voice_service.get_voice(role.default_voice_id)
# project = project_service.get_project(project_id)
# save_path = dto.audio_path
# res = chapter_service.generate_audio(voice.reference_path,project.tts_provider_id,dto.text_content,save_path=save_path)
# return Res(data=None, code=200, message="生成成功")
# 合并结果并导出
# @router.get("/export-audio/{project_id}/{chapter_id}",response_model=Res[str],summary="合并结果并导出",description="合并结果并导出")
# async def export_audio(project_id: int,chapter_id: int, chapter_service: ChapterService = Depends(get_chapter_service))
# res = chapter_service.export_audio(project_id,chapter_id)
# 添加智能匹配角色和音色的功能
@router.post("/add-smart-role-and-voice/{project_id}/{chapter_id}",response_model=Res[List],summary="添加智能匹配角色和音色的功能",description="添加智能匹配角色和音色的功能")
async def add_smart_role_and_voice(project_id: int,chapter_id: int,
chapter_service: ChapterService = Depends(get_chapter_service),
project_service: ProjectService = Depends(get_project_service),
voice_service: VoiceService = Depends(get_voice_service),
role_service: RoleService = Depends(get_role_service)):
# 获取项目信息
project = project_service.get_project(project_id)
# 首先获取项目下所有角色
roles = role_service.get_all_roles(project_id)
# 将所有角色未绑定音色的角色提取出来
roles_no_voice = [role for role in roles if role.default_voice_id is None]
# 只要角色name
role_names = [role.name for role in roles_no_voice]
# 获取所有音色
voices = voice_service.get_all_voices(project.tts_provider_id)
# 只要音色的名字和描述
voice_names = [
{
"name": voice.name,
"description": voice.description
}
for voice in voices
]
# 获取原文内容
content = chapter_service.get_chapter(chapter_id).text_content
res,data = chapter_service.add_smart_role_and_voice(project,content,role_names,voice_names)
# 将data中的每一个元素转化为RoleBindVoiceDTO
# data = [RoleBindVoiceDTO(**item) for item in data]
if res:
return Res(data=data, code=200, message="智能匹配成功")
else:
return Res(data=None, code=500, message="智能匹配失败")
================================================
FILE: SonicVale/app/routers/emotion_router.py
================================================
from typing import List
from fastapi import APIRouter, Depends, HTTPException
from sqlalchemy.orm import Session
from app.core.response import Res
from app.db.database import get_db
from app.dto.emotion_dto import EmotionResponseDTO, EmotionCreateDTO
from app.entity.emotion_entity import EmotionEntity
from app.repositories.line_repository import LineRepository
from app.repositories.project_repository import ProjectRepository
from app.repositories.emotion_repository import EmotionRepository
from app.repositories.tts_provider_repository import TTSProviderRepository
from app.services.line_service import LineService
from app.services.project_service import ProjectService
from app.services.emotion_service import EmotionService
router = APIRouter(prefix="/emotions", tags=["Emotions"])
# 依赖注入(实际项目可用 DI 容器)
def get_emotion_service(db: Session = Depends(get_db)) -> EmotionService:
repository = EmotionRepository(db)
return EmotionService(repository)
@router.post("", response_model=Res[EmotionResponseDTO],
summary="创建情绪枚举",
description="根据项目ID创建情绪枚举,情绪枚举名称在同一项目下不可重复" )
def create_emotion(dto: EmotionCreateDTO, emotion_service: EmotionService = Depends(get_emotion_service)):
"""创建情绪枚举"""
try:
# DTO → Entity
entity = EmotionEntity(**dto.__dict__)
# 调用 Service 创建项目(返回 True/False)
entityRes = emotion_service.create_emotion(entity)
# 返回统一 Response
if entityRes is not None:
# 创建成功,可以返回 DTO 或者部分字段
res = EmotionResponseDTO(**entityRes.__dict__)
return Res(data=res, code=200, message="创建成功")
else:
return Res(data=None, code=400, message=f"情绪枚举 '{entity.name}' 已存在")
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
@router.get("/{emotion_id}", response_model=Res[EmotionResponseDTO],
summary="查询情绪枚举",
description="根据情绪枚举id查询情绪枚举信息")
def get_emotion(emotion_id: int, emotion_service: EmotionService = Depends(get_emotion_service)):
entity = emotion_service.get_emotion(emotion_id)
if entity:
res = EmotionResponseDTO(**entity.__dict__)
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=None, code=404, message="情绪枚举不存在")
@router.get("", response_model=Res[List[EmotionResponseDTO]],
summary="查询所有情绪枚举",
description="根据所有情绪枚举信息")
def get_all_emotions(emotion_service: EmotionService = Depends(get_emotion_service)):
entities = emotion_service.get_all_emotions()
if entities:
res = [EmotionResponseDTO(**e.__dict__) for e in entities]
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=[], code=404, message="项目不存在情绪枚举")
# 修改,传入的参数是id
@router.put("/{emotion_id}", response_model=Res[EmotionCreateDTO],
summary="修改情绪枚举信息",
description="根据情绪枚举id修改情绪枚举信息,并且不能修改项目id")
def update_emotion(emotion_id: int, dto: EmotionCreateDTO, emotion_service: EmotionService = Depends(get_emotion_service)):
emotion = emotion_service.get_emotion(emotion_id)
if emotion is None:
return Res(data=None, code=404, message="情绪枚举不存在")
res = emotion_service.update_emotion(emotion_id, dto.dict(exclude_unset=True))
if res:
return Res(data=dto, code=200, message="修改成功")
else:
return Res(data=None, code=400, message="修改失败,情绪枚举已存在")
# 根据id,删除,不开放
@router.delete("/{emotion_id}", response_model=Res,
summary="删除情绪枚举",
description="根据情绪枚举id删除情绪枚举信息")
def delete_emotion(emotion_id: int, emotion_service: EmotionService = Depends(get_emotion_service)):
success = emotion_service.delete_emotion(emotion_id)
if success:
return Res(data=None, code=200, message="删除成功")
else:
return Res(data=None, code=400, message="删除失败或情绪枚举不存在")
================================================
FILE: SonicVale/app/routers/line_router.py
================================================
import asyncio
import os
import logging
import shutil
from concurrent.futures import ThreadPoolExecutor
from typing import List, Optional
from fastapi import APIRouter, Depends, HTTPException, Body, Request, Query
from sqlalchemy.orm import Session
from app.core.config import getConfigPath
from app.core.response import Res
from app.core.ws_manager import manager
from app.db.database import get_db, SessionLocal
from app.dto.line_dto import LineResponseDTO, LineCreateDTO, LineOrderDTO, LineAudioProcessDTO
from app.entity.line_entity import LineEntity
from app.repositories.chapter_repository import ChapterRepository
from app.repositories.llm_provider_repository import LLMProviderRepository
from app.repositories.multi_emotion_voice_repository import MultiEmotionVoiceRepository
from app.repositories.project_repository import ProjectRepository
from app.repositories.line_repository import LineRepository
from app.repositories.role_repository import RoleRepository
from app.repositories.tts_provider_repository import TTSProviderRepository
from app.repositories.voice_repository import VoiceRepository
from app.services.chapter_service import ChapterService
from app.services.project_service import ProjectService
from app.services.line_service import LineService
from app.services.role_service import RoleService
from app.services.voice_service import VoiceService
router = APIRouter(prefix="/lines", tags=["Lines"])
# 依赖注入(实际项目可用 DI 容器)
def get_line_service(db: Session = Depends(get_db)) -> LineService:
repository = LineRepository(db)
role_repository = RoleRepository(db)
tts_repository = TTSProviderRepository(db)
llm_repository = LLMProviderRepository(db)
return LineService(repository, role_repository, tts_repository, llm_repository)
def get_project_service(db: Session = Depends(get_db)) -> ProjectService:
repository = ProjectRepository(db)
return ProjectService(repository)
def get_chapter_service(db: Session = Depends(get_db)) -> ChapterService:
repository = ChapterRepository(db)
return ChapterService(repository)
def get_voice_service(db: Session = Depends(get_db)) -> VoiceService:
repository = VoiceRepository(db)
multi_emotion_voice_repository = MultiEmotionVoiceRepository(db)
return VoiceService(repository, multi_emotion_voice_repository)
def get_role_service(db: Session = Depends(get_db)) -> RoleService:
repository = RoleRepository(db)
return RoleService(repository)
@router.post("/{project_id}", response_model=Res[LineResponseDTO],
summary="创建台词",
description="根据项目ID创建台词" )
def create_line(project_id:int,dto: LineCreateDTO, line_service: LineService = Depends(get_line_service),
project_service: ProjectService = Depends(get_project_service),
chapter_service : ChapterService = Depends(get_chapter_service)):
"""创建台词"""
try:
# DTO → Entity
entity = LineEntity(**dto.__dict__)
# 判断project_id是否存在
project = project_service.get_project(project_id)
if project is None:
return Res(data=None, code=400, message=f"项目 '{project_id}' 不存在")
chapter = chapter_service.get_chapter(dto.chapter_id)
if chapter is None:
return Res(data=None, code=400, message=f"章节 '{dto.chapter_id}' 不存在")
# 调用 Service 创建项目(返回 True/False)
entityRes = line_service.create_line(entity)
# 新增台词,这里搞个audio_path
audio_path = os.path.join(project.project_root_path, str(project_id), str(dto.chapter_id), "audio")
os.makedirs(audio_path, exist_ok=True)
res_path = os.path.join(audio_path, "id_" + str(entityRes.id) + ".wav")
line_service.update_line(entityRes.id, {"audio_path": res_path})
# 返回统一 Response
if entityRes is not None:
# 创建成功,可以返回 DTO 或者部分字段
res = LineResponseDTO(**entityRes.__dict__)
return Res(data=res, code=200, message="创建成功")
else:
return Res(data=None, code=400, message=f"台词 '{entity.name}' 已存在")
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
@router.get("/{line_id}", response_model=Res[LineResponseDTO],
summary="查询台词",
description="根据台词id查询台词信息")
def get_line(line_id: int, line_service: LineService = Depends(get_line_service)):
entity = line_service.get_line(line_id)
if entity:
res = LineResponseDTO(**entity.__dict__)
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=None, code=404, message="项目不存在")
@router.get("/lines/{chapter_id}", response_model=Res[List[LineResponseDTO]],
summary="查询章节下的所有台词",
description="根据章节id查询章节下的所有台词信息")
def get_all_lines(chapter_id: int, line_service: LineService = Depends(get_line_service)):
entities = line_service.get_all_lines(chapter_id)
if entities:
res = [LineResponseDTO(**e.__dict__) for e in entities]
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=[], code=200, message="章节不存在台词")
# 修改,传入的参数是id
@router.put("/{line_id}", response_model=Res[LineCreateDTO],
summary="修改台词信息",
description="根据台词id修改台词信息,并且不能修改章节id")
def update_line(line_id: int, dto: LineCreateDTO, line_service: LineService = Depends(get_line_service)):
line = line_service.get_line(line_id)
if line is None:
return Res(data=None, code=404, message="台词不存在")
res = line_service.update_line(line_id, dto.dict(exclude_unset=True))
if res:
return Res(data=dto, code=200, message="修改成功")
else:
return Res(data=None, code=400, message="修改失败")
# 根据id,删除
@router.delete("/{line_id}", response_model=Res,
summary="删除台词",
description="根据台词id删除台词信息")
def delete_line(line_id: int, line_service: LineService = Depends(get_line_service)):
success = line_service.delete_line(line_id)
if success:
return Res(data=None, code=200, message="删除成功")
else:
return Res(data=None, code=400, message="删除失败或台词不存在")
# 删除章节下所有台词
@router.delete("/lines/{chapter_id}", response_model=Res,summary="删除章节下所有台词",description="根据章节id删除章节下的所有台词信息")
def delete_all_lines(chapter_id: int, line_service: LineService = Depends(get_line_service)):
success = line_service.delete_all_lines(chapter_id)
if success:
return Res(data=None, code=200, message="删除成功")
else:
return Res(data=None, code=400, message="删除失败或台词不存在")
@router.put("/batch/orders", response_model=Res[bool])
def batch_update_line_order(
line_orders: List[LineOrderDTO] = Body(...), # 关键:明确从 body 读取“数组”
line_service: LineService = Depends(get_line_service),
):
res = line_service.batch_update_line_order(line_orders)
return Res(data=res, code=200, message="更新成功")
# 完成配音时候,更新音频路径,保证顺序一致
@router.put("/{line_id}/audio_path", response_model=Res[bool])
def update_line_audio_path(
line_id: int,
dto: LineCreateDTO, # 关键:明确从 body 读取“数组”
line_service: LineService = Depends(get_line_service),
):
res = line_service.update_audio_path(line_id,dto)
if not res:
return Res(data=None, code=400, message="更新失败")
return Res(data=res, code=200, message="更新成功")
@router.post("/generate-audio/{project_id}/{chapter_id}")
async def generate_audio(request: Request, project_id: int, dto: LineCreateDTO,line_service: LineService = Depends(get_line_service)):
q = request.app.state.tts_queue # 👈 永远拿到已初始化的同一份队列
if q.full():
# 可选:带上 Retry-After 头
raise HTTPException(status_code=429, detail="队列已满,请稍后重试")
q.put_nowait((project_id, dto))
queue_size = q.qsize() # 入队后的队列大小
line_service.update_line(dto.id, {"status": "processing"})
# 入队后立即广播队列大小,让前端实时看到更新
await manager.broadcast({
"event": "line_update",
"line_id": dto.id,
"status": "queued",
"progress": queue_size,
"meta": f"已入队,等待生成"
})
logging.info("队列剩余数量: %s", queue_size)
return {"code": 200, "message": "已入队", "data": {"line_id": dto.id}}
# 改为异步任务
# @router.post("/generate-audio/{project_id}/{chapter_id}")
# async def generate_audio(project_id : int, chapter_id: int, dto: LineCreateDTO):
# # 立即返回,不阻塞
# asyncio.create_task(_run_line_tts(project_id,dto))
# return {"code": 200, "message": "已入队", "data": {"line_id": dto.id}}
#
#
# TTS_EXECUTOR = ThreadPoolExecutor(max_workers=4) # 线程池大小
# TTS_SEMAPHORE = asyncio.Semaphore(1) # 最多 4 个并行 TTS
# async def _run_line_tts(project_id:int,dto: LineCreateDTO):
# db = SessionLocal()
# line_service = get_line_service(db)
# role_service = get_role_service( db)
# voice_service = get_voice_service(db)
# project_service = get_project_service(db)
# try:
# # 1) 更新为 running
# line_service.update_line(dto.id, {"status": "processing"})
# print("开始生成")
# await manager.broadcast({
# "event": "line_update",
# "line_id": dto.id,
# "status": "processing",
# "progress": 0,
# "meta": f"角色 {dto.role_id} 开始生成"
# })
#
# # 2) 模拟进度
# # 获取角色绑定的音色的reference_path
# role = role_service.get_role(dto.role_id)
# voice = voice_service.get_voice(role.default_voice_id)
# project = project_service.get_project(project_id)
# save_path = dto.audio_path
# loop = asyncio.get_running_loop()
# async with TTS_SEMAPHORE:
# # 可选:设置超时,防挂死
# try:
# res = await asyncio.wait_for(
# loop.run_in_executor(
# TTS_EXECUTOR, # ✅ 用自建线程池
# line_service.generate_audio,
# voice.reference_path,
# project.tts_provider_id, # 若引擎需要 base_url,就换成 project.tts_base_url
# dto.text_content,
# save_path
# ),
# timeout=120 # 例:最多等 5 分钟
# )
# except asyncio.TimeoutError:
# raise RuntimeError("TTS 超时")
#
# # res = chapter_service.generate_audio(voice.reference_path,project.tts_provider_id,dto.text_content,save_path=save_path)
# # 3) 真正合成
# line_service.update_line(dto.id, {"status": "done"})
#
# # 4) 广播完成
# await manager.broadcast({
# "event": "line_update",
# "line_id": dto.id,
# "status": "done",
# "progress": 100,
# "meta": "生成完成",
# "audio_path": dto.audio_path
# })
# except Exception as e:
# line_service.update_line(dto.id, {"status": "failed"})
# await manager.broadcast({
# "event": "line_update",
# "line_id": dto.id,
# "status": "failed",
# "progress": 0,
# "meta": f"失败: {e}"
# })
# finally:
# db.close()
#
#
# # 批量更新line_order
# 处理音频文件,传入倍速,音量大小,以及line_id
@router.post("/process-audio/{line_id}")
async def process_audio(line_id: int, dto: LineAudioProcessDTO, line_service: LineService = Depends(get_line_service)):
res = line_service.process_audio(line_id,dto)
if not res:
return Res(data=None, code=400, message="处理失败")
return Res(data=res, code=200, message="处理成功")
# 导出音频与字幕
@router.get("/export-audio/{chapter_id}")
async def export_audio(chapter_id: int,
single: bool = Query(False, description="是否导出单条音频字幕"),
line_service: LineService = Depends(get_line_service)):
res = line_service.export_audio(chapter_id, single)
# res 现在返回 dict,包含 success, message, audio_path 等字段
if isinstance(res, dict):
if res.get("success"):
return Res(data=res, code=200, message=res.get("message", "导出成功"))
else:
return Res(data=res, code=400, message=res.get("message", "导出失败"))
# 兼容旧的返回格式
if not res:
return Res(data=None, code=400, message="导出失败")
return Res(data=res, code=200, message="导出成功")
# 生成单条音频的字幕(已经有音频)
#
# 矫正字幕 - 拼音匹配矫正
@router.post("/correct-subtitle-pinyin/{chapter_id}")
async def correct_subtitle_pinyin(
chapter_id: int,
line_service: LineService = Depends(get_line_service)
):
"""使用拼音匹配算法矫正字幕"""
lines = line_service.get_all_lines(chapter_id)
if not lines:
logging.info("无台词记录")
return Res(data=None, code=400, message="无台词记录")
paths = [line.audio_path for line in lines]
if not paths or not paths[0]:
logging.info("未找到有效音频路径")
return Res(data=None, code=400, message="未找到有效音频路径")
# 读取所有台词,组成一个文本
text = "\n".join([line.text_content for line in lines])
output_dir_path = os.path.join(os.path.dirname(paths[0]), "result")
output_subtitle_path = os.path.join(output_dir_path, "result.srt")
if not os.path.exists(output_subtitle_path):
logging.info("请先导出音频")
return Res(data=None, code=400, message="请先导出音频")
# 拼音矫正输出到独立文件
pinyin_subtitle_path = os.path.join(output_dir_path, "result_pinyin.srt")
shutil.copy(output_subtitle_path, pinyin_subtitle_path)
line_service.correct_subtitle_pinyin(text, pinyin_subtitle_path)
logging.info("整体字幕矫正完成(拼音匹配):%s", pinyin_subtitle_path)
# 将单条字幕也进行矫正
logging.info("开始对单条字幕进行矫正")
for line in lines:
subtitle_path = line.subtitle_path
line_text = line.text_content
if subtitle_path is not None and line_text is not None and os.path.exists(subtitle_path):
# 单条字幕也输出到 _pinyin 文件
base, ext = os.path.splitext(subtitle_path)
pinyin_single_path = f"{base}_pinyin{ext}"
shutil.copy(subtitle_path, pinyin_single_path)
line_service.correct_subtitle_pinyin(line_text, pinyin_single_path)
logging.info("单条字幕矫正完成:%s", line.id)
return Res(data=None, code=200, message="拼音匹配矫正完成")
# 矫正字幕 - LLM矫正
@router.post("/correct-subtitle-llm/{chapter_id}")
async def correct_subtitle_llm(
chapter_id: int,
batch_size: int = Query(20, description="LLM分批处理时每批的条数"),
line_service: LineService = Depends(get_line_service),
chapter_service: ChapterService = Depends(get_chapter_service),
project_service: ProjectService = Depends(get_project_service)
):
"""使用LLM矫正字幕,自动从项目配置获取LLM信息"""
# 获取章节信息
chapter = chapter_service.get_chapter(chapter_id)
if not chapter:
return Res(data=None, code=400, message="章节不存在")
# 获取项目信息,从中读取LLM配置
project = project_service.get_project(chapter.project_id)
if not project:
return Res(data=None, code=400, message="项目不存在")
if not project.llm_provider_id:
return Res(data=None, code=400, message="项目未配置LLM提供商,请在项目设置中配置")
if not project.llm_model:
return Res(data=None, code=400, message="项目未配置LLM模型,请在项目设置中选择模型")
lines = line_service.get_all_lines(chapter_id)
if not lines:
logging.info("无台词记录")
return Res(data=None, code=400, message="无台词记录")
paths = [line.audio_path for line in lines]
if not paths or not paths[0]:
logging.info("未找到有效音频路径")
return Res(data=None, code=400, message="未找到有效音频路径")
# 读取所有台词,组成一个文本
text = "\n".join([line.text_content for line in lines])
output_dir_path = os.path.join(os.path.dirname(paths[0]), "result")
output_subtitle_path = os.path.join(output_dir_path, "result.srt")
if not os.path.exists(output_subtitle_path):
logging.info("请先导出音频")
return Res(data=None, code=400, message="请先导出音频")
# LLM矫正输出到独立文件
llm_subtitle_path = os.path.join(output_dir_path, "result_llm.srt")
shutil.copy(output_subtitle_path, llm_subtitle_path)
line_service.correct_subtitle_llm(
text, llm_subtitle_path,
llm_provider_id=project.llm_provider_id,
llm_model=project.llm_model,
batch_size=batch_size
)
logging.info("整体字幕矫正完成(LLM):%s", llm_subtitle_path)
# 将单条字幕也进行矫正
logging.info("开始对单条字幕进行矫正")
for line in lines:
subtitle_path = line.subtitle_path
line_text = line.text_content
if subtitle_path is not None and line_text is not None and os.path.exists(subtitle_path):
# 单条字幕也输出到 _llm 文件
base, ext = os.path.splitext(subtitle_path)
llm_single_path = f"{base}_llm{ext}"
shutil.copy(subtitle_path, llm_single_path)
line_service.correct_subtitle_llm(
line_text, llm_single_path,
llm_provider_id=project.llm_provider_id,
llm_model=project.llm_model,
batch_size=batch_size
)
logging.info("单条字幕矫正完成:%s", line.id)
return Res(data=None, code=200, message="LLM矫正完成")
================================================
FILE: SonicVale/app/routers/llm_provider_router.py
================================================
from fastapi import APIRouter, Depends, HTTPException
from typing import List
from sqlalchemy.orm import Session
from app.core.response import Res
from app.db.database import get_db
from app.dto.llm_provider_dto import LLMProviderCreateDTO, LLMProviderResponseDTO
from app.entity.llm_provider_entity import LLMProviderEntity
from app.services.llm_provider_service import LLMProviderService
from app.repositories.llm_provider_repository import LLMProviderRepository
# 初始化 router
router = APIRouter(prefix="/llm_providers", tags=["LLMProviders"])
# 依赖注入(实际LLM供应商可用 DI 容器)
def get_llm_service(db: Session = Depends(get_db)) -> LLMProviderService:
repository = LLMProviderRepository(db) # ✅ 传入 db
return LLMProviderService(repository)
@router.post("/", response_model=Res[LLMProviderResponseDTO],
summary="创建LLM供应商",
description="根据LLM供应商信息创建LLM供应商,LLM供应商名称不可重复")
def create_llm_provider(dto: LLMProviderCreateDTO, service: LLMProviderService = Depends(get_llm_service)):
"""
创建LLM供应商
- dto: 前端 POST JSON 传入参数
- service: Service 层注入
"""
try:
# DTO → Entity
entity = LLMProviderEntity(**dto.__dict__)
# 调用 Service 创建LLM供应商(返回 True/False)
entityRes = service.create_llm_provider(entity)
# 返回统一 Response
if entityRes is not None:
# 创建成功,可以返回 DTO 或者部分字段
res = LLMProviderResponseDTO(**entityRes.__dict__)
return Res(data=res, code=200, message="创建成功")
else:
return Res(data=None, code=400, message=f"LLM供应商 '{entity.name}' 已存在")
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
# 按id查找
@router.get("/{llm_provider_id}", response_model=Res[LLMProviderResponseDTO],
summary="查询LLM供应商",
description="根据LLM供应商ID查询LLM供应商信息")
def get_llm_provider(llm_provider_id: int, service: LLMProviderService = Depends(get_llm_service)):
entity = service.get_llm_provider(llm_provider_id)
if entity:
res = LLMProviderResponseDTO(**entity.__dict__)
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=None, code=404, message="LLM供应商不存在")
@router.get("/", response_model=Res[List[LLMProviderResponseDTO]],
summary="查询所有LLM供应商",
description="查询所有LLM供应商信息")
def get_all_llm_providers(service: LLMProviderService = Depends(get_llm_service)):
entities = service.get_all_llm_providers()
dtos = [LLMProviderResponseDTO(**e.__dict__) for e in entities]
return Res(data=dtos, code=200, message="查询成功")
# ------------------- 修改LLM供应商 -------------------
@router.put("/{llm_provider_id}", response_model=Res[LLMProviderCreateDTO],
summary="修改LLM供应商",
description="根据LLM供应商ID修改LLM供应商信息")
def update_llm_provider(llm_provider_id: int, dto: LLMProviderCreateDTO, service: LLMProviderService = Depends(get_llm_service)):
# 先根据id进行查找
llm_provider = service.get_llm_provider(llm_provider_id)
if not llm_provider:
return Res(data=None, code=400, message="LLM供应商不存在")
success = service.update_llm_provider(llm_provider_id,dto.dict(exclude_unset=True))
if success:
return Res(data=dto, code=200, message="更新成功")
else:
return Res(data=None, code=400, message="更新失败")
# ------------------- 删除LLM供应商 -------------------
@router.delete("/{llm_provider_id}", response_model=Res,
summary="删除LLM供应商",
description="根据LLM供应商ID删除LLM供应商,并且级联删除LLM供应商下所有章节以及内容")
def delete_llm_provider(llm_provider_id: int, service: LLMProviderService = Depends(get_llm_service)):
success = service.delete_llm_provider(llm_provider_id)
# todo 级联删除LLM供应商所有相关内容,比如LLM供应商下所有章节以及内容
if success:
return Res(data=None, code=200, message="删除成功")
else:
return Res(data=None, code=400, message="删除失败或LLM供应商不存在")
# 测试供应商
@router.post("/test", response_model=Res)
def test_llm_provider(dto: LLMProviderCreateDTO, service: LLMProviderService = Depends(get_llm_service)):
"""
测试供应商
"""
entity = LLMProviderEntity(**dto.__dict__)
res,msg = service.test_llm_provider(entity)
if res == True:
return Res(data=None, code=200, message="测试成功")
else:
return Res(data=None, code=400, message= msg)
================================================
FILE: SonicVale/app/routers/multi_emotion_voice_router.py
================================================
from typing import List
from fastapi import APIRouter, Depends
from sqlalchemy.orm import Session
from app.core.response import Res
from app.db.database import get_db
from app.dto.multi_emotion_voice_dto import MultiEmotionVoiceCreateDTO, MultiEmotionVoiceResponseDTO
from app.entity.multi_emotion_voice_entity import MultiEmotionVoiceEntity
from app.repositories.emotion_repository import EmotionRepository
from app.repositories.multi_emotion_voice_repository import MultiEmotionVoiceRepository
from app.repositories.strength_repository import StrengthRepository
from app.repositories.voice_repository import VoiceRepository
from app.services.emotion_service import EmotionService
from app.services.multi_emotion_voice_service import MultiEmotionVoiceService
from app.services.strength_service import StrengthService
from app.services.voice_service import VoiceService
router = APIRouter(prefix="/multi_emotion_voices", tags=["MultiEmotionVoice"])
def get_multi_emotion_voice_service(db: Session = Depends(get_db)) -> MultiEmotionVoiceService:
repository = MultiEmotionVoiceRepository(db)
return MultiEmotionVoiceService(repository)
def get_voice_service(db: Session = Depends(get_db)) -> VoiceService:
repository = VoiceRepository(db)
multi_emotion_voice_repository = MultiEmotionVoiceRepository(db)
return VoiceService(repository, multi_emotion_voice_repository)
def get_emotion_service(db: Session = Depends(get_db)) -> EmotionService:
repository = EmotionRepository(db)
return EmotionService(repository)
def get_strength_service(db: Session = Depends(get_db)) -> StrengthService:
repository = StrengthRepository(db)
return StrengthService(repository)
# 根据voice_id获取多音色
@router.get("/voice_id/{voice_id}", response_model=Res[List[MultiEmotionVoiceResponseDTO]],summary="根据voice_id获取多音色", description="根据voice_id获取多音色")
def get_multi_emotion_voice_by_voice_id(voice_id: int, multi_emotion_voice_service: MultiEmotionVoiceService = Depends(get_multi_emotion_voice_service),
voice_service: VoiceService = Depends(get_voice_service)):
# 应该查询voice
voice = voice_service.get_voice(voice_id)
if voice is None:
return Res(code=404, message="音色不存在")
entities = multi_emotion_voice_service.get_multi_emotion_voice_by_voice_id(voice_id)
if entities is None:
return Res(code=404, message="多音色不存在")
else:
res = [MultiEmotionVoiceResponseDTO(**entity.__dict__) for entity in entities]
return Res(data=res, code=200, message="查询成功")
# 查询所有多音色
@router.get("", response_model=Res[List[MultiEmotionVoiceResponseDTO]],summary="查询所有多音色", description="查询所有多音色")
def get_all_multi_emotion_voice(multi_emotion_voice_service: MultiEmotionVoiceService = Depends(get_multi_emotion_voice_service)):
entities = multi_emotion_voice_service.get_all_multi_emotion_voices()
if not entities:
return Res(data=[], code=200, message="查询成功")
entities = [MultiEmotionVoiceResponseDTO(**e.__dict__) for e in entities]
return Res(data=entities, code=200, message="查询成功")
# 创建
@router.post("", response_model=Res[MultiEmotionVoiceResponseDTO],summary="创建多情绪音色", description="创建多情绪音色")
def create_multi_emotion_voice(dto: MultiEmotionVoiceCreateDTO, multi_emotion_voice_service: MultiEmotionVoiceService = Depends(get_multi_emotion_voice_service),
voice_service: VoiceService = Depends(get_voice_service),
emotion_service: EmotionService = Depends(get_emotion_service),
strength_service: StrengthService = Depends(get_strength_service)):
"""创建多音色"""
# 先要判断voice是否存在
voice = voice_service.get_voice(dto.voice_id)
# 判断情绪枚举是否存在
emotion = emotion_service.get_emotion(dto.emotion_id)
# 判断强度枚举是否存在
strength = strength_service.get_strength(dto.strength_id)
if voice is None or emotion is None or strength is None:
return Res(code=500, message="创建失败,音色或者情绪枚举或者情绪强弱枚举不存在,不能创建多情绪音色")
# DTO → Entity
entity = MultiEmotionVoiceEntity(**dto.__dict__)
entity = multi_emotion_voice_service.create_multi_emotion_voice(entity)
if entity is None:
return Res(code=500, message="创建失败,已存在多情绪音色")
else :
entity = MultiEmotionVoiceResponseDTO(**entity.__dict__)
return Res(data=entity, code=200, message="创建成功")
# 修改
@router.put("/{multi_emotion_voice_id}", response_model=Res[MultiEmotionVoiceCreateDTO],summary="修改多情绪音色", description="修改多情绪音色")
def update_multi_emotion_voice(multi_emotion_voice_id: int, dto: MultiEmotionVoiceCreateDTO, multi_emotion_voice_service: MultiEmotionVoiceService = Depends(get_multi_emotion_voice_service)):
"""修改多音色"""
entity = multi_emotion_voice_service.get_multi_emotion_voice_by_id(multi_emotion_voice_id)
if entity is None:
return Res(code=404, message="多音色不存在")
res = multi_emotion_voice_service.update_multi_emotion_voice(multi_emotion_voice_id, dto.dict(exclude_unset=True))
if res is None:
return Res(code=500, message="修改失败")
else:
entityRes = MultiEmotionVoiceResponseDTO(**entity.__dict__)
return Res(data=entityRes, code=200, message="修改成功")
# 删除
@router.delete("/{multi_emotion_voice_id}", response_model=Res[MultiEmotionVoiceResponseDTO],summary="删除多情绪音色", description="删除多情绪音色")
def delete_multi_emotion_voice(multi_emotion_voice_id: int, multi_emotion_voice_service: MultiEmotionVoiceService = Depends(get_multi_emotion_voice_service)):
"""删除多音色"""
res = multi_emotion_voice_service.delete_multi_emotion_voice(multi_emotion_voice_id)
if res:
return Res(data=None, code=200, message="删除成功")
else:
return Res(data=None, code=400, message="删除失败")
#
================================================
FILE: SonicVale/app/routers/project_router.py
================================================
import os
import shutil
import logging
from fastapi import APIRouter, Depends, HTTPException
from typing import List
from sqlalchemy.orm import Session
from app.core.config import getConfigPath
from app.core.response import Res
from app.db.database import get_db
from app.dto.project_dto import ProjectCreateDTO, ProjectResponseDTO, ProjectImportDTO
from app.entity.chapter_entity import ChapterEntity
from app.entity.project_entity import ProjectEntity
from app.models.po import ChapterPO
from app.repositories.chapter_repository import ChapterRepository
from app.repositories.line_repository import LineRepository
from app.repositories.llm_provider_repository import LLMProviderRepository
from app.repositories.role_repository import RoleRepository
from app.repositories.tts_provider_repository import TTSProviderRepository
from app.services.chapter_service import ChapterService
from app.services.project_service import ProjectService
from app.repositories.project_repository import ProjectRepository
from app.services.role_service import RoleService
# 初始化 router
router = APIRouter(prefix="/projects", tags=["Projects"])
# 依赖注入(实际项目可用 DI 容器)
def get_service(db: Session = Depends(get_db)) -> ProjectService:
repository = ProjectRepository(db) # ✅ 传入 db
return ProjectService(repository)
def get_chapter_service(db: Session = Depends(get_db)) -> ChapterService:
repository = ChapterRepository(db) # ✅ 传入 db
return ChapterService(repository)
def get_role_service(db: Session = Depends(get_db)) -> RoleService:
repository = RoleRepository(db) # ✅ 传入 db
return RoleService(repository)
@router.post("/", response_model=Res[ProjectResponseDTO],
summary="创建项目",
description="根据项目信息创建项目,项目名称不可重复")
def create_project(dto: ProjectCreateDTO, service: ProjectService = Depends(get_service)):
"""
创建项目
- dto: 前端 POST JSON 传入参数
- service: Service 层注入
"""
try:
# DTO → Entity
entity = ProjectEntity(**dto.__dict__)
# 调用 Service 创建项目(返回 True/False)
entityRes,message = service.create_project(entity)
# 返回统一 Response
if entityRes is not None:
# 创建成功,可以返回 DTO 或者部分字段
res = ProjectResponseDTO(**entityRes.__dict__)
return Res(data=res, code=200, message="创建成功")
else:
return Res(data=None, code=400, message=message)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
# 按id查找
@router.get("/{project_id}", response_model=Res[ProjectResponseDTO],
summary="查询项目",
description="根据项目ID查询项目信息")
def get_project(project_id: int, service: ProjectService = Depends(get_service)):
entity = service.get_project(project_id)
if entity:
res = ProjectResponseDTO(**entity.__dict__)
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=None, code=404, message="项目不存在")
@router.get("/", response_model=Res[List[ProjectResponseDTO]],
summary="查询所有项目",
description="查询所有项目信息")
def get_all_projects(service: ProjectService = Depends(get_service)):
entities = service.get_all_projects()
dtos = [ProjectResponseDTO(**e.__dict__) for e in entities]
return Res(data=dtos, code=200, message="查询成功")
# ------------------- 修改项目 -------------------
@router.put("/{project_id}", response_model=Res[ProjectCreateDTO],
summary="修改项目",
description="根据项目ID修改项目信息")
def update_project(project_id: int, dto: ProjectCreateDTO, service: ProjectService = Depends(get_service)):
# 先根据id进行查找
project = service.get_project(project_id)
if not project:
return Res(data=None, code=400, message="项目不存在")
success = service.update_project(project_id,dto.dict())
if success:
return Res(data=dto, code=200, message="更新成功")
else:
return Res(data=None, code=400, message="更新失败")
# ------------------- 删除项目 -------------------
@router.delete("/{project_id}", response_model=Res,
summary="删除项目",
description="根据项目ID删除项目,并且级联删除项目下所有章节以及内容")
def delete_project(project_id: int, service: ProjectService = Depends(get_service), chapter_service: ChapterService = Depends(get_chapter_service),role_service: RoleService = Depends(get_role_service)):
# 级联删除项目所有相关内容,比如项目下所有章节以及内容
entities = chapter_service.get_all_chapters(project_id)
for entity in entities:
chapter_service.delete_chapter(entity.id)
# 删除project目录
project = service.get_project(project_id)
project_path = os.path.join(project.project_root_path, str(project_id))
if os.path.exists(project_path):
shutil.rmtree(project_path) # 删除整个文件夹及其所有内容
logging.info("已删除目录及内容: %s", project_path)
else:
logging.info("目录不存在: %s", project_path)
# 还要删除角色库中projet下的所有角色
roles = role_service.get_all_roles(project_id)
for role in roles:
role_service.delete_role(role.id)
success = service.delete_project(project_id)
if success:
return Res(data=None, code=200, message="删除成功")
else:
return Res(data=None, code=400, message="删除失败或项目不存在")
# 直接导入整本小说内容,然后解析,创建章节
@router.post("/{project_id}/import")
def import_project(project_id: int, dto: ProjectImportDTO,service: ProjectService = Depends(get_service),
chapter_service: ChapterService = Depends(get_chapter_service)):
content = dto.content
# 删除该项目下的所有章节
# chapters = chapter_service.get_all_chapters(project_id)
# for chapter in chapters:
# chapter_service.delete_chapter(chapter.id)
# 解析content
chapter_contents = service.parse_content(content)
if len(chapter_contents) == 0:
return Res(code=400, message="导入失败")
# 批量创建章节
for chapter_content in chapter_contents:
name = chapter_content["chapter_name"]
content = chapter_content["content"]
logging.info("批量创建章节 %s", name)
chapter_service.create_chapter(ChapterEntity(project_id=project_id, title=name, text_content=content))
return Res(code=200, message="导入成功")
================================================
FILE: SonicVale/app/routers/prompt_router.py
================================================
from fastapi import APIRouter, Depends, HTTPException
from typing import List
from sqlalchemy.orm import Session
from app.core.enums import TaskEnum
from app.core.response import Res
from app.db.database import get_db
from app.dto.prompt_dto import PromptCreateDTO, PromptResponseDTO
from app.entity.prompt_entity import PromptEntity
from app.services.prompt_service import PromptService
from app.repositories.prompt_repository import PromptRepository
# 初始化 router
router = APIRouter(prefix="/prompts", tags=["Prompts"])
# 依赖注入(实际提示词可用 DI 容器)
def get_service(db: Session = Depends(get_db)) -> PromptService:
repository = PromptRepository(db) # ✅ 传入 db
return PromptService(repository)
@router.post("/", response_model=Res[PromptResponseDTO],
summary="创建提示词",
description="根据提示词信息创建提示词,提示词名称不可重复")
def create_prompt(dto: PromptCreateDTO, service: PromptService = Depends(get_service)):
"""
创建提示词
- dto: 前端 POST JSON 传入参数
- service: Service 层注入
"""
try:
# DTO → Entity
entity = PromptEntity(**dto.__dict__)
# 调用 Service 创建提示词(返回 True/False)
entityRes = service.create_prompt(entity)
# 返回统一 Response
if entityRes is not None:
# 创建成功,可以返回 DTO 或者部分字段
res = PromptResponseDTO(**entityRes.__dict__)
return Res(data=res, code=200, message="创建成功")
else:
return Res(data=None, code=400, message=f"创建失败,可能是不存在该任务或提示词数据不完整")
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
# 按id查找
@router.get("/{prompt_id}", response_model=Res[PromptResponseDTO],
summary="查询提示词",
description="根据提示词ID查询提示词信息")
def get_prompt(prompt_id: int, service: PromptService = Depends(get_service)):
entity = service.get_prompt(prompt_id)
if entity:
res = PromptResponseDTO(**entity.__dict__)
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=None, code=404, message="提示词不存在")
@router.get("/", response_model=Res[List[PromptResponseDTO]],
summary="查询所有提示词",
description="查询所有提示词信息")
def get_all_prompts(service: PromptService = Depends(get_service)):
entities = service.get_all_prompts()
dtos = [PromptResponseDTO(**e.__dict__) for e in entities]
return Res(data=dtos, code=200, message="查询成功")
# ------------------- 修改提示词 -------------------
@router.put("/{prompt_id}", response_model=Res[PromptCreateDTO],
summary="修改提示词",
description="根据提示词ID修改提示词信息")
def update_prompt(prompt_id: int, dto: PromptCreateDTO, service: PromptService = Depends(get_service)):
# 先根据id进行查找
prompt = service.get_prompt(prompt_id)
if not prompt:
return Res(data=None, code=400, message="提示词不存在")
success = service.update_prompt(prompt_id,dto.dict(exclude_unset=True))
if success:
return Res(data=dto, code=200, message="更新成功")
else:
return Res(data=None, code=400, message="更新失败,可能是不存在该任务或提示词数据不完整")
# ------------------- 删除提示词 -------------------
@router.delete("/{prompt_id}", response_model=Res,
summary="删除提示词",
description="根据提示词ID删除提示词,并且级联删除提示词下所有章节以及内容")
def delete_prompt(prompt_id: int, service: PromptService = Depends(get_service)):
success = service.delete_prompt(prompt_id)
# todo 级联删除提示词所有相关内容,比如提示词下所有章节以及内容
if success:
return Res(data=None, code=200, message="删除成功")
else:
return Res(data=None, code=400, message="删除失败或提示词不存在")
# 获取所有的任务列表
@router.get("/tasks/all", response_model=Res[List[str]])
def get_all_tasks(service: PromptService = Depends(get_service)):
tasks = service.get_all_tasks()
return Res(data=tasks, code=200, message="查询成功")
# 根据任务列表获取对应的提示词
@router.get("/tasks/by", response_model=Res[List[PromptResponseDTO]])
def get_prompt_by_task(task: TaskEnum, service: PromptService = Depends(get_service)):
prompts = service.get_prompt_by_task(task.value) # 取枚举的值
dtos = [PromptResponseDTO(**e.__dict__) for e in prompts]
return Res(data=dtos, code=200, message="查询成功")
# 测试供应商
# @router.post("/test", response_model=Res)
# def test_prompt(dto: PromptCreateDTO, service: PromptService = Depends(get_service)):
# """
# 测试供应商
# """
# entity = PromptEntity(**dto.__dict__)
# success = service.test_prompt(entity)
# if success:
# return Res(data=None, code=200, message="测试成功")
# else:
# return Res(data=None, code=400, message="测试失败")
================================================
FILE: SonicVale/app/routers/role_router.py
================================================
from typing import List
from fastapi import APIRouter, Depends, HTTPException
from sqlalchemy.orm import Session
from app.core.response import Res
from app.db.database import get_db
from app.dto.role_dto import RoleResponseDTO, RoleCreateDTO
from app.entity.role_entity import RoleEntity
from app.repositories.line_repository import LineRepository
from app.repositories.project_repository import ProjectRepository
from app.repositories.role_repository import RoleRepository
from app.repositories.tts_provider_repository import TTSProviderRepository
from app.repositories.llm_provider_repository import LLMProviderRepository
from app.services.line_service import LineService
from app.services.project_service import ProjectService
from app.services.role_service import RoleService
router = APIRouter(prefix="/roles", tags=["Roles"])
# 依赖注入(实际项目可用 DI 容器)
def get_role_service(db: Session = Depends(get_db)) -> RoleService:
repository = RoleRepository(db)
return RoleService(repository)
def get_project_service(db: Session = Depends(get_db)) -> ProjectService:
repository = ProjectRepository(db)
return ProjectService(repository)
def get_line_service(db: Session = Depends(get_db)) -> LineService:
repository = LineRepository(db)
role_repository = RoleRepository(db)
tts_provider_repository = TTSProviderRepository(db)
llm_provider_repository = LLMProviderRepository(db)
return LineService(repository, role_repository, tts_provider_repository, llm_provider_repository)
@router.post("", response_model=Res[RoleResponseDTO],
summary="创建角色",
description="根据项目ID创建角色,角色名称在同一项目下不可重复" )
def create_role(dto: RoleCreateDTO, role_service: RoleService = Depends(get_role_service),
project_service: ProjectService = Depends(get_project_service)):
"""创建角色"""
try:
# DTO → Entity
entity = RoleEntity(**dto.__dict__)
# 判断project_id是否存在
project = project_service.get_project(dto.project_id)
if project is None:
return Res(data=None, code=400, message=f"项目 '{dto.project_id}' 不存在")
# 调用 Service 创建项目(返回 True/False)
entityRes = role_service.create_role(entity)
# 返回统一 Response
if entityRes is not None:
# 创建成功,可以返回 DTO 或者部分字段
res = RoleResponseDTO(**entityRes.__dict__)
return Res(data=res, code=200, message="创建成功")
else:
return Res(data=None, code=400, message=f"角色 '{entity.name}' 已存在")
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
@router.get("/{role_id}", response_model=Res[RoleResponseDTO],
summary="查询角色",
description="根据角色id查询角色信息")
def get_role(role_id: int, role_service: RoleService = Depends(get_role_service)):
entity = role_service.get_role(role_id)
if entity:
res = RoleResponseDTO(**entity.__dict__)
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=None, code=404, message="项目不存在")
@router.get("/project/{project_id}", response_model=Res[List[RoleResponseDTO]],
summary="查询项目下的所有角色",
description="根据项目id查询项目下的所有角色信息")
def get_all_roles(project_id: int, role_service: RoleService = Depends(get_role_service)):
entities = role_service.get_all_roles(project_id)
if entities:
res = [RoleResponseDTO(**e.__dict__) for e in entities]
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=[], code=404, message="项目不存在角色")
# 修改,传入的参数是id
@router.put("/{role_id}", response_model=Res[RoleCreateDTO],
summary="修改角色信息",
description="根据角色id修改角色信息,并且不能修改项目id")
def update_role(role_id: int, dto: RoleCreateDTO, role_service: RoleService = Depends(get_role_service)):
role = role_service.get_role(role_id)
if role is None:
return Res(data=None, code=404, message="角色不存在")
res = role_service.update_role(role_id, dto.dict(exclude_unset=True))
if res:
return Res(data=dto, code=200, message="修改成功")
else:
return Res(data=None, code=400, message="修改失败")
# 根据id,删除
@router.delete("/{role_id}", response_model=Res,
summary="删除角色",
description="根据角色id删除角色信息")
def delete_role(role_id: int, role_service: RoleService = Depends(get_role_service),line_service: LineService = Depends(get_line_service)):
success = role_service.delete_role(role_id)
if success:
# 获取改角色下所有的台词
line_service.clear_role_id(role_id)
return Res(data=None, code=200, message="删除成功")
else:
return Res(data=None, code=400, message="删除失败或角色不存在")
# 根据内容进行解析
================================================
FILE: SonicVale/app/routers/strength_router.py
================================================
from typing import List
from fastapi import APIRouter, Depends, HTTPException
from sqlalchemy.orm import Session
from app.core.response import Res
from app.db.database import get_db
from app.dto.strength_dto import StrengthResponseDTO, StrengthCreateDTO
from app.entity.strength_entity import StrengthEntity
from app.repositories.strength_repository import StrengthRepository
from app.services.strength_service import StrengthService
router = APIRouter(prefix="/strengths", tags=["Strengths"])
# 依赖注入(实际项目可用 DI 容器)
def get_strength_service(db: Session = Depends(get_db)) -> StrengthService:
repository = StrengthRepository(db)
return StrengthService(repository)
@router.post("", response_model=Res[StrengthResponseDTO],
summary="创建情绪强弱枚举",
description="根据项目ID创建情绪强弱枚举,情绪强弱枚举名称在同一项目下不可重复" )
def create_strength(dto: StrengthCreateDTO, strength_service: StrengthService = Depends(get_strength_service)):
"""创建情绪强弱枚举"""
try:
# DTO → Entity
entity = StrengthEntity(**dto.__dict__)
# 调用 Service 创建项目(返回 True/False)
entityRes = strength_service.create_strength(entity)
# 返回统一 Response
if entityRes is not None:
# 创建成功,可以返回 DTO 或者部分字段
res = StrengthResponseDTO(**entityRes.__dict__)
return Res(data=res, code=200, message="创建成功")
else:
return Res(data=None, code=400, message=f"情绪强弱枚举 '{entity.name}' 已存在")
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
@router.get("/{strength_id}", response_model=Res[StrengthResponseDTO],
summary="查询情绪强弱枚举",
description="根据情绪强弱枚举id查询情绪强弱枚举信息")
def get_strength(strength_id: int, strength_service: StrengthService = Depends(get_strength_service)):
entity = strength_service.get_strength(strength_id)
if entity:
res = StrengthResponseDTO(**entity.__dict__)
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=None, code=404, message="情绪强弱枚举不存在")
@router.get("", response_model=Res[List[StrengthResponseDTO]],
summary="查询所有情绪强弱枚举",
description="根据所有情绪强弱枚举信息")
def get_all_strengths(strength_service: StrengthService = Depends(get_strength_service)):
entities = strength_service.get_all_strengths()
if entities:
res = [StrengthResponseDTO(**e.__dict__) for e in entities]
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=[], code=404, message="项目不存在情绪强弱枚举")
# 修改,传入的参数是id
@router.put("/{strength_id}", response_model=Res[StrengthCreateDTO],
summary="修改情绪强弱枚举信息",
description="根据情绪强弱枚举id修改情绪强弱枚举信息,并且不能修改项目id")
def update_strength(strength_id: int, dto: StrengthCreateDTO, strength_service: StrengthService = Depends(get_strength_service)):
strength = strength_service.get_strength(strength_id)
if strength is None:
return Res(data=None, code=404, message="情绪强弱枚举不存在")
res = strength_service.update_strength(strength_id, dto.dict(exclude_unset=True))
if res:
return Res(data=dto, code=200, message="修改成功")
else:
return Res(data=None, code=400, message="修改失败,情绪强弱枚举名已存在")
# 根据id,删除,不开放
@router.delete("/{strength_id}", response_model=Res,
summary="删除情绪强弱枚举",
description="根据情绪强弱枚举id删除情绪强弱枚举信息")
def delete_strength(strength_id: int, strength_service: StrengthService = Depends(get_strength_service)):
success = strength_service.delete_strength(strength_id)
if success:
return Res(data=None, code=200, message="删除成功")
else:
return Res(data=None, code=400, message="删除失败或情绪强弱枚举不存在")
================================================
FILE: SonicVale/app/routers/tts_provider_router.py
================================================
from fastapi import APIRouter, Depends, HTTPException
from typing import List
from sqlalchemy.orm import Session
from app.core.response import Res
from app.db.database import get_db
from app.dto.tts_provider_dto import TTSProviderCreateDTO, TTSProviderResponseDTO
from app.entity.tts_provider_entity import TTSProviderEntity
from app.services.tts_provider_service import TTSProviderService
from app.repositories.tts_provider_repository import TTSProviderRepository
# 初始化 router
router = APIRouter(prefix="/tts_providers", tags=["TTSProviders"])
# 依赖注入(实际TTS供应商可用 DI 容器)
def get_service(db: Session = Depends(get_db)) -> TTSProviderService:
repository = TTSProviderRepository(db) # ✅ 传入 db
return TTSProviderService(repository)
# 按id查找
@router.get("/{tts_provider_id}", response_model=Res[TTSProviderResponseDTO],
summary="查询TTS供应商",
description="根据TTS供应商ID查询TTS供应商信息")
def get_tts_provider(tts_provider_id: int, service: TTSProviderService = Depends(get_service)):
entity = service.get_tts_provider(tts_provider_id)
if entity:
res = TTSProviderResponseDTO(**entity.__dict__)
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=None, code=404, message="TTS供应商不存在")
@router.get("/", response_model=Res[List[TTSProviderResponseDTO]],
summary="查询所有TTS供应商",
description="查询所有TTS供应商信息")
def get_all_tts_providers(service: TTSProviderService = Depends(get_service)):
entities = service.get_all_tts_providers()
dtos = [TTSProviderResponseDTO(**e.__dict__) for e in entities]
return Res(data=dtos, code=200, message="查询成功")
# ------------------- 修改TTS供应商 -------------------
@router.put("/{tts_provider_id}", response_model=Res[TTSProviderCreateDTO],
summary="修改TTS供应商",
description="根据TTS供应商ID修改TTS供应商信息")
def update_tts_provider(tts_provider_id: int, dto: TTSProviderCreateDTO, service: TTSProviderService = Depends(get_service)):
# 先根据id进行查找
tts_provider = service.get_tts_provider(tts_provider_id)
if not tts_provider:
return Res(data=None, code=400, message="TTS供应商不存在")
success = service.update_tts_provider(tts_provider_id,dto.dict(exclude_unset=True))
if success:
return Res(data=dto, code=200, message="更新成功")
else:
return Res(data=None, code=400, message="更新失败")
# 测试tts是否正常
@router.post("/test", response_model=Res)
def test_tts_provider(dto: TTSProviderCreateDTO, service: TTSProviderService = Depends(get_service)):
"""
测试tts是否正常
"""
entity = TTSProviderEntity(**dto.dict())
success = service.test_tts_provider(entity)
if success:
return Res(data=None, code=200, message="测试成功")
else:
return Res(data=None, code=400, message="测试失败")
# ------------------- 删除TTS供应商 -------------------
# @router.delete("/{tts_provider_id}", response_model=Res,
# summary="删除TTS供应商",
# description="根据TTS供应商ID删除TTS供应商,并且级联删除TTS供应商下所有章节以及内容")
# def delete_tts_provider(tts_provider_id: int, service: TTSProviderService = Depends(get_service)):
# success = service.delete_tts_provider(tts_provider_id)
# # todo 级联删除TTS供应商所有相关内容,比如TTS供应商下所有章节以及内容
# if success:
# return Res(data=None, code=200, message="删除成功")
# else:
# return Res(data=None, code=400, message="删除失败或TTS供应商不存在")
================================================
FILE: SonicVale/app/routers/voice_router.py
================================================
from typing import List
from fastapi import APIRouter, Depends, HTTPException
from fastapi.responses import FileResponse
from sqlalchemy.orm import Session
from app.core.response import Res
from app.db.database import get_db
from app.dto.tts_provider_dto import TTSProviderResponseDTO
from app.dto.voice_dto import VoiceResponseDTO, VoiceCreateDTO, VoiceExportDTO, VoiceImportDTO, VoiceImportResultDTO, VoiceAudioProcessDTO, VoiceCopyDTO
from app.entity.voice_entity import VoiceEntity
from app.repositories.multi_emotion_voice_repository import MultiEmotionVoiceRepository
from app.repositories.tts_provider_repository import TTSProviderRepository
from app.repositories.voice_repository import VoiceRepository
from app.services.tts_provider_service import TTSProviderService
from app.services.voice_service import VoiceService
router = APIRouter(prefix="/voices", tags=["Voices"])
# 依赖注入(实际项目可用 DI 容器)
def get_voice_service(db: Session = Depends(get_db)) -> VoiceService:
repository = VoiceRepository(db)
multi_emotion_voice_repository = MultiEmotionVoiceRepository(db)
return VoiceService(repository, multi_emotion_voice_repository)
def get_tts_provider_service(db: Session = Depends(get_db)) -> TTSProviderService:
repository = TTSProviderRepository(db)
return TTSProviderService(repository)
# ====== 静态路由放在动态路由之前,避免路径冲突 ======
@router.post("/process-audio", response_model=Res[str],
summary="处理音色参考音频",
description="对音色的参考音频进行处理(变速、音量、裁剪等)")
def process_voice_audio(dto: VoiceAudioProcessDTO, voice_service: VoiceService = Depends(get_voice_service)):
"""处理音色参考音频"""
try:
result = voice_service.process_audio(dto)
if result:
return Res(data=dto.audio_path, code=200, message="处理成功")
else:
return Res(data=None, code=400, message="处理失败")
except FileNotFoundError as e:
return Res(data=None, code=404, message=f"音频文件不存在: {str(e)}")
except Exception as e:
return Res(data=None, code=500, message=f"处理失败: {str(e)}")
@router.post("/export", response_model=Res[str],
summary="导出音色库",
description="将指定TTS供应商下的音色打包到zip文件(可选传ids仅导出选中)")
def export_voices(dto: VoiceExportDTO, voice_service: VoiceService = Depends(get_voice_service)):
"""导出音色库到zip文件"""
try:
result = voice_service.export_voices(dto.tts_provider_id, dto.export_path, dto.ids)
if result:
return Res(data=result, code=200, message="导出成功")
else:
return Res(data=None, code=400, message="没有可导出的音色")
except Exception as e:
return Res(data=None, code=500, message=f"导出失败: {str(e)}")
@router.post("/import", response_model=Res[VoiceImportResultDTO],
summary="导入音色库",
description="从zip文件导入音色库,将音频文件复制到指定目录,已存在的音色会跳过")
def import_voices(dto: VoiceImportDTO, voice_service: VoiceService = Depends(get_voice_service)):
"""从zip文件导入音色库"""
try:
success_count, skipped_count, skipped_names = voice_service.import_voices(
dto.tts_provider_id, dto.zip_path, dto.target_dir
)
result = VoiceImportResultDTO(
success_count=success_count,
skipped_count=skipped_count,
skipped_names=skipped_names
)
return Res(data=result, code=200, message=f"导入完成:成功{success_count}个,跳过{skipped_count}个")
except FileNotFoundError as e:
return Res(data=None, code=404, message=str(e))
except ValueError as e:
return Res(data=None, code=400, message=str(e))
except Exception as e:
return Res(data=None, code=500, message=f"导入失败: {str(e)}")
@router.post("/copy", response_model=Res[VoiceResponseDTO],
summary="复制音色",
description="复制现有音色,包括音频文件,生成新的音色记录")
def copy_voice(dto: VoiceCopyDTO, voice_service: VoiceService = Depends(get_voice_service)):
"""复制音色"""
try:
new_voice = voice_service.copy_voice(
dto.source_voice_id, dto.new_name, dto.target_dir
)
res = VoiceResponseDTO(**new_voice.__dict__)
return Res(data=res, code=200, message="复制成功")
except ValueError as e:
return Res(data=None, code=400, message=str(e))
except Exception as e:
return Res(data=None, code=500, message=f"复制失败: {str(e)}")
@router.get("/tts/{tts_provider_id}", response_model=Res[List[VoiceResponseDTO]],
summary="查询tts供应商下的所有音色",
description="根据tts供应商id,查询tts供应商下的所有音色信息")
def get_all_voices(tts_provider_id: int, voice_service: VoiceService = Depends(get_voice_service)):
entities = voice_service.get_all_voices(tts_provider_id)
if entities:
res = [VoiceResponseDTO(**e.__dict__) for e in entities]
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=[], code=404, message="项目不存在音色")
@router.post("", response_model=Res[VoiceResponseDTO],
summary="创建音色",
description="根据项目ID创建音色,音色名称在同一项目下不可重复" )
def create_voice(dto: VoiceCreateDTO, voice_service: VoiceService = Depends(get_voice_service),
tts_provider_service: TTSProviderService = Depends(get_tts_provider_service)):
"""创建音色"""
try:
# DTO → Entity
entity = VoiceEntity(**dto.__dict__)
# 判断tts_id是否存在
tts_provider = tts_provider_service.get_tts_provider(dto.tts_provider_id)
if tts_provider is None:
return Res(data=None, code=400, message=f"tts服务提供商 '{dto.tts_provider_id}' 不存在")
# 调用 Service 创建项目(返回 True/False)
entityRes = voice_service.create_voice(entity)
# 返回统一 Response
if entityRes is not None:
# 创建成功,可以返回 DTO 或者部分字段
res = VoiceResponseDTO(**entityRes.__dict__)
return Res(data=res, code=200, message="创建成功")
else:
return Res(data=None, code=400, message=f"音色 '{entity.name}' 已存在")
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
# ====== 动态路由放在最后 ======
@router.get("/{voice_id}", response_model=Res[VoiceResponseDTO],
summary="查询音色",
description="根据音色id查询音色信息")
def get_voice(voice_id: int, voice_service: VoiceService = Depends(get_voice_service)):
entity = voice_service.get_voice(voice_id)
if entity:
res = VoiceResponseDTO(**entity.__dict__)
return Res(data=res, code=200, message="查询成功")
else:
return Res(data=None, code=404, message="项目不存在")
# 修改,传入的参数是id
@router.put("/{voice_id}", response_model=Res[VoiceCreateDTO],
summary="修改音色信息",
description="根据音色id修改音色信息,并且不能修改项目id")
def update_voice(voice_id: int, dto: VoiceCreateDTO, voice_service: VoiceService = Depends(get_voice_service)):
voice = voice_service.get_voice(voice_id)
if voice is None:
return Res(data=None, code=404, message="音色不存在")
res = voice_service.update_voice(voice_id, dto.dict())
if res:
return Res(data=dto, code=200, message="修改成功")
else:
return Res(data=None, code=400, message="修改失败")
# 根据 id,删除
@router.delete("/{voice_id}", response_model=Res,
summary="删除音色",
description="根据音色id删除音色信息")
def delete_voice(voice_id: int, voice_service: VoiceService = Depends(get_voice_service)):
success = voice_service.delete_voice(voice_id)
if success:
return Res(data=None, code=200, message="删除成功")
else:
return Res(data=None, code=400, message="删除失败或音色不存在")
# tts_provider的查询和修改
# @router.get("/tts/provider/{tts_provider_id}", response_model=Res[TTSProviderResponseDTO])
# def get_tts_provider(tts_provider_id: int, tts_provider_service: TTSProviderService = Depends(get_tts_provider_service)):
# tts_provider = tts_provider_service.get_tts_provider(tts_provider_id)
# if tts_provider:
# res = TTSProviderResponseDTO(**tts_provider.__dict__)
# return Res(data=res, code=200, message="查询成功")
# else:
# return Res(data=None, code=404, message="tts服务提供商不存在")
#
# # tts_provider的修改
# @router.put("/tts/provider/{tts_provider_id}", response_model=Res[TTSProviderResponseDTO])
# def update_tts_provider(tts_provider_id: int, dto: TTSProviderResponseDTO, tts_provider_service: TTSProviderService = Depends(get_tts_provider_service)):
# # 先判断是否存在
# tts_provider = tts_provider_service.get_tts_provider(tts_provider_id)
# if tts_provider is None:
# return Res(data=None, code=404, message="tts服务提供商不存在")
# tts_provider = tts_provider_service.update_tts_provider(tts_provider_id, dto.dict(exclude_unset=True))
# if tts_provider:
# return Res(data=None, code=200, message="修改成功")
# else:
# return Res(data=None, code=400, message="修改失败")
================================================
FILE: SonicVale/app/services/chapter_service.py
================================================
import json
import logging
import os
import re
import shutil
import threading
from collections import defaultdict
from typing import List
from sqlalchemy import Sequence
from app.core.config import getConfigPath
from app.core.text_correct_engine import TextCorrectorFinal
from app.core.tts_engine import TTSEngine
from app.db.database import SessionLocal
from app.dto.line_dto import LineInitDTO
from app.entity.chapter_entity import ChapterEntity
from app.entity.line_entity import LineEntity
from app.models.po import ChapterPO, RolePO, LinePO
from app.repositories.chapter_repository import ChapterRepository
from app.repositories.line_repository import LineRepository
from app.core.prompts import get_context2lines_prompt, get_add_smart_role_and_voice
from app.repositories.llm_provider_repository import LLMProviderRepository
from app.repositories.project_repository import ProjectRepository
from app.repositories.role_repository import RoleRepository
from app.core.llm_engine import LLMEngine
from app.repositories.voice_repository import VoiceRepository
class ChapterService:
def __init__(self, repository: ChapterRepository):
"""注入 repository"""
self.repository = repository
def create_chapter(self, entity: ChapterEntity):
"""创建新章节
- 检查同名章节是否存在
- 如果存在,抛出异常或返回错误
- 调用 repository.create 插入数据库
"""
chapter = self.repository.get_by_name(entity.title, entity.project_id)
if chapter:
logging.info("同名章节已存在")
return None
# 手动将entity转化为po
po = ChapterPO(**entity.__dict__)
res = self.repository.create(po)
# res(po) --> entity
data = {k: v for k, v in res.__dict__.items() if not k.startswith("_")}
entity = ChapterEntity(**data)
# 将po转化为entity
return entity
def get_chapter(self, chapter_id: int) -> ChapterEntity | None:
"""根据 ID 查询章节"""
po = self.repository.get_by_id(chapter_id)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = ChapterEntity(**data)
return res
def get_all_chapters(self,project_id: int) -> Sequence[ChapterEntity]:
"""获取所有章节列表"""
pos = self.repository.get_all(project_id)
# pos -> entities
entities = [
ChapterEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")})
for po in pos
]
return entities
def update_chapter(self, chapter_id: int, data:dict) -> bool:
"""更新章节
- 可以只更新部分字段
- 检查同名冲突
- 检查project_id不能改变
"""
title = data["title"]
project_id = data["project_id"]
if self.repository.get_by_name(title, project_id) and self.repository.get_by_name(title,project_id).id != chapter_id:
return False
po = self.repository.get_by_id(chapter_id)
# 防止改变project_id
if po.project_id != project_id:
return False
self.repository.update(chapter_id, data)
return True
def delete_chapter(self, chapter_id: int) -> bool:
"""删除章节
"""
db = SessionLocal()
try :
chapter = self.repository.get_by_id(chapter_id)
# 移除资源内容
# 删除该路径所有内容
project_repository = ProjectRepository(db)
project = project_repository.get_by_id(chapter.project_id)
chapter_path = os.path.join(project.project_root_path, str(chapter.project_id), str(chapter_id))
if os.path.exists(chapter_path):
shutil.rmtree(chapter_path) # 删除整个文件夹及其所有内容
logging.info("已删除目录及内容: %s", chapter_path)
else:
logging.info("目录不存在: %s", chapter_path)
# 先删除资源,再删除记录
res = self.repository.delete(chapter_id)
# 删除章节下所有台词
line_repository = LineRepository(db)
line_res = line_repository.delete_all_by_chapter_id(chapter_id)
finally:
db.close()
return res
# 先获取章节内容
def split_text(self, chapter_id: int, max_length: int = 1500) -> List[str]:
"""
将文本按标点/换行断句,并按最大长度分组,确保每段以标点结束。
支持中英文标点和换行符。
"""
content = self.get_chapter(chapter_id).text_content
# 去掉空行
content = "\n".join([line for line in content.split("\n") if line.strip()])
# 如果最后没有句号/问号/感叹号/点号,自动补一个句号
if not re.search(r'[。!?.!?]$', content):
content += "。"
# 使用正则分割,支持中英文标点 + 逗号 + 换行
# [] 里列出所有可能的结束符号
sentences = re.findall(r'[^。!?.!?,,\n]*[。!?.!?,,\n]', content, re.MULTILINE | re.DOTALL)
chunks = []
buffer = ""
for sentence in sentences:
if len(buffer) + len(sentence) <= max_length:
buffer += sentence
else:
if buffer:
chunks.append(buffer.strip())
buffer = sentence
if buffer:
chunks.append(buffer.strip())
return chunks
# 然后进行划分
# 然后循环解析,并保存
def fill_prompt(self,template: str, characters: list[str], emotions: list[str], strengths: list[str],
novel_content: str) -> str:
result = template
result = result.replace("{possible_characters}", ", ".join(characters))
result = result.replace("{possible_emotions}", ", ".join(emotions))
result = result.replace("{possible_strengths}", ", ".join(strengths))
result = result.replace("{novel_content}", novel_content)
return result
def para_content(self, prompt:str,chapter_id: int,content: str = None,role_names: List[str] = None,emotion_names: List[str] = None,strength_names: List[str] = None,is_precise_fill: int = 0):
db = SessionLocal()
try :
# 获取content
chapter = self.repository.get_by_id(chapter_id)
# content = chapter.text_content
# 获取角色列表
# role_repository = RoleRepository(db)
# roles = role_repository.get_all(chapter.project_id)
# role_names = [role.name for role in roles]
# 组装prompt
# prompt = get_context2lines_prompt(role_names, content,emotion_names,strength_names)
prompt = self.fill_prompt(prompt, role_names, emotion_names, strength_names, content)
# 获取llm_provider
project_repository = ProjectRepository(db)
project = project_repository.get_by_id(chapter.project_id)
llm_provider_id = project.llm_provider_id
#
llm_provider_repository = LLMProviderRepository(db)
llm_provider = llm_provider_repository.get_by_id(llm_provider_id)
llm = LLMEngine(llm_provider.api_key, llm_provider.api_base_url, project.llm_model, llm_provider.custom_params)
try:
llm.generate_text_test("请输出一份用户信息,严格使用 JSON 格式,不要包含任何额外文字。字段包括:name, age, city")
logging.info("LLM可用")
except Exception as e:
logging.warning("LLM不可用")
return {
"success": False,
"message": f"LLM 不可用: {str(e)}"
}
logging.info("开始内容解析")
try:
result = llm.generate_text(prompt)
# 解析json,并且构造为List[LineInitDTO]
# 解析 JSON 字符串为 Python 对象
parsed_data = llm.save_load_json(result)
if not parsed_data:
return {
"success": False,
"message": "JSON 解析失败或返回空对象",
}
# 验证 parsed_data 是否为有效的字典列表
if not isinstance(parsed_data, list):
logging.error("LLM返回的数据不是列表,实际类型: %s", type(parsed_data))
return {
"success": False,
"message": f"LLM返回的数据格式不正确,期望列表但收到: {type(parsed_data).__name__}",
}
# 验证列表中的每个元素是否为字典
for idx, item in enumerate(parsed_data):
if not isinstance(item, dict):
logging.error("列表第 %d 项不是字典,实际类型: %s, 内容: %s", idx, type(item), str(item)[:100])
return {
"success": False,
"message": f"LLM返回的数据格式不正确,列表第 {idx} 项应为字典但收到: {type(item).__name__}",
}
# 这里进行自动填充
if is_precise_fill == 1:
logging.info("开始自动填充")
corrector = TextCorrectorFinal()
parsed_data = corrector.correct_ai_text(content, parsed_data)
# parsed_data = json.loads(result)
# 构造 List[LineInitDTO]
line_dtos: List[LineInitDTO] = [LineInitDTO(**item) for item in parsed_data]
return {
"success": True,
"data": line_dtos
}
except Exception as e:
logging.exception("调用 LLM 出错: %s", e)
return {
"success": False,
"message": f"调用 LLM 出错: {str(e)}"
}
finally:
db.close()
# 导出指令
# def get_prompt_content(self,project_id, chapter_id,prompt):
# db = SessionLocal()
# try:
# # 获取content
# chapter = self.repository.get_by_id(chapter_id)
# content = chapter.text_content
# # 获取角色列表
# role_repository = RoleRepository(db)
# roles = role_repository.get_all(chapter.project_id)
# role_names = [role.name for role in roles]
# # 组装prompt
# # 获取project
#
# prompt = self.fill_prompt(prompt, role_names, emotion_names, strength_names, content)
# prompt = get_context2lines_prompt(role_names, content)
# return prompt
# finally:
# db.close()
def add_smart_role_and_voice(self,project,content, role_names, voice_names):
# 智能匹配提示词,要写死吗?
db = SessionLocal()
try:
llm_provider_id = project.llm_provider_id
llm_provider_repository = LLMProviderRepository(db)
llm_provider = llm_provider_repository.get_by_id(llm_provider_id)
llm = LLMEngine(llm_provider.api_key, llm_provider.api_base_url, project.llm_model, llm_provider.custom_params)
prompt = get_add_smart_role_and_voice(content,role_names, voice_names)
result = llm.generate_smart_text(prompt)
parse_data = llm.save_load_json(result)
# 获取项目所有音色
voice_repository = VoiceRepository(db)
voices = voice_repository.get_all(project.tts_provider_id)
# map name- id
voice_id_map = {voice.name: voice.id for voice in voices}
# 对角色进行update
role_repository = RoleRepository(db)
res = []
for item in parse_data:
role = role_repository.get_by_name( item["role_name"],project.id)
if role:
if item["voice_name"]:
logging.info("更新角色音色:%s %s", item["role_name"], item["voice_name"])
role_repository.update(role.id, {"default_voice_id": voice_id_map.get(item["voice_name"])})
res.append({"role_name": item["role_name"], "voice_name": item["voice_name"]})
return True,res
except Exception as e:
logging.exception("LLM智能匹配出错: %s", e)
return False, []
finally:
db.close()
================================================
FILE: SonicVale/app/services/emotion_service.py
================================================
from sqlalchemy import Sequence
from app.entity.emotion_entity import EmotionEntity
from app.models.po import EmotionPO
from app.repositories.emotion_repository import EmotionRepository
class EmotionService:
def __init__(self, repository: EmotionRepository):
"""注入 repository"""
self.repository = repository
def create_emotion(self, entity: EmotionEntity):
"""创建新情绪枚举
- 检查同名情绪枚举是否存在
- 如果存在,抛出异常或返回错误
- 调用 repository.create 插入数据库
"""
emotion = self.repository.get_by_name(entity.name)
if emotion:
return None
# 手动将entity转化为po
po = EmotionPO(**entity.__dict__)
res = self.repository.create(po)
# res(po) --> entity
data = {k: v for k, v in res.__dict__.items() if not k.startswith("_")}
entity = EmotionEntity(**data)
# 将po转化为entity
return entity
def get_emotion(self, emotion_id: int) -> EmotionEntity | None:
"""根据 ID 查询情绪枚举"""
po = self.repository.get_by_id(emotion_id)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = EmotionEntity(**data)
return res
def get_all_emotions(self) -> Sequence[EmotionEntity]:
"""获取所有情绪枚举列表"""
pos = self.repository.get_all()
# pos -> entities
entities = [
EmotionEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")})
for po in pos
]
return entities
def update_emotion(self, emotion_id: int, data:dict) -> bool:
"""更新情绪枚举
- 可以只更新部分字段
"""
name = data.get("name")
if self.repository.get_by_name(name):
return False
self.repository.update(emotion_id, data)
return True
def delete_emotion(self, emotion_id: int) -> bool:
"""删除情绪枚举
"""
res = self.repository.delete(emotion_id)
return res
def get_emotion_by_name(self, name: str) -> EmotionEntity | None:
"""根据名称查询情绪枚举"""
po = self.repository.get_by_name(name)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = EmotionEntity(**data)
return res
================================================
FILE: SonicVale/app/services/line_service.py
================================================
import contextlib
import hashlib
import logging
import shutil
import subprocess
import sys
import tempfile
import threading
from collections import defaultdict
from typing import List
from openpyxl import Workbook
from sqlalchemy import Sequence
from app.core.audio_engin import AudioProcessor
from app.core.config import getConfigPath, getFfmpegPath
from app.core.subtitle import subtitle_engine
from app.core.tts_engine import TTSEngine
from app.dto.line_dto import LineCreateDTO, LineOrderDTO, LineAudioProcessDTO
from app.entity.line_entity import LineEntity
from app.models.po import LinePO, RolePO
from app.repositories.line_repository import LineRepository
from app.repositories.role_repository import RoleRepository
from app.repositories.tts_provider_repository import TTSProviderRepository
from app.repositories.llm_provider_repository import LLMProviderRepository
from app.core.llm_engine import LLMEngine
import os
import numpy as np
import soundfile as sf
def _lock_key(path: str) -> str:
return hashlib.md5(path.encode("utf-8")).hexdigest()
_file_locks = defaultdict(threading.Lock)
class LineService:
def __init__(self, repository: LineRepository,role_repository: RoleRepository,tts_provider_repository: TTSProviderRepository, llm_provider_repository: LLMProviderRepository = None):
"""注入 repository"""
self.tts_provider_repository = tts_provider_repository
self.llm_provider_repository = llm_provider_repository
self.role_repository = role_repository
self.repository = repository
def create_line(self, entity: LineEntity):
"""创建新台词
- 如果存在,抛出异常或返回错误
- 调用 repository.create 插入数据库
"""
# 手动将entity转化为po
po = LinePO(**entity.__dict__)
res = self.repository.create(po)
# res(po) --> entity
data = {k: v for k, v in res.__dict__.items() if not k.startswith("_")}
entity = LineEntity(**data)
# 将po转化为entity
return entity
def get_line(self, line_id: int) -> LineEntity | None:
"""根据 ID 查询台词"""
po = self.repository.get_by_id(line_id)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = LineEntity(**data)
return res
def get_all_lines(self,chapter_id: int) -> Sequence[LineEntity]:
"""获取所有台词列表"""
pos = self.repository.get_all(chapter_id)
# pos -> entities
entities = [
LineEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")})
for po in pos
]
return entities
def delete_line(self, line_id: int) -> bool:
"""删除台词
"""
# 还要把audio_path删除
po = self.repository.get_by_id(line_id)
if po and po.audio_path:
with contextlib.suppress(FileNotFoundError):
os.remove(po.audio_path)
res = self.repository.delete(line_id)
return res
# 删除章节下所有台词
def delete_all_lines(self, chapter_id: int) -> bool:
"""删除章节下所有台词
"""
# 要移除所有的音频资源
for line in self.get_all_lines(chapter_id):
if line and line.audio_path:
with contextlib.suppress(FileNotFoundError):
os.remove(line.audio_path)
return self.repository.delete_all_by_chapter_id(chapter_id)
# 单个台词新增
def add_new_line(self, line: LineCreateDTO,project_id,chapter_id,index,emotions_dict, strengths_dict,audio_path):
# 先判断角色是否存在
role = self.role_repository.get_by_name(line.role_name,project_id)
if role is None:
# 新增角色
role = self.role_repository.create(RolePO(name=line.role_name, project_id=project_id))
# 获取情绪id
emotion_id = emotions_dict.get(line.emotion_name)
# 获取强度id
strength_id = strengths_dict.get(line.strength_name)
res = self.repository.create(LinePO(text_content=line.text_content, role_id=role.id,
chapter_id=chapter_id,line_order = index+1,emotion_id=emotion_id,strength_id=strength_id))
# 新增台词,这里搞个audio_path
# audio_path = os.path.join(getConfigPath(), str(project_id), str(chapter_id), "audio")
# os.makedirs(audio_path, exist_ok=True)
res_path = os.path.join(audio_path, "id_"+str(res.id) + ".wav")
self.repository.update(res.id, {"audio_path": res_path})
def update_init_lines(self, lines: list, project_id: object, chapter_id: object,emotions_dict, strengths_dict,audio_path) -> None:
for index, line in enumerate(lines):
self.add_new_line(line,project_id,chapter_id,index,emotions_dict, strengths_dict,audio_path)
# 获取章节下所有台词
# 更新line
def update_line(self, line_id: int, data: dict) -> bool:
po = self.repository.get_by_id(line_id)
if po is None:
return False
res = self.repository.update(line_id, data)
if res is None:
return False
return True
# 生成音频(服务器和本地两种方式)
def generate_audio(self, reference_path: str,tts_provider_id,content,emo_text:str,emo_vector:list[float],save_path= None):
#
tts_provider = self.tts_provider_repository.get_by_id(tts_provider_id)
if tts_provider is None:
raise Exception(f"TTS服务提供商不存在(ID: {tts_provider_id})")
if not tts_provider.api_base_url:
raise Exception("TTS服务地址未配置,请先在配置中心设置TTS服务")
tts_engine = TTSEngine(tts_provider.api_base_url)
# 检查参考音频路径是否有效
if not reference_path:
raise Exception("参考音频路径未设置,请检查角色音色配置")
key = _lock_key(reference_path)
lock = _file_locks[key]
with lock:
try:
audio_exists = tts_engine.check_audio_exists(reference_path)
except Exception as e:
raise Exception(f"检查参考音频失败: {str(e)}")
if not audio_exists:
# 检查本地文件是否存在
if not os.path.isfile(reference_path):
raise Exception(f"参考音频文件不存在: {reference_path}")
upload_result = tts_engine.upload_audio(reference_path, reference_path)
if upload_result.get('code') and upload_result.get('code') != 200:
raise Exception(f"上传参考音频失败: {upload_result.get('msg', '未知错误')}")
# 合成音频
return tts_engine.synthesize(content, reference_path, emo_text, emo_vector, save_path)
# 将角色role_id下所有台词的role_id都置位空
def clear_role_id(self, role_id: int):
# 先获取role_id下所有台词实体
pos = self.repository.get_lines_by_role_id(role_id)
for po in pos:
self.repository.update(po.id, {"role_id": None})
def batch_update_line_order(self,line_orders:List[LineOrderDTO]):
for line_order in line_orders:
self.update_line(line_order.id,{"line_order":line_order.line_order})
return True
def update_audio_path(self, id, dto) -> bool:
try:
po = self.get_line(id)
old_path = po.audio_path
new_path = dto.audio_path
if not old_path:
return False # 原始路径为空
if not os.path.exists(old_path):
return False # 原始文件不存在
if os.path.exists(new_path):
return False # 目标文件已存在,避免覆盖
# 确保目标目录存在
os.makedirs(os.path.dirname(new_path), exist_ok=True)
# 重命名文件
shutil.move(old_path, new_path)
# 更新数据库
self.update_line(id, {"audio_path": new_path})
return True
except Exception as e:
logging.exception("[update_audio_path] 失败: %s", e)
return False
def process_audio_ffmpeg(
self,
audio_path: str,
speed: float = 1.0,
volume: float = 1.0,
start_ms: int | None = None,
end_ms: int | None = None,
out_path: str | None = None,
keep_format: bool = True, # 是否保持原文件采样率/声道
default_sr: int = 44100,
default_ch: int = 2
):
"""
使用 ffmpeg 对音频进行变速 (0.5~2.0)、音量调整、可选裁剪。
输出 WAV PCM16。
如果 keep_format=True,则保持输入文件的 sr/ch 不变。
"""
ffmpeg_path = getFfmpegPath()
if not os.path.exists(audio_path):
raise FileNotFoundError(audio_path)
# 获取原始参数
info = sf.info(audio_path)
target_sr = info.samplerate if keep_format else default_sr
target_ch = info.channels if keep_format else default_ch
# 参数规整
speed = float(np.clip(speed or 1.0, 0.5, 2.0))
volume = 1.0 if volume is None else max(0.0, float(volume))
# 输出路径
target_path = out_path or audio_path
os.makedirs(os.path.dirname(target_path) or ".", exist_ok=True)
with tempfile.NamedTemporaryFile(delete=False, suffix=".wav",
dir=os.path.dirname(target_path) or ".") as tmp:
tmp_path = tmp.name
# 构建 ffmpeg 命令
filter_chain = [f"atempo={speed}"]
if abs(volume - 1.0) > 1e-6:
filter_chain.append(f"volume={volume}")
cmd = [ffmpeg_path, "-y"]
if start_ms is not None:
cmd.extend(["-ss", str(start_ms / 1000)])
cmd.extend(["-i", audio_path])
if end_ms is not None:
cmd.extend(["-to", str(end_ms / 1000)])
cmd.extend([
"-af", ",".join(filter_chain),
"-ar", str(target_sr),
"-ac", str(target_ch),
"-c:a", "pcm_s16le",
tmp_path
])
subprocess.run(cmd, check=True,
creationflags=subprocess.CREATE_NO_WINDOW if sys.platform == "win32" else 0)
# 软限幅:避免 clipping
data, sr = sf.read(tmp_path, dtype="float32", always_2d=True)
peak = float(np.max(np.abs(data)))
if peak > 1.0:
data = data / peak
sf.write(tmp_path, data, sr, format="WAV", subtype="PCM_16")
os.replace(tmp_path, target_path)
return target_path
# 删除区间进行拼接
def process_audio_ffmpeg_cut(
self,
audio_path: str,
speed: float = 1.0,
volume: float = 1.0,
start_ms: int | None = None,
end_ms: int | None = None,
silence_sec: float = 0.0, # 末尾静音时长,单位秒
out_path: str | None = None,
keep_format: bool = True, # 是否保持原文件采样率/声道
default_sr: int = 44100,
default_ch: int = 2
):
"""
使用 ffmpeg 对音频进行变速 (0.5~2.0)、音量调整。
删除 [start_ms, end_ms] 区间,并拼接前后音频。
输出 WAV PCM16。
可在末尾附加 silence_sec 秒静音。
"""
ffmpeg_path = getFfmpegPath()
if not os.path.exists(audio_path):
raise FileNotFoundError(audio_path)
# 获取原始参数
info = sf.info(audio_path)
target_sr = info.samplerate if keep_format else default_sr
target_ch = info.channels if keep_format else default_ch
# 参数规整
speed = float(np.clip(speed or 1.0, 0.5, 2.0))
volume = 1.0 if volume is None else max(0.0, float(volume))
# 输出路径
target_path = out_path or audio_path
os.makedirs(os.path.dirname(target_path) or ".", exist_ok=True)
with tempfile.NamedTemporaryFile(delete=False, suffix=".wav",
dir=os.path.dirname(target_path) or ".") as tmp:
tmp_path = tmp.name
# 构建 ffmpeg 命令
if start_ms is None or end_ms is None or end_ms <= start_ms:
# 无剪切
if silence_sec > 0:
# 添加静音
cmd = [
ffmpeg_path, "-y",
"-i", audio_path,
"-f", "lavfi", "-t", str(silence_sec),
"-i", f"anullsrc=channel_layout={'stereo' if target_ch == 2 else 'mono'}:sample_rate={target_sr}",
"-filter_complex",
f"[0:a]atempo={speed},volume={volume}[main];"
f"[main][1:a]concat=n=2:v=0:a=1[out]",
"-map", "[out]",
"-ar", str(target_sr),
"-ac", str(target_ch),
"-c:a", "pcm_s16le",
tmp_path
]
elif silence_sec < 0:
# 裁掉末尾 abs(silence_sec)
cut_dur = info.duration + silence_sec
if cut_dur <= 0:
cut_dur = 0 # 整段裁掉
cmd = [
ffmpeg_path, "-y",
"-i", audio_path,
"-filter_complex",
f"[0:a]atempo={speed},volume={volume},atrim=0:{cut_dur}[out]",
"-map", "[out]",
"-ar", str(target_sr),
"-ac", str(target_ch),
"-c:a", "pcm_s16le",
tmp_path
]
else:
# 不处理末尾
cmd = [
ffmpeg_path, "-y", "-i", audio_path,
"-af", f"atempo={speed},volume={volume}",
"-ar", str(target_sr),
"-ac", str(target_ch),
"-c:a", "pcm_s16le",
tmp_path
]
else:
# 剪切
start_sec = start_ms / 1000
end_sec = end_ms / 1000
if silence_sec > 0:
# 拼接 + 添加静音
cmd = [
ffmpeg_path, "-y",
"-i", audio_path,
"-f", "lavfi", "-t", str(silence_sec),
"-i", f"anullsrc=channel_layout={'stereo' if target_ch == 2 else 'mono'}:sample_rate={target_sr}",
"-filter_complex",
f"[0:a]atrim=0:{start_sec},asetpts=PTS-STARTPTS[first];"
f"[0:a]atrim={end_sec},asetpts=PTS-STARTPTS[second];"
f"[first][second]concat=n=2:v=0:a=1,atempo={speed},volume={volume}[main];"
f"[main][1:a]concat=n=2:v=0:a=1[out]",
"-map", "[out]",
"-ar", str(target_sr),
"-ac", str(target_ch),
"-c:a", "pcm_s16le",
tmp_path
]
elif silence_sec < 0:
# 拼接后再裁掉末尾
cut_dur = info.duration + silence_sec
if cut_dur <= 0:
cut_dur = 0 # 整段裁掉
cmd = [
ffmpeg_path, "-y", "-i", audio_path,
"-filter_complex",
f"[0:a]atrim=0:{start_sec},asetpts=PTS-STARTPTS[first];"
f"[0:a]atrim={end_sec},asetpts=PTS-STARTPTS[second];"
f"[first][second]concat=n=2:v=0:a=1,atempo={speed},volume={volume},atrim=0:{cut_dur}[out]",
"-map", "[out]",
"-ar", str(target_sr),
"-ac", str(target_ch),
"-c:a", "pcm_s16le",
tmp_path
]
else:
# 拼接但不处理末尾
cmd = [
ffmpeg_path, "-y", "-i", audio_path,
"-filter_complex",
f"[0:a]atrim=0:{start_sec},asetpts=PTS-STARTPTS[first];"
f"[0:a]atrim={end_sec},asetpts=PTS-STARTPTS[second];"
f"[first][second]concat=n=2:v=0:a=1,atempo={speed},volume={volume}[out]",
"-map", "[out]",
"-ar", str(target_sr),
"-ac", str(target_ch),
"-c:a", "pcm_s16le",
tmp_path
]
# 执行 ffmpeg
subprocess.run(
cmd, check=True,
creationflags=subprocess.CREATE_NO_WINDOW if sys.platform == "win32" else 0
)
# 软限幅:避免 clipping
data, sr = sf.read(tmp_path, dtype="float32", always_2d=True)
peak = float(np.max(np.abs(data)))
if peak > 1.0:
data = data / peak
sf.write(tmp_path, data, sr, format="WAV", subtype="PCM_16")
os.replace(tmp_path, target_path)
return target_path
def process_audio(self, line_id, dto:LineAudioProcessDTO):
line = self.get_line(line_id)
if line:
# 读取音频文件
# audio_file =self.process_audio_ffmpeg(line.audio_path, dto.speed, dto.volume,dto.start_ms,dto.end_ms)
# 删除拼接
# audio_file = self.process_audio_ffmpeg_cut(line.audio_path, dto.speed, dto.volume, dto.start_ms, dto.end_ms, dto.tail_silence_sec,dto.current_ms)
processor = AudioProcessor(line.audio_path)
start_ms = dto.start_ms
end_ms = dto.end_ms
speed = dto.speed
volume = dto.volume
current_ms = dto.current_ms
silence_sec = dto.silence_sec
# ---------- (1) 优先裁剪 ----------
if start_ms is not None and end_ms is not None and end_ms > start_ms:
logging.info("裁剪")
processor.cut(start_ms, end_ms)
# ---------- (2) 插入静音 ----------
elif current_ms is not None and silence_sec is not None and silence_sec != 0:
logging.info("插入静音")
processor.insert_silence(current_ms, silence_sec)
# ---------- (3) 末尾静音/裁剪 ----------
elif current_ms is None and silence_sec is not None and silence_sec != 0:
logging.info("末尾静音/裁剪")
processor.append_silence(silence_sec)
# ---------- (4) 音量 + 变速 ----------
if speed != 1.0:
processor.change_speed(speed)
if volume != 1.0:
processor.change_volume(volume)
logging.info("音频处理完成")
return True
else:
return False
# 导出音频,合并音频,并且导出字幕
def concat_wav_files(self,paths, out_path, verify=True, block_frames=262144):
"""
按顺序把若干 WAV 合并到 out_path。
假设:采样率与声道一致(如需更稳,可保留 verify=True 做轻校验)。
"""
assert paths and len(paths) >= 1, "至少提供一个文件路径"
os.makedirs(os.path.dirname(out_path) or ".", exist_ok=True)
# 以首文件格式为准
info0 = sf.info(paths[0])
sr, ch, subtype = info0.samplerate, info0.channels, info0.subtype or "PCM_16"
# 可选校验
if verify:
for p in paths[1:]:
info = sf.info(p)
if info.samplerate != sr or info.channels != ch:
raise ValueError(
f"格式不一致:{p} (sr={info.samplerate}, ch={info.channels}) vs 首文件 (sr={sr}, ch={ch})")
# 流式写入
with sf.SoundFile(out_path, mode='w', samplerate=sr, channels=ch, format='WAV', subtype=subtype) as fout:
for p in paths:
with sf.SoundFile(p, mode='r') as fin:
if verify and (fin.samplerate != sr or fin.channels != ch):
raise ValueError(f"参数不一致:{p}")
while True:
block = fin.read(block_frames, dtype='float32', always_2d=True)
if len(block) == 0:
break
fout.write(block.astype(np.float32, copy=False))
return out_path
def export_lines_to_excel(self,lines, file_path="all_lines.xlsx"):
# 1) 取出所有数据
# lines = self.repository.get_all(chapter_id)
# 2) 创建 Excel 工作簿
wb = Workbook()
ws = wb.active
ws.title = "Lines"
# 3) 写表头(根据你的数据字段调整)
headers = ["序号","角色", "台词"]
ws.append(headers)
# 4) 写内容
for line in lines:
role = self.role_repository.get_by_id(line.role_id)
role_name = role.name if role else "未知角色"
ws.append([
line.line_order,
role_name,
line.text_content
])
# 5) 保存到文件
wb.save(file_path)
return file_path
def export_audio(self, chapter_id, single=False):
"""导出音频与字幕
Returns:
dict: 包含导出结果的详细信息
- success: bool, 是否成功
- message: str, 错误信息(如果失败)
- audio_path: str, 合并后的音频路径
- subtitle_path: str, 字幕路径
- missing_files: list, 缺失的音频文件列表
"""
try:
# 拿到所有的台词
lines = self.repository.get_all(chapter_id)
if not lines:
return {"success": False, "message": "该章节没有台词"}
# 过滤掉空路径和不存在的文件
valid_lines = []
missing_files = []
for line in lines:
if not line.audio_path:
missing_files.append(f"台词#{line.id}(序号{line.line_order}): 无音频路径")
elif not os.path.exists(line.audio_path):
missing_files.append(f"台词#{line.id}(序号{line.line_order}): 文件不存在 - {line.audio_path}")
else:
valid_lines.append(line)
if not valid_lines:
return {
"success": False,
"message": "没有有效的音频文件可导出",
"missing_files": missing_files
}
paths = [line.audio_path for line in valid_lines]
# 把paths[0]的path去掉后面的文件名,得到文件夹路径
output_dir_path = os.path.join(os.path.dirname(paths[0]), "result")
# 不存在就创建
os.makedirs(output_dir_path, exist_ok=True)
# 放到result目录下
output_path = os.path.join(output_dir_path, "result.wav")
# 合并音频文件
try:
self.concat_wav_files(paths, output_path)
except ValueError as e:
return {
"success": False,
"message": f"音频合并失败: {str(e)}",
"missing_files": missing_files
}
except Exception as e:
logging.exception("[export_audio] concat_wav_files 失败")
return {
"success": False,
"message": f"音频合并异常: {str(e)}",
"missing_files": missing_files
}
# 生成字幕
output_subtitle_path = os.path.join(output_dir_path, "result.srt")
try:
subtitle_engine.generate_subtitle(output_path, output_subtitle_path)
except Exception as e:
logging.exception("[export_audio] 生成整体字幕失败")
# 字幕生成失败不影响音频导出,继续执行
# 生成单条字幕(如果需要)
if single:
subtitle_dir_path = os.path.join(os.path.dirname(paths[0]), "subtitles")
# 先清空这个文件夹
shutil.rmtree(subtitle_dir_path, ignore_errors=True)
os.makedirs(subtitle_dir_path, exist_ok=True)
for line in valid_lines:
try:
path = line.audio_path
base_name = os.path.splitext(os.path.basename(path))[0]
subtitle_path = os.path.join(subtitle_dir_path, base_name + ".srt")
subtitle_engine.generate_subtitle(path, subtitle_path)
# 将subtitle_path写进line.subtitle_path
self.repository.update(line.id, {"subtitle_path": subtitle_path})
except Exception as e:
logging.warning(f"[export_audio] 生成单条字幕失败 line#{line.id}: {e}")
# 单条字幕失败不影响整体导出
# 导出所有数据到Excel
try:
self.export_lines_to_excel(lines, os.path.join(output_dir_path, "all_lines.xlsx"))
except Exception as e:
logging.warning(f"[export_audio] 导出Excel失败: {e}")
# Excel导出失败不影响整体导出
result = {
"success": True,
"audio_path": output_path,
"subtitle_path": output_subtitle_path,
"exported_count": len(valid_lines),
"total_count": len(lines)
}
if missing_files:
result["missing_files"] = missing_files
result["message"] = f"导出成功,但有{len(missing_files)}条台词缺少音频"
return result
except Exception as e:
logging.exception("[export_audio] 未预期的错误")
return {"success": False, "message": f"导出失败: {str(e)}"}
def generate_subtitle(self, line_id, dto):
# 获取台词
line = self.get_line(line_id)
if line:
# 将音频文件路径的后缀改为.srt
dto.subtitle_path = os.path.splitext(dto.subtitle_path)[0] + ".srt"
subtitle_engine.generate_subtitle(line.audio_path,dto.subtitle_path)
return dto.subtitle_path
else:
return None
# 字幕矫正 - 拼音匹配
def correct_subtitle_pinyin(self, text, output_subtitle_path):
"""
使用拼音匹配算法矫正字幕
text: 原始正确文本
output_subtitle_path: 字幕文件路径
"""
subtitle_engine.correct_srt_file(text, output_subtitle_path)
# 字幕矫正 - LLM
def correct_subtitle_llm(self, text, output_subtitle_path, llm_provider_id: int, llm_model: str, batch_size: int = 20):
"""
使用LLM矫正字幕
text: 原始正确文本
output_subtitle_path: 字幕文件路径
llm_provider_id: LLM提供商ID
llm_model: LLM模型名称
batch_size: 分批处理时每批的条数
"""
if not self.llm_provider_repository:
raise Exception("LLM Provider Repository 未配置")
llm_provider = self.llm_provider_repository.get_by_id(llm_provider_id)
if llm_provider is None:
raise Exception(f"LLM服务提供商不存在(ID: {llm_provider_id})")
llm_engine = LLMEngine(
api_key=llm_provider.api_key,
base_url=llm_provider.api_base_url,
model_name=llm_model,
custom_params=llm_provider.custom_params or "{}"
)
subtitle_engine.correct_srt_file_with_llm(
text,
output_subtitle_path,
llm_engine=llm_engine,
batch_size=batch_size
)
# 生成字幕
# def generate_subtitle(self, res_path):
# subtitle_engine.generate_subtitle(res_path,res_path+".srt")
================================================
FILE: SonicVale/app/services/llm_provider_service.py
================================================
import json
import logging
from aiohttp.abc import HTTPException
from sqlalchemy import Sequence
from app.core.llm_engine import LLMEngine
from app.entity.llm_provider_entity import LLMProviderEntity
from app.models.po import LLMProviderPO
from app.repositories.llm_provider_repository import LLMProviderRepository
class LLMProviderService:
def __init__(self, repository: LLMProviderRepository):
"""注入 repository"""
self.repository = repository
def create_llm_provider(self, entity: LLMProviderEntity):
"""创建新LLM供应商
- 检查同名LLM供应商是否存在
- 如果存在,抛出异常或返回错误
- 调用 repository.create 插入数据库
"""
llm_provider = self.repository.get_by_name(entity.name)
if llm_provider:
return None
# 手动将entity转化为po
po = LLMProviderPO(**entity.__dict__)
res = self.repository.create(po)
# res(po) --> entity
data = {k: v for k, v in res.__dict__.items() if not k.startswith("_")}
entity = LLMProviderEntity(**data)
# 将po转化为entity
return entity
def get_llm_provider(self, llm_provider_id: int) -> LLMProviderEntity | None:
"""根据 ID 查询LLM供应商"""
po = self.repository.get_by_id(llm_provider_id)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = LLMProviderEntity(**data)
return res
def get_all_llm_providers(self) -> Sequence[LLMProviderEntity]:
"""获取所有LLM供应商列表"""
pos = self.repository.get_all()
# pos -> entities
entities = [
LLMProviderEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")})
for po in pos
]
return entities
def update_llm_provider(self, llm_provider_id: int, data:dict) -> bool:
"""更新LLM供应商
- 可以只更新部分字段
- 检查同名冲突
"""
name = data["name"]
if self.repository.get_by_name(name) and self.repository.get_by_name(name).id != llm_provider_id:
return False
self.repository.update(llm_provider_id, data)
return True
def delete_llm_provider(self, llm_provider_id: int) -> bool:
"""删除LLM供应商
- 可以添加业务校验,例如LLM供应商下有章节是否允许删除
- 后续需要级联删除所有章节内容
"""
res = self.repository.delete(llm_provider_id)
return res
def test_llm_provider(self, entity: LLMProviderEntity):
"""测试LLM供应商"""
# 按逗号划分模型名称
if entity.api_base_url is None or entity.api_key is None or entity.model_list is None:
return False
model_lists = entity.model_list.split(",")
custom_params = entity.custom_params
llm = LLMEngine(entity.api_key, entity.api_base_url, model_lists[0],custom_params)
try:
res = llm.generate_text_test("请输出一份用户信息,严格使用 JSON 格式,不要包含任何额外文字。字段包括:name, age, city")
except Exception as e:
return False,str(e)
logging.info("测试结果为:%s", res)
if res is None:
return False,"LLM 未返回任何内容"
# 7. 校验返回是否为合法 JSON
try:
# res = res.replace("```json",'')
# res = res.replace("```",'')
json.loads(res)
except json.JSONDecodeError:
return False, "LLM 返回的内容不是合法 JSON,请检查模型 / 提示词"
return True,"测试成功"
================================================
FILE: SonicVale/app/services/multi_emotion_voice_service.py
================================================
from sqlalchemy import Sequence
from app.entity.multi_emotion_voice_entity import MultiEmotionVoiceEntity
from app.models.po import MultiEmotionVoicePO
from app.repositories.multi_emotion_voice_repository import MultiEmotionVoiceRepository
class MultiEmotionVoiceService:
def __init__(self, repository: MultiEmotionVoiceRepository):
"""注入 repository"""
self.repository = repository
def create_multi_emotion_voice(self, entity: MultiEmotionVoiceEntity):
"""创建新多情绪音色变体
- 检查同名多情绪音色变体是否存在
- 如果存在,抛出异常或返回错误
- 调用 repository.create 插入数据库
"""
if entity.voice_id is None or entity.emotion_id is None or entity.strength_id is None:
return None
multi_emotion_voice = self.repository.get_by_voice_id_emotion_id_strength_id(entity.voice_id, entity.emotion_id, entity.strength_id)
if multi_emotion_voice:
return None
po = MultiEmotionVoicePO(**entity.__dict__)
res = self.repository.create(po)
# res(po) --> entity
data = {k: v for k, v in res.__dict__.items() if not k.startswith("_")}
entity = MultiEmotionVoiceEntity(**data)
# 将po转化为entity
return entity
# 根据voice_id,emotion_id,strength_id查询多情绪音色变体
def get_multi_emotion_voice_by_voice_id_emotion_id_strength_id(self, voice_id: int, emotion_id: int, strength_id: int) -> MultiEmotionVoiceEntity | None:
"""根据voice_id,emotion_id,strength_id查询多情绪音色变体"""
po = self.repository.get_by_voice_id_emotion_id_strength_id(voice_id, emotion_id, strength_id)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = MultiEmotionVoiceEntity(**data)
return res
def get_multi_emotion_voice_by_voice_id(self, voice_id: int) -> list[MultiEmotionVoiceEntity] | None:
"""根据 voice_id 查询所有的多情绪音色变体"""
pos = self.repository.get_by_voice_id(voice_id)
entities = [
MultiEmotionVoiceEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")})
for po in pos
]
return entities
def get_multi_emotion_voice_by_id(self, multi_emotion_voice_id: int) -> MultiEmotionVoiceEntity | None:
"""根据 ID 获取多情绪音色变体"""
po = self.repository.get_by_id(multi_emotion_voice_id)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = MultiEmotionVoiceEntity(**data)
return res
def get_all_multi_emotion_voices(self) -> list[MultiEmotionVoiceEntity]:
"""获取所有多情绪音色变体列表"""
pos = self.repository.get_all()
# pos -> entities
entities = [
MultiEmotionVoiceEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")})
for po in pos
]
return entities
def update_multi_emotion_voice(self, multi_emotion_voice_id: int, data:dict) -> bool:
"""更新多情绪音色变体
- 可以只更新部分字段
- 检查同名冲突
- 检查project_id不能改变
"""
self.repository.update(multi_emotion_voice_id, data)
return True
def delete_multi_emotion_voice(self, multi_emotion_voice_id: int) -> bool:
"""删除多情绪音色变体
"""
res = self.repository.delete(multi_emotion_voice_id)
return res
# 删除voice下所有多情绪音色变体
def delete_multi_emotion_voice_by_voice_id(self, voice_id: int) -> bool:
"""删除voice下所有多情绪音色变体
"""
res = self.repository.delete_multi_emotion_voice_by_voice_id(voice_id)
return res
================================================
FILE: SonicVale/app/services/project_service.py
================================================
import os
import re
import logging
from sqlalchemy import Sequence
from app.entity.project_entity import ProjectEntity
from app.models.po import ProjectPO
from app.repositories.project_repository import ProjectRepository
class ProjectService:
def __init__(self, repository: ProjectRepository):
"""注入 repository"""
self.repository = repository
def create_project(self, entity: ProjectEntity):
"""创建新项目
- 检查同名项目是否存在
- 如果存在,抛出异常或返回错误
- 调用 repository.create 插入数据库
"""
project = self.repository.get_by_name(entity.name)
if project:
return None, "项目已存在"
# 判断项目根路径是否存在
if not os.path.exists(entity.project_root_path):
logging.info("项目根路径不存在")
return None, "项目根路径不存在"
# 手动将entity转化为po
po = ProjectPO(**entity.__dict__)
res = self.repository.create(po)
# res(po) --> entity
data = {k: v for k, v in res.__dict__.items() if not k.startswith("_")}
entity = ProjectEntity(**data)
# 将po转化为entity
return entity, "创建成功"
def get_project(self, project_id: int) -> ProjectEntity | None:
"""根据 ID 查询项目"""
po = self.repository.get_by_id(project_id)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = ProjectEntity(**data)
return res
def get_all_projects(self) -> Sequence[ProjectEntity]:
"""获取所有项目列表"""
pos = self.repository.get_all()
# pos -> entities
entities = [
ProjectEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")})
for po in pos
]
return entities
def update_project(self, project_id: int, data:dict) -> bool:
"""更新项目
- 可以只更新部分字段
- 检查同名冲突
"""
name = data["name"]
if self.repository.get_by_name(name) and self.repository.get_by_name(name).id != project_id:
return False
self.repository.update(project_id, data)
return True
def delete_project(self, project_id: int) -> bool:
"""删除项目
- 可以添加业务校验,例如项目下有章节是否允许删除
- 后续需要级联删除所有章节内容
"""
res = self.repository.delete(project_id)
return res
def search_projects(self, keyword: str) -> Sequence[ProjectEntity]:
"""模糊搜索项目"""
# 解析content,按照章节
def parse_content(self, content):
"""解析内容,按照章节"""
# 正则匹配常见章节格式(支持中英文数字)
chapter_pattern = re.compile(
r'(第[\d一二三四五六七八九十百千]+[章回节部卷].*?)(?=\n|$)'
)
# 找到所有章节标题位置
matches = list(chapter_pattern.finditer(content))
chapters = []
# 如果没找到章节,直接返回整个文本
if not matches:
return chapters
for i, match in enumerate(matches):
start = match.end()
end = matches[i + 1].start() if i + 1 < len(matches) else len(content)
chapter_name = match.group(1).strip()
chapter_content = content[start:end].strip()
chapters.append({
"chapter_name": chapter_name,
"content": chapter_content
})
# 排序
# chapters.sort(key=lambda x: x["chapter_name"])
# 不需要排序了,因为是顺序解析得到的
return chapters
================================================
FILE: SonicVale/app/services/prompt_service.py
================================================
from numba.scripts.generate_lower_listing import description
from sqlalchemy import Sequence
from app.core.enums import TaskEnum
from app.core.llm_engine import LLMEngine
from app.core.prompts import get_prompt_str
from app.entity.prompt_entity import PromptEntity
from app.models.po import PromptPO
from app.repositories.prompt_repository import PromptRepository
class PromptService:
def __init__(self, repository: PromptRepository):
"""注入 repository"""
self.repository = repository
# 拆分台词prompt验证
def validate_prompt_with_DUBBING(self, content: str):
REQUIRED_BLOCKS = [
# ("", "", "{possible_characters}"),
# ("", "", "{possible_emotions}"),
# ("", "", "{possible_strengths}"),
("", "", "{novel_content}"),
]
for start, end, placeholder in REQUIRED_BLOCKS:
if start not in content or end not in content or placeholder not in content:
return False
return True
# 创建默认提示词
def create_default_prompt(self):
task = TaskEnum.DUBBING
name = "默认拆分台词提示词"
description = "默认拆分台词提示词"
content = get_prompt_str()
self.create_prompt(PromptEntity(name=name, description=description, content=content, task=task))
return True
def create_prompt(self, entity: PromptEntity):
"""创建新提示词
- 检查同名提示词是否存在
- 如果存在,抛出异常或返回错误
- 调用 repository.create 插入数据库
"""
prompt = self.repository.get_by_name(entity.name)
if prompt:
return None
# 判断task是否存在于task_enum中
if entity.task not in TaskEnum:
return None
# 验证拆分台词的提示词
if entity.task == TaskEnum.DUBBING:
isValid = self.validate_prompt_with_DUBBING(entity.content)
if not isValid:
return None
# 手动将entity转化为po
po = PromptPO(**entity.__dict__)
res = self.repository.create(po)
# res(po) --> entity
data = {k: v for k, v in res.__dict__.items() if not k.startswith("_")}
entity = PromptEntity(**data)
# 将po转化为entity
return entity
def get_prompt(self, prompt_id: int) -> PromptEntity | None:
"""根据 ID 查询提示词"""
po = self.repository.get_by_id(prompt_id)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = PromptEntity(**data)
return res
def get_all_prompts(self) -> Sequence[PromptEntity]:
"""获取所有提示词列表"""
pos = self.repository.get_all()
# pos -> entities
entities = [
PromptEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")})
for po in pos
]
return entities
def update_prompt(self, prompt_id: int, data:dict) -> bool:
"""更新提示词
- 可以只更新部分字段
- 检查同名冲突
"""
name = data["name"]
task = data.get("task")
if self.repository.get_by_name(name) and self.repository.get_by_name(name).id != prompt_id:
return False
# 如果改的是content
if TaskEnum(task) == TaskEnum.DUBBING:
if not self.validate_prompt_with_DUBBING(content=data['content']):
return False
self.repository.update(prompt_id, data)
return True
def delete_prompt(self, prompt_id: int) -> bool:
"""删除提示词
- 可以添加业务校验,例如提示词下有章节是否允许删除
- 后续需要级联删除所有章节内容
"""
res = self.repository.delete(prompt_id)
return res
# 根据task 获取提示词列表
def get_prompt_by_task(self, task: str) -> Sequence[PromptEntity]:
pos = self.repository.get_by_task(task)
entities = [
PromptEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")})
for po in pos
]
return entities
# 获取所有的stak
def get_all_tasks(self) -> Sequence[str]:
# 这些写死了,后面也在这添加,不改成数据库
return list(TaskEnum)
# def test_prompt(self, entity: PromptEntity):
# """测试提示词"""
# # 按逗号划分模型名称
# if entity.api_base_url is None or entity.api_key is None or entity.model_list is None:
# return False
# model_lists = entity.model_list.split(",")
# llm = LLMEngine(entity.api_key, entity.api_base_url, model_lists[0])
# res = llm.generate_text_test("你好")
# if res is not None:
# return True
# return False
# 根据名字获取提示词
def get_prompt_by_name(self, name: str) -> PromptEntity | None:
"""根据名字获取提示词"""
po = self.repository.get_by_name(name)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = PromptEntity(**data)
return res
================================================
FILE: SonicVale/app/services/role_service.py
================================================
from sqlalchemy import Sequence
from app.entity.role_entity import RoleEntity
from app.models.po import RolePO
from app.repositories.role_repository import RoleRepository
class RoleService:
def __init__(self, repository: RoleRepository):
"""注入 repository"""
self.repository = repository
def create_role(self, entity: RoleEntity):
"""创建新角色
- 检查同名角色是否存在
- 如果存在,抛出异常或返回错误
- 调用 repository.create 插入数据库
"""
role = self.repository.get_by_name(entity.name, entity.project_id)
if role:
return None
# 手动将entity转化为po
po = RolePO(**entity.__dict__)
res = self.repository.create(po)
# res(po) --> entity
data = {k: v for k, v in res.__dict__.items() if not k.startswith("_")}
entity = RoleEntity(**data)
# 将po转化为entity
return entity
def get_role(self, role_id: int) -> RoleEntity | None:
"""根据 ID 查询角色"""
po = self.repository.get_by_id(role_id)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = RoleEntity(**data)
return res
def get_all_roles(self,project_id: int) -> Sequence[RoleEntity]:
"""获取所有角色列表"""
pos = self.repository.get_all(project_id)
# pos -> entities
entities = [
RoleEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")})
for po in pos
]
return entities
def update_role(self, role_id: int, data:dict) -> bool:
"""更新角色
- 可以只更新部分字段
- 检查同名冲突
- 检查project_id不能改变
"""
name = data["name"]
project_id = data["project_id"]
if self.repository.get_by_name(name, project_id) and self.repository.get_by_name(name,project_id).id != role_id:
return False
po = self.repository.get_by_id(role_id)
# 防止改变project_id
if po.project_id != project_id:
return False
self.repository.update(role_id, data)
return True
def delete_role(self, role_id: int) -> bool:
"""删除角色
"""
res = self.repository.delete(role_id)
return res
================================================
FILE: SonicVale/app/services/strength_service.py
================================================
from sqlalchemy import Sequence
from app.entity.strength_entity import StrengthEntity
from app.models.po import StrengthPO
from app.repositories.strength_repository import StrengthRepository
class StrengthService:
def __init__(self, repository: StrengthRepository):
"""注入 repository"""
self.repository = repository
def create_strength(self, entity: StrengthEntity):
"""创建新情绪强弱枚举
- 检查同名情绪强弱枚举是否存在
- 如果存在,抛出异常或返回错误
- 调用 repository.create 插入数据库
"""
strength = self.repository.get_by_name(entity.name)
if strength:
return None
# 手动将entity转化为po
po = StrengthPO(**entity.__dict__)
res = self.repository.create(po)
# res(po) --> entity
data = {k: v for k, v in res.__dict__.items() if not k.startswith("_")}
entity = StrengthEntity(**data)
# 将po转化为entity
return entity
def get_strength(self, strength_id: int) -> StrengthEntity | None:
"""根据 ID 查询情绪强弱枚举"""
po = self.repository.get_by_id(strength_id)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = StrengthEntity(**data)
return res
def get_all_strengths(self) -> Sequence[StrengthEntity]:
"""获取所有情绪强弱枚举列表"""
pos = self.repository.get_all()
# pos -> entities
entities = [
StrengthEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")})
for po in pos
]
return entities
def update_strength(self, strength_id: int, data:dict) -> bool:
"""更新情绪强弱枚举
- 可以只更新部分字段
"""
name = data.get("name")
if name and self.repository.get_by_name(name):
return False
self.repository.update(strength_id, data)
return True
def delete_strength(self, strength_id: int) -> bool:
"""删除情绪强弱枚举
"""
res = self.repository.delete(strength_id)
return res
def get_strength_by_name(self, name: str) -> StrengthEntity | None:
"""根据名称查询情绪强弱枚举"""
po = self.repository.get_by_name(name)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = StrengthEntity(**data)
return res
================================================
FILE: SonicVale/app/services/tts_provider_service.py
================================================
import requests
import logging
from sqlalchemy import Sequence
from app.entity.tts_provider_entity import TTSProviderEntity
from app.models.po import TTSProviderPO
from app.repositories.tts_provider_repository import TTSProviderRepository
class TTSProviderService:
def __init__(self, repository: TTSProviderRepository):
"""注入 repository"""
self.repository = repository
def get_all_tts_providers(self) -> list[TTSProviderEntity]:
"""查询所有tts供应商"""
pos = self.repository.get_all()
res = [TTSProviderEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")}) for po in pos]
return res
def get_tts_provider(self, tts_provider_id: int) -> TTSProviderEntity | None:
"""根据 ID 查询tts供应商"""
po = self.repository.get_by_id(tts_provider_id)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = TTSProviderEntity(**data)
return res
def update_tts_provider(self, tts_provider_id: int, data:dict) -> bool:
"""更新tts供应商
- 可以只更新部分字段
- 检查同名冲突
- 检查project_id不能改变
"""
name = data["name"]
if self.repository.get_by_name(name) and self.repository.get_by_name(name).id != tts_provider_id:
return False
self.repository.update(tts_provider_id, data)
return True
def delete_tts_provider(self, tts_provider_id: int) -> bool:
"""删除tts供应商
"""
res = self.repository.delete(tts_provider_id)
return res
def create_default_tts_provider(self):
"""创建默认的tts供应商"""
if self.repository.get_by_name("index_tts") :
return
if self.repository.get_by_id(1) :
return
po = TTSProviderPO(name="index_tts", id=1,status=1, api_base_url="", api_key="")
self.repository.create(po)
def test_tts_provider(self, entity: TTSProviderEntity):
# 拿到url
api_base_url = entity.api_base_url
if not api_base_url:
return False
# ping api
# 调用
try:
resp = requests.get(api_base_url, timeout=5)
# 如果返回 200-399 都认为是通的(有些服务会 302 重定向)
if 200 <= resp.status_code < 400:
try:
data = resp.json()
if "endpoints" in data:
return True
else:
logging.error("TTS provider test failed: 'endpoints' missing in response")
return False
except ValueError:
logging.error("TTS provider test failed: response is not valid JSON")
return False
else:
logging.error("TTS provider test failed: status %s", resp.status_code)
return False
except Exception as e:
logging.exception("TTS provider test failed: %s", e)
return False
================================================
FILE: SonicVale/app/services/voice_service.py
================================================
import json
import os
import shutil
import tempfile
import zipfile
from typing import List, Tuple
from sqlalchemy import Sequence
from app.core.audio_engin import AudioProcessor
from app.dto.voice_dto import VoiceAudioProcessDTO
from app.entity.voice_entity import VoiceEntity
from app.models.po import VoicePO
from app.repositories.multi_emotion_voice_repository import MultiEmotionVoiceRepository
from app.repositories.voice_repository import VoiceRepository
class VoiceService:
def __init__(self, repository: VoiceRepository,multi_emotion_voice_repository: MultiEmotionVoiceRepository):
"""注入 repository"""
self.repository = repository
self.multi_emotion_voice_repository = multi_emotion_voice_repository
def create_voice(self, entity: VoiceEntity):
"""创建新音色
- 检查同名音色是否存在
- 如果存在,抛出异常或返回错误
- 调用 repository.create 插入数据库
"""
voice = self.repository.get_by_name(entity.name, entity.tts_provider_id)
if voice:
return None
# 手动将entity转化为po
po = VoicePO(**entity.__dict__)
res = self.repository.create(po)
# res(po) --> entity
data = {k: v for k, v in res.__dict__.items() if not k.startswith("_")}
entity = VoiceEntity(**data)
# 将po转化为entity
return entity
def get_voice(self, voice_id: int) -> VoiceEntity | None:
"""根据 ID 查询音色"""
po = self.repository.get_by_id(voice_id)
if not po:
return None
data = {k: v for k, v in po.__dict__.items() if not k.startswith("_")}
res = VoiceEntity(**data)
return res
def get_all_voices(self,tts_provider_id: int) -> Sequence[VoiceEntity]:
"""获取所有音色列表"""
pos = self.repository.get_all(tts_provider_id)
# pos -> entities
entities = [
VoiceEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")})
for po in pos
]
return entities
def update_voice(self, voice_id: int, data:dict) -> bool:
"""更新音色
- 可以只更新部分字段
- 检查同名冲突
- 检查project_id不能改变
"""
name = data["name"]
tts_provider_id = data["tts_provider_id"]
if self.repository.get_by_name(name, tts_provider_id) and self.repository.get_by_name(name,tts_provider_id).id != voice_id:
return False
po = self.repository.get_by_id(voice_id)
# 防止改变project_id
if po.tts_provider_id != tts_provider_id:
return False
self.repository.update(voice_id, data)
return True
def delete_voice(self, voice_id: int) -> bool:
"""删除音色,需要保证事务
"""
res = self.repository.delete(voice_id)
self.multi_emotion_voice_repository.delete_multi_emotion_voice_by_voice_id(voice_id)
return res
def export_voices(self, tts_provider_id: int, export_path: str, ids: List[int] | None = None) -> str:
"""导出音色库到zip文件
- 获取所有音色
- 将音色信息和对应的音频文件打包到zip
- 返回zip文件路径
"""
if ids is None:
voices = self.get_all_voices(tts_provider_id)
else:
pos = self.repository.get_by_ids(tts_provider_id, ids)
voices = [
VoiceEntity(**{k: v for k, v in po.__dict__.items() if not k.startswith("_")})
for po in pos
]
if not voices:
return None
# 确保导出目录存在
os.makedirs(os.path.dirname(export_path) if os.path.dirname(export_path) else ".", exist_ok=True)
# 创建zip文件
with zipfile.ZipFile(export_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
# 准备音色元数据
voices_metadata = []
for voice in voices:
voice_data = {
"name": voice.name,
"description": voice.description,
"is_multi_emotion": voice.is_multi_emotion,
"reference_file": None
}
# 如果有参考音频文件,添加到zip
if voice.reference_path and os.path.exists(voice.reference_path):
# 保持原文件名
file_name = os.path.basename(voice.reference_path)
# 使用音色名称作为子目录,避免文件名冲突
archive_path = f"voices/{voice.name}/{file_name}"
zipf.write(voice.reference_path, archive_path)
voice_data["reference_file"] = archive_path
voices_metadata.append(voice_data)
# 写入元数据文件
metadata_json = json.dumps(voices_metadata, ensure_ascii=False, indent=2)
zipf.writestr("voices_metadata.json", metadata_json)
return export_path
def import_voices(self, tts_provider_id: int, zip_path: str, target_dir: str) -> Tuple[int, int, List[str]]:
"""从zip文件导入音色库
- 解压zip文件
- 将音频文件复制到指定目录
- 添加音色到数据库(跳过重名的)
- 返回: (成功数量, 跳过数量, 跳过的音色名称列表)
"""
if not os.path.exists(zip_path):
raise FileNotFoundError(f"zip文件不存在: {zip_path}")
# 确保目标目录存在
os.makedirs(target_dir, exist_ok=True)
success_count = 0
skipped_count = 0
skipped_names = []
# 创建临时目录解压
with tempfile.TemporaryDirectory() as temp_dir:
# 解压zip文件
with zipfile.ZipFile(zip_path, 'r') as zipf:
zipf.extractall(temp_dir)
# 读取元数据
metadata_path = os.path.join(temp_dir, "voices_metadata.json")
if not os.path.exists(metadata_path):
raise ValueError("无效的音色库文件:缺少voices_metadata.json")
with open(metadata_path, 'r', encoding='utf-8') as f:
voices_metadata = json.load(f)
for voice_data in voices_metadata:
voice_name = voice_data["name"]
# 检查是否已存在同名音色
existing = self.repository.get_by_name(voice_name, tts_provider_id)
if existing:
skipped_count += 1
skipped_names.append(voice_name)
continue
reference_path = None
# 如果有参考音频文件,复制到目标目录
if voice_data.get("reference_file"):
source_file = os.path.join(temp_dir, voice_data["reference_file"])
if os.path.exists(source_file):
# 使用音色名称作为文件名,保留原扩展名
file_ext = os.path.splitext(source_file)[1]
file_name = f"{voice_name}{file_ext}"
dest_file = os.path.join(target_dir, file_name)
shutil.copy2(source_file, dest_file)
reference_path = dest_file
# 创建音色实体
entity = VoiceEntity(
name=voice_name,
tts_provider_id=tts_provider_id,
reference_path=reference_path,
description=voice_data.get("description"),
is_multi_emotion=voice_data.get("is_multi_emotion", 0)
)
# 保存到数据库
po = VoicePO(**entity.__dict__)
self.repository.create(po)
success_count += 1
return success_count, skipped_count, skipped_names
def process_audio(self, dto: VoiceAudioProcessDTO) -> bool:
"""处理音色参考音频
- 变速、音量调整
- 裁剪/删除区间
- 添加/裁剪末尾静音
- 指定位置插入静音
"""
audio_path = dto.audio_path
if not os.path.exists(audio_path):
raise FileNotFoundError(audio_path)
processor = AudioProcessor(audio_path)
start_ms = dto.start_ms
end_ms = dto.end_ms
speed = dto.speed
volume = dto.volume
current_ms = dto.current_ms
silence_sec = dto.silence_sec
# ---------- (1) 优先裁剪 ----------
if start_ms is not None and end_ms is not None and end_ms > start_ms:
processor.cut(start_ms, end_ms)
# ---------- (2) 插入静音 ----------
elif current_ms is not None and silence_sec is not None and silence_sec != 0:
processor.insert_silence(current_ms, silence_sec)
# ---------- (3) 末尾静音/裁剪 ----------
elif current_ms is None and silence_sec is not None and silence_sec != 0:
processor.append_silence(silence_sec)
# ---------- (4) 音量 + 变速 ----------
if speed != 1.0:
processor.change_speed(speed)
if volume != 1.0:
processor.change_volume(volume)
return True
def copy_voice(self, source_voice_id: int, new_name: str, target_dir: str = None) -> VoiceEntity:
"""复制音色
- 获取源音色信息
- 复制音频文件到目标目录
- 创建新音色记录
- 返回新音色实体
"""
# 获取源音色
source_voice = self.get_voice(source_voice_id)
if not source_voice:
raise ValueError("源音色不存在")
# 检查新名称是否已存在
existing = self.repository.get_by_name(new_name, source_voice.tts_provider_id)
if existing:
raise ValueError(f"音色名称 '{new_name}' 已存在")
new_reference_path = None
# 处理音频文件复制
if source_voice.reference_path and os.path.exists(source_voice.reference_path):
# 确定目标目录
if target_dir and target_dir.strip():
dest_dir = target_dir.strip()
else:
# 使用源音频所在目录
dest_dir = os.path.dirname(source_voice.reference_path)
# 确保目标目录存在
os.makedirs(dest_dir, exist_ok=True)
# 获取源文件扩展名
file_ext = os.path.splitext(source_voice.reference_path)[1]
# 使用新音色名作为文件名
new_file_name = f"{new_name}{file_ext}"
new_reference_path = os.path.join(dest_dir, new_file_name)
# 复制文件
shutil.copy2(source_voice.reference_path, new_reference_path)
# 创建新音色实体
new_entity = VoiceEntity(
name=new_name,
tts_provider_id=source_voice.tts_provider_id,
reference_path=new_reference_path,
description=source_voice.description,
is_multi_emotion=source_voice.is_multi_emotion
)
# 保存到数据库
po = VoicePO(**new_entity.__dict__)
res = self.repository.create(po)
# 返回新建的音色实体
data = {k: v for k, v in res.__dict__.items() if not k.startswith("_")}
return VoiceEntity(**data)
================================================
FILE: SonicVale/requirements.txt
================================================
fastapi==0.119.0
numba==0.61.2
numpy==2.3.3
openai==2.8.0
openpyxl==3.1.5
pydantic==2.12.2
pypinyin==0.55.0
Requests==2.32.5
soundfile==0.13.1
SQLAlchemy==2.0.44
starlette==0.48.0
uvicorn==0.37.0
================================================
FILE: sonicvale-front/.gitignore
================================================
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
lerna-debug.log*
node_modules
dist
release
dist-ssr
*.local
*.exe
# Editor directories and files
.vscode/*
!.vscode/extensions.json
.idea
.DS_Store
*.suo
*.ntvs*
*.njsproj
*.sln
*.sw?
================================================
FILE: sonicvale-front/.vscode/extensions.json
================================================
{
"recommendations": ["Vue.volar"]
}
================================================
FILE: sonicvale-front/README.md
================================================
# Vue 3 + Vite
This template should help get you started developing with Vue 3 in Vite. The template uses Vue 3 `