Showing preview only (1,315K chars total). Download the full file or copy to clipboard to get everything.
Repository: WongKinYiu/yolov9
Branch: main
Commit: 5b1ea9a8b3f0
Files: 116
Total size: 1.2 MB
Directory structure:
gitextract_ys0vmsdq/
├── LICENSE.md
├── README.md
├── benchmarks.py
├── classify/
│ ├── predict.py
│ ├── train.py
│ └── val.py
├── data/
│ ├── coco.yaml
│ └── hyps/
│ └── hyp.scratch-high.yaml
├── detect.py
├── detect_dual.py
├── export.py
├── hubconf.py
├── models/
│ ├── __init__.py
│ ├── common.py
│ ├── detect/
│ │ ├── gelan-c.yaml
│ │ ├── gelan-e.yaml
│ │ ├── gelan-m.yaml
│ │ ├── gelan-s.yaml
│ │ ├── gelan-t.yaml
│ │ ├── gelan.yaml
│ │ ├── yolov7-af.yaml
│ │ ├── yolov9-c.yaml
│ │ ├── yolov9-cf.yaml
│ │ ├── yolov9-e.yaml
│ │ ├── yolov9-m.yaml
│ │ ├── yolov9-s.yaml
│ │ ├── yolov9-t.yaml
│ │ └── yolov9.yaml
│ ├── experimental.py
│ ├── hub/
│ │ ├── anchors.yaml
│ │ ├── yolov3-spp.yaml
│ │ ├── yolov3-tiny.yaml
│ │ └── yolov3.yaml
│ ├── panoptic/
│ │ ├── gelan-c-pan.yaml
│ │ └── yolov7-af-pan.yaml
│ ├── segment/
│ │ ├── gelan-c-dseg.yaml
│ │ ├── gelan-c-seg.yaml
│ │ ├── yolov7-af-seg.yaml
│ │ └── yolov9-c-dseg.yaml
│ ├── tf.py
│ └── yolo.py
├── panoptic/
│ ├── predict.py
│ ├── train.py
│ └── val.py
├── requirements.txt
├── scripts/
│ └── get_coco.sh
├── segment/
│ ├── predict.py
│ ├── train.py
│ ├── train_dual.py
│ ├── val.py
│ └── val_dual.py
├── tools/
│ └── reparameterization.ipynb
├── train.py
├── train_dual.py
├── train_triple.py
├── utils/
│ ├── __init__.py
│ ├── activations.py
│ ├── augmentations.py
│ ├── autoanchor.py
│ ├── autobatch.py
│ ├── callbacks.py
│ ├── coco_utils.py
│ ├── dataloaders.py
│ ├── downloads.py
│ ├── general.py
│ ├── lion.py
│ ├── loggers/
│ │ ├── __init__.py
│ │ ├── clearml/
│ │ │ ├── __init__.py
│ │ │ ├── clearml_utils.py
│ │ │ └── hpo.py
│ │ ├── comet/
│ │ │ ├── __init__.py
│ │ │ ├── comet_utils.py
│ │ │ ├── hpo.py
│ │ │ └── optimizer_config.json
│ │ └── wandb/
│ │ ├── __init__.py
│ │ ├── log_dataset.py
│ │ ├── sweep.py
│ │ ├── sweep.yaml
│ │ └── wandb_utils.py
│ ├── loss.py
│ ├── loss_tal.py
│ ├── loss_tal_dual.py
│ ├── loss_tal_triple.py
│ ├── metrics.py
│ ├── panoptic/
│ │ ├── __init__.py
│ │ ├── augmentations.py
│ │ ├── dataloaders.py
│ │ ├── general.py
│ │ ├── loss.py
│ │ ├── loss_tal.py
│ │ ├── metrics.py
│ │ ├── plots.py
│ │ └── tal/
│ │ ├── __init__.py
│ │ ├── anchor_generator.py
│ │ └── assigner.py
│ ├── plots.py
│ ├── segment/
│ │ ├── __init__.py
│ │ ├── augmentations.py
│ │ ├── dataloaders.py
│ │ ├── general.py
│ │ ├── loss.py
│ │ ├── loss_tal.py
│ │ ├── loss_tal_dual.py
│ │ ├── metrics.py
│ │ ├── plots.py
│ │ └── tal/
│ │ ├── __init__.py
│ │ ├── anchor_generator.py
│ │ └── assigner.py
│ ├── tal/
│ │ ├── __init__.py
│ │ ├── anchor_generator.py
│ │ └── assigner.py
│ ├── torch_utils.py
│ └── triton.py
├── val.py
├── val_dual.py
└── val_triple.py
================================================
FILE CONTENTS
================================================
================================================
FILE: LICENSE.md
================================================
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for
software and other kinds of works.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.
Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.
For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.
Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:
<program> Copyright (C) <year> <name of author>
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
<https://www.gnu.org/licenses/>.
The GNU General Public License does not permit incorporating your program
into proprietary programs. If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
<https://www.gnu.org/licenses/why-not-lgpl.html>.
================================================
FILE: README.md
================================================
# YOLOv9
Implementation of paper - [YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information](https://arxiv.org/abs/2402.13616)
[](https://arxiv.org/abs/2402.13616)
[](https://huggingface.co/spaces/kadirnar/Yolov9)
[](https://huggingface.co/merve/yolov9)
[](https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/train-yolov9-object-detection-on-custom-dataset.ipynb)
[](https://learnopencv.com/yolov9-advancing-the-yolo-legacy/)
<div align="center">
<a href="./">
<img src="./figure/performance.png" width="79%"/>
</a>
</div>
## Performance
MS COCO
| Model | Test Size | AP<sup>val</sup> | AP<sub>50</sub><sup>val</sup> | AP<sub>75</sub><sup>val</sup> | Param. | FLOPs |
| :-- | :-: | :-: | :-: | :-: | :-: | :-: |
| [**YOLOv9-T**](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-t-converted.pt) | 640 | **38.3%** | **53.1%** | **41.3%** | **2.0M** | **7.7G** |
| [**YOLOv9-S**](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-s-converted.pt) | 640 | **46.8%** | **63.4%** | **50.7%** | **7.1M** | **26.4G** |
| [**YOLOv9-M**](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-m-converted.pt) | 640 | **51.4%** | **68.1%** | **56.1%** | **20.0M** | **76.3G** |
| [**YOLOv9-C**](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-c-converted.pt) | 640 | **53.0%** | **70.2%** | **57.8%** | **25.3M** | **102.1G** |
| [**YOLOv9-E**](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-e-converted.pt) | 640 | **55.6%** | **72.8%** | **60.6%** | **57.3M** | **189.0G** |
<!-- | [**YOLOv9 (ReLU)**]() | 640 | **51.9%** | **69.1%** | **56.5%** | **25.3M** | **102.1G** | -->
<!-- tiny, small, and medium models will be released after the paper be accepted and published. -->
## Useful Links
<details><summary> <b>Expand</b> </summary>
Custom training: https://github.com/WongKinYiu/yolov9/issues/30#issuecomment-1960955297
ONNX export: https://github.com/WongKinYiu/yolov9/issues/2#issuecomment-1960519506 https://github.com/WongKinYiu/yolov9/issues/40#issue-2150697688 https://github.com/WongKinYiu/yolov9/issues/130#issue-2162045461
ONNX export for segmentation: https://github.com/WongKinYiu/yolov9/issues/260#issue-2191162150
TensorRT inference: https://github.com/WongKinYiu/yolov9/issues/143#issuecomment-1975049660 https://github.com/WongKinYiu/yolov9/issues/34#issue-2150393690 https://github.com/WongKinYiu/yolov9/issues/79#issue-2153547004 https://github.com/WongKinYiu/yolov9/issues/143#issue-2164002309
QAT TensorRT: https://github.com/WongKinYiu/yolov9/issues/327#issue-2229284136 https://github.com/WongKinYiu/yolov9/issues/253#issue-2189520073
TensorRT inference for segmentation: https://github.com/WongKinYiu/yolov9/issues/446
TFLite: https://github.com/WongKinYiu/yolov9/issues/374#issuecomment-2065751706
OpenVINO: https://github.com/WongKinYiu/yolov9/issues/164#issue-2168540003
C# ONNX inference: https://github.com/WongKinYiu/yolov9/issues/95#issue-2155974619
C# OpenVINO inference: https://github.com/WongKinYiu/yolov9/issues/95#issuecomment-1968131244
OpenCV: https://github.com/WongKinYiu/yolov9/issues/113#issuecomment-1971327672
Hugging Face demo: https://github.com/WongKinYiu/yolov9/issues/45#issuecomment-1961496943
CoLab demo: https://github.com/WongKinYiu/yolov9/pull/18
ONNXSlim export: https://github.com/WongKinYiu/yolov9/pull/37
YOLOv9 ROS: https://github.com/WongKinYiu/yolov9/issues/144#issue-2164210644
YOLOv9 ROS TensorRT: https://github.com/WongKinYiu/yolov9/issues/145#issue-2164218595
YOLOv9 Julia: https://github.com/WongKinYiu/yolov9/issues/141#issuecomment-1973710107
YOLOv9 MLX: https://github.com/WongKinYiu/yolov9/issues/258#issue-2190586540
YOLOv9 StrongSORT with OSNet: https://github.com/WongKinYiu/yolov9/issues/299#issue-2212093340
YOLOv9 ByteTrack: https://github.com/WongKinYiu/yolov9/issues/78#issue-2153512879
YOLOv9 DeepSORT: https://github.com/WongKinYiu/yolov9/issues/98#issue-2156172319
YOLOv9 counting: https://github.com/WongKinYiu/yolov9/issues/84#issue-2153904804
YOLOv9 speed estimation: https://github.com/WongKinYiu/yolov9/issues/456
YOLOv9 face detection: https://github.com/WongKinYiu/yolov9/issues/121#issue-2160218766
YOLOv9 segmentation onnxruntime: https://github.com/WongKinYiu/yolov9/issues/151#issue-2165667350
Comet logging: https://github.com/WongKinYiu/yolov9/pull/110
MLflow logging: https://github.com/WongKinYiu/yolov9/pull/87
AnyLabeling tool: https://github.com/WongKinYiu/yolov9/issues/48#issue-2152139662
AX650N deploy: https://github.com/WongKinYiu/yolov9/issues/96#issue-2156115760
Conda environment: https://github.com/WongKinYiu/yolov9/pull/93
AutoDL docker environment: https://github.com/WongKinYiu/yolov9/issues/112#issue-2158203480
</details>
## Installation
Docker environment (recommended)
<details><summary> <b>Expand</b> </summary>
``` shell
# create the docker container, you can change the share memory size if you have more.
nvidia-docker run --name yolov9 -it -v your_coco_path/:/coco/ -v your_code_path/:/yolov9 --shm-size=64g nvcr.io/nvidia/pytorch:21.11-py3
# apt install required packages
apt update
apt install -y zip htop screen libgl1-mesa-glx
# pip install required packages
pip install seaborn thop
# go to code folder
cd /yolov9
```
</details>
## Evaluation
[`yolov9-s-converted.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-s-converted.pt) [`yolov9-m-converted.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-m-converted.pt) [`yolov9-c-converted.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-c-converted.pt) [`yolov9-e-converted.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-e-converted.pt)
[`yolov9-s.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-s.pt) [`yolov9-m.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-m.pt) [`yolov9-c.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-c.pt) [`yolov9-e.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-e.pt)
[`gelan-s.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/gelan-s.pt) [`gelan-m.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/gelan-m.pt) [`gelan-c.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/gelan-c.pt) [`gelan-e.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/gelan-e.pt)
``` shell
# evaluate converted yolov9 models
python val.py --data data/coco.yaml --img 640 --batch 32 --conf 0.001 --iou 0.7 --device 0 --weights './yolov9-c-converted.pt' --save-json --name yolov9_c_c_640_val
# evaluate yolov9 models
# python val_dual.py --data data/coco.yaml --img 640 --batch 32 --conf 0.001 --iou 0.7 --device 0 --weights './yolov9-c.pt' --save-json --name yolov9_c_640_val
# evaluate gelan models
# python val.py --data data/coco.yaml --img 640 --batch 32 --conf 0.001 --iou 0.7 --device 0 --weights './gelan-c.pt' --save-json --name gelan_c_640_val
```
You will get the results:
```
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.530
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.702
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.578
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.362
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.585
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.693
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.392
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.652
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.702
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.541
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.760
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.844
```
## Training
Data preparation
``` shell
bash scripts/get_coco.sh
```
* Download MS COCO dataset images ([train](http://images.cocodataset.org/zips/train2017.zip), [val](http://images.cocodataset.org/zips/val2017.zip), [test](http://images.cocodataset.org/zips/test2017.zip)) and [labels](https://github.com/WongKinYiu/yolov7/releases/download/v0.1/coco2017labels-segments.zip). If you have previously used a different version of YOLO, we strongly recommend that you delete `train2017.cache` and `val2017.cache` files, and redownload [labels](https://github.com/WongKinYiu/yolov7/releases/download/v0.1/coco2017labels-segments.zip)
Single GPU training
``` shell
# train yolov9 models
python train_dual.py --workers 8 --device 0 --batch 16 --data data/coco.yaml --img 640 --cfg models/detect/yolov9-c.yaml --weights '' --name yolov9-c --hyp hyp.scratch-high.yaml --min-items 0 --epochs 500 --close-mosaic 15
# train gelan models
# python train.py --workers 8 --device 0 --batch 32 --data data/coco.yaml --img 640 --cfg models/detect/gelan-c.yaml --weights '' --name gelan-c --hyp hyp.scratch-high.yaml --min-items 0 --epochs 500 --close-mosaic 15
```
Multiple GPU training
``` shell
# train yolov9 models
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train_dual.py --workers 8 --device 0,1,2,3,4,5,6,7 --sync-bn --batch 128 --data data/coco.yaml --img 640 --cfg models/detect/yolov9-c.yaml --weights '' --name yolov9-c --hyp hyp.scratch-high.yaml --min-items 0 --epochs 500 --close-mosaic 15
# train gelan models
# python -m torch.distributed.launch --nproc_per_node 4 --master_port 9527 train.py --workers 8 --device 0,1,2,3 --sync-bn --batch 128 --data data/coco.yaml --img 640 --cfg models/detect/gelan-c.yaml --weights '' --name gelan-c --hyp hyp.scratch-high.yaml --min-items 0 --epochs 500 --close-mosaic 15
```
## Re-parameterization
See [reparameterization.ipynb](https://github.com/WongKinYiu/yolov9/blob/main/tools/reparameterization.ipynb).
## Inference
<div align="center">
<a href="./">
<img src="./figure/horses_prediction.jpg" width="49%"/>
</a>
</div>
``` shell
# inference converted yolov9 models
python detect.py --source './data/images/horses.jpg' --img 640 --device 0 --weights './yolov9-c-converted.pt' --name yolov9_c_c_640_detect
# inference yolov9 models
# python detect_dual.py --source './data/images/horses.jpg' --img 640 --device 0 --weights './yolov9-c.pt' --name yolov9_c_640_detect
# inference gelan models
# python detect.py --source './data/images/horses.jpg' --img 640 --device 0 --weights './gelan-c.pt' --name gelan_c_c_640_detect
```
## Citation
```
@article{wang2024yolov9,
title={{YOLOv9}: Learning What You Want to Learn Using Programmable Gradient Information},
author={Wang, Chien-Yao and Liao, Hong-Yuan Mark},
booktitle={arXiv preprint arXiv:2402.13616},
year={2024}
}
```
```
@article{chang2023yolor,
title={{YOLOR}-Based Multi-Task Learning},
author={Chang, Hung-Shuo and Wang, Chien-Yao and Wang, Richard Robert and Chou, Gene and Liao, Hong-Yuan Mark},
journal={arXiv preprint arXiv:2309.16921},
year={2023}
}
```
## Teaser
Parts of code of [YOLOR-Based Multi-Task Learning](https://arxiv.org/abs/2309.16921) are released in the repository.
<div align="center">
<a href="./">
<img src="./figure/multitask.png" width="99%"/>
</a>
</div>
#### Object Detection
[`gelan-c-det.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/gelan-c-det.pt)
`object detection`
``` shell
# coco/labels/{split}/*.txt
# bbox or polygon (1 instance 1 line)
python train.py --workers 8 --device 0 --batch 32 --data data/coco.yaml --img 640 --cfg models/detect/gelan-c.yaml --weights '' --name gelan-c-det --hyp hyp.scratch-high.yaml --min-items 0 --epochs 300 --close-mosaic 10
```
| Model | Test Size | Param. | FLOPs | AP<sup>box</sup> |
| :-- | :-: | :-: | :-: | :-: |
| [**GELAN-C-DET**](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/gelan-c-det.pt) | 640 | 25.3M | 102.1G |**52.3%** |
| [**YOLOv9-C-DET**]() | 640 | 25.3M | 102.1G | **53.0%** |
#### Instance Segmentation
[`gelan-c-seg.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/gelan-c-seg.pt)
`object detection` `instance segmentation`
``` shell
# coco/labels/{split}/*.txt
# polygon (1 instance 1 line)
python segment/train.py --workers 8 --device 0 --batch 32 --data coco.yaml --img 640 --cfg models/segment/gelan-c-seg.yaml --weights '' --name gelan-c-seg --hyp hyp.scratch-high.yaml --no-overlap --epochs 300 --close-mosaic 10
```
| Model | Test Size | Param. | FLOPs | AP<sup>box</sup> | AP<sup>mask</sup> |
| :-- | :-: | :-: | :-: | :-: | :-: |
| [**GELAN-C-SEG**](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/gelan-c-seg.pt) | 640 | 27.4M | 144.6G | **52.3%** | **42.4%** |
| [**YOLOv9-C-SEG**]() | 640 | 27.4M | 145.5G | **53.3%** | **43.5%** |
#### Panoptic Segmentation
[`gelan-c-pan.pt`](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/gelan-c-pan.pt)
`object detection` `instance segmentation` `semantic segmentation` `stuff segmentation` `panoptic segmentation`
``` shell
# coco/labels/{split}/*.txt
# polygon (1 instance 1 line)
# coco/stuff/{split}/*.txt
# polygon (1 semantic 1 line)
python panoptic/train.py --workers 8 --device 0 --batch 32 --data coco.yaml --img 640 --cfg models/panoptic/gelan-c-pan.yaml --weights '' --name gelan-c-pan --hyp hyp.scratch-high.yaml --no-overlap --epochs 300 --close-mosaic 10
```
| Model | Test Size | Param. | FLOPs | AP<sup>box</sup> | AP<sup>mask</sup> | mIoU<sub>164k/10k</sub><sup>semantic</sup> | mIoU<sup>stuff</sup> | PQ<sup>panoptic</sup> |
| :-- | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: |
| [**GELAN-C-PAN**](https://github.com/WongKinYiu/yolov9/releases/download/v0.1/gelan-c-pan.pt) | 640 | 27.6M | 146.7G | **52.6%** | **42.5%** | **39.0%/48.3%** | **52.7%** | **39.4%** |
| [**YOLOv9-C-PAN**]() | 640 | 28.8M | 187.0G | **52.7%** | **43.0%** | **39.8%/-** | **52.2%** | **40.5%** |
#### Image Captioning (not yet released)
<!--[`gelan-c-cap.pt`]()-->
`object detection` `instance segmentation` `semantic segmentation` `stuff segmentation` `panoptic segmentation` `image captioning`
``` shell
# coco/labels/{split}/*.txt
# polygon (1 instance 1 line)
# coco/stuff/{split}/*.txt
# polygon (1 semantic 1 line)
# coco/annotations/*.json
# json (1 split 1 file)
python caption/train.py --workers 8 --device 0 --batch 32 --data coco.yaml --img 640 --cfg models/caption/gelan-c-cap.yaml --weights '' --name gelan-c-cap --hyp hyp.scratch-high.yaml --no-overlap --epochs 300 --close-mosaic 10
```
| Model | Test Size | Param. | FLOPs | AP<sup>box</sup> | AP<sup>mask</sup> | mIoU<sub>164k/10k</sub><sup>semantic</sup> | mIoU<sup>stuff</sup> | PQ<sup>panoptic</sup> | BLEU@4<sup>caption</sup> | CIDEr<sup>caption</sup> |
| :-- | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: |
| [**GELAN-C-CAP**]() | 640 | 47.5M | - | **51.9%** | **42.6%** | **42.5%/-** | **56.5%** | **41.7%** | **38.8** | **122.3** |
| [**YOLOv9-C-CAP**]() | 640 | 47.5M | - | **52.1%** | **42.6%** | **43.0%/-** | **56.4%** | **42.1%** | **39.1** | **122.0** |
<!--| [**YOLOR-MT**]() | 640 | 79.3M | - | **51.0%** | **41.7%** | **-/49.6%** | **55.9%** | **40.5%** | **35.7** | **112.7** |-->
## Acknowledgements
<details><summary> <b>Expand</b> </summary>
* [https://github.com/AlexeyAB/darknet](https://github.com/AlexeyAB/darknet)
* [https://github.com/WongKinYiu/yolor](https://github.com/WongKinYiu/yolor)
* [https://github.com/WongKinYiu/yolov7](https://github.com/WongKinYiu/yolov7)
* [https://github.com/VDIGPKU/DynamicDet](https://github.com/VDIGPKU/DynamicDet)
* [https://github.com/DingXiaoH/RepVGG](https://github.com/DingXiaoH/RepVGG)
* [https://github.com/ultralytics/yolov5](https://github.com/ultralytics/yolov5)
* [https://github.com/meituan/YOLOv6](https://github.com/meituan/YOLOv6)
</details>
================================================
FILE: benchmarks.py
================================================
import argparse
import platform
import sys
import time
from pathlib import Path
import pandas as pd
FILE = Path(__file__).resolve()
ROOT = FILE.parents[0] # YOLO root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
# ROOT = ROOT.relative_to(Path.cwd()) # relative
import export
from models.experimental import attempt_load
from models.yolo import SegmentationModel
from segment.val import run as val_seg
from utils import notebook_init
from utils.general import LOGGER, check_yaml, file_size, print_args
from utils.torch_utils import select_device
from val import run as val_det
def run(
weights=ROOT / 'yolo.pt', # weights path
imgsz=640, # inference size (pixels)
batch_size=1, # batch size
data=ROOT / 'data/coco.yaml', # dataset.yaml path
device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu
half=False, # use FP16 half-precision inference
test=False, # test exports only
pt_only=False, # test PyTorch only
hard_fail=False, # throw error on benchmark failure
):
y, t = [], time.time()
device = select_device(device)
model_type = type(attempt_load(weights, fuse=False)) # DetectionModel, SegmentationModel, etc.
for i, (name, f, suffix, cpu, gpu) in export.export_formats().iterrows(): # index, (name, file, suffix, CPU, GPU)
try:
assert i not in (9, 10), 'inference not supported' # Edge TPU and TF.js are unsupported
assert i != 5 or platform.system() == 'Darwin', 'inference only supported on macOS>=10.13' # CoreML
if 'cpu' in device.type:
assert cpu, 'inference not supported on CPU'
if 'cuda' in device.type:
assert gpu, 'inference not supported on GPU'
# Export
if f == '-':
w = weights # PyTorch format
else:
w = export.run(weights=weights, imgsz=[imgsz], include=[f], device=device, half=half)[-1] # all others
assert suffix in str(w), 'export failed'
# Validate
if model_type == SegmentationModel:
result = val_seg(data, w, batch_size, imgsz, plots=False, device=device, task='speed', half=half)
metric = result[0][7] # (box(p, r, map50, map), mask(p, r, map50, map), *loss(box, obj, cls))
else: # DetectionModel:
result = val_det(data, w, batch_size, imgsz, plots=False, device=device, task='speed', half=half)
metric = result[0][3] # (p, r, map50, map, *loss(box, obj, cls))
speed = result[2][1] # times (preprocess, inference, postprocess)
y.append([name, round(file_size(w), 1), round(metric, 4), round(speed, 2)]) # MB, mAP, t_inference
except Exception as e:
if hard_fail:
assert type(e) is AssertionError, f'Benchmark --hard-fail for {name}: {e}'
LOGGER.warning(f'WARNING ⚠️ Benchmark failure for {name}: {e}')
y.append([name, None, None, None]) # mAP, t_inference
if pt_only and i == 0:
break # break after PyTorch
# Print results
LOGGER.info('\n')
parse_opt()
notebook_init() # print system info
c = ['Format', 'Size (MB)', 'mAP50-95', 'Inference time (ms)'] if map else ['Format', 'Export', '', '']
py = pd.DataFrame(y, columns=c)
LOGGER.info(f'\nBenchmarks complete ({time.time() - t:.2f}s)')
LOGGER.info(str(py if map else py.iloc[:, :2]))
if hard_fail and isinstance(hard_fail, str):
metrics = py['mAP50-95'].array # values to compare to floor
floor = eval(hard_fail) # minimum metric floor to pass
assert all(x > floor for x in metrics if pd.notna(x)), f'HARD FAIL: mAP50-95 < floor {floor}'
return py
def test(
weights=ROOT / 'yolo.pt', # weights path
imgsz=640, # inference size (pixels)
batch_size=1, # batch size
data=ROOT / 'data/coco128.yaml', # dataset.yaml path
device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu
half=False, # use FP16 half-precision inference
test=False, # test exports only
pt_only=False, # test PyTorch only
hard_fail=False, # throw error on benchmark failure
):
y, t = [], time.time()
device = select_device(device)
for i, (name, f, suffix, gpu) in export.export_formats().iterrows(): # index, (name, file, suffix, gpu-capable)
try:
w = weights if f == '-' else \
export.run(weights=weights, imgsz=[imgsz], include=[f], device=device, half=half)[-1] # weights
assert suffix in str(w), 'export failed'
y.append([name, True])
except Exception:
y.append([name, False]) # mAP, t_inference
# Print results
LOGGER.info('\n')
parse_opt()
notebook_init() # print system info
py = pd.DataFrame(y, columns=['Format', 'Export'])
LOGGER.info(f'\nExports complete ({time.time() - t:.2f}s)')
LOGGER.info(str(py))
return py
def parse_opt():
parser = argparse.ArgumentParser()
parser.add_argument('--weights', type=str, default=ROOT / 'yolo.pt', help='weights path')
parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=640, help='inference size (pixels)')
parser.add_argument('--batch-size', type=int, default=1, help='batch size')
parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='dataset.yaml path')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
parser.add_argument('--test', action='store_true', help='test exports only')
parser.add_argument('--pt-only', action='store_true', help='test PyTorch only')
parser.add_argument('--hard-fail', nargs='?', const=True, default=False, help='Exception on error or < min metric')
opt = parser.parse_args()
opt.data = check_yaml(opt.data) # check YAML
print_args(vars(opt))
return opt
def main(opt):
test(**vars(opt)) if opt.test else run(**vars(opt))
if __name__ == "__main__":
opt = parse_opt()
main(opt)
================================================
FILE: classify/predict.py
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
"""
Run YOLOv5 classification inference on images, videos, directories, globs, YouTube, webcam, streams, etc.
Usage - sources:
$ python classify/predict.py --weights yolov5s-cls.pt --source 0 # webcam
img.jpg # image
vid.mp4 # video
screen # screenshot
path/ # directory
'path/*.jpg' # glob
'https://youtu.be/Zgi9g1ksQHc' # YouTube
'rtsp://example.com/media.mp4' # RTSP, RTMP, HTTP stream
Usage - formats:
$ python classify/predict.py --weights yolov5s-cls.pt # PyTorch
yolov5s-cls.torchscript # TorchScript
yolov5s-cls.onnx # ONNX Runtime or OpenCV DNN with --dnn
yolov5s-cls_openvino_model # OpenVINO
yolov5s-cls.engine # TensorRT
yolov5s-cls.mlmodel # CoreML (macOS-only)
yolov5s-cls_saved_model # TensorFlow SavedModel
yolov5s-cls.pb # TensorFlow GraphDef
yolov5s-cls.tflite # TensorFlow Lite
yolov5s-cls_edgetpu.tflite # TensorFlow Edge TPU
yolov5s-cls_paddle_model # PaddlePaddle
"""
import argparse
import os
import platform
import sys
from pathlib import Path
import torch
import torch.nn.functional as F
FILE = Path(__file__).resolve()
ROOT = FILE.parents[1] # YOLOv5 root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
from models.common import DetectMultiBackend
from utils.augmentations import classify_transforms
from utils.dataloaders import IMG_FORMATS, VID_FORMATS, LoadImages, LoadScreenshots, LoadStreams
from utils.general import (LOGGER, Profile, check_file, check_img_size, check_imshow, check_requirements, colorstr, cv2,
increment_path, print_args, strip_optimizer)
from utils.plots import Annotator
from utils.torch_utils import select_device, smart_inference_mode
@smart_inference_mode()
def run(
weights=ROOT / 'yolov5s-cls.pt', # model.pt path(s)
source=ROOT / 'data/images', # file/dir/URL/glob/screen/0(webcam)
data=ROOT / 'data/coco128.yaml', # dataset.yaml path
imgsz=(224, 224), # inference size (height, width)
device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu
view_img=False, # show results
save_txt=False, # save results to *.txt
nosave=False, # do not save images/videos
augment=False, # augmented inference
visualize=False, # visualize features
update=False, # update all models
project=ROOT / 'runs/predict-cls', # save results to project/name
name='exp', # save results to project/name
exist_ok=False, # existing project/name ok, do not increment
half=False, # use FP16 half-precision inference
dnn=False, # use OpenCV DNN for ONNX inference
vid_stride=1, # video frame-rate stride
):
source = str(source)
save_img = not nosave and not source.endswith('.txt') # save inference images
is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS)
is_url = source.lower().startswith(('rtsp://', 'rtmp://', 'http://', 'https://'))
webcam = source.isnumeric() or source.endswith('.txt') or (is_url and not is_file)
screenshot = source.lower().startswith('screen')
if is_url and is_file:
source = check_file(source) # download
# Directories
save_dir = increment_path(Path(project) / name, exist_ok=exist_ok) # increment run
(save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True) # make dir
# Load model
device = select_device(device)
model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
stride, names, pt = model.stride, model.names, model.pt
imgsz = check_img_size(imgsz, s=stride) # check image size
# Dataloader
bs = 1 # batch_size
if webcam:
view_img = check_imshow(warn=True)
dataset = LoadStreams(source, img_size=imgsz, transforms=classify_transforms(imgsz[0]), vid_stride=vid_stride)
bs = len(dataset)
elif screenshot:
dataset = LoadScreenshots(source, img_size=imgsz, stride=stride, auto=pt)
else:
dataset = LoadImages(source, img_size=imgsz, transforms=classify_transforms(imgsz[0]), vid_stride=vid_stride)
vid_path, vid_writer = [None] * bs, [None] * bs
# Run inference
model.warmup(imgsz=(1 if pt else bs, 3, *imgsz)) # warmup
seen, windows, dt = 0, [], (Profile(), Profile(), Profile())
for path, im, im0s, vid_cap, s in dataset:
with dt[0]:
im = torch.Tensor(im).to(model.device)
im = im.half() if model.fp16 else im.float() # uint8 to fp16/32
if len(im.shape) == 3:
im = im[None] # expand for batch dim
# Inference
with dt[1]:
results = model(im)
# Post-process
with dt[2]:
pred = F.softmax(results, dim=1) # probabilities
# Process predictions
for i, prob in enumerate(pred): # per image
seen += 1
if webcam: # batch_size >= 1
p, im0, frame = path[i], im0s[i].copy(), dataset.count
s += f'{i}: '
else:
p, im0, frame = path, im0s.copy(), getattr(dataset, 'frame', 0)
p = Path(p) # to Path
save_path = str(save_dir / p.name) # im.jpg
txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}') # im.txt
s += '%gx%g ' % im.shape[2:] # print string
annotator = Annotator(im0, example=str(names), pil=True)
# Print results
top5i = prob.argsort(0, descending=True)[:5].tolist() # top 5 indices
s += f"{', '.join(f'{names[j]} {prob[j]:.2f}' for j in top5i)}, "
# Write results
text = '\n'.join(f'{prob[j]:.2f} {names[j]}' for j in top5i)
if save_img or view_img: # Add bbox to image
annotator.text((32, 32), text, txt_color=(255, 255, 255))
if save_txt: # Write to file
with open(f'{txt_path}.txt', 'a') as f:
f.write(text + '\n')
# Stream results
im0 = annotator.result()
if view_img:
if platform.system() == 'Linux' and p not in windows:
windows.append(p)
cv2.namedWindow(str(p), cv2.WINDOW_NORMAL | cv2.WINDOW_KEEPRATIO) # allow window resize (Linux)
cv2.resizeWindow(str(p), im0.shape[1], im0.shape[0])
cv2.imshow(str(p), im0)
cv2.waitKey(1) # 1 millisecond
# Save results (image with detections)
if save_img:
if dataset.mode == 'image':
cv2.imwrite(save_path, im0)
else: # 'video' or 'stream'
if vid_path[i] != save_path: # new video
vid_path[i] = save_path
if isinstance(vid_writer[i], cv2.VideoWriter):
vid_writer[i].release() # release previous video writer
if vid_cap: # video
fps = vid_cap.get(cv2.CAP_PROP_FPS)
w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
else: # stream
fps, w, h = 30, im0.shape[1], im0.shape[0]
save_path = str(Path(save_path).with_suffix('.mp4')) # force *.mp4 suffix on results videos
vid_writer[i] = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
vid_writer[i].write(im0)
# Print time (inference-only)
LOGGER.info(f"{s}{dt[1].dt * 1E3:.1f}ms")
# Print results
t = tuple(x.t / seen * 1E3 for x in dt) # speeds per image
LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *imgsz)}' % t)
if save_txt or save_img:
s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ''
LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
if update:
strip_optimizer(weights[0]) # update model (to fix SourceChangeWarning)
def parse_opt():
parser = argparse.ArgumentParser()
parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolov5s-cls.pt', help='model path(s)')
parser.add_argument('--source', type=str, default=ROOT / 'data/images', help='file/dir/URL/glob/screen/0(webcam)')
parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='(optional) dataset.yaml path')
parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[224], help='inference size h,w')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--view-img', action='store_true', help='show results')
parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
parser.add_argument('--nosave', action='store_true', help='do not save images/videos')
parser.add_argument('--augment', action='store_true', help='augmented inference')
parser.add_argument('--visualize', action='store_true', help='visualize features')
parser.add_argument('--update', action='store_true', help='update all models')
parser.add_argument('--project', default=ROOT / 'runs/predict-cls', help='save results to project/name')
parser.add_argument('--name', default='exp', help='save results to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
parser.add_argument('--vid-stride', type=int, default=1, help='video frame-rate stride')
opt = parser.parse_args()
opt.imgsz *= 2 if len(opt.imgsz) == 1 else 1 # expand
print_args(vars(opt))
return opt
def main(opt):
check_requirements(exclude=('tensorboard', 'thop'))
run(**vars(opt))
if __name__ == "__main__":
opt = parse_opt()
main(opt)
================================================
FILE: classify/train.py
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
"""
Train a YOLOv5 classifier model on a classification dataset
Usage - Single-GPU training:
$ python classify/train.py --model yolov5s-cls.pt --data imagenette160 --epochs 5 --img 224
Usage - Multi-GPU DDP training:
$ python -m torch.distributed.run --nproc_per_node 4 --master_port 1 classify/train.py --model yolov5s-cls.pt --data imagenet --epochs 5 --img 224 --device 0,1,2,3
Datasets: --data mnist, fashion-mnist, cifar10, cifar100, imagenette, imagewoof, imagenet, or 'path/to/data'
YOLOv5-cls models: --model yolov5n-cls.pt, yolov5s-cls.pt, yolov5m-cls.pt, yolov5l-cls.pt, yolov5x-cls.pt
Torchvision models: --model resnet50, efficientnet_b0, etc. See https://pytorch.org/vision/stable/models.html
"""
import argparse
import os
import subprocess
import sys
import time
from copy import deepcopy
from datetime import datetime
from pathlib import Path
import torch
import torch.distributed as dist
import torch.hub as hub
import torch.optim.lr_scheduler as lr_scheduler
import torchvision
from torch.cuda import amp
from tqdm import tqdm
FILE = Path(__file__).resolve()
ROOT = FILE.parents[1] # YOLOv5 root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
from classify import val as validate
from models.experimental import attempt_load
from models.yolo import ClassificationModel, DetectionModel
from utils.dataloaders import create_classification_dataloader
from utils.general import (DATASETS_DIR, LOGGER, TQDM_BAR_FORMAT, WorkingDirectory, check_git_info, check_git_status,
check_requirements, colorstr, download, increment_path, init_seeds, print_args, yaml_save)
from utils.loggers import GenericLogger
from utils.plots import imshow_cls
from utils.torch_utils import (ModelEMA, model_info, reshape_classifier_output, select_device, smart_DDP,
smart_optimizer, smartCrossEntropyLoss, torch_distributed_zero_first)
LOCAL_RANK = int(os.getenv('LOCAL_RANK', -1)) # https://pytorch.org/docs/stable/elastic/run.html
RANK = int(os.getenv('RANK', -1))
WORLD_SIZE = int(os.getenv('WORLD_SIZE', 1))
GIT_INFO = check_git_info()
def train(opt, device):
init_seeds(opt.seed + 1 + RANK, deterministic=True)
save_dir, data, bs, epochs, nw, imgsz, pretrained = \
opt.save_dir, Path(opt.data), opt.batch_size, opt.epochs, min(os.cpu_count() - 1, opt.workers), \
opt.imgsz, str(opt.pretrained).lower() == 'true'
cuda = device.type != 'cpu'
# Directories
wdir = save_dir / 'weights'
wdir.mkdir(parents=True, exist_ok=True) # make dir
last, best = wdir / 'last.pt', wdir / 'best.pt'
# Save run settings
yaml_save(save_dir / 'opt.yaml', vars(opt))
# Logger
logger = GenericLogger(opt=opt, console_logger=LOGGER) if RANK in {-1, 0} else None
# Download Dataset
with torch_distributed_zero_first(LOCAL_RANK), WorkingDirectory(ROOT):
data_dir = data if data.is_dir() else (DATASETS_DIR / data)
if not data_dir.is_dir():
LOGGER.info(f'\nDataset not found ⚠️, missing path {data_dir}, attempting download...')
t = time.time()
if str(data) == 'imagenet':
subprocess.run(f"bash {ROOT / 'data/scripts/get_imagenet.sh'}", shell=True, check=True)
else:
url = f'https://github.com/ultralytics/yolov5/releases/download/v1.0/{data}.zip'
download(url, dir=data_dir.parent)
s = f"Dataset download success ✅ ({time.time() - t:.1f}s), saved to {colorstr('bold', data_dir)}\n"
LOGGER.info(s)
# Dataloaders
nc = len([x for x in (data_dir / 'train').glob('*') if x.is_dir()]) # number of classes
trainloader = create_classification_dataloader(path=data_dir / 'train',
imgsz=imgsz,
batch_size=bs // WORLD_SIZE,
augment=True,
cache=opt.cache,
rank=LOCAL_RANK,
workers=nw)
test_dir = data_dir / 'test' if (data_dir / 'test').exists() else data_dir / 'val' # data/test or data/val
if RANK in {-1, 0}:
testloader = create_classification_dataloader(path=test_dir,
imgsz=imgsz,
batch_size=bs // WORLD_SIZE * 2,
augment=False,
cache=opt.cache,
rank=-1,
workers=nw)
# Model
with torch_distributed_zero_first(LOCAL_RANK), WorkingDirectory(ROOT):
if Path(opt.model).is_file() or opt.model.endswith('.pt'):
model = attempt_load(opt.model, device='cpu', fuse=False)
elif opt.model in torchvision.models.__dict__: # TorchVision models i.e. resnet50, efficientnet_b0
model = torchvision.models.__dict__[opt.model](weights='IMAGENET1K_V1' if pretrained else None)
else:
m = hub.list('ultralytics/yolov5') # + hub.list('pytorch/vision') # models
raise ModuleNotFoundError(f'--model {opt.model} not found. Available models are: \n' + '\n'.join(m))
if isinstance(model, DetectionModel):
LOGGER.warning("WARNING ⚠️ pass YOLOv5 classifier model with '-cls' suffix, i.e. '--model yolov5s-cls.pt'")
model = ClassificationModel(model=model, nc=nc, cutoff=opt.cutoff or 10) # convert to classification model
reshape_classifier_output(model, nc) # update class count
for m in model.modules():
if not pretrained and hasattr(m, 'reset_parameters'):
m.reset_parameters()
if isinstance(m, torch.nn.Dropout) and opt.dropout is not None:
m.p = opt.dropout # set dropout
for p in model.parameters():
p.requires_grad = True # for training
model = model.to(device)
# Info
if RANK in {-1, 0}:
model.names = trainloader.dataset.classes # attach class names
model.transforms = testloader.dataset.torch_transforms # attach inference transforms
model_info(model)
if opt.verbose:
LOGGER.info(model)
images, labels = next(iter(trainloader))
file = imshow_cls(images[:25], labels[:25], names=model.names, f=save_dir / 'train_images.jpg')
logger.log_images(file, name='Train Examples')
logger.log_graph(model, imgsz) # log model
# Optimizer
optimizer = smart_optimizer(model, opt.optimizer, opt.lr0, momentum=0.9, decay=opt.decay)
# Scheduler
lrf = 0.01 # final lr (fraction of lr0)
# lf = lambda x: ((1 + math.cos(x * math.pi / epochs)) / 2) * (1 - lrf) + lrf # cosine
lf = lambda x: (1 - x / epochs) * (1 - lrf) + lrf # linear
scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lf)
# scheduler = lr_scheduler.OneCycleLR(optimizer, max_lr=lr0, total_steps=epochs, pct_start=0.1,
# final_div_factor=1 / 25 / lrf)
# EMA
ema = ModelEMA(model) if RANK in {-1, 0} else None
# DDP mode
if cuda and RANK != -1:
model = smart_DDP(model)
# Train
t0 = time.time()
criterion = smartCrossEntropyLoss(label_smoothing=opt.label_smoothing) # loss function
best_fitness = 0.0
scaler = amp.GradScaler(enabled=cuda)
val = test_dir.stem # 'val' or 'test'
LOGGER.info(f'Image sizes {imgsz} train, {imgsz} test\n'
f'Using {nw * WORLD_SIZE} dataloader workers\n'
f"Logging results to {colorstr('bold', save_dir)}\n"
f'Starting {opt.model} training on {data} dataset with {nc} classes for {epochs} epochs...\n\n'
f"{'Epoch':>10}{'GPU_mem':>10}{'train_loss':>12}{f'{val}_loss':>12}{'top1_acc':>12}{'top5_acc':>12}")
for epoch in range(epochs): # loop over the dataset multiple times
tloss, vloss, fitness = 0.0, 0.0, 0.0 # train loss, val loss, fitness
model.train()
if RANK != -1:
trainloader.sampler.set_epoch(epoch)
pbar = enumerate(trainloader)
if RANK in {-1, 0}:
pbar = tqdm(enumerate(trainloader), total=len(trainloader), bar_format=TQDM_BAR_FORMAT)
for i, (images, labels) in pbar: # progress bar
images, labels = images.to(device, non_blocking=True), labels.to(device)
# Forward
with amp.autocast(enabled=cuda): # stability issues when enabled
loss = criterion(model(images), labels)
# Backward
scaler.scale(loss).backward()
# Optimize
scaler.unscale_(optimizer) # unscale gradients
torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=10.0) # clip gradients
scaler.step(optimizer)
scaler.update()
optimizer.zero_grad()
if ema:
ema.update(model)
if RANK in {-1, 0}:
# Print
tloss = (tloss * i + loss.item()) / (i + 1) # update mean losses
mem = '%.3gG' % (torch.cuda.memory_reserved() / 1E9 if torch.cuda.is_available() else 0) # (GB)
pbar.desc = f"{f'{epoch + 1}/{epochs}':>10}{mem:>10}{tloss:>12.3g}" + ' ' * 36
# Test
if i == len(pbar) - 1: # last batch
top1, top5, vloss = validate.run(model=ema.ema,
dataloader=testloader,
criterion=criterion,
pbar=pbar) # test accuracy, loss
fitness = top1 # define fitness as top1 accuracy
# Scheduler
scheduler.step()
# Log metrics
if RANK in {-1, 0}:
# Best fitness
if fitness > best_fitness:
best_fitness = fitness
# Log
metrics = {
"train/loss": tloss,
f"{val}/loss": vloss,
"metrics/accuracy_top1": top1,
"metrics/accuracy_top5": top5,
"lr/0": optimizer.param_groups[0]['lr']} # learning rate
logger.log_metrics(metrics, epoch)
# Save model
final_epoch = epoch + 1 == epochs
if (not opt.nosave) or final_epoch:
ckpt = {
'epoch': epoch,
'best_fitness': best_fitness,
'model': deepcopy(ema.ema).half(), # deepcopy(de_parallel(model)).half(),
'ema': None, # deepcopy(ema.ema).half(),
'updates': ema.updates,
'optimizer': None, # optimizer.state_dict(),
'opt': vars(opt),
'git': GIT_INFO, # {remote, branch, commit} if a git repo
'date': datetime.now().isoformat()}
# Save last, best and delete
torch.save(ckpt, last)
if best_fitness == fitness:
torch.save(ckpt, best)
del ckpt
# Train complete
if RANK in {-1, 0} and final_epoch:
LOGGER.info(f'\nTraining complete ({(time.time() - t0) / 3600:.3f} hours)'
f"\nResults saved to {colorstr('bold', save_dir)}"
f"\nPredict: python classify/predict.py --weights {best} --source im.jpg"
f"\nValidate: python classify/val.py --weights {best} --data {data_dir}"
f"\nExport: python export.py --weights {best} --include onnx"
f"\nPyTorch Hub: model = torch.hub.load('ultralytics/yolov5', 'custom', '{best}')"
f"\nVisualize: https://netron.app\n")
# Plot examples
images, labels = (x[:25] for x in next(iter(testloader))) # first 25 images and labels
pred = torch.max(ema.ema(images.to(device)), 1)[1]
file = imshow_cls(images, labels, pred, model.names, verbose=False, f=save_dir / 'test_images.jpg')
# Log results
meta = {"epochs": epochs, "top1_acc": best_fitness, "date": datetime.now().isoformat()}
logger.log_images(file, name='Test Examples (true-predicted)', epoch=epoch)
logger.log_model(best, epochs, metadata=meta)
def parse_opt(known=False):
parser = argparse.ArgumentParser()
parser.add_argument('--model', type=str, default='yolov5s-cls.pt', help='initial weights path')
parser.add_argument('--data', type=str, default='imagenette160', help='cifar10, cifar100, mnist, imagenet, ...')
parser.add_argument('--epochs', type=int, default=10, help='total training epochs')
parser.add_argument('--batch-size', type=int, default=64, help='total batch size for all GPUs')
parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=224, help='train, val image size (pixels)')
parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
parser.add_argument('--cache', type=str, nargs='?', const='ram', help='--cache images in "ram" (default) or "disk"')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--workers', type=int, default=8, help='max dataloader workers (per RANK in DDP mode)')
parser.add_argument('--project', default=ROOT / 'runs/train-cls', help='save to project/name')
parser.add_argument('--name', default='exp', help='save to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--pretrained', nargs='?', const=True, default=True, help='start from i.e. --pretrained False')
parser.add_argument('--optimizer', choices=['SGD', 'Adam', 'AdamW', 'RMSProp'], default='Adam', help='optimizer')
parser.add_argument('--lr0', type=float, default=0.001, help='initial learning rate')
parser.add_argument('--decay', type=float, default=5e-5, help='weight decay')
parser.add_argument('--label-smoothing', type=float, default=0.1, help='Label smoothing epsilon')
parser.add_argument('--cutoff', type=int, default=None, help='Model layer cutoff index for Classify() head')
parser.add_argument('--dropout', type=float, default=None, help='Dropout (fraction)')
parser.add_argument('--verbose', action='store_true', help='Verbose mode')
parser.add_argument('--seed', type=int, default=0, help='Global training seed')
parser.add_argument('--local_rank', type=int, default=-1, help='Automatic DDP Multi-GPU argument, do not modify')
return parser.parse_known_args()[0] if known else parser.parse_args()
def main(opt):
# Checks
if RANK in {-1, 0}:
print_args(vars(opt))
check_git_status()
check_requirements()
# DDP mode
device = select_device(opt.device, batch_size=opt.batch_size)
if LOCAL_RANK != -1:
assert opt.batch_size != -1, 'AutoBatch is coming soon for classification, please pass a valid --batch-size'
assert opt.batch_size % WORLD_SIZE == 0, f'--batch-size {opt.batch_size} must be multiple of WORLD_SIZE'
assert torch.cuda.device_count() > LOCAL_RANK, 'insufficient CUDA devices for DDP command'
torch.cuda.set_device(LOCAL_RANK)
device = torch.device('cuda', LOCAL_RANK)
dist.init_process_group(backend="nccl" if dist.is_nccl_available() else "gloo")
# Parameters
opt.save_dir = increment_path(Path(opt.project) / opt.name, exist_ok=opt.exist_ok) # increment run
# Train
train(opt, device)
def run(**kwargs):
# Usage: from yolov5 import classify; classify.train.run(data=mnist, imgsz=320, model='yolov5m')
opt = parse_opt(True)
for k, v in kwargs.items():
setattr(opt, k, v)
main(opt)
return opt
if __name__ == "__main__":
opt = parse_opt()
main(opt)
================================================
FILE: classify/val.py
================================================
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
"""
Validate a trained YOLOv5 classification model on a classification dataset
Usage:
$ bash data/scripts/get_imagenet.sh --val # download ImageNet val split (6.3G, 50000 images)
$ python classify/val.py --weights yolov5m-cls.pt --data ../datasets/imagenet --img 224 # validate ImageNet
Usage - formats:
$ python classify/val.py --weights yolov5s-cls.pt # PyTorch
yolov5s-cls.torchscript # TorchScript
yolov5s-cls.onnx # ONNX Runtime or OpenCV DNN with --dnn
yolov5s-cls_openvino_model # OpenVINO
yolov5s-cls.engine # TensorRT
yolov5s-cls.mlmodel # CoreML (macOS-only)
yolov5s-cls_saved_model # TensorFlow SavedModel
yolov5s-cls.pb # TensorFlow GraphDef
yolov5s-cls.tflite # TensorFlow Lite
yolov5s-cls_edgetpu.tflite # TensorFlow Edge TPU
yolov5s-cls_paddle_model # PaddlePaddle
"""
import argparse
import os
import sys
from pathlib import Path
import torch
from tqdm import tqdm
FILE = Path(__file__).resolve()
ROOT = FILE.parents[1] # YOLOv5 root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
from models.common import DetectMultiBackend
from utils.dataloaders import create_classification_dataloader
from utils.general import (LOGGER, TQDM_BAR_FORMAT, Profile, check_img_size, check_requirements, colorstr,
increment_path, print_args)
from utils.torch_utils import select_device, smart_inference_mode
@smart_inference_mode()
def run(
data=ROOT / '../datasets/mnist', # dataset dir
weights=ROOT / 'yolov5s-cls.pt', # model.pt path(s)
batch_size=128, # batch size
imgsz=224, # inference size (pixels)
device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu
workers=8, # max dataloader workers (per RANK in DDP mode)
verbose=False, # verbose output
project=ROOT / 'runs/val-cls', # save to project/name
name='exp', # save to project/name
exist_ok=False, # existing project/name ok, do not increment
half=False, # use FP16 half-precision inference
dnn=False, # use OpenCV DNN for ONNX inference
model=None,
dataloader=None,
criterion=None,
pbar=None,
):
# Initialize/load model and set device
training = model is not None
if training: # called by train.py
device, pt, jit, engine = next(model.parameters()).device, True, False, False # get model device, PyTorch model
half &= device.type != 'cpu' # half precision only supported on CUDA
model.half() if half else model.float()
else: # called directly
device = select_device(device, batch_size=batch_size)
# Directories
save_dir = increment_path(Path(project) / name, exist_ok=exist_ok) # increment run
save_dir.mkdir(parents=True, exist_ok=True) # make dir
# Load model
model = DetectMultiBackend(weights, device=device, dnn=dnn, fp16=half)
stride, pt, jit, engine = model.stride, model.pt, model.jit, model.engine
imgsz = check_img_size(imgsz, s=stride) # check image size
half = model.fp16 # FP16 supported on limited backends with CUDA
if engine:
batch_size = model.batch_size
else:
device = model.device
if not (pt or jit):
batch_size = 1 # export.py models default to batch-size 1
LOGGER.info(f'Forcing --batch-size 1 square inference (1,3,{imgsz},{imgsz}) for non-PyTorch models')
# Dataloader
data = Path(data)
test_dir = data / 'test' if (data / 'test').exists() else data / 'val' # data/test or data/val
dataloader = create_classification_dataloader(path=test_dir,
imgsz=imgsz,
batch_size=batch_size,
augment=False,
rank=-1,
workers=workers)
model.eval()
pred, targets, loss, dt = [], [], 0, (Profile(), Profile(), Profile())
n = len(dataloader) # number of batches
action = 'validating' if dataloader.dataset.root.stem == 'val' else 'testing'
desc = f"{pbar.desc[:-36]}{action:>36}" if pbar else f"{action}"
bar = tqdm(dataloader, desc, n, not training, bar_format=TQDM_BAR_FORMAT, position=0)
with torch.cuda.amp.autocast(enabled=device.type != 'cpu'):
for images, labels in bar:
with dt[0]:
images, labels = images.to(device, non_blocking=True), labels.to(device)
with dt[1]:
y = model(images)
with dt[2]:
pred.append(y.argsort(1, descending=True)[:, :5])
targets.append(labels)
if criterion:
loss += criterion(y, labels)
loss /= n
pred, targets = torch.cat(pred), torch.cat(targets)
correct = (targets[:, None] == pred).float()
acc = torch.stack((correct[:, 0], correct.max(1).values), dim=1) # (top1, top5) accuracy
top1, top5 = acc.mean(0).tolist()
if pbar:
pbar.desc = f"{pbar.desc[:-36]}{loss:>12.3g}{top1:>12.3g}{top5:>12.3g}"
if verbose: # all classes
LOGGER.info(f"{'Class':>24}{'Images':>12}{'top1_acc':>12}{'top5_acc':>12}")
LOGGER.info(f"{'all':>24}{targets.shape[0]:>12}{top1:>12.3g}{top5:>12.3g}")
for i, c in model.names.items():
aci = acc[targets == i]
top1i, top5i = aci.mean(0).tolist()
LOGGER.info(f"{c:>24}{aci.shape[0]:>12}{top1i:>12.3g}{top5i:>12.3g}")
# Print results
t = tuple(x.t / len(dataloader.dataset.samples) * 1E3 for x in dt) # speeds per image
shape = (1, 3, imgsz, imgsz)
LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms post-process per image at shape {shape}' % t)
LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}")
return top1, top5, loss
def parse_opt():
parser = argparse.ArgumentParser()
parser.add_argument('--data', type=str, default=ROOT / '../datasets/mnist', help='dataset path')
parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolov5s-cls.pt', help='model.pt path(s)')
parser.add_argument('--batch-size', type=int, default=128, help='batch size')
parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=224, help='inference size (pixels)')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--workers', type=int, default=8, help='max dataloader workers (per RANK in DDP mode)')
parser.add_argument('--verbose', nargs='?', const=True, default=True, help='verbose output')
parser.add_argument('--project', default=ROOT / 'runs/val-cls', help='save to project/name')
parser.add_argument('--name', default='exp', help='save to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
opt = parser.parse_args()
print_args(vars(opt))
return opt
def main(opt):
check_requirements(exclude=('tensorboard', 'thop'))
run(**vars(opt))
if __name__ == "__main__":
opt = parse_opt()
main(opt)
================================================
FILE: data/coco.yaml
================================================
path: ../datasets/coco # dataset root dir
train: train2017.txt # train images (relative to 'path') 118287 images
val: val2017.txt # val images (relative to 'path') 5000 images
test: test-dev2017.txt # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794
# Classes
names:
0: person
1: bicycle
2: car
3: motorcycle
4: airplane
5: bus
6: train
7: truck
8: boat
9: traffic light
10: fire hydrant
11: stop sign
12: parking meter
13: bench
14: bird
15: cat
16: dog
17: horse
18: sheep
19: cow
20: elephant
21: bear
22: zebra
23: giraffe
24: backpack
25: umbrella
26: handbag
27: tie
28: suitcase
29: frisbee
30: skis
31: snowboard
32: sports ball
33: kite
34: baseball bat
35: baseball glove
36: skateboard
37: surfboard
38: tennis racket
39: bottle
40: wine glass
41: cup
42: fork
43: knife
44: spoon
45: bowl
46: banana
47: apple
48: sandwich
49: orange
50: broccoli
51: carrot
52: hot dog
53: pizza
54: donut
55: cake
56: chair
57: couch
58: potted plant
59: bed
60: dining table
61: toilet
62: tv
63: laptop
64: mouse
65: remote
66: keyboard
67: cell phone
68: microwave
69: oven
70: toaster
71: sink
72: refrigerator
73: book
74: clock
75: vase
76: scissors
77: teddy bear
78: hair drier
79: toothbrush
# stuff names
stuff_names: [
'banner', 'blanket', 'branch', 'bridge', 'building-other', 'bush', 'cabinet', 'cage',
'cardboard', 'carpet', 'ceiling-other', 'ceiling-tile', 'cloth', 'clothes', 'clouds', 'counter', 'cupboard',
'curtain', 'desk-stuff', 'dirt', 'door-stuff', 'fence', 'floor-marble', 'floor-other', 'floor-stone', 'floor-tile',
'floor-wood', 'flower', 'fog', 'food-other', 'fruit', 'furniture-other', 'grass', 'gravel', 'ground-other', 'hill',
'house', 'leaves', 'light', 'mat', 'metal', 'mirror-stuff', 'moss', 'mountain', 'mud', 'napkin', 'net', 'paper',
'pavement', 'pillow', 'plant-other', 'plastic', 'platform', 'playingfield', 'railing', 'railroad', 'river', 'road',
'rock', 'roof', 'rug', 'salad', 'sand', 'sea', 'shelf', 'sky-other', 'skyscraper', 'snow', 'solid-other', 'stairs',
'stone', 'straw', 'structural-other', 'table', 'tent', 'textile-other', 'towel', 'tree', 'vegetable', 'wall-brick',
'wall-concrete', 'wall-other', 'wall-panel', 'wall-stone', 'wall-tile', 'wall-wood', 'water-other', 'waterdrops',
'window-blind', 'window-other', 'wood',
# other
'other',
# unlabeled
'unlabeled'
]
# Download script/URL (optional)
download: |
from utils.general import download, Path
# Download labels
#segments = True # segment or box labels
#dir = Path(yaml['path']) # dataset root dir
#url = 'https://github.com/WongKinYiu/yolov7/releases/download/v0.1/'
#urls = [url + ('coco2017labels-segments.zip' if segments else 'coco2017labels.zip')] # labels
#download(urls, dir=dir.parent)
# Download data
#urls = ['http://images.cocodataset.org/zips/train2017.zip', # 19G, 118k images
# 'http://images.cocodataset.org/zips/val2017.zip', # 1G, 5k images
# 'http://images.cocodataset.org/zips/test2017.zip'] # 7G, 41k images (optional)
#download(urls, dir=dir / 'images', threads=3)
================================================
FILE: data/hyps/hyp.scratch-high.yaml
================================================
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.01 # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937 # SGD momentum/Adam beta1
weight_decay: 0.0005 # optimizer weight decay 5e-4
warmup_epochs: 3.0 # warmup epochs (fractions ok)
warmup_momentum: 0.8 # warmup initial momentum
warmup_bias_lr: 0.1 # warmup initial bias lr
box: 7.5 # box loss gain
cls: 0.5 # cls loss gain
cls_pw: 1.0 # cls BCELoss positive_weight
obj: 0.7 # obj loss gain (scale with pixels)
obj_pw: 1.0 # obj BCELoss positive_weight
dfl: 1.5 # dfl loss gain
iou_t: 0.20 # IoU training threshold
anchor_t: 5.0 # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
degrees: 0.0 # image rotation (+/- deg)
translate: 0.1 # image translation (+/- fraction)
scale: 0.9 # image scale (+/- gain)
shear: 0.0 # image shear (+/- deg)
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
flipud: 0.0 # image flip up-down (probability)
fliplr: 0.5 # image flip left-right (probability)
mosaic: 1.0 # image mosaic (probability)
mixup: 0.15 # image mixup (probability)
copy_paste: 0.3 # segment copy-paste (probability)
================================================
FILE: detect.py
================================================
import argparse
import os
import platform
import sys
from pathlib import Path
import torch
FILE = Path(__file__).resolve()
ROOT = FILE.parents[0] # YOLO root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
from models.common import DetectMultiBackend
from utils.dataloaders import IMG_FORMATS, VID_FORMATS, LoadImages, LoadScreenshots, LoadStreams
from utils.general import (LOGGER, Profile, check_file, check_img_size, check_imshow, check_requirements, colorstr, cv2,
increment_path, non_max_suppression, print_args, scale_boxes, strip_optimizer, xyxy2xywh)
from utils.plots import Annotator, colors, save_one_box
from utils.torch_utils import select_device, smart_inference_mode
@smart_inference_mode()
def run(
weights=ROOT / 'yolo.pt', # model path or triton URL
source=ROOT / 'data/images', # file/dir/URL/glob/screen/0(webcam)
data=ROOT / 'data/coco.yaml', # dataset.yaml path
imgsz=(640, 640), # inference size (height, width)
conf_thres=0.25, # confidence threshold
iou_thres=0.45, # NMS IOU threshold
max_det=1000, # maximum detections per image
device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu
view_img=False, # show results
save_txt=False, # save results to *.txt
save_conf=False, # save confidences in --save-txt labels
save_crop=False, # save cropped prediction boxes
nosave=False, # do not save images/videos
classes=None, # filter by class: --class 0, or --class 0 2 3
agnostic_nms=False, # class-agnostic NMS
augment=False, # augmented inference
visualize=False, # visualize features
update=False, # update all models
project=ROOT / 'runs/detect', # save results to project/name
name='exp', # save results to project/name
exist_ok=False, # existing project/name ok, do not increment
line_thickness=3, # bounding box thickness (pixels)
hide_labels=False, # hide labels
hide_conf=False, # hide confidences
half=False, # use FP16 half-precision inference
dnn=False, # use OpenCV DNN for ONNX inference
vid_stride=1, # video frame-rate stride
):
source = str(source)
save_img = not nosave and not source.endswith('.txt') # save inference images
is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS)
is_url = source.lower().startswith(('rtsp://', 'rtmp://', 'http://', 'https://'))
webcam = source.isnumeric() or source.endswith('.txt') or (is_url and not is_file)
screenshot = source.lower().startswith('screen')
if is_url and is_file:
source = check_file(source) # download
# Directories
save_dir = increment_path(Path(project) / name, exist_ok=exist_ok) # increment run
(save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True) # make dir
# Load model
device = select_device(device)
model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
stride, names, pt = model.stride, model.names, model.pt
imgsz = check_img_size(imgsz, s=stride) # check image size
# Dataloader
bs = 1 # batch_size
if webcam:
view_img = check_imshow(warn=True)
dataset = LoadStreams(source, img_size=imgsz, stride=stride, auto=pt, vid_stride=vid_stride)
bs = len(dataset)
elif screenshot:
dataset = LoadScreenshots(source, img_size=imgsz, stride=stride, auto=pt)
else:
dataset = LoadImages(source, img_size=imgsz, stride=stride, auto=pt, vid_stride=vid_stride)
vid_path, vid_writer = [None] * bs, [None] * bs
# Run inference
model.warmup(imgsz=(1 if pt or model.triton else bs, 3, *imgsz)) # warmup
seen, windows, dt = 0, [], (Profile(), Profile(), Profile())
for path, im, im0s, vid_cap, s in dataset:
with dt[0]:
im = torch.from_numpy(im).to(model.device)
im = im.half() if model.fp16 else im.float() # uint8 to fp16/32
im /= 255 # 0 - 255 to 0.0 - 1.0
if len(im.shape) == 3:
im = im[None] # expand for batch dim
# Inference
with dt[1]:
visualize = increment_path(save_dir / Path(path).stem, mkdir=True) if visualize else False
pred = model(im, augment=augment, visualize=visualize)
# NMS
with dt[2]:
pred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det)
# Second-stage classifier (optional)
# pred = utils.general.apply_classifier(pred, classifier_model, im, im0s)
# Process predictions
for i, det in enumerate(pred): # per image
seen += 1
if webcam: # batch_size >= 1
p, im0, frame = path[i], im0s[i].copy(), dataset.count
s += f'{i}: '
else:
p, im0, frame = path, im0s.copy(), getattr(dataset, 'frame', 0)
p = Path(p) # to Path
save_path = str(save_dir / p.name) # im.jpg
txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}') # im.txt
s += '%gx%g ' % im.shape[2:] # print string
gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh
imc = im0.copy() if save_crop else im0 # for save_crop
annotator = Annotator(im0, line_width=line_thickness, example=str(names))
if len(det):
# Rescale boxes from img_size to im0 size
det[:, :4] = scale_boxes(im.shape[2:], det[:, :4], im0.shape).round()
# Print results
for c in det[:, 5].unique():
n = (det[:, 5] == c).sum() # detections per class
s += f"{n} {names[int(c)]}{'s' * (n > 1)}, " # add to string
# Write results
for *xyxy, conf, cls in reversed(det):
if save_txt: # Write to file
xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
line = (cls, *xywh, conf) if save_conf else (cls, *xywh) # label format
with open(f'{txt_path}.txt', 'a') as f:
f.write(('%g ' * len(line)).rstrip() % line + '\n')
if save_img or save_crop or view_img: # Add bbox to image
c = int(cls) # integer class
label = None if hide_labels else (names[c] if hide_conf else f'{names[c]} {conf:.2f}')
annotator.box_label(xyxy, label, color=colors(c, True))
if save_crop:
save_one_box(xyxy, imc, file=save_dir / 'crops' / names[c] / f'{p.stem}.jpg', BGR=True)
# Stream results
im0 = annotator.result()
if view_img:
if platform.system() == 'Linux' and p not in windows:
windows.append(p)
cv2.namedWindow(str(p), cv2.WINDOW_NORMAL | cv2.WINDOW_KEEPRATIO) # allow window resize (Linux)
cv2.resizeWindow(str(p), im0.shape[1], im0.shape[0])
cv2.imshow(str(p), im0)
cv2.waitKey(1) # 1 millisecond
# Save results (image with detections)
if save_img:
if dataset.mode == 'image':
cv2.imwrite(save_path, im0)
else: # 'video' or 'stream'
if vid_path[i] != save_path: # new video
vid_path[i] = save_path
if isinstance(vid_writer[i], cv2.VideoWriter):
vid_writer[i].release() # release previous video writer
if vid_cap: # video
fps = vid_cap.get(cv2.CAP_PROP_FPS)
w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
else: # stream
fps, w, h = 30, im0.shape[1], im0.shape[0]
save_path = str(Path(save_path).with_suffix('.mp4')) # force *.mp4 suffix on results videos
vid_writer[i] = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
vid_writer[i].write(im0)
# Print time (inference-only)
LOGGER.info(f"{s}{'' if len(det) else '(no detections), '}{dt[1].dt * 1E3:.1f}ms")
# Print results
t = tuple(x.t / seen * 1E3 for x in dt) # speeds per image
LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *imgsz)}' % t)
if save_txt or save_img:
s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ''
LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
if update:
strip_optimizer(weights[0]) # update model (to fix SourceChangeWarning)
def parse_opt():
parser = argparse.ArgumentParser()
parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolo.pt', help='model path or triton URL')
parser.add_argument('--source', type=str, default=ROOT / 'data/images', help='file/dir/URL/glob/screen/0(webcam)')
parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='(optional) dataset.yaml path')
parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640], help='inference size h,w')
parser.add_argument('--conf-thres', type=float, default=0.25, help='confidence threshold')
parser.add_argument('--iou-thres', type=float, default=0.45, help='NMS IoU threshold')
parser.add_argument('--max-det', type=int, default=1000, help='maximum detections per image')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--view-img', action='store_true', help='show results')
parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
parser.add_argument('--save-crop', action='store_true', help='save cropped prediction boxes')
parser.add_argument('--nosave', action='store_true', help='do not save images/videos')
parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --classes 0, or --classes 0 2 3')
parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')
parser.add_argument('--augment', action='store_true', help='augmented inference')
parser.add_argument('--visualize', action='store_true', help='visualize features')
parser.add_argument('--update', action='store_true', help='update all models')
parser.add_argument('--project', default=ROOT / 'runs/detect', help='save results to project/name')
parser.add_argument('--name', default='exp', help='save results to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--line-thickness', default=3, type=int, help='bounding box thickness (pixels)')
parser.add_argument('--hide-labels', default=False, action='store_true', help='hide labels')
parser.add_argument('--hide-conf', default=False, action='store_true', help='hide confidences')
parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
parser.add_argument('--vid-stride', type=int, default=1, help='video frame-rate stride')
opt = parser.parse_args()
opt.imgsz *= 2 if len(opt.imgsz) == 1 else 1 # expand
print_args(vars(opt))
return opt
def main(opt):
# check_requirements(exclude=('tensorboard', 'thop'))
run(**vars(opt))
if __name__ == "__main__":
opt = parse_opt()
main(opt)
================================================
FILE: detect_dual.py
================================================
import argparse
import os
import platform
import sys
from pathlib import Path
import torch
FILE = Path(__file__).resolve()
ROOT = FILE.parents[0] # YOLO root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
from models.common import DetectMultiBackend
from utils.dataloaders import IMG_FORMATS, VID_FORMATS, LoadImages, LoadScreenshots, LoadStreams
from utils.general import (LOGGER, Profile, check_file, check_img_size, check_imshow, check_requirements, colorstr, cv2,
increment_path, non_max_suppression, print_args, scale_boxes, strip_optimizer, xyxy2xywh)
from utils.plots import Annotator, colors, save_one_box
from utils.torch_utils import select_device, smart_inference_mode
@smart_inference_mode()
def run(
weights=ROOT / 'yolo.pt', # model path or triton URL
source=ROOT / 'data/images', # file/dir/URL/glob/screen/0(webcam)
data=ROOT / 'data/coco.yaml', # dataset.yaml path
imgsz=(640, 640), # inference size (height, width)
conf_thres=0.25, # confidence threshold
iou_thres=0.45, # NMS IOU threshold
max_det=1000, # maximum detections per image
device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu
view_img=False, # show results
save_txt=False, # save results to *.txt
save_conf=False, # save confidences in --save-txt labels
save_crop=False, # save cropped prediction boxes
nosave=False, # do not save images/videos
classes=None, # filter by class: --class 0, or --class 0 2 3
agnostic_nms=False, # class-agnostic NMS
augment=False, # augmented inference
visualize=False, # visualize features
update=False, # update all models
project=ROOT / 'runs/detect', # save results to project/name
name='exp', # save results to project/name
exist_ok=False, # existing project/name ok, do not increment
line_thickness=3, # bounding box thickness (pixels)
hide_labels=False, # hide labels
hide_conf=False, # hide confidences
half=False, # use FP16 half-precision inference
dnn=False, # use OpenCV DNN for ONNX inference
vid_stride=1, # video frame-rate stride
):
source = str(source)
save_img = not nosave and not source.endswith('.txt') # save inference images
is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS)
is_url = source.lower().startswith(('rtsp://', 'rtmp://', 'http://', 'https://'))
webcam = source.isnumeric() or source.endswith('.txt') or (is_url and not is_file)
screenshot = source.lower().startswith('screen')
if is_url and is_file:
source = check_file(source) # download
# Directories
save_dir = increment_path(Path(project) / name, exist_ok=exist_ok) # increment run
(save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True) # make dir
# Load model
device = select_device(device)
model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
stride, names, pt = model.stride, model.names, model.pt
imgsz = check_img_size(imgsz, s=stride) # check image size
# Dataloader
bs = 1 # batch_size
if webcam:
view_img = check_imshow(warn=True)
dataset = LoadStreams(source, img_size=imgsz, stride=stride, auto=pt, vid_stride=vid_stride)
bs = len(dataset)
elif screenshot:
dataset = LoadScreenshots(source, img_size=imgsz, stride=stride, auto=pt)
else:
dataset = LoadImages(source, img_size=imgsz, stride=stride, auto=pt, vid_stride=vid_stride)
vid_path, vid_writer = [None] * bs, [None] * bs
# Run inference
model.warmup(imgsz=(1 if pt or model.triton else bs, 3, *imgsz)) # warmup
seen, windows, dt = 0, [], (Profile(), Profile(), Profile())
for path, im, im0s, vid_cap, s in dataset:
with dt[0]:
im = torch.from_numpy(im).to(model.device)
im = im.half() if model.fp16 else im.float() # uint8 to fp16/32
im /= 255 # 0 - 255 to 0.0 - 1.0
if len(im.shape) == 3:
im = im[None] # expand for batch dim
# Inference
with dt[1]:
visualize = increment_path(save_dir / Path(path).stem, mkdir=True) if visualize else False
pred = model(im, augment=augment, visualize=visualize)
pred = pred[0][1]
# NMS
with dt[2]:
pred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det)
# Second-stage classifier (optional)
# pred = utils.general.apply_classifier(pred, classifier_model, im, im0s)
# Process predictions
for i, det in enumerate(pred): # per image
seen += 1
if webcam: # batch_size >= 1
p, im0, frame = path[i], im0s[i].copy(), dataset.count
s += f'{i}: '
else:
p, im0, frame = path, im0s.copy(), getattr(dataset, 'frame', 0)
p = Path(p) # to Path
save_path = str(save_dir / p.name) # im.jpg
txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}') # im.txt
s += '%gx%g ' % im.shape[2:] # print string
gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh
imc = im0.copy() if save_crop else im0 # for save_crop
annotator = Annotator(im0, line_width=line_thickness, example=str(names))
if len(det):
# Rescale boxes from img_size to im0 size
det[:, :4] = scale_boxes(im.shape[2:], det[:, :4], im0.shape).round()
# Print results
for c in det[:, 5].unique():
n = (det[:, 5] == c).sum() # detections per class
s += f"{n} {names[int(c)]}{'s' * (n > 1)}, " # add to string
# Write results
for *xyxy, conf, cls in reversed(det):
if save_txt: # Write to file
xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
line = (cls, *xywh, conf) if save_conf else (cls, *xywh) # label format
with open(f'{txt_path}.txt', 'a') as f:
f.write(('%g ' * len(line)).rstrip() % line + '\n')
if save_img or save_crop or view_img: # Add bbox to image
c = int(cls) # integer class
label = None if hide_labels else (names[c] if hide_conf else f'{names[c]} {conf:.2f}')
annotator.box_label(xyxy, label, color=colors(c, True))
if save_crop:
save_one_box(xyxy, imc, file=save_dir / 'crops' / names[c] / f'{p.stem}.jpg', BGR=True)
# Stream results
im0 = annotator.result()
if view_img:
if platform.system() == 'Linux' and p not in windows:
windows.append(p)
cv2.namedWindow(str(p), cv2.WINDOW_NORMAL | cv2.WINDOW_KEEPRATIO) # allow window resize (Linux)
cv2.resizeWindow(str(p), im0.shape[1], im0.shape[0])
cv2.imshow(str(p), im0)
cv2.waitKey(1) # 1 millisecond
# Save results (image with detections)
if save_img:
if dataset.mode == 'image':
cv2.imwrite(save_path, im0)
else: # 'video' or 'stream'
if vid_path[i] != save_path: # new video
vid_path[i] = save_path
if isinstance(vid_writer[i], cv2.VideoWriter):
vid_writer[i].release() # release previous video writer
if vid_cap: # video
fps = vid_cap.get(cv2.CAP_PROP_FPS)
w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
else: # stream
fps, w, h = 30, im0.shape[1], im0.shape[0]
save_path = str(Path(save_path).with_suffix('.mp4')) # force *.mp4 suffix on results videos
vid_writer[i] = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
vid_writer[i].write(im0)
# Print time (inference-only)
LOGGER.info(f"{s}{'' if len(det) else '(no detections), '}{dt[1].dt * 1E3:.1f}ms")
# Print results
t = tuple(x.t / seen * 1E3 for x in dt) # speeds per image
LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *imgsz)}' % t)
if save_txt or save_img:
s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ''
LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
if update:
strip_optimizer(weights[0]) # update model (to fix SourceChangeWarning)
def parse_opt():
parser = argparse.ArgumentParser()
parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolo.pt', help='model path or triton URL')
parser.add_argument('--source', type=str, default=ROOT / 'data/images', help='file/dir/URL/glob/screen/0(webcam)')
parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='(optional) dataset.yaml path')
parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640], help='inference size h,w')
parser.add_argument('--conf-thres', type=float, default=0.25, help='confidence threshold')
parser.add_argument('--iou-thres', type=float, default=0.45, help='NMS IoU threshold')
parser.add_argument('--max-det', type=int, default=1000, help='maximum detections per image')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--view-img', action='store_true', help='show results')
parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
parser.add_argument('--save-crop', action='store_true', help='save cropped prediction boxes')
parser.add_argument('--nosave', action='store_true', help='do not save images/videos')
parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --classes 0, or --classes 0 2 3')
parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')
parser.add_argument('--augment', action='store_true', help='augmented inference')
parser.add_argument('--visualize', action='store_true', help='visualize features')
parser.add_argument('--update', action='store_true', help='update all models')
parser.add_argument('--project', default=ROOT / 'runs/detect', help='save results to project/name')
parser.add_argument('--name', default='exp', help='save results to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--line-thickness', default=3, type=int, help='bounding box thickness (pixels)')
parser.add_argument('--hide-labels', default=False, action='store_true', help='hide labels')
parser.add_argument('--hide-conf', default=False, action='store_true', help='hide confidences')
parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
parser.add_argument('--vid-stride', type=int, default=1, help='video frame-rate stride')
opt = parser.parse_args()
opt.imgsz *= 2 if len(opt.imgsz) == 1 else 1 # expand
print_args(vars(opt))
return opt
def main(opt):
# check_requirements(exclude=('tensorboard', 'thop'))
run(**vars(opt))
if __name__ == "__main__":
opt = parse_opt()
main(opt)
================================================
FILE: export.py
================================================
import argparse
import contextlib
import json
import os
import platform
import re
import subprocess
import sys
import time
import warnings
from pathlib import Path
import pandas as pd
import torch
from torch.utils.mobile_optimizer import optimize_for_mobile
FILE = Path(__file__).resolve()
ROOT = FILE.parents[0] # YOLO root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
if platform.system() != 'Windows':
ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
from models.experimental import attempt_load, End2End
from models.yolo import ClassificationModel, Detect, DDetect, DualDetect, DualDDetect, DetectionModel, SegmentationModel
from utils.dataloaders import LoadImages
from utils.general import (LOGGER, Profile, check_dataset, check_img_size, check_requirements, check_version,
check_yaml, colorstr, file_size, get_default_args, print_args, url2file, yaml_save)
from utils.torch_utils import select_device, smart_inference_mode
MACOS = platform.system() == 'Darwin' # macOS environment
def export_formats():
# YOLO export formats
x = [
['PyTorch', '-', '.pt', True, True],
['TorchScript', 'torchscript', '.torchscript', True, True],
['ONNX', 'onnx', '.onnx', True, True],
['ONNX END2END', 'onnx_end2end', '_end2end.onnx', True, True],
['OpenVINO', 'openvino', '_openvino_model', True, False],
['TensorRT', 'engine', '.engine', False, True],
['CoreML', 'coreml', '.mlmodel', True, False],
['TensorFlow SavedModel', 'saved_model', '_saved_model', True, True],
['TensorFlow GraphDef', 'pb', '.pb', True, True],
['TensorFlow Lite', 'tflite', '.tflite', True, False],
['TensorFlow Edge TPU', 'edgetpu', '_edgetpu.tflite', False, False],
['TensorFlow.js', 'tfjs', '_web_model', False, False],
['PaddlePaddle', 'paddle', '_paddle_model', True, True],]
return pd.DataFrame(x, columns=['Format', 'Argument', 'Suffix', 'CPU', 'GPU'])
def try_export(inner_func):
# YOLO export decorator, i..e @try_export
inner_args = get_default_args(inner_func)
def outer_func(*args, **kwargs):
prefix = inner_args['prefix']
try:
with Profile() as dt:
f, model = inner_func(*args, **kwargs)
LOGGER.info(f'{prefix} export success ✅ {dt.t:.1f}s, saved as {f} ({file_size(f):.1f} MB)')
return f, model
except Exception as e:
LOGGER.info(f'{prefix} export failure ❌ {dt.t:.1f}s: {e}')
return None, None
return outer_func
@try_export
def export_torchscript(model, im, file, optimize, prefix=colorstr('TorchScript:')):
# YOLO TorchScript model export
LOGGER.info(f'\n{prefix} starting export with torch {torch.__version__}...')
f = file.with_suffix('.torchscript')
ts = torch.jit.trace(model, im, strict=False)
d = {"shape": im.shape, "stride": int(max(model.stride)), "names": model.names}
extra_files = {'config.txt': json.dumps(d)} # torch._C.ExtraFilesMap()
if optimize: # https://pytorch.org/tutorials/recipes/mobile_interpreter.html
optimize_for_mobile(ts)._save_for_lite_interpreter(str(f), _extra_files=extra_files)
else:
ts.save(str(f), _extra_files=extra_files)
return f, None
@try_export
def export_onnx(model, im, file, opset, dynamic, simplify, prefix=colorstr('ONNX:')):
# YOLO ONNX export
check_requirements('onnx')
import onnx
LOGGER.info(f'\n{prefix} starting export with onnx {onnx.__version__}...')
f = file.with_suffix('.onnx')
output_names = ['output0', 'output1'] if isinstance(model, SegmentationModel) else ['output0']
if dynamic:
dynamic = {'images': {0: 'batch', 2: 'height', 3: 'width'}} # shape(1,3,640,640)
if isinstance(model, SegmentationModel):
dynamic['output0'] = {0: 'batch', 1: 'anchors'} # shape(1,25200,85)
dynamic['output1'] = {0: 'batch', 2: 'mask_height', 3: 'mask_width'} # shape(1,32,160,160)
elif isinstance(model, DetectionModel):
dynamic['output0'] = {0: 'batch', 1: 'anchors'} # shape(1,25200,85)
torch.onnx.export(
model.cpu() if dynamic else model, # --dynamic only compatible with cpu
im.cpu() if dynamic else im,
f,
verbose=False,
opset_version=opset,
do_constant_folding=True,
input_names=['images'],
output_names=output_names,
dynamic_axes=dynamic or None)
# Checks
model_onnx = onnx.load(f) # load onnx model
onnx.checker.check_model(model_onnx) # check onnx model
# Metadata
d = {'stride': int(max(model.stride)), 'names': model.names}
for k, v in d.items():
meta = model_onnx.metadata_props.add()
meta.key, meta.value = k, str(v)
onnx.save(model_onnx, f)
# Simplify
if simplify:
try:
cuda = torch.cuda.is_available()
check_requirements(('onnxruntime-gpu' if cuda else 'onnxruntime', 'onnx-simplifier>=0.4.1'))
import onnxsim
LOGGER.info(f'{prefix} simplifying with onnx-simplifier {onnxsim.__version__}...')
model_onnx, check = onnxsim.simplify(model_onnx)
assert check, 'assert check failed'
onnx.save(model_onnx, f)
except Exception as e:
LOGGER.info(f'{prefix} simplifier failure: {e}')
return f, model_onnx
@try_export
def export_onnx_end2end(model, im, file, simplify, topk_all, iou_thres, conf_thres, device, labels, prefix=colorstr('ONNX END2END:')):
# YOLO ONNX export
check_requirements('onnx')
import onnx
LOGGER.info(f'\n{prefix} starting export with onnx {onnx.__version__}...')
f = os.path.splitext(file)[0] + "-end2end.onnx"
batch_size = 'batch'
dynamic_axes = {'images': {0 : 'batch', 2: 'height', 3:'width'}, } # variable length axes
output_axes = {
'num_dets': {0: 'batch'},
'det_boxes': {0: 'batch'},
'det_scores': {0: 'batch'},
'det_classes': {0: 'batch'},
}
dynamic_axes.update(output_axes)
model = End2End(model, topk_all, iou_thres, conf_thres, None ,device, labels)
output_names = ['num_dets', 'det_boxes', 'det_scores', 'det_classes']
shapes = [ batch_size, 1, batch_size, topk_all, 4,
batch_size, topk_all, batch_size, topk_all]
torch.onnx.export(model,
im,
f,
verbose=False,
export_params=True, # store the trained parameter weights inside the model file
opset_version=12,
do_constant_folding=True, # whether to execute constant folding for optimization
input_names=['images'],
output_names=output_names,
dynamic_axes=dynamic_axes)
# Checks
model_onnx = onnx.load(f) # load onnx model
onnx.checker.check_model(model_onnx) # check onnx model
for i in model_onnx.graph.output:
for j in i.type.tensor_type.shape.dim:
j.dim_param = str(shapes.pop(0))
if simplify:
try:
import onnxsim
print('\nStarting to simplify ONNX...')
model_onnx, check = onnxsim.simplify(model_onnx)
assert check, 'assert check failed'
except Exception as e:
print(f'Simplifier failure: {e}')
# print(onnx.helper.printable_graph(onnx_model.graph)) # print a human readable model
onnx.save(model_onnx,f)
print('ONNX export success, saved as %s' % f)
return f, model_onnx
@try_export
def export_openvino(file, metadata, half, prefix=colorstr('OpenVINO:')):
# YOLO OpenVINO export
check_requirements('openvino-dev') # requires openvino-dev: https://pypi.org/project/openvino-dev/
import openvino.inference_engine as ie
LOGGER.info(f'\n{prefix} starting export with openvino {ie.__version__}...')
f = str(file).replace('.pt', f'_openvino_model{os.sep}')
#cmd = f"mo --input_model {file.with_suffix('.onnx')} --output_dir {f} --data_type {'FP16' if half else 'FP32'}"
#cmd = f"mo --input_model {file.with_suffix('.onnx')} --output_dir {f} {"--compress_to_fp16" if half else ""}"
half_arg = "--compress_to_fp16" if half else ""
cmd = f"mo --input_model {file.with_suffix('.onnx')} --output_dir {f} {half_arg}"
subprocess.run(cmd.split(), check=True, env=os.environ) # export
yaml_save(Path(f) / file.with_suffix('.yaml').name, metadata) # add metadata.yaml
return f, None
@try_export
def export_paddle(model, im, file, metadata, prefix=colorstr('PaddlePaddle:')):
# YOLO Paddle export
check_requirements(('paddlepaddle', 'x2paddle'))
import x2paddle
from x2paddle.convert import pytorch2paddle
LOGGER.info(f'\n{prefix} starting export with X2Paddle {x2paddle.__version__}...')
f = str(file).replace('.pt', f'_paddle_model{os.sep}')
pytorch2paddle(module=model, save_dir=f, jit_type='trace', input_examples=[im]) # export
yaml_save(Path(f) / file.with_suffix('.yaml').name, metadata) # add metadata.yaml
return f, None
@try_export
def export_coreml(model, im, file, int8, half, prefix=colorstr('CoreML:')):
# YOLO CoreML export
check_requirements('coremltools')
import coremltools as ct
LOGGER.info(f'\n{prefix} starting export with coremltools {ct.__version__}...')
f = file.with_suffix('.mlmodel')
ts = torch.jit.trace(model, im, strict=False) # TorchScript model
ct_model = ct.convert(ts, inputs=[ct.ImageType('image', shape=im.shape, scale=1 / 255, bias=[0, 0, 0])])
bits, mode = (8, 'kmeans_lut') if int8 else (16, 'linear') if half else (32, None)
if bits < 32:
if MACOS: # quantization only supported on macOS
with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=DeprecationWarning) # suppress numpy==1.20 float warning
ct_model = ct.models.neural_network.quantization_utils.quantize_weights(ct_model, bits, mode)
else:
print(f'{prefix} quantization only supported on macOS, skipping...')
ct_model.save(f)
return f, ct_model
@try_export
def export_engine(model, im, file, half, dynamic, simplify, workspace=4, verbose=False, prefix=colorstr('TensorRT:')):
# YOLO TensorRT export https://developer.nvidia.com/tensorrt
assert im.device.type != 'cpu', 'export running on CPU but must be on GPU, i.e. `python export.py --device 0`'
try:
import tensorrt as trt
except Exception:
if platform.system() == 'Linux':
check_requirements('nvidia-tensorrt', cmds='-U --index-url https://pypi.ngc.nvidia.com')
import tensorrt as trt
if trt.__version__[0] == '7': # TensorRT 7 handling https://github.com/ultralytics/yolov5/issues/6012
grid = model.model[-1].anchor_grid
model.model[-1].anchor_grid = [a[..., :1, :1, :] for a in grid]
export_onnx(model, im, file, 12, dynamic, simplify) # opset 12
model.model[-1].anchor_grid = grid
else: # TensorRT >= 8
check_version(trt.__version__, '8.0.0', hard=True) # require tensorrt>=8.0.0
export_onnx(model, im, file, 12, dynamic, simplify) # opset 12
onnx = file.with_suffix('.onnx')
LOGGER.info(f'\n{prefix} starting export with TensorRT {trt.__version__}...')
assert onnx.exists(), f'failed to export ONNX file: {onnx}'
f = file.with_suffix('.engine') # TensorRT engine file
logger = trt.Logger(trt.Logger.INFO)
if verbose:
logger.min_severity = trt.Logger.Severity.VERBOSE
builder = trt.Builder(logger)
config = builder.create_builder_config()
config.max_workspace_size = workspace * 1 << 30
# config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, workspace << 30) # fix TRT 8.4 deprecation notice
flag = (1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
network = builder.create_network(flag)
parser = trt.OnnxParser(network, logger)
if not parser.parse_from_file(str(onnx)):
raise RuntimeError(f'failed to load ONNX file: {onnx}')
inputs = [network.get_input(i) for i in range(network.num_inputs)]
outputs = [network.get_output(i) for i in range(network.num_outputs)]
for inp in inputs:
LOGGER.info(f'{prefix} input "{inp.name}" with shape{inp.shape} {inp.dtype}')
for out in outputs:
LOGGER.info(f'{prefix} output "{out.name}" with shape{out.shape} {out.dtype}')
if dynamic:
if im.shape[0] <= 1:
LOGGER.warning(f"{prefix} WARNING ⚠️ --dynamic model requires maximum --batch-size argument")
profile = builder.create_optimization_profile()
for inp in inputs:
profile.set_shape(inp.name, (1, *im.shape[1:]), (max(1, im.shape[0] // 2), *im.shape[1:]), im.shape)
config.add_optimization_profile(profile)
LOGGER.info(f'{prefix} building FP{16 if builder.platform_has_fast_fp16 and half else 32} engine as {f}')
if builder.platform_has_fast_fp16 and half:
config.set_flag(trt.BuilderFlag.FP16)
with builder.build_engine(network, config) as engine, open(f, 'wb') as t:
t.write(engine.serialize())
return f, None
@try_export
def export_saved_model(model,
im,
file,
dynamic,
tf_nms=False,
agnostic_nms=False,
topk_per_class=100,
topk_all=100,
iou_thres=0.45,
conf_thres=0.25,
keras=False,
prefix=colorstr('TensorFlow SavedModel:')):
# YOLO TensorFlow SavedModel export
try:
import tensorflow as tf
except Exception:
check_requirements(f"tensorflow{'' if torch.cuda.is_available() else '-macos' if MACOS else '-cpu'}")
import tensorflow as tf
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2
from models.tf import TFModel
LOGGER.info(f'\n{prefix} starting export with tensorflow {tf.__version__}...')
f = str(file).replace('.pt', '_saved_model')
batch_size, ch, *imgsz = list(im.shape) # BCHW
tf_model = TFModel(cfg=model.yaml, model=model, nc=model.nc, imgsz=imgsz)
im = tf.zeros((batch_size, *imgsz, ch)) # BHWC order for TensorFlow
_ = tf_model.predict(im, tf_nms, agnostic_nms, topk_per_class, topk_all, iou_thres, conf_thres)
inputs = tf.keras.Input(shape=(*imgsz, ch), batch_size=None if dynamic else batch_size)
outputs = tf_model.predict(inputs, tf_nms, agnostic_nms, topk_per_class, topk_all, iou_thres, conf_thres)
keras_model = tf.keras.Model(inputs=inputs, outputs=outputs)
keras_model.trainable = False
keras_model.summary()
if keras:
keras_model.save(f, save_format='tf')
else:
spec = tf.TensorSpec(keras_model.inputs[0].shape, keras_model.inputs[0].dtype)
m = tf.function(lambda x: keras_model(x)) # full model
m = m.get_concrete_function(spec)
frozen_func = convert_variables_to_constants_v2(m)
tfm = tf.Module()
tfm.__call__ = tf.function(lambda x: frozen_func(x)[:4] if tf_nms else frozen_func(x), [spec])
tfm.__call__(im)
tf.saved_model.save(tfm,
f,
options=tf.saved_model.SaveOptions(experimental_custom_gradients=False) if check_version(
tf.__version__, '2.6') else tf.saved_model.SaveOptions())
return f, keras_model
@try_export
def export_pb(keras_model, file, prefix=colorstr('TensorFlow GraphDef:')):
# YOLO TensorFlow GraphDef *.pb export https://github.com/leimao/Frozen_Graph_TensorFlow
import tensorflow as tf
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2
LOGGER.info(f'\n{prefix} starting export with tensorflow {tf.__version__}...')
f = file.with_suffix('.pb')
m = tf.function(lambda x: keras_model(x)) # full model
m = m.get_concrete_function(tf.TensorSpec(keras_model.inputs[0].shape, keras_model.inputs[0].dtype))
frozen_func = convert_variables_to_constants_v2(m)
frozen_func.graph.as_graph_def()
tf.io.write_graph(graph_or_graph_def=frozen_func.graph, logdir=str(f.parent), name=f.name, as_text=False)
return f, None
@try_export
def export_tflite(keras_model, im, file, int8, data, nms, agnostic_nms, prefix=colorstr('TensorFlow Lite:')):
# YOLOv5 TensorFlow Lite export
import tensorflow as tf
LOGGER.info(f'\n{prefix} starting export with tensorflow {tf.__version__}...')
batch_size, ch, *imgsz = list(im.shape) # BCHW
f = str(file).replace('.pt', '-fp16.tflite')
converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]
converter.target_spec.supported_types = [tf.float16]
converter.optimizations = [tf.lite.Optimize.DEFAULT]
if int8:
from models.tf import representative_dataset_gen
dataset = LoadImages(check_dataset(check_yaml(data))['train'], img_size=imgsz, auto=False)
converter.representative_dataset = lambda: representative_dataset_gen(dataset, ncalib=100)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.target_spec.supported_types = []
converter.inference_input_type = tf.uint8 # or tf.int8
converter.inference_output_type = tf.uint8 # or tf.int8
converter.experimental_new_quantizer = True
f = str(file).replace('.pt', '-int8.tflite')
if nms or agnostic_nms:
converter.target_spec.supported_ops.append(tf.lite.OpsSet.SELECT_TF_OPS)
tflite_model = converter.convert()
open(f, "wb").write(tflite_model)
return f, None
@try_export
def export_edgetpu(file, prefix=colorstr('Edge TPU:')):
# YOLO Edge TPU export https://coral.ai/docs/edgetpu/models-intro/
cmd = 'edgetpu_compiler --version'
help_url = 'https://coral.ai/docs/edgetpu/compiler/'
assert platform.system() == 'Linux', f'export only supported on Linux. See {help_url}'
if subprocess.run(f'{cmd} >/dev/null', shell=True).returncode != 0:
LOGGER.info(f'\n{prefix} export requires Edge TPU compiler. Attempting install from {help_url}')
sudo = subprocess.run('sudo --version >/dev/null', shell=True).returncode == 0 # sudo installed on system
for c in (
'curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -',
'echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list',
'sudo apt-get update', 'sudo apt-get install edgetpu-compiler'):
subprocess.run(c if sudo else c.replace('sudo ', ''), shell=True, check=True)
ver = subprocess.run(cmd, shell=True, capture_output=True, check=True).stdout.decode().split()[-1]
LOGGER.info(f'\n{prefix} starting export with Edge TPU compiler {ver}...')
f = str(file).replace('.pt', '-int8_edgetpu.tflite') # Edge TPU model
f_tfl = str(file).replace('.pt', '-int8.tflite') # TFLite model
cmd = f"edgetpu_compiler -s -d -k 10 --out_dir {file.parent} {f_tfl}"
subprocess.run(cmd.split(), check=True)
return f, None
@try_export
def export_tfjs(file, prefix=colorstr('TensorFlow.js:')):
# YOLO TensorFlow.js export
check_requirements('tensorflowjs')
import tensorflowjs as tfjs
LOGGER.info(f'\n{prefix} starting export with tensorflowjs {tfjs.__version__}...')
f = str(file).replace('.pt', '_web_model') # js dir
f_pb = file.with_suffix('.pb') # *.pb path
f_json = f'{f}/model.json' # *.json path
cmd = f'tensorflowjs_converter --input_format=tf_frozen_model ' \
f'--output_node_names=Identity,Identity_1,Identity_2,Identity_3 {f_pb} {f}'
subprocess.run(cmd.split())
json = Path(f_json).read_text()
with open(f_json, 'w') as j: # sort JSON Identity_* in ascending order
subst = re.sub(
r'{"outputs": {"Identity.?.?": {"name": "Identity.?.?"}, '
r'"Identity.?.?": {"name": "Identity.?.?"}, '
r'"Identity.?.?": {"name": "Identity.?.?"}, '
r'"Identity.?.?": {"name": "Identity.?.?"}}}', r'{"outputs": {"Identity": {"name": "Identity"}, '
r'"Identity_1": {"name": "Identity_1"}, '
r'"Identity_2": {"name": "Identity_2"}, '
r'"Identity_3": {"name": "Identity_3"}}}', json)
j.write(subst)
return f, None
def add_tflite_metadata(file, metadata, num_outputs):
# Add metadata to *.tflite models per https://www.tensorflow.org/lite/models/convert/metadata
with contextlib.suppress(ImportError):
# check_requirements('tflite_support')
from tflite_support import flatbuffers
from tflite_support import metadata as _metadata
from tflite_support import metadata_schema_py_generated as _metadata_fb
tmp_file = Path('/tmp/meta.txt')
with open(tmp_file, 'w') as meta_f:
meta_f.write(str(metadata))
model_meta = _metadata_fb.ModelMetadataT()
label_file = _metadata_fb.AssociatedFileT()
label_file.name = tmp_file.name
model_meta.associatedFiles = [label_file]
subgraph = _metadata_fb.SubGraphMetadataT()
subgraph.inputTensorMetadata = [_metadata_fb.TensorMetadataT()]
subgraph.outputTensorMetadata = [_metadata_fb.TensorMetadataT()] * num_outputs
model_meta.subgraphMetadata = [subgraph]
b = flatbuffers.Builder(0)
b.Finish(model_meta.Pack(b), _metadata.MetadataPopulator.METADATA_FILE_IDENTIFIER)
metadata_buf = b.Output()
populator = _metadata.MetadataPopulator.with_model_file(file)
populator.load_metadata_buffer(metadata_buf)
populator.load_associated_files([str(tmp_file)])
populator.populate()
tmp_file.unlink()
@smart_inference_mode()
def run(
data=ROOT / 'data/coco.yaml', # 'dataset.yaml path'
weights=ROOT / 'yolo.pt', # weights path
imgsz=(640, 640), # image (height, width)
batch_size=1, # batch size
device='cpu', # cuda device, i.e. 0 or 0,1,2,3 or cpu
include=('torchscript', 'onnx'), # include formats
half=False, # FP16 half-precision export
inplace=False, # set YOLO Detect() inplace=True
keras=False, # use Keras
optimize=False, # TorchScript: optimize for mobile
int8=False, # CoreML/TF INT8 quantization
dynamic=False, # ONNX/TF/TensorRT: dynamic axes
simplify=False, # ONNX: simplify model
opset=12, # ONNX: opset version
verbose=False, # TensorRT: verbose log
workspace=4, # TensorRT: workspace size (GB)
nms=False, # TF: add NMS to model
agnostic_nms=False, # TF: add agnostic NMS to model
topk_per_class=100, # TF.js NMS: topk per class to keep
topk_all=100, # TF.js NMS: topk for all classes to keep
iou_thres=0.45, # TF.js NMS: IoU threshold
conf_thres=0.25, # TF.js NMS: confidence threshold
):
t = time.time()
include = [x.lower() for x in include] # to lowercase
fmts = tuple(export_formats()['Argument'][1:]) # --include arguments
flags = [x in include for x in fmts]
assert sum(flags) == len(include), f'ERROR: Invalid --include {include}, valid --include arguments are {fmts}'
jit, onnx, onnx_end2end, xml, engine, coreml, saved_model, pb, tflite, edgetpu, tfjs, paddle = flags # export booleans
file = Path(url2file(weights) if str(weights).startswith(('http:/', 'https:/')) else weights) # PyTorch weights
# Load PyTorch model
device = select_device(device)
if half:
assert device.type != 'cpu' or coreml, '--half only compatible with GPU export, i.e. use --device 0'
assert not dynamic, '--half not compatible with --dynamic, i.e. use either --half or --dynamic but not both'
model = attempt_load(weights, device=device, inplace=True, fuse=True) # load FP32 model
# Checks
imgsz *= 2 if len(imgsz) == 1 else 1 # expand
if optimize:
assert device.type == 'cpu', '--optimize not compatible with cuda devices, i.e. use --device cpu'
# Input
gs = int(max(model.stride)) # grid size (max stride)
imgsz = [check_img_size(x, gs) for x in imgsz] # verify img_size are gs-multiples
im = torch.zeros(batch_size, 3, *imgsz).to(device) # image size(1,3,320,192) BCHW iDetection
# Update model
model.eval()
for k, m in model.named_modules():
if isinstance(m, (Detect, DDetect, DualDetect, DualDDetect)):
m.inplace = inplace
m.dynamic = dynamic
m.export = True
for _ in range(2):
y = model(im) # dry runs
if half and not coreml:
im, model = im.half(), model.half() # to FP16
shape = tuple((y[0] if isinstance(y, (tuple, list)) else y).shape) # model output shape
metadata = {'stride': int(max(model.stride)), 'names': model.names} # model metadata
LOGGER.info(f"\n{colorstr('PyTorch:')} starting from {file} with output shape {shape} ({file_size(file):.1f} MB)")
# Exports
f = [''] * len(fmts) # exported filenames
warnings.filterwarnings(action='ignore', category=torch.jit.TracerWarning) # suppress TracerWarning
if jit: # TorchScript
f[0], _ = export_torchscript(model, im, file, optimize)
if engine: # TensorRT required before ONNX
f[1], _ = export_engine(model, im, file, half, dynamic, simplify, workspace, verbose)
if onnx or xml: # OpenVINO requires ONNX
f[2], _ = export_onnx(model, im, file, opset, dynamic, simplify)
if onnx_end2end:
if isinstance(model, DetectionModel):
labels = model.names
f[2], _ = export_onnx_end2end(model, im, file, simplify, topk_all, iou_thres, conf_thres, device, len(labels))
else:
raise RuntimeError("The model is not a DetectionModel.")
if xml: # OpenVINO
f[3], _ = export_openvino(file, metadata, half)
if coreml: # CoreML
f[4], _ = export_coreml(model, im, file, int8, half)
if any((saved_model, pb, tflite, edgetpu, tfjs)): # TensorFlow formats
assert not tflite or not tfjs, 'TFLite and TF.js models must be exported separately, please pass only one type.'
assert not isinstance(model, ClassificationModel), 'ClassificationModel export to TF formats not yet supported.'
f[5], s_model = export_saved_model(model.cpu(),
im,
file,
dynamic,
tf_nms=nms or agnostic_nms or tfjs,
agnostic_nms=agnostic_nms or tfjs,
topk_per_class=topk_per_class,
topk_all=topk_all,
iou_thres=iou_thres,
conf_thres=conf_thres,
keras=keras)
if pb or tfjs: # pb prerequisite to tfjs
f[6], _ = export_pb(s_model, file)
if tflite or edgetpu:
f[7], _ = export_tflite(s_model, im, file, int8 or edgetpu, data=data, nms=nms, agnostic_nms=agnostic_nms)
if edgetpu:
f[8], _ = export_edgetpu(file)
add_tflite_metadata(f[8] or f[7], metadata, num_outputs=len(s_model.outputs))
if tfjs:
f[9], _ = export_tfjs(file)
if paddle: # PaddlePaddle
f[10], _ = export_paddle(model, im, file, metadata)
# Finish
f = [str(x) for x in f if x] # filter out '' and None
if any(f):
cls, det, seg = (isinstance(model, x) for x in (ClassificationModel, DetectionModel, SegmentationModel)) # type
dir = Path('segment' if seg else 'classify' if cls else '')
h = '--half' if half else '' # --half FP16 inference arg
s = "# WARNING ⚠️ ClassificationModel not yet supported for PyTorch Hub AutoShape inference" if cls else \
"# WARNING ⚠️ SegmentationModel not yet supported for PyTorch Hub AutoShape inference" if seg else ''
if onnx_end2end:
LOGGER.info(f'\nExport complete ({time.time() - t:.1f}s)'
f"\nResults saved to {colorstr('bold', file.parent.resolve())}"
f"\nVisualize: https://netron.app")
else:
LOGGER.info(f'\nExport complete ({time.time() - t:.1f}s)'
f"\nResults saved to {colorstr('bold', file.parent.resolve())}"
f"\nDetect: python {dir / ('detect.py' if det else 'predict.py')} --weights {f[-1]} {h}"
f"\nValidate: python {dir / 'val.py'} --weights {f[-1]} {h}"
f"\nPyTorch Hub: model = torch.hub.load('ultralytics/yolov5', 'custom', '{f[-1]}') {s}"
f"\nVisualize: https://netron.app")
return f # return list of exported files/dirs
def parse_opt():
parser = argparse.ArgumentParser()
parser.add_argument('--data', type=str, default=ROOT / 'data/coco.yaml', help='dataset.yaml path')
parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolo.pt', help='model.pt path(s)')
parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640, 640], help='image (h, w)')
parser.add_argument('--batch-size', type=int, default=1, help='batch size')
parser.add_argument('--device', default='cpu', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--half', action='store_true', help='FP16 half-precision export')
parser.add_argument('--inplace', action='store_true', help='set YOLO Detect() inplace=True')
parser.add_argument('--keras', action='store_true', help='TF: use Keras')
parser.add_argument('--optimize', action='store_true', help='TorchScript: optimize for mobile')
parser.add_argument('--int8', action='store_true', help='CoreML/TF INT8 quantization')
parser.add_argument('--dynamic', action='store_true', help='ONNX/TF/TensorRT: dynamic axes')
parser.add_argument('--simplify', action='store_true', help='ONNX: simplify model')
parser.add_argument('--opset', type=int, default=12, help='ONNX: opset version')
parser.add_argument('--verbose', action='store_true', help='TensorRT: verbose log')
parser.add_argument('--workspace', type=int, default=4, help='TensorRT: workspace size (GB)')
parser.add_argument('--nms', action='store_true', help='TF: add NMS to model')
parser.add_argument('--agnostic-nms', action='store_true', help='TF: add agnostic NMS to model')
parser.add_argument('--topk-per-class', type=int, default=100, help='TF.js NMS: topk per class to keep')
parser.add_argument('--topk-all', type=int, default=100, help='ONNX END2END/TF.js NMS: topk for all classes to keep')
parser.add_argument('--iou-thres', type=float, default=0.45, help='ONNX END2END/TF.js NMS: IoU threshold')
parser.add_argument('--conf-thres', type=float, default=0.25, help='ONNX END2END/TF.js NMS: confidence threshold')
parser.add_argument(
'--include',
nargs='+',
default=['torchscript'],
help='torchscript, onnx, onnx_end2end, openvino, engine, coreml, saved_model, pb, tflite, edgetpu, tfjs, paddle')
opt = parser.parse_args()
if 'onnx_end2end' in opt.include:
opt.simplify = True
opt.dynamic = True
opt.inplace = True
opt.half = False
print_args(vars(opt))
return opt
def main(opt):
for opt.weights in (opt.weights if isinstance(opt.weights, list) else [opt.weights]):
run(**vars(opt))
if __name__ == "__main__":
opt = parse_opt()
main(opt)
================================================
FILE: hubconf.py
================================================
import torch
def _create(name, pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):
"""Creates or loads a YOLO model
Arguments:
name (str): model name 'yolov3' or path 'path/to/best.pt'
pretrained (bool): load pretrained weights into the model
channels (int): number of input channels
classes (int): number of model classes
autoshape (bool): apply YOLO .autoshape() wrapper to model
verbose (bool): print all information to screen
device (str, torch.device, None): device to use for model parameters
Returns:
YOLO model
"""
from pathlib import Path
from models.common import AutoShape, DetectMultiBackend
from models.experimental import attempt_load
from models.yolo import ClassificationModel, DetectionModel, SegmentationModel
from utils.downloads import attempt_download
from utils.general import LOGGER, check_requirements, intersect_dicts, logging
from utils.torch_utils import select_device
if not verbose:
LOGGER.setLevel(logging.WARNING)
check_requirements(exclude=('opencv-python', 'tensorboard', 'thop'))
name = Path(name)
path = name.with_suffix('.pt') if name.suffix == '' and not name.is_dir() else name # checkpoint path
try:
device = select_device(device)
if pretrained and channels == 3 and classes == 80:
try:
model = DetectMultiBackend(path, device=device, fuse=autoshape) # detection model
if autoshape:
if model.pt and isinstance(model.model, ClassificationModel):
LOGGER.warning('WARNING ⚠️ YOLO ClassificationModel is not yet AutoShape compatible. '
'You must pass torch tensors in BCHW to this model, i.e. shape(1,3,224,224).')
elif model.pt and isinstance(model.model, SegmentationModel):
LOGGER.warning('WARNING ⚠️ YOLO SegmentationModel is not yet AutoShape compatible. '
'You will not be able to run inference with this model.')
else:
model = AutoShape(model) # for file/URI/PIL/cv2/np inputs and NMS
except Exception:
model = attempt_load(path, device=device, fuse=False) # arbitrary model
else:
cfg = list((Path(__file__).parent / 'models').rglob(f'{path.stem}.yaml'))[0] # model.yaml path
model = DetectionModel(cfg, channels, classes) # create model
if pretrained:
ckpt = torch.load(attempt_download(path), map_location=device) # load
csd = ckpt['model'].float().state_dict() # checkpoint state_dict as FP32
csd = intersect_dicts(csd, model.state_dict(), exclude=['anchors']) # intersect
model.load_state_dict(csd, strict=False) # load
if len(ckpt['model'].names) == classes:
model.names = ckpt['model'].names # set class names attribute
if not verbose:
LOGGER.setLevel(logging.INFO) # reset to default
return model.to(device)
except Exception as e:
help_url = 'https://github.com/ultralytics/yolov5/issues/36'
s = f'{e}. Cache may be out of date, try `force_reload=True` or see {help_url} for help.'
raise Exception(s) from e
def custom(path='path/to/model.pt', autoshape=True, _verbose=True, device=None):
# YOLO custom or local model
return _create(path, autoshape=autoshape, verbose=_verbose, device=device)
if __name__ == '__main__':
import argparse
from pathlib import Path
import numpy as np
from PIL import Image
from utils.general import cv2, print_args
# Argparser
parser = argparse.ArgumentParser()
parser.add_argument('--model', type=str, default='yolo', help='model name')
opt = parser.parse_args()
print_args(vars(opt))
# Model
model = _create(name=opt.model, pretrained=True, channels=3, classes=80, autoshape=True, verbose=True)
# model = custom(path='path/to/model.pt') # custom
# Images
imgs = [
'data/images/zidane.jpg', # filename
Path('data/images/zidane.jpg'), # Path
'https://ultralytics.com/images/zidane.jpg', # URI
cv2.imread('data/images/bus.jpg')[:, :, ::-1], # OpenCV
Image.open('data/images/bus.jpg'), # PIL
np.zeros((320, 640, 3))] # numpy
# Inference
results = model(imgs, size=320) # batched inference
# Results
results.print()
results.save()
================================================
FILE: models/__init__.py
================================================
# init
================================================
FILE: models/common.py
================================================
import ast
import contextlib
import json
import math
import platform
import warnings
import zipfile
from collections import OrderedDict, namedtuple
from copy import copy
from pathlib import Path
from urllib.parse import urlparse
from typing import Optional
import cv2
import numpy as np
import pandas as pd
import requests
import torch
import torch.nn as nn
from IPython.display import display
from PIL import Image
from torch.cuda import amp
from utils import TryExcept
from utils.dataloaders import exif_transpose, letterbox
from utils.general import (LOGGER, ROOT, Profile, check_requirements, check_suffix, check_version, colorstr,
increment_path, is_notebook, make_divisible, non_max_suppression, scale_boxes,
xywh2xyxy, xyxy2xywh, yaml_load)
from utils.plots import Annotator, colors, save_one_box
from utils.torch_utils import copy_attr, smart_inference_mode
def autopad(k, p=None, d=1): # kernel, padding, dilation
# Pad to 'same' shape outputs
if d > 1:
k = d * (k - 1) + 1 if isinstance(k, int) else [d * (x - 1) + 1 for x in k] # actual kernel-size
if p is None:
p = k // 2 if isinstance(k, int) else [x // 2 for x in k] # auto-pad
return p
class Conv(nn.Module):
# Standard convolution with args(ch_in, ch_out, kernel, stride, padding, groups, dilation, activation)
default_act = nn.SiLU() # default activation
def __init__(self, c1, c2, k=1, s=1, p=None, g=1, d=1, act=True):
super().__init__()
self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p, d), groups=g, dilation=d, bias=False)
self.bn = nn.BatchNorm2d(c2)
self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity()
def forward(self, x):
return self.act(self.bn(self.conv(x)))
def forward_fuse(self, x):
return self.act(self.conv(x))
class AConv(nn.Module):
def __init__(self, c1, c2): # ch_in, ch_out, shortcut, kernels, groups, expand
super().__init__()
self.cv1 = Conv(c1, c2, 3, 2, 1)
def forward(self, x):
x = torch.nn.functional.avg_pool2d(x, 2, 1, 0, False, True)
return self.cv1(x)
class ADown(nn.Module):
def __init__(self, c1, c2): # ch_in, ch_out, shortcut, kernels, groups, expand
super().__init__()
self.c = c2 // 2
self.cv1 = Conv(c1 // 2, self.c, 3, 2, 1)
self.cv2 = Conv(c1 // 2, self.c, 1, 1, 0)
def forward(self, x):
x = torch.nn.functional.avg_pool2d(x, 2, 1, 0, False, True)
x1,x2 = x.chunk(2, 1)
x1 = self.cv1(x1)
x2 = torch.nn.functional.max_pool2d(x2, 3, 2, 1)
x2 = self.cv2(x2)
return torch.cat((x1, x2), 1)
class RepConvN(nn.Module):
"""RepConv is a basic rep-style block, including training and deploy status
This code is based on https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py
"""
default_act = nn.SiLU() # default activation
def __init__(self, c1, c2, k=3, s=1, p=1, g=1, d=1, act=True, bn=False, deploy=False):
super().__init__()
assert k == 3 and p == 1
self.g = g
self.c1 = c1
self.c2 = c2
self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity()
self.bn = None
self.conv1 = Conv(c1, c2, k, s, p=p, g=g, act=False)
self.conv2 = Conv(c1, c2, 1, s, p=(p - k // 2), g=g, act=False)
def forward_fuse(self, x):
"""Forward process"""
return self.act(self.conv(x))
def forward(self, x):
"""Forward process"""
id_out = 0 if self.bn is None else self.bn(x)
return self.act(self.conv1(x) + self.conv2(x) + id_out)
def get_equivalent_kernel_bias(self):
kernel3x3, bias3x3 = self._fuse_bn_tensor(self.conv1)
kernel1x1, bias1x1 = self._fuse_bn_tensor(self.conv2)
kernelid, biasid = self._fuse_bn_tensor(self.bn)
return kernel3x3 + self._pad_1x1_to_3x3_tensor(kernel1x1) + kernelid, bias3x3 + bias1x1 + biasid
def _avg_to_3x3_tensor(self, avgp):
channels = self.c1
groups = self.g
kernel_size = avgp.kernel_size
input_dim = channels // groups
k = torch.zeros((channels, input_dim, kernel_size, kernel_size))
k[np.arange(channels), np.tile(np.arange(input_dim), groups), :, :] = 1.0 / kernel_size ** 2
return k
def _pad_1x1_to_3x3_tensor(self, kernel1x1):
if kernel1x1 is None:
return 0
else:
return torch.nn.functional.pad(kernel1x1, [1, 1, 1, 1])
def _fuse_bn_tensor(self, branch):
if branch is None:
return 0, 0
if isinstance(branch, Conv):
kernel = branch.conv.weight
running_mean = branch.bn.running_mean
running_var = branch.bn.running_var
gamma = branch.bn.weight
beta = branch.bn.bias
eps = branch.bn.eps
elif isinstance(branch, nn.BatchNorm2d):
if not hasattr(self, 'id_tensor'):
input_dim = self.c1 // self.g
kernel_value = np.zeros((self.c1, input_dim, 3, 3), dtype=np.float32)
for i in range(self.c1):
kernel_value[i, i % input_dim, 1, 1] = 1
self.id_tensor = torch.from_numpy(kernel_value).to(branch.weight.device)
kernel = self.id_tensor
running_mean = branch.running_mean
running_var = branch.running_var
gamma = branch.weight
beta = branch.bias
eps = branch.eps
std = (running_var + eps).sqrt()
t = (gamma / std).reshape(-1, 1, 1, 1)
return kernel * t, beta - running_mean * gamma / std
def fuse_convs(self):
if hasattr(self, 'conv'):
return
kernel, bias = self.get_equivalent_kernel_bias()
self.conv = nn.Conv2d(in_channels=self.conv1.conv.in_channels,
out_channels=self.conv1.conv.out_channels,
kernel_size=self.conv1.conv.kernel_size,
stride=self.conv1.conv.stride,
padding=self.conv1.conv.padding,
dilation=self.conv1.conv.dilation,
groups=self.conv1.conv.groups,
bias=True).requires_grad_(False)
self.conv.weight.data = kernel
self.conv.bias.data = bias
for para in self.parameters():
para.detach_()
self.__delattr__('conv1')
self.__delattr__('conv2')
if hasattr(self, 'nm'):
self.__delattr__('nm')
if hasattr(self, 'bn'):
self.__delattr__('bn')
if hasattr(self, 'id_tensor'):
self.__delattr__('id_tensor')
class SP(nn.Module):
def __init__(self, k=3, s=1):
super(SP, self).__init__()
self.m = nn.MaxPool2d(kernel_size=k, stride=s, padding=k // 2)
def forward(self, x):
return self.m(x)
class MP(nn.Module):
# Max pooling
def __init__(self, k=2):
super(MP, self).__init__()
self.m = nn.MaxPool2d(kernel_size=k, stride=k)
def forward(self, x):
return self.m(x)
class ConvTranspose(nn.Module):
# Convolution transpose 2d layer
default_act = nn.SiLU() # default activation
def __init__(self, c1, c2, k=2, s=2, p=0, bn=True, act=True):
super().__init__()
self.conv_transpose = nn.ConvTranspose2d(c1, c2, k, s, p, bias=not bn)
self.bn = nn.BatchNorm2d(c2) if bn else nn.Identity()
self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity()
def forward(self, x):
return self.act(self.bn(self.conv_transpose(x)))
class DWConv(Conv):
# Depth-wise convolution
def __init__(self, c1, c2, k=1, s=1, d=1, act=True): # ch_in, ch_out, kernel, stride, dilation, activation
super().__init__(c1, c2, k, s, g=math.gcd(c1, c2), d=d, act=act)
class DWConvTranspose2d(nn.ConvTranspose2d):
# Depth-wise transpose convolution
def __init__(self, c1, c2, k=1, s=1, p1=0, p2=0): # ch_in, ch_out, kernel, stride, padding, padding_out
super().__init__(c1, c2, k, s, p1, p2, groups=math.gcd(c1, c2))
class DFL(nn.Module):
# DFL module
def __init__(self, c1=17):
super().__init__()
self.conv = nn.Conv2d(c1, 1, 1, bias=False).requires_grad_(False)
self.conv.weight.data[:] = nn.Parameter(torch.arange(c1, dtype=torch.float).view(1, c1, 1, 1)) # / 120.0
self.c1 = c1
# self.bn = nn.BatchNorm2d(4)
def forward(self, x):
b, c, a = x.shape # batch, channels, anchors
return self.conv(x.view(b, 4, self.c1, a).transpose(2, 1).softmax(1)).view(b, 4, a)
# return self.conv(x.view(b, self.c1, 4, a).softmax(1)).view(b, 4, a)
class BottleneckBase(nn.Module):
# Standard bottleneck
def __init__(self, c1, c2, shortcut=True, g=1, k=(1, 3), e=0.5): # ch_in, ch_out, shortcut, kernels, groups, expand
super().__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, k[0], 1)
self.cv2 = Conv(c_, c2, k[1], 1, g=g)
self.add = shortcut and c1 == c2
def forward(self, x):
return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
class RBottleneckBase(nn.Module):
# Standard bottleneck
def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 1), e=0.5): # ch_in, ch_out, shortcut, kernels, groups, expand
super().__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, k[0], 1)
self.cv2 = Conv(c_, c2, k[1], 1, g=g)
self.add = shortcut and c1 == c2
def forward(self, x):
return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
class RepNRBottleneckBase(nn.Module):
# Standard bottleneck
def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 1), e=0.5): # ch_in, ch_out, shortcut, kernels, groups, expand
super().__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = RepConvN(c1, c_, k[0], 1)
self.cv2 = Conv(c_, c2, k[1], 1, g=g)
self.add = shortcut and c1 == c2
def forward(self, x):
return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
class Bottleneck(nn.Module):
# Standard bottleneck
def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5): # ch_in, ch_out, shortcut, kernels, groups, expand
super().__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, k[0], 1)
self.cv2 = Conv(c_, c2, k[1], 1, g=g)
self.add = shortcut and c1 == c2
def forward(self, x):
return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
class RepNBottleneck(nn.Module):
# Standard bottleneck
def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5): # ch_in, ch_out, shortcut, kernels, groups, expand
super().__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = RepConvN(c1, c_, k[0], 1)
self.cv2 = Conv(c_, c2, k[1], 1, g=g)
self.add = shortcut and c1 == c2
def forward(self, x):
return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))
class Res(nn.Module):
# ResNet bottleneck
def __init__(self, c1, c2, shortcut=True, g=1, e=0.5): # ch_in, ch_out, shortcut, groups, expansion
super(Res, self).__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c_, c_, 3, 1, g=g)
self.cv3 = Conv(c_, c2, 1, 1)
self.add = shortcut and c1 == c2
def forward(self, x):
return x + self.cv3(self.cv2(self.cv1(x))) if self.add else self.cv3(self.cv2(self.cv1(x)))
class RepNRes(nn.Module):
# ResNet bottleneck
def __init__(self, c1, c2, shortcut=True, g=1, e=0.5): # ch_in, ch_out, shortcut, groups, expansion
super(RepNRes, self).__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = RepConvN(c_, c_, 3, 1, g=g)
self.cv3 = Conv(c_, c2, 1, 1)
self.add = shortcut and c1 == c2
def forward(self, x):
return x + self.cv3(self.cv2(self.cv1(x))) if self.add else self.cv3(self.cv2(self.cv1(x)))
class BottleneckCSP(nn.Module):
# CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
super().__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False)
self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False)
self.cv4 = Conv(2 * c_, c2, 1, 1)
self.bn = nn.BatchNorm2d(2 * c_) # applied to cat(cv2, cv3)
self.act = nn.SiLU()
self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))
def forward(self, x):
y1 = self.cv3(self.m(self.cv1(x)))
y2 = self.cv2(x)
return self.cv4(self.act(self.bn(torch.cat((y1, y2), 1))))
class CSP(nn.Module):
# CSP Bottleneck with 3 convolutions
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
super().__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c1, c_, 1, 1)
self.cv3 = Conv(2 * c_, c2, 1) # optional act=FReLU(c2)
self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))
def forward(self, x):
return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1))
class RepNCSP(nn.Module):
# CSP Bottleneck with 3 convolutions
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
super().__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c1, c_, 1, 1)
self.cv3 = Conv(2 * c_, c2, 1) # optional act=FReLU(c2)
self.m = nn.Sequential(*(RepNBottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))
def forward(self, x):
return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1))
class CSPBase(nn.Module):
# CSP Bottleneck with 3 convolutions
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ch_out, number, shortcut, groups, expansion
super().__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c1, c_, 1, 1)
self.cv3 = Conv(2 * c_, c2, 1) # optional act=FReLU(c2)
self.m = nn.Sequential(*(BottleneckBase(c_, c_, shortcut, g, e=1.0) for _ in range(n)))
def forward(self, x):
return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1))
class SPP(nn.Module):
# Spatial Pyramid Pooling (SPP) layer https://arxiv.org/abs/1406.4729
def __init__(self, c1, c2, k=(5, 9, 13)):
super().__init__()
c_ = c1 // 2 # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1)
self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])
def forward(self, x):
x = self.cv1(x)
with warnings.catch_warnings():
warnings.simplefilter('ignore') # suppress torch 1.9.0 max_pool2d() warning
return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1))
class ASPP(torch.nn.Module):
def __init__(self, in_channels, out_channels):
super().__init__()
kernel_sizes = [1, 3, 3, 1]
dilations = [1, 3, 6, 1]
paddings = [0, 3, 6, 0]
self.aspp = torch.nn.ModuleList()
for aspp_idx in range(len(kernel_sizes)):
conv = torch.nn.Conv2d(
in_channels,
out_channels,
kernel_size=kernel_sizes[aspp_idx],
stride=1,
dilation=dilations[aspp_idx],
padding=paddings[aspp_idx],
bias=True)
self.aspp.append(conv)
self.gap = torch.nn.AdaptiveAvgPool2d(1)
self.aspp_num = len(kernel_sizes)
for m in self.modules():
if isinstance(m, torch.nn.Conv2d):
n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
m.weight.data.normal_(0, math.sqrt(2. / n))
m.bias.data.fill_(0)
def forward(self, x):
avg_x = self.gap(x)
out = []
for aspp_idx in range(self.aspp_num):
inp = avg_x if (aspp_idx == self.aspp_num - 1) else x
out.append(F.relu_(self.aspp[aspp_idx](inp)))
out[-1] = out[-1].expand_as(out[-2])
out = torch.cat(out, dim=1)
return out
class SPPCSPC(nn.Module):
# CSP SPP https://github.com/WongKinYiu/CrossStagePartialNetworks
def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5, k=(5, 9, 13)):
super(SPPCSPC, self).__init__()
c_ = int(2 * c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c1, c_, 1, 1)
self.cv3 = Conv(c_, c_, 3, 1)
self.cv4 = Conv(c_, c_, 1, 1)
self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])
self.cv5 = Conv(4 * c_, c_, 1, 1)
self.cv6 = Conv(c_, c_, 3, 1)
self.cv7 = Conv(2 * c_, c2, 1, 1)
def forward(self, x):
x1 = self.cv4(self.cv3(self.cv1(x)))
y1 = self.cv6(self.cv5(torch.cat([x1] + [m(x1) for m in self.m], 1)))
y2 = self.cv2(x)
return self.cv7(torch.cat((y1, y2), dim=1))
class SPPF(nn.Module):
# Spatial Pyramid Pooling - Fast (SPPF) layer by Glenn Jocher
def __init__(self, c1, c2, k=5): # equivalent to SPP(k=(5, 9, 13))
super().__init__()
c_ = c1 // 2 # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c_ * 4, c2, 1, 1)
self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)
# self.m = SoftPool2d(kernel_size=k, stride=1, padding=k // 2)
def forward(self, x):
x = self.cv1(x)
with warnings.catch_warnings():
warnings.simplefilter('ignore') # suppress torch 1.9.0 max_pool2d() warning
y1 = self.m(x)
y2 = self.m(y1)
return self.cv2(torch.cat((x, y1, y2, self.m(y2)), 1))
import torch.nn.functional as F
from torch.nn.modules.utils import _pair
class ReOrg(nn.Module):
# yolo
def __init__(self):
super(ReOrg, self).__init__()
def forward(self, x): # x(b,c,w,h) -> y(b,4c,w/2,h/2)
return torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1)
class Contract(nn.Module):
# Contract width-height into channels, i.e. x(1,64,80,80) to x(1,256,40,40)
def __init__(self, gain=2):
super().__init__()
self.gain = gain
def forward(self, x):
b, c, h, w = x.size() # assert (h / s == 0) and (W / s == 0), 'Indivisible gain'
s = self.gain
x = x.view(b, c, h // s, s, w // s, s) # x(1,64,40,2,40,2)
x = x.permute(0, 3, 5, 1, 2, 4).contiguous() # x(1,2,2,64,40,40)
return x.view(b, c * s * s, h // s, w // s) # x(1,256,40,40)
class Expand(nn.Module):
# Expand channels into width-height, i.e. x(1,64,80,80) to x(1,16,160,160)
def __init__(self, gain=2):
super().__init__()
self.gain = gain
def forward(self, x):
b, c, h, w = x.size() # assert C / s ** 2 == 0, 'Indivisible gain'
s = self.gain
x = x.view(b, s, s, c // s ** 2, h, w) # x(1,2,2,16,80,80)
x = x.permute(0, 3, 4, 1, 5, 2).contiguous() # x(1,16,80,2,80,2)
return x.view(b, c // s ** 2, h * s, w * s) # x(1,16,160,160)
class Concat(nn.Module):
# Concatenate a list of tensors along dimension
def __init__(self, dimension=1):
super().__init__()
self.d = dimension
def forward(self, x):
return torch.cat(x, self.d)
class Shortcut(nn.Module):
def __init__(self, dimension=0):
super(Shortcut, self).__init__()
self.d = dimension
def forward(self, x):
return x[0]+x[1]
class Silence(nn.Module):
def __init__(self):
super(Silence, self).__init__()
def forward(self, x):
return x
##### GELAN #####
class SPPELAN(nn.Module):
# spp-elan
def __init__(self, c1, c2, c3): # ch_in, ch_out, number, shortcut, groups, expansion
super().__init__()
self.c = c3
self.cv1 = Conv(c1, c3, 1, 1)
self.cv2 = SP(5)
self.cv3 = SP(5)
self.cv4 = SP(5)
self.cv5 = Conv(4*c3, c2, 1, 1)
def forward(self, x):
y = [self.cv1(x)]
y.extend(m(y[-1]) for m in [self.cv2, self.cv3, self.cv4])
return self.cv5(torch.cat(y, 1))
class ELAN1(nn.Module):
def __init__(self, c1, c2, c3, c4): # ch_in, ch_out, number, shortcut, groups, expansion
super().__init__()
self.c = c3//2
self.cv1 = Conv(c1, c3, 1, 1)
self.cv2 = Conv(c3//2, c4, 3, 1)
self.cv3 = Conv(c4, c4, 3, 1)
self.cv4 = Conv(c3+(2*c4), c2, 1, 1)
def forward(self, x):
y = list(self.cv1(x).chunk(2, 1))
y.extend(m(y[-1]) for m in [self.cv2, self.cv3])
return self.cv4(torch.cat(y, 1))
def forward_split(self, x):
y = list(self.cv1(x).split((self.c, self.c), 1))
y.extend(m(y[-1]) for m in [self.cv2, self.cv3])
return self.cv4(torch.cat(y, 1))
class RepNCSPELAN4(nn.Module):
# csp-elan
def __init__(self, c1, c2, c3, c4, c5=1): # ch_in, ch_out, number, shortcut, groups, expansion
super().__init__()
self.c = c3//2
self.cv1 = Conv(c1, c3, 1, 1)
self.cv2 = nn.Sequential(RepNCSP(c3//2, c4, c5), Conv(c4, c4, 3, 1))
self.cv3 = nn.Sequential(RepNCSP(c4, c4, c5), Conv(c4, c4, 3, 1))
self.cv4 = Conv(c3+(2*c4), c2, 1, 1)
def forward(self, x):
y = list(self.cv1(x).chunk(2, 1))
y.extend((m(y[-1])) for m in [self.cv2, self.cv3])
return self.cv4(torch.cat(y, 1))
def forward_split(self, x):
y = list(self.cv1(x).split((self.c, self.c), 1))
y.extend(m(y[-1]) for m in [self.cv2, self.cv3])
return self.cv4(torch.cat(y, 1))
#################
##### YOLOR #####
class ImplicitA(nn.Module):
def __init__(self, channel):
super(ImplicitA, self).__init__()
self.channel = channel
self.implicit = nn.Parameter(torch.zeros(1, channel, 1, 1))
nn.init.normal_(self.implicit, std=.02)
def forward(self, x):
return self.implicit + x
class ImplicitM(nn.Module):
def __init__(self, channel):
super(ImplicitM, self).__init__()
self.channel = channel
self.implicit = nn.Parameter(torch.ones(1, channel, 1, 1))
nn.init.normal_(self.implicit, mean=1., std=.02)
def forward(self, x):
return self.implicit * x
#################
##### CBNet #####
class CBLinear(nn.Module):
def __init__(self, c1, c2s, k=1, s=1, p=None, g=1): # ch_in, ch_outs, kernel, stride, padding, groups
super(CBLinear, self).__init__()
self.c2s = c2s
self.conv = nn.Conv2d(c1, sum(c2s), k, s, autopad(k, p), groups=g, bias=True)
def forward(self, x):
outs = self.conv(x).split(self.c2s, dim=1)
return outs
class CBFuse(nn.Module):
def __init__(self, idx):
super(CBFuse, self).__init__()
self.idx = idx
def forward(self, xs):
target_size = xs[-1].shape[2:]
res = [F.interpolate(x[self.idx[i]], size=target_size, mode='nearest') for i, x in enumerate(xs[:-1])]
out = torch.sum(torch.stack(res + xs[-1:]), dim=0)
return out
#################
class DetectMultiBackend(nn.Module):
# YOLO MultiBackend class for python inference on various backends
def __init__(self, weights='yolo.pt', device=torch.device('cpu'), dnn=False, data=None, fp16=False, fuse=True):
# Usage:
# PyTorch: weights = *.pt
# TorchScript: *.torchscript
# ONNX Runtime: *.onnx
# ONNX OpenCV DNN: *.onnx --dnn
# OpenVINO: *_openvino_model
# CoreML: *.mlmodel
# TensorRT: *.engine
# TensorFlow SavedModel: *_saved_model
# TensorFlow GraphDef: *.pb
# TensorFlow Lite: *.tflite
# TensorFlow Edge TPU: *_edgetpu.tflite
# PaddlePaddle: *_paddle_model
from models.experimental import attempt_download, attempt_load # scoped to avoid circular import
super().__init__()
w = str(weights[0] if isinstance(weights, list) else weights)
pt, jit, onnx, onnx_end2end, xml, engine, coreml, saved_model, pb, tflite, edgetpu, tfjs, paddle, triton = self._model_type(w)
fp16 &= pt or jit or onnx or engine # FP16
nhwc = coreml or saved_model or pb or tflite or edgetpu # BHWC formats (vs torch BCWH)
stride = 32 # default stride
cuda = torch.cuda.is_available() and device.type != 'cpu' # use CUDA
if not (pt or triton):
w = attempt_download(w) # download if not local
if pt: # PyTorch
model = attempt_load(weights if isinstance(weights, list) else w, device=device, inplace=True, fuse=fuse)
stride = max(int(model.stride.max()), 32) # model stride
names = model.module.names if hasattr(model, 'module') else model.names # get class names
model.half() if fp16 else model.float()
self.model = model # explicitly assign for to(), cpu(), cuda(), half()
elif jit: # TorchScript
LOGGER.info(f'Loading {w} for TorchScript inference...')
extra_files = {'config.txt': ''} # model metadata
model = torch.jit.load(w, _extra_files=extra_files, map_location=device)
model.half() if fp16 else model.float()
if extra_files['config.txt']: # load metadata dict
d = json.loads(extra_files['config.txt'],
object_hook=lambda d: {int(k) if k.isdigit() else k: v
for k, v in d.items()})
stride, names = int(d['stride']), d['names']
elif dnn: # ONNX OpenCV DNN
LOGGER.info(f'Loading {w} for ONNX OpenCV DNN inference...')
check_requirements('opencv-python>=4.5.4')
net = cv2.dnn.readNetFromONNX(w)
elif onnx: # ONNX Runtime
LOGGER.info(f'Loading {w} for ONNX Runtime inference...')
check_requirements(('onnx', 'onnxruntime-gpu' if cuda else 'onnxruntime'))
import onnxruntime
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if cuda else ['CPUExecutionProvider']
session = onnxruntime.InferenceSession(w, providers=providers)
output_names = [x.name for x in session.get_outputs()]
meta = session.get_modelmeta().custom_metadata_map # metadata
if 'stride' in meta:
stride, names = int(meta['stride']), eval(meta['names'])
elif xml: # OpenVINO
LOGGER.info(f'Loading {w} for OpenVINO inference...')
check_requirements('openvino') # requires openvino-dev: https://pypi.org/project/openvino-dev/
from openvino.runtime import Core, Layout, get_batch
ie = Core()
if not Path(w).is_file(): # if not *.xml
w = next(Path(w).glob('*.xml')) # get *.xml file from *_openvino_model dir
network = ie.read_model(model=w, weights=Path(w).with_suffix('.bin'))
if network.get_parameters()[0].get_layout().empty:
network.get_parameters()[0].set_layout(Layout("NCHW"))
batch_dim = get_batch(network)
if batch_dim.is_static:
batch_size = batch_dim.get_length()
executable_network = ie.compile_model(network, device_name="CPU") # device_name="MYRIAD" for Intel NCS2
stride, names = self._load_metadata(Path(w).with_suffix('.yaml')) # load metadata
elif engine: # TensorRT
LOGGER.info(f'Loading {w} for TensorRT inference...')
import tensorrt as trt # https://developer.nvidia.com/nvidia-tensorrt-download
check_version(trt.__version__, '7.0.0', hard=True) # require tensorrt>=7.0.0
if device.type == 'cpu':
device = torch.device('cuda:0')
Binding = namedtuple('Binding', ('name', 'dtype', 'shape', 'data', 'ptr'))
logger = trt.Logger(trt.Logger.INFO)
with open(w, 'rb') as f, trt.Runtime(logger) as runtime:
model = runtime.deserialize_cuda_engine(f.read())
context = model.create_execution_context()
bindings = OrderedDict()
output_names = []
fp16 = False # default updated below
dynamic = False
for i in range(model.num_bindings):
name = model.get_binding_name(i)
dtype = trt.nptype(model.get_binding_dtype(i))
if model.binding_is_input(i):
if -1 in tuple(model.get_binding_shape(i)): # dynamic
dynamic = True
context.set_binding_shape(i, tuple(model.get_profile_shape(0, i)[2]))
if dtype == np.float16:
fp16 = True
else: # output
output_names.append(name)
shape = tuple(context.get_binding_shape(i))
im = torch.from_numpy(np.empty(shape, dtype=dtype)).to(device)
bindings[name] = Binding(name, dtype, shape, im, int(im.data_ptr()))
binding_addrs = OrderedDict((n, d.ptr) for n, d in bindings.items())
batch_size = bindings['images'].shape[0] # if dynamic, this is instead max batch size
elif coreml: # CoreML
LOGGER.info(f'Loading {w} for CoreML inference...')
import coremltools as ct
model = ct.models.MLModel(w)
elif saved_model: # TF SavedModel
LOGGER.info(f'Loading {w} for TensorFlow SavedModel inference...')
import tensorflow as tf
keras = False # assume TF1 saved_model
model = tf.keras.models.load_model(w) if keras else tf.saved_model.load(w)
elif pb: # GraphDef https://www.tensorflow.org/guide/migrate#a_graphpb_or_graphpbtxt
LOGGER.info(f'Loading {w} for TensorFlow GraphDef inference...')
import tensorflow as tf
def wrap_frozen_graph(gd, inputs, outputs):
x = tf.compat.v1.wrap_function(lambda: tf.compat.v1.import_graph_def(gd, name=""), []) # wrapped
ge = x.graph.as_graph_element
return x.prune(tf.nest.map_structure(ge, inputs), tf.nest.map_structure(ge, outputs))
def gd_outputs(gd):
name_list, input_list = [], []
for node in gd.node: # tensorflow.core.framework.node_def_pb2.NodeDef
name_list.append(node.name)
input_list.extend(node.input)
return sorted(f'{x}:0' for x in list(set(name_list) - set(input_list)) if not x.startswith('NoOp'))
gd = tf.Graph().as_graph_def() # TF GraphDef
with open(w, 'rb') as f:
gd.ParseFromString(f.read())
frozen_func = wrap_frozen_graph(gd, inputs="x:0", outputs=gd_outputs(gd))
elif tflite or edgetpu: # https://www.tensorflow.org/lite/guide/python#install_tensorflow_lite_for_python
try: # https://coral.ai/docs/edgetpu/tflite-python/#update-existing-tf-lite-code-for-the-edge-tpu
from tflite_runtime.interpreter import Interpreter, load_delegate
except ImportError:
import tensorflow as tf
Interpreter, load_delegate = tf.lite.Interpreter, tf.lite.experimental.load_delegate,
if edgetpu: # TF Edge TPU https://coral.ai/software/#edgetpu-runtime
LOGGER.info(f'Loading {w} for TensorFlow Lite Edge TPU inference...')
delegate = {
'Linux': 'libedgetpu.so.1',
'Darwin': 'libedgetpu.1.dylib',
'Windows': 'edgetpu.dll'}[platform.system()]
interpreter = Interpreter(model_path=w, experimental_delegates=[load_delegate(delegate)])
else: # TFLite
LOGGER.info(f'Loading {w} for TensorFlow Lite inference...')
interpreter = Interpreter(model_path=w) # load TFLite model
interpreter.allocate_tensors() # allocate
input_details = interpreter.get_input_details() # inputs
output_details = interpreter.get_output_details() # outputs
# load metadata
with contextlib.suppress(zipfile.BadZipFile):
with zipfile.ZipFile(w, "r") as model:
meta_file = model.namelist()[0]
meta = ast.literal_eval(model.read(meta_file).decode("utf-8"))
stride, names = int(meta['stride']), meta['names']
elif tfjs: # TF.js
raise NotImplementedError('ERROR: YOLO TF.js inference is not supported')
elif paddle: # PaddlePaddle
LOGGER.info(f'Loading {w} for PaddlePaddle inference...')
check_requirements('paddlepaddle-gpu' if cuda else 'paddlepaddle')
import paddle.inference as pdi
if not Path(w).is_file(): # if not *.pdmodel
w = next(Path(w).rglob('*.pdmodel')) # get *.pdmodel file from *_paddle_model dir
weights = Path(w).with_suffix('.pdiparams')
config = pdi.Config(str(w), str(weights))
if cuda:
config.enable_use_gpu(memory_pool_init_size_mb=2048, device_id=0)
predictor = pdi.create_predictor(config)
input_handle = predictor.get_input_handle(predictor.get_input_names()[0])
output_names = predictor.get_output_names()
elif triton: # NVIDIA Triton Inference Server
LOGGER.info(f'Using {w} as Triton Inference Server...')
check_requirements('tritonclient[all]')
from utils.triton import TritonRemoteModel
model = TritonRemoteModel(url=w)
nhwc = model.runtime.startswith("tensorflow")
else:
raise NotImplementedError(f'ERROR: {w} is not a supported format')
# class names
if 'names' not in locals():
names = yaml_load(data)['names'] if data else {i: f'class{i}' for i in range(999)}
if names[0] == 'n01440764' and len(names) == 1000: # ImageNet
names = yaml_load(ROOT / 'data/ImageNet.yaml')['names'] # human-readable names
self.__dict__.update(locals()) # assign all variables to self
def forward(self, im, augment=False, visualize=False):
# YOLO MultiBackend inference
b, ch, h, w = im.shape # batch, channel, height, width
if self.fp16 and im.dtype != torch.float16:
im = im.half() # to FP16
if self.nhwc:
im = im.permute(0, 2, 3, 1) # torch BCHW to numpy BHWC shape(1,320,192,3)
if self.pt: # PyTorch
y = self.model(im, augment=augment, visualize=visualize) if augment or visualize else self.model(im)
elif self.jit: # TorchScript
y = self.model(im)
elif self.dnn: # ONNX OpenCV DNN
im = im.cpu().numpy() # torch to numpy
self.net.setInput(im)
y = self.net.forward()
elif self.onnx: # ONNX Runtime
im = im.cpu().numpy() # torch to numpy
y = self.session.run(self.output_names, {self.session.get_inputs()[0].name: im})
elif self.xml: # OpenVINO
im = im.cpu().numpy() # FP32
y = list(self.executable_network([im]).values())
elif self.engine: # TensorRT
if self.dynamic and im.shape != self.bindings['images'].shape:
i = self.model.get_binding_index('images')
self.context.set_binding_shape(i, im.shape) # reshape if dynamic
self.bindings['images'] = self.bindings['images']._replace(shape=im.shape)
for name in self.output_names:
i = self.model.get_binding_index(name)
self.bindings[name].data.resize_(tuple(self.context.get_binding_shape(i)))
s = self.bindings['images'].shape
assert im.shape == s, f"input size {im.shape} {'>' if self.dynamic else 'not equal to'} max model size {s}"
self.binding_addrs['images'] = int(im.data_ptr())
self.context.execute_v2(list(self.binding_addrs.values()))
y = [self.bindings[x].data for x in sorted(self.output_names)]
elif self.coreml: # CoreML
im = im.cpu().numpy()
im = Image.fromarray((im[0] * 255).astype('uint8'))
# im = im.resize((192, 320), Image.ANTIALIAS)
y = self.model.predict({'image': im}) # coordinates are xywh normalized
if 'confidence' in y:
box = xywh2xyxy(y['coordinates'] * [[w, h, w, h]]) # xyxy pixels
conf, cls = y['confidence'].max(1), y['confidence'].argmax(1).astype(np.float)
y = np.concatenate((box, conf.reshape(-1, 1), cls.reshape(-1, 1)), 1)
else:
y = list(reversed(y.values())) # reversed for segmentation models (pred, proto)
elif self.paddle: # PaddlePaddle
im = im.cpu().numpy().astype(np.float32)
self.input_handle.copy_from_cpu(im)
gitextract_ys0vmsdq/ ├── LICENSE.md ├── README.md ├── benchmarks.py ├── classify/ │ ├── predict.py │ ├── train.py │ └── val.py ├── data/ │ ├── coco.yaml │ └── hyps/ │ └── hyp.scratch-high.yaml ├── detect.py ├── detect_dual.py ├── export.py ├── hubconf.py ├── models/ │ ├── __init__.py │ ├── common.py │ ├── detect/ │ │ ├── gelan-c.yaml │ │ ├── gelan-e.yaml │ │ ├── gelan-m.yaml │ │ ├── gelan-s.yaml │ │ ├── gelan-t.yaml │ │ ├── gelan.yaml │ │ ├── yolov7-af.yaml │ │ ├── yolov9-c.yaml │ │ ├── yolov9-cf.yaml │ │ ├── yolov9-e.yaml │ │ ├── yolov9-m.yaml │ │ ├── yolov9-s.yaml │ │ ├── yolov9-t.yaml │ │ └── yolov9.yaml │ ├── experimental.py │ ├── hub/ │ │ ├── anchors.yaml │ │ ├── yolov3-spp.yaml │ │ ├── yolov3-tiny.yaml │ │ └── yolov3.yaml │ ├── panoptic/ │ │ ├── gelan-c-pan.yaml │ │ └── yolov7-af-pan.yaml │ ├── segment/ │ │ ├── gelan-c-dseg.yaml │ │ ├── gelan-c-seg.yaml │ │ ├── yolov7-af-seg.yaml │ │ └── yolov9-c-dseg.yaml │ ├── tf.py │ └── yolo.py ├── panoptic/ │ ├── predict.py │ ├── train.py │ └── val.py ├── requirements.txt ├── scripts/ │ └── get_coco.sh ├── segment/ │ ├── predict.py │ ├── train.py │ ├── train_dual.py │ ├── val.py │ └── val_dual.py ├── tools/ │ └── reparameterization.ipynb ├── train.py ├── train_dual.py ├── train_triple.py ├── utils/ │ ├── __init__.py │ ├── activations.py │ ├── augmentations.py │ ├── autoanchor.py │ ├── autobatch.py │ ├── callbacks.py │ ├── coco_utils.py │ ├── dataloaders.py │ ├── downloads.py │ ├── general.py │ ├── lion.py │ ├── loggers/ │ │ ├── __init__.py │ │ ├── clearml/ │ │ │ ├── __init__.py │ │ │ ├── clearml_utils.py │ │ │ └── hpo.py │ │ ├── comet/ │ │ │ ├── __init__.py │ │ │ ├── comet_utils.py │ │ │ ├── hpo.py │ │ │ └── optimizer_config.json │ │ └── wandb/ │ │ ├── __init__.py │ │ ├── log_dataset.py │ │ ├── sweep.py │ │ ├── sweep.yaml │ │ └── wandb_utils.py │ ├── loss.py │ ├── loss_tal.py │ ├── loss_tal_dual.py │ ├── loss_tal_triple.py │ ├── metrics.py │ ├── panoptic/ │ │ ├── __init__.py │ │ ├── augmentations.py │ │ ├── dataloaders.py │ │ ├── general.py │ │ ├── loss.py │ │ ├── loss_tal.py │ │ ├── metrics.py │ │ ├── plots.py │ │ └── tal/ │ │ ├── __init__.py │ │ ├── anchor_generator.py │ │ └── assigner.py │ ├── plots.py │ ├── segment/ │ │ ├── __init__.py │ │ ├── augmentations.py │ │ ├── dataloaders.py │ │ ├── general.py │ │ ├── loss.py │ │ ├── loss_tal.py │ │ ├── loss_tal_dual.py │ │ ├── metrics.py │ │ ├── plots.py │ │ └── tal/ │ │ ├── __init__.py │ │ ├── anchor_generator.py │ │ └── assigner.py │ ├── tal/ │ │ ├── __init__.py │ │ ├── anchor_generator.py │ │ └── assigner.py │ ├── torch_utils.py │ └── triton.py ├── val.py ├── val_dual.py └── val_triple.py
SYMBOL INDEX (1072 symbols across 74 files)
FILE: benchmarks.py
function run (line 25) | def run(
function test (line 87) | def test(
function parse_opt (line 119) | def parse_opt():
function main (line 136) | def main(opt):
FILE: classify/predict.py
function run (line 54) | def run(
function parse_opt (line 192) | def parse_opt():
function main (line 217) | def main(opt):
FILE: classify/train.py
function train (line 56) | def train(opt, device):
function parse_opt (line 271) | def parse_opt(known=False):
function main (line 298) | def main(opt):
function run (line 322) | def run(**kwargs):
FILE: classify/val.py
function run (line 45) | def run(
function parse_opt (line 144) | def parse_opt():
function main (line 163) | def main(opt):
FILE: detect.py
function run (line 24) | def run(
function parse_opt (line 189) | def parse_opt():
function main (line 224) | def main(opt):
FILE: detect_dual.py
function run (line 24) | def run(
function parse_opt (line 190) | def parse_opt():
function main (line 225) | def main(opt):
FILE: export.py
function export_formats (line 34) | def export_formats():
function try_export (line 53) | def try_export(inner_func):
function export_torchscript (line 72) | def export_torchscript(model, im, file, optimize, prefix=colorstr('Torch...
function export_onnx (line 88) | def export_onnx(model, im, file, opset, dynamic, simplify, prefix=colors...
function export_onnx_end2end (line 144) | def export_onnx_end2end(model, im, file, simplify, topk_all, iou_thres, ...
function export_openvino (line 202) | def export_openvino(file, metadata, half, prefix=colorstr('OpenVINO:')):
function export_paddle (line 220) | def export_paddle(model, im, file, metadata, prefix=colorstr('PaddlePadd...
function export_coreml (line 235) | def export_coreml(model, im, file, int8, half, prefix=colorstr('CoreML:')):
function export_engine (line 258) | def export_engine(model, im, file, half, dynamic, simplify, workspace=4,...
function export_saved_model (line 320) | def export_saved_model(model,
function export_pb (line 372) | def export_pb(keras_model, file, prefix=colorstr('TensorFlow GraphDef:')):
function export_tflite (line 389) | def export_tflite(keras_model, im, file, int8, data, nms, agnostic_nms, ...
function export_edgetpu (line 420) | def export_edgetpu(file, prefix=colorstr('Edge TPU:')):
function export_tfjs (line 445) | def export_tfjs(file, prefix=colorstr('TensorFlow.js:')):
function add_tflite_metadata (line 473) | def add_tflite_metadata(file, metadata, num_outputs):
function run (line 507) | def run(
function parse_opt (line 639) | def parse_opt():
function main (line 679) | def main(opt):
FILE: hubconf.py
function _create (line 4) | def _create(name, pretrained=True, channels=3, classes=80, autoshape=Tru...
function custom (line 69) | def custom(path='path/to/model.pt', autoshape=True, _verbose=True, devic...
FILE: models/common.py
function autopad (line 34) | def autopad(k, p=None, d=1): # kernel, padding, dilation
class Conv (line 43) | class Conv(nn.Module):
method __init__ (line 47) | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, d=1, act=True):
method forward (line 53) | def forward(self, x):
method forward_fuse (line 56) | def forward_fuse(self, x):
class AConv (line 60) | class AConv(nn.Module):
method __init__ (line 61) | def __init__(self, c1, c2): # ch_in, ch_out, shortcut, kernels, group...
method forward (line 65) | def forward(self, x):
class ADown (line 70) | class ADown(nn.Module):
method __init__ (line 71) | def __init__(self, c1, c2): # ch_in, ch_out, shortcut, kernels, group...
method forward (line 77) | def forward(self, x):
class RepConvN (line 86) | class RepConvN(nn.Module):
method __init__ (line 92) | def __init__(self, c1, c2, k=3, s=1, p=1, g=1, d=1, act=True, bn=False...
method forward_fuse (line 104) | def forward_fuse(self, x):
method forward (line 108) | def forward(self, x):
method get_equivalent_kernel_bias (line 113) | def get_equivalent_kernel_bias(self):
method _avg_to_3x3_tensor (line 119) | def _avg_to_3x3_tensor(self, avgp):
method _pad_1x1_to_3x3_tensor (line 128) | def _pad_1x1_to_3x3_tensor(self, kernel1x1):
method _fuse_bn_tensor (line 134) | def _fuse_bn_tensor(self, branch):
method fuse_convs (line 161) | def fuse_convs(self):
class SP (line 187) | class SP(nn.Module):
method __init__ (line 188) | def __init__(self, k=3, s=1):
method forward (line 192) | def forward(self, x):
class MP (line 196) | class MP(nn.Module):
method __init__ (line 198) | def __init__(self, k=2):
method forward (line 202) | def forward(self, x):
class ConvTranspose (line 206) | class ConvTranspose(nn.Module):
method __init__ (line 210) | def __init__(self, c1, c2, k=2, s=2, p=0, bn=True, act=True):
method forward (line 216) | def forward(self, x):
class DWConv (line 220) | class DWConv(Conv):
method __init__ (line 222) | def __init__(self, c1, c2, k=1, s=1, d=1, act=True): # ch_in, ch_out,...
class DWConvTranspose2d (line 226) | class DWConvTranspose2d(nn.ConvTranspose2d):
method __init__ (line 228) | def __init__(self, c1, c2, k=1, s=1, p1=0, p2=0): # ch_in, ch_out, ke...
class DFL (line 232) | class DFL(nn.Module):
method __init__ (line 234) | def __init__(self, c1=17):
method forward (line 241) | def forward(self, x):
class BottleneckBase (line 247) | class BottleneckBase(nn.Module):
method __init__ (line 249) | def __init__(self, c1, c2, shortcut=True, g=1, k=(1, 3), e=0.5): # ch...
method forward (line 256) | def forward(self, x):
class RBottleneckBase (line 260) | class RBottleneckBase(nn.Module):
method __init__ (line 262) | def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 1), e=0.5): # ch...
method forward (line 269) | def forward(self, x):
class RepNRBottleneckBase (line 273) | class RepNRBottleneckBase(nn.Module):
method __init__ (line 275) | def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 1), e=0.5): # ch...
method forward (line 282) | def forward(self, x):
class Bottleneck (line 286) | class Bottleneck(nn.Module):
method __init__ (line 288) | def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5): # ch...
method forward (line 295) | def forward(self, x):
class RepNBottleneck (line 299) | class RepNBottleneck(nn.Module):
method __init__ (line 301) | def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5): # ch...
method forward (line 308) | def forward(self, x):
class Res (line 312) | class Res(nn.Module):
method __init__ (line 314) | def __init__(self, c1, c2, shortcut=True, g=1, e=0.5): # ch_in, ch_ou...
method forward (line 322) | def forward(self, x):
class RepNRes (line 326) | class RepNRes(nn.Module):
method __init__ (line 328) | def __init__(self, c1, c2, shortcut=True, g=1, e=0.5): # ch_in, ch_ou...
method forward (line 336) | def forward(self, x):
class BottleneckCSP (line 340) | class BottleneckCSP(nn.Module):
method __init__ (line 342) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ...
method forward (line 353) | def forward(self, x):
class CSP (line 359) | class CSP(nn.Module):
method __init__ (line 361) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ...
method forward (line 369) | def forward(self, x):
class RepNCSP (line 373) | class RepNCSP(nn.Module):
method __init__ (line 375) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ...
method forward (line 383) | def forward(self, x):
class CSPBase (line 387) | class CSPBase(nn.Module):
method __init__ (line 389) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5): # ch_in, ...
method forward (line 397) | def forward(self, x):
class SPP (line 401) | class SPP(nn.Module):
method __init__ (line 403) | def __init__(self, c1, c2, k=(5, 9, 13)):
method forward (line 410) | def forward(self, x):
class ASPP (line 417) | class ASPP(torch.nn.Module):
method __init__ (line 419) | def __init__(self, in_channels, out_channels):
method forward (line 443) | def forward(self, x):
class SPPCSPC (line 454) | class SPPCSPC(nn.Module):
method __init__ (line 456) | def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5, k=(5, 9, 1...
method forward (line 468) | def forward(self, x):
class SPPF (line 475) | class SPPF(nn.Module):
method __init__ (line 477) | def __init__(self, c1, c2, k=5): # equivalent to SPP(k=(5, 9, 13))
method forward (line 485) | def forward(self, x):
class ReOrg (line 498) | class ReOrg(nn.Module):
method __init__ (line 500) | def __init__(self):
method forward (line 503) | def forward(self, x): # x(b,c,w,h) -> y(b,4c,w/2,h/2)
class Contract (line 507) | class Contract(nn.Module):
method __init__ (line 509) | def __init__(self, gain=2):
method forward (line 513) | def forward(self, x):
class Expand (line 521) | class Expand(nn.Module):
method __init__ (line 523) | def __init__(self, gain=2):
method forward (line 527) | def forward(self, x):
class Concat (line 535) | class Concat(nn.Module):
method __init__ (line 537) | def __init__(self, dimension=1):
method forward (line 541) | def forward(self, x):
class Shortcut (line 545) | class Shortcut(nn.Module):
method __init__ (line 546) | def __init__(self, dimension=0):
method forward (line 550) | def forward(self, x):
class Silence (line 554) | class Silence(nn.Module):
method __init__ (line 555) | def __init__(self):
method forward (line 557) | def forward(self, x):
class SPPELAN (line 563) | class SPPELAN(nn.Module):
method __init__ (line 565) | def __init__(self, c1, c2, c3): # ch_in, ch_out, number, shortcut, gr...
method forward (line 574) | def forward(self, x):
class ELAN1 (line 580) | class ELAN1(nn.Module):
method __init__ (line 582) | def __init__(self, c1, c2, c3, c4): # ch_in, ch_out, number, shortcut...
method forward (line 590) | def forward(self, x):
method forward_split (line 595) | def forward_split(self, x):
class RepNCSPELAN4 (line 601) | class RepNCSPELAN4(nn.Module):
method __init__ (line 603) | def __init__(self, c1, c2, c3, c4, c5=1): # ch_in, ch_out, number, sh...
method forward (line 611) | def forward(self, x):
method forward_split (line 616) | def forward_split(self, x):
class ImplicitA (line 626) | class ImplicitA(nn.Module):
method __init__ (line 627) | def __init__(self, channel):
method forward (line 633) | def forward(self, x):
class ImplicitM (line 637) | class ImplicitM(nn.Module):
method __init__ (line 638) | def __init__(self, channel):
method forward (line 644) | def forward(self, x):
class CBLinear (line 652) | class CBLinear(nn.Module):
method __init__ (line 653) | def __init__(self, c1, c2s, k=1, s=1, p=None, g=1): # ch_in, ch_outs,...
method forward (line 658) | def forward(self, x):
class CBFuse (line 662) | class CBFuse(nn.Module):
method __init__ (line 663) | def __init__(self, idx):
method forward (line 667) | def forward(self, xs):
class DetectMultiBackend (line 676) | class DetectMultiBackend(nn.Module):
method __init__ (line 678) | def __init__(self, weights='yolo.pt', device=torch.device('cpu'), dnn=...
method forward (line 866) | def forward(self, im, augment=False, visualize=False):
method from_numpy (line 948) | def from_numpy(self, x):
method warmup (line 951) | def warmup(self, imgsz=(1, 3, 640, 640)):
method _model_type (line 960) | def _model_type(p='path/to/model.pt'):
method _load_metadata (line 975) | def _load_metadata(f=Path('path/to/meta.yaml')):
class AutoShape (line 983) | class AutoShape(nn.Module):
method __init__ (line 993) | def __init__(self, model, verbose=True):
method _apply (line 1006) | def _apply(self, fn):
method forward (line 1019) | def forward(self, ims, size=640, augment=False, profile=False):
class Detections (line 1083) | class Detections:
method __init__ (line 1085) | def __init__(self, ims, pred, files, times=(0, 0, 0), names=None, shap...
method _run (line 1102) | def _run(self, pprint=False, show=False, save=False, crop=False, rende...
method show (line 1148) | def show(self, labels=True):
method save (line 1151) | def save(self, labels=True, save_dir='runs/detect/exp', exist_ok=False):
method crop (line 1155) | def crop(self, save=True, save_dir='runs/detect/exp', exist_ok=False):
method render (line 1159) | def render(self, labels=True):
method pandas (line 1163) | def pandas(self):
method tolist (line 1173) | def tolist(self):
method print (line 1182) | def print(self):
method __len__ (line 1185) | def __len__(self): # override len(results)
method __str__ (line 1188) | def __str__(self): # override print(results)
method __repr__ (line 1191) | def __repr__(self):
class Proto (line 1195) | class Proto(nn.Module):
method __init__ (line 1197) | def __init__(self, c1, c_=256, c2=32): # ch_in, number of protos, num...
method forward (line 1204) | def forward(self, x):
class UConv (line 1208) | class UConv(nn.Module):
method __init__ (line 1209) | def __init__(self, c1, c_=256, c2=256): # ch_in, number of protos, nu...
method forward (line 1216) | def forward(self, x):
class Classify (line 1220) | class Classify(nn.Module):
method __init__ (line 1222) | def __init__(self, c1, c2, k=1, s=1, p=None, g=1): # ch_in, ch_out, k...
method forward (line 1230) | def forward(self, x):
FILE: models/experimental.py
class Sum (line 10) | class Sum(nn.Module):
method __init__ (line 12) | def __init__(self, n, weight=False): # n: number of inputs
method forward (line 19) | def forward(self, x):
class MixConv2d (line 31) | class MixConv2d(nn.Module):
method __init__ (line 33) | def __init__(self, c1, c2, k=(1, 3), s=1, equal_ch=True): # ch_in, ch...
method forward (line 52) | def forward(self, x):
class Ensemble (line 56) | class Ensemble(nn.ModuleList):
method __init__ (line 58) | def __init__(self):
method forward (line 61) | def forward(self, x, augment=False, profile=False, visualize=False):
class ORT_NMS (line 69) | class ORT_NMS(torch.autograd.Function):
method forward (line 72) | def forward(ctx,
method symbolic (line 89) | def symbolic(g, boxes, scores, max_output_boxes_per_class, iou_thresho...
class TRT_NMS (line 93) | class TRT_NMS(torch.autograd.Function):
method forward (line 96) | def forward(
method symbolic (line 117) | def symbolic(g,
class ONNX_ORT (line 142) | class ONNX_ORT(nn.Module):
method __init__ (line 144) | def __init__(self, max_obj=100, iou_thres=0.45, score_thres=0.25, max_...
method forward (line 156) | def forward(self, x):
class ONNX_TRT (line 184) | class ONNX_TRT(nn.Module):
method __init__ (line 186) | def __init__(self, max_obj=100, iou_thres=0.45, score_thres=0.25, max_...
method forward (line 199) | def forward(self, x):
class End2End (line 219) | class End2End(nn.Module):
method __init__ (line 221) | def __init__(self, model, max_obj=100, iou_thres=0.45, score_thres=0.2...
method forward (line 231) | def forward(self, x):
function attempt_load (line 237) | def attempt_load(weights, device=None, inplace=True, fuse=True):
FILE: models/tf.py
class TFBN (line 26) | class TFBN(keras.layers.Layer):
method __init__ (line 28) | def __init__(self, w=None):
method call (line 37) | def call(self, inputs):
class TFPad (line 41) | class TFPad(keras.layers.Layer):
method __init__ (line 43) | def __init__(self, pad):
method call (line 50) | def call(self, inputs):
class TFConv (line 54) | class TFConv(keras.layers.Layer):
method __init__ (line 56) | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True, w=None):
method call (line 74) | def call(self, inputs):
class TFDWConv (line 78) | class TFDWConv(keras.layers.Layer):
method __init__ (line 80) | def __init__(self, c1, c2, k=1, s=1, p=None, act=True, w=None):
method call (line 96) | def call(self, inputs):
class TFDWConvTranspose2d (line 100) | class TFDWConvTranspose2d(keras.layers.Layer):
method __init__ (line 102) | def __init__(self, c1, c2, k=1, s=1, p1=0, p2=0, w=None):
method call (line 119) | def call(self, inputs):
class TFFocus (line 123) | class TFFocus(keras.layers.Layer):
method __init__ (line 125) | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True, w=None):
method call (line 130) | def call(self, inputs): # x(b,w,h,c) -> y(b,w/2,h/2,4c)
class TFBottleneck (line 136) | class TFBottleneck(keras.layers.Layer):
method __init__ (line 138) | def __init__(self, c1, c2, shortcut=True, g=1, e=0.5, w=None): # ch_i...
method call (line 145) | def call(self, inputs):
class TFCrossConv (line 149) | class TFCrossConv(keras.layers.Layer):
method __init__ (line 151) | def __init__(self, c1, c2, k=3, s=1, g=1, e=1.0, shortcut=False, w=None):
method call (line 158) | def call(self, inputs):
class TFConv2d (line 162) | class TFConv2d(keras.layers.Layer):
method __init__ (line 164) | def __init__(self, c1, c2, k, s=1, g=1, bias=True, w=None):
method call (line 176) | def call(self, inputs):
class TFBottleneckCSP (line 180) | class TFBottleneckCSP(keras.layers.Layer):
method __init__ (line 182) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5, w=None):
method call (line 194) | def call(self, inputs):
class TFC3 (line 200) | class TFC3(keras.layers.Layer):
method __init__ (line 202) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5, w=None):
method call (line 211) | def call(self, inputs):
class TFC3x (line 215) | class TFC3x(keras.layers.Layer):
method __init__ (line 217) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5, w=None):
method call (line 227) | def call(self, inputs):
class TFSPP (line 231) | class TFSPP(keras.layers.Layer):
method __init__ (line 233) | def __init__(self, c1, c2, k=(5, 9, 13), w=None):
method call (line 240) | def call(self, inputs):
class TFSPPF (line 245) | class TFSPPF(keras.layers.Layer):
method __init__ (line 247) | def __init__(self, c1, c2, k=5, w=None):
method call (line 254) | def call(self, inputs):
class TFDetect (line 261) | class TFDetect(keras.layers.Layer):
method __init__ (line 263) | def __init__(self, nc=80, anchors=(), ch=(), imgsz=(640, 640), w=None)...
method call (line 280) | def call(self, inputs):
method _make_grid (line 304) | def _make_grid(nx=20, ny=20):
class TFSegment (line 311) | class TFSegment(TFDetect):
method __init__ (line 313) | def __init__(self, nc=80, anchors=(), nm=32, npr=256, ch=(), imgsz=(64...
method call (line 322) | def call(self, x):
class TFProto (line 330) | class TFProto(keras.layers.Layer):
method __init__ (line 332) | def __init__(self, c1, c_=256, c2=32, w=None):
method call (line 339) | def call(self, inputs):
class TFUpsample (line 343) | class TFUpsample(keras.layers.Layer):
method __init__ (line 345) | def __init__(self, size, scale_factor, mode, w=None): # warning: all ...
method call (line 354) | def call(self, inputs):
class TFConcat (line 358) | class TFConcat(keras.layers.Layer):
method __init__ (line 360) | def __init__(self, dimension=1, w=None):
method call (line 365) | def call(self, inputs):
function parse_model (line 369) | def parse_model(d, ch, model, imgsz): # model_dict, input_channels(3)
class TFModel (line 425) | class TFModel:
method __init__ (line 427) | def __init__(self, cfg='yolo.yaml', ch=3, nc=None, model=None, imgsz=(...
method predict (line 443) | def predict(self,
method _xywh2xyxy (line 486) | def _xywh2xyxy(xywh):
class AgnosticNMS (line 492) | class AgnosticNMS(keras.layers.Layer):
method call (line 494) | def call(self, input, topk_all, iou_thres, conf_thres):
method _nms (line 502) | def _nms(x, topk_all=100, iou_thres=0.45, conf_thres=0.25): # agnosti...
function activations (line 530) | def activations(act=nn.SiLU):
function representative_dataset_gen (line 542) | def representative_dataset_gen(dataset, ncalib=100):
function run (line 553) | def run(
function parse_opt (line 578) | def parse_opt():
function main (line 590) | def main(opt):
FILE: models/yolo.py
class Detect (line 29) | class Detect(nn.Module):
method __init__ (line 37) | def __init__(self, nc=80, ch=(), inplace=True): # detection layer
method forward (line 53) | def forward(self, x):
method bias_init (line 68) | def bias_init(self):
class DDetect (line 78) | class DDetect(nn.Module):
method __init__ (line 86) | def __init__(self, nc=80, ch=(), inplace=True): # detection layer
method forward (line 102) | def forward(self, x):
method bias_init (line 117) | def bias_init(self):
class DualDetect (line 127) | class DualDetect(nn.Module):
method __init__ (line 135) | def __init__(self, nc=80, ch=(), inplace=True): # detection layer
method forward (line 157) | def forward(self, x):
method bias_init (line 177) | def bias_init(self):
class DualDDetect (line 190) | class DualDDetect(nn.Module):
method __init__ (line 198) | def __init__(self, nc=80, ch=(), inplace=True): # detection layer
method forward (line 220) | def forward(self, x):
method bias_init (line 246) | def bias_init(self):
class TripleDetect (line 259) | class TripleDetect(nn.Module):
method __init__ (line 267) | def __init__(self, nc=80, ch=(), inplace=True): # detection layer
method forward (line 295) | def forward(self, x):
method bias_init (line 319) | def bias_init(self):
class TripleDDetect (line 335) | class TripleDDetect(nn.Module):
method __init__ (line 343) | def __init__(self, nc=80, ch=(), inplace=True): # detection layer
method forward (line 377) | def forward(self, x):
method bias_init (line 403) | def bias_init(self):
class Segment (line 419) | class Segment(Detect):
method __init__ (line 421) | def __init__(self, nc=80, nm=32, npr=256, ch=(), inplace=True):
method forward (line 431) | def forward(self, x):
class DSegment (line 442) | class DSegment(DDetect):
method __init__ (line 444) | def __init__(self, nc=80, nm=32, npr=256, ch=(), inplace=True):
method forward (line 455) | def forward(self, x):
class DualDSegment (line 466) | class DualDSegment(DualDDetect):
method __init__ (line 468) | def __init__(self, nc=80, nm=32, npr=256, ch=(), inplace=True):
method forward (line 482) | def forward(self, x):
class Panoptic (line 494) | class Panoptic(Detect):
method __init__ (line 496) | def __init__(self, nc=80, sem_nc=93, nm=32, npr=256, ch=(), inplace=Tr...
method forward (line 509) | def forward(self, x):
class BaseModel (line 521) | class BaseModel(nn.Module):
method forward (line 523) | def forward(self, x, profile=False, visualize=False):
method _forward_once (line 526) | def _forward_once(self, x, profile=False, visualize=False):
method _profile_one_layer (line 539) | def _profile_one_layer(self, m, x, dt):
method fuse (line 552) | def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers
method info (line 565) | def info(self, verbose=False, img_size=640): # print model information
method _apply (line 568) | def _apply(self, fn):
class DetectionModel (line 580) | class DetectionModel(BaseModel):
method __init__ (line 582) | def __init__(self, cfg='yolo.yaml', ch=3, nc=None, anchors=None): # m...
method forward (line 630) | def forward(self, x, augment=False, profile=False, visualize=False):
method _forward_augment (line 635) | def _forward_augment(self, x):
method _descale_pred (line 649) | def _descale_pred(self, p, flips, scale, img_size):
method _clip_augmented (line 666) | def _clip_augmented(self, y):
class SegmentationModel (line 681) | class SegmentationModel(DetectionModel):
method __init__ (line 683) | def __init__(self, cfg='yolo-seg.yaml', ch=3, nc=None, anchors=None):
class ClassificationModel (line 687) | class ClassificationModel(BaseModel):
method __init__ (line 689) | def __init__(self, cfg=None, model=None, nc=1000, cutoff=10): # yaml,...
method _from_detection_model (line 693) | def _from_detection_model(self, model, nc=1000, cutoff=10):
method _from_yaml (line 708) | def _from_yaml(self, cfg):
function parse_model (line 713) | def parse_model(d, ch): # model_dict, input_channels(3)
FILE: panoptic/predict.py
function run (line 26) | def run(
function parse_opt (line 203) | def parse_opt():
function main (line 239) | def main(opt):
FILE: panoptic/train.py
function train (line 51) | def train(hyp, opt, device, callbacks): # hyp is path/to/hyp.yaml or hy...
function parse_opt (line 462) | def parse_opt(known=False):
function main (line 510) | def main(opt, callbacks=Callbacks()):
function run (line 651) | def run(**kwargs):
FILE: panoptic/val.py
function save_one_txt (line 37) | def save_one_txt(predn, save_conf, shape, file):
function save_one_json (line 47) | def save_one_json(predn, jdict, path, class_map, pred_masks):
function process_batch (line 71) | def process_batch(detections, labels, iouv, pred_masks=None, gt_masks=No...
function run (line 109) | def run(
function parse_opt (line 530) | def parse_opt():
function main (line 562) | def main(opt):
FILE: segment/predict.py
function run (line 26) | def run(
function parse_opt (line 203) | def parse_opt():
function main (line 239) | def main(opt):
FILE: segment/train.py
function train (line 51) | def train(hyp, opt, device, callbacks): # hyp is path/to/hyp.yaml or hy...
function parse_opt (line 449) | def parse_opt(known=False):
function main (line 494) | def main(opt, callbacks=Callbacks()):
function run (line 635) | def run(**kwargs):
FILE: segment/train_dual.py
function train (line 52) | def train(hyp, opt, device, callbacks): # hyp is path/to/hyp.yaml or hy...
function parse_opt (line 450) | def parse_opt(known=False):
function main (line 495) | def main(opt, callbacks=Callbacks()):
function run (line 636) | def run(**kwargs):
FILE: segment/val.py
function save_one_txt (line 35) | def save_one_txt(predn, save_conf, shape, file):
function save_one_json (line 45) | def save_one_json(predn, jdict, path, class_map, pred_masks):
function process_batch (line 69) | def process_batch(detections, labels, iouv, pred_masks=None, gt_masks=No...
function run (line 107) | def run(
function parse_opt (line 390) | def parse_opt():
function main (line 422) | def main(opt):
FILE: segment/val_dual.py
function save_one_txt (line 35) | def save_one_txt(predn, save_conf, shape, file):
function save_one_json (line 45) | def save_one_json(predn, jdict, path, class_map, pred_masks):
function process_batch (line 69) | def process_batch(detections, labels, iouv, pred_masks=None, gt_masks=No...
function run (line 107) | def run(
function parse_opt (line 391) | def parse_opt():
function main (line 423) | def main(opt):
FILE: train.py
function train (line 51) | def train(hyp, opt, device, callbacks): # hyp is path/to/hyp.yaml or hy...
function parse_opt (line 430) | def parse_opt(known=False):
function main (line 482) | def main(opt, callbacks=Callbacks()):
function run (line 623) | def run(**kwargs):
FILE: train_dual.py
function train (line 54) | def train(hyp, opt, device, callbacks): # hyp is path/to/hyp.yaml or hy...
function parse_opt (line 438) | def parse_opt(known=False):
function main (line 490) | def main(opt, callbacks=Callbacks()):
function run (line 633) | def run(**kwargs):
FILE: train_triple.py
function train (line 52) | def train(hyp, opt, device, callbacks): # hyp is path/to/hyp.yaml or hy...
function parse_opt (line 432) | def parse_opt(known=False):
function main (line 482) | def main(opt, callbacks=Callbacks()):
function run (line 625) | def run(**kwargs):
FILE: utils/__init__.py
function emojis (line 6) | def emojis(str=''):
class TryExcept (line 11) | class TryExcept(contextlib.ContextDecorator):
method __init__ (line 13) | def __init__(self, msg=''):
method __enter__ (line 16) | def __enter__(self):
method __exit__ (line 19) | def __exit__(self, exc_type, value, traceback):
function threaded (line 25) | def threaded(func):
function join_threads (line 35) | def join_threads(verbose=False):
function notebook_init (line 45) | def notebook_init(verbose=True):
FILE: utils/activations.py
class SiLU (line 6) | class SiLU(nn.Module):
method forward (line 9) | def forward(x):
class Hardswish (line 13) | class Hardswish(nn.Module):
method forward (line 16) | def forward(x):
class Mish (line 21) | class Mish(nn.Module):
method forward (line 24) | def forward(x):
class MemoryEfficientMish (line 28) | class MemoryEfficientMish(nn.Module):
class F (line 30) | class F(torch.autograd.Function):
method forward (line 33) | def forward(ctx, x):
method backward (line 38) | def backward(ctx, grad_output):
method forward (line 44) | def forward(self, x):
class FReLU (line 48) | class FReLU(nn.Module):
method __init__ (line 50) | def __init__(self, c1, k=3): # ch_in, kernel
method forward (line 55) | def forward(self, x):
class AconC (line 59) | class AconC(nn.Module):
method __init__ (line 65) | def __init__(self, c1):
method forward (line 71) | def forward(self, x):
class MetaAconC (line 76) | class MetaAconC(nn.Module):
method __init__ (line 82) | def __init__(self, c1, k=1, s=1, r=16): # ch_in, kernel, stride, r
method forward (line 92) | def forward(self, x):
FILE: utils/augmentations.py
class Albumentations (line 17) | class Albumentations:
method __init__ (line 19) | def __init__(self, size=640):
method __call__ (line 43) | def __call__(self, im, labels, p=1.0):
function normalize (line 50) | def normalize(x, mean=IMAGENET_MEAN, std=IMAGENET_STD, inplace=False):
function denormalize (line 55) | def denormalize(x, mean=IMAGENET_MEAN, std=IMAGENET_STD):
function augment_hsv (line 62) | def augment_hsv(im, hgain=0.5, sgain=0.5, vgain=0.5):
function hist_equalize (line 78) | def hist_equalize(im, clahe=True, bgr=False):
function replicate (line 89) | def replicate(im, labels):
function letterbox (line 106) | def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True...
function random_perspective (line 139) | def random_perspective(im,
function copy_paste (line 235) | def copy_paste(im, labels, segments, p=0.5):
function cutout (line 260) | def cutout(im, labels, p=0.5):
function mixup (line 287) | def mixup(im, labels, im2, labels2):
function box_candidates (line 295) | def box_candidates(box1, box2, wh_thr=2, ar_thr=100, area_thr=0.1, eps=1...
function classify_albumentations (line 303) | def classify_albumentations(
function classify_transforms (line 345) | def classify_transforms(size=224):
class LetterBox (line 352) | class LetterBox:
method __init__ (line 354) | def __init__(self, size=(640, 640), auto=False, stride=32):
method __call__ (line 360) | def __call__(self, im): # im = np.array HWC
class CenterCrop (line 371) | class CenterCrop:
method __init__ (line 373) | def __init__(self, size=640):
method __call__ (line 377) | def __call__(self, im): # im = np.array HWC
class ToTensor (line 384) | class ToTensor:
method __init__ (line 386) | def __init__(self, half=False):
method __call__ (line 390) | def __call__(self, im): # im = np.array HWC in BGR order
FILE: utils/autoanchor.py
function check_anchor_order (line 14) | def check_anchor_order(m):
function check_anchors (line 25) | def check_anchors(dataset, model, thr=4.0, imgsz=640):
function kmean_anchors (line 62) | def kmean_anchors(dataset='./data/coco128.yaml', n=9, img_size=640, thr=...
FILE: utils/autobatch.py
function check_train_batch_size (line 10) | def check_train_batch_size(model, imgsz=640, amp=True):
function autobatch (line 16) | def autobatch(model, imgsz=640, fraction=0.8, batch_size=16):
FILE: utils/callbacks.py
class Callbacks (line 4) | class Callbacks:
method __init__ (line 9) | def __init__(self):
method register_action (line 33) | def register_action(self, hook, name='', callback=None):
method get_registered_actions (line 46) | def get_registered_actions(self, hook=None):
method run (line 55) | def run(self, hook, *args, thread=False, **kwargs):
FILE: utils/coco_utils.py
function getCocoIds (line 53) | def getCocoIds(name = 'semantic'):
function getMappingId (line 63) | def getMappingId(index, name = 'semantic'):
function getMappingIndex (line 67) | def getMappingIndex(id, name = 'semantic'):
function annToRLE (line 72) | def annToRLE(ann, img_size):
function annToMask (line 89) | def annToMask(ann, img_size):
function convert_to_polys (line 95) | def convert_to_polys(mask):
FILE: utils/dataloaders.py
function get_hash (line 47) | def get_hash(paths):
function exif_size (line 55) | def exif_size(img):
function exif_transpose (line 65) | def exif_transpose(image):
function seed_worker (line 91) | def seed_worker(worker_id):
function create_dataloader (line 98) | def create_dataloader(path,
class InfiniteDataLoader (line 154) | class InfiniteDataLoader(dataloader.DataLoader):
method __init__ (line 160) | def __init__(self, *args, **kwargs):
method __len__ (line 165) | def __len__(self):
method __iter__ (line 168) | def __iter__(self):
class _RepeatSampler (line 173) | class _RepeatSampler:
method __init__ (line 180) | def __init__(self, sampler):
method __iter__ (line 183) | def __iter__(self):
class LoadScreenshots (line 188) | class LoadScreenshots:
method __init__ (line 190) | def __init__(self, source, img_size=640, stride=32, auto=True, transfo...
method __iter__ (line 219) | def __iter__(self):
method __next__ (line 222) | def __next__(self):
class LoadImages (line 237) | class LoadImages:
method __init__ (line 239) | def __init__(self, path, img_size=640, stride=32, auto=True, transform...
method __iter__ (line 272) | def __iter__(self):
method __next__ (line 276) | def __next__(self):
method _new_video (line 316) | def _new_video(self, path):
method _cv2_rotate (line 324) | def _cv2_rotate(self, im):
method __len__ (line 334) | def __len__(self):
class LoadStreams (line 338) | class LoadStreams:
method __init__ (line 340) | def __init__(self, sources='streams.txt', img_size=640, stride=32, aut...
method update (line 384) | def update(self, i, cap, stream):
method __iter__ (line 400) | def __iter__(self):
method __next__ (line 404) | def __next__(self):
method __len__ (line 420) | def __len__(self):
function img2label_paths (line 424) | def img2label_paths(img_paths):
class LoadImagesAndLabels (line 430) | class LoadImagesAndLabels(Dataset):
method __init__ (line 435) | def __init__(self,
method check_cache_ram (line 585) | def check_cache_ram(self, safety_margin=0.1, prefix=''):
method cache_labels (line 602) | def cache_labels(self, path=Path('./labels.cache'), prefix=''):
method __len__ (line 640) | def __len__(self):
method __getitem__ (line 649) | def __getitem__(self, index):
method load_image (line 723) | def load_image(self, i):
method cache_images_to_disk (line 740) | def cache_images_to_disk(self, i):
method load_mosaic (line 746) | def load_mosaic(self, index):
method load_mosaic9 (line 804) | def load_mosaic9(self, index):
method collate_fn (line 882) | def collate_fn(batch):
method collate_fn4 (line 889) | def collate_fn4(batch):
function flatten_recursive (line 916) | def flatten_recursive(path=DATASETS_DIR / 'coco128'):
function extract_boxes (line 926) | def extract_boxes(path=DATASETS_DIR / 'coco128'): # from utils.dataload...
function autosplit (line 960) | def autosplit(path=DATASETS_DIR / 'coco128/images', weights=(0.9, 0.1, 0...
function verify_image_label (line 986) | def verify_image_label(args):
class HUBDatasetStats (line 1038) | class HUBDatasetStats():
method __init__ (line 1053) | def __init__(self, path='coco128.yaml', autodownload=False):
method _find_yaml (line 1072) | def _find_yaml(dir):
method _unzip (line 1082) | def _unzip(self, path):
method _hub_ops (line 1092) | def _hub_ops(self, f, max_dim=1920):
method get_json (line 1110) | def get_json(self, save=False, verbose=False):
method process_images (line 1145) | def process_images(self):
class ClassificationDataset (line 1159) | class ClassificationDataset(torchvision.datasets.ImageFolder):
method __init__ (line 1168) | def __init__(self, root, augment, imgsz, cache=False):
method __getitem__ (line 1176) | def __getitem__(self, i):
function create_classification_dataloader (line 1193) | def create_classification_dataloader(path,
FILE: utils/downloads.py
function is_url (line 11) | def is_url(url, check=True):
function gsutil_getsize (line 22) | def gsutil_getsize(url=''):
function url_getsize (line 28) | def url_getsize(url='https://ultralytics.com/images/bus.jpg'):
function safe_download (line 34) | def safe_download(file, url, url2=None, min_bytes=1E0, error_msg=''):
function attempt_download (line 57) | def attempt_download(file, repo='ultralytics/yolov5', release='v7.0'):
FILE: utils/general.py
function is_ascii (line 58) | def is_ascii(s=''):
function is_chinese (line 64) | def is_chinese(s='人工智能'):
function is_colab (line 69) | def is_colab():
function is_notebook (line 74) | def is_notebook():
function is_kaggle (line 80) | def is_kaggle():
function is_docker (line 85) | def is_docker() -> bool:
function is_writeable (line 96) | def is_writeable(dir, test=False):
function set_logging (line 113) | def set_logging(name=LOGGING_NAME, verbose=True):
function user_config_dir (line 142) | def user_config_dir(dir='Ultralytics', env_var='YOLOV5_CONFIG_DIR'):
class Profile (line 158) | class Profile(contextlib.ContextDecorator):
method __init__ (line 160) | def __init__(self, t=0.0):
method __enter__ (line 164) | def __enter__(self):
method __exit__ (line 168) | def __exit__(self, type, value, traceback):
method time (line 172) | def time(self):
class Timeout (line 178) | class Timeout(contextlib.ContextDecorator):
method __init__ (line 180) | def __init__(self, seconds, *, timeout_msg='', suppress_timeout_errors...
method _timeout_handler (line 185) | def _timeout_handler(self, signum, frame):
method __enter__ (line 188) | def __enter__(self):
method __exit__ (line 193) | def __exit__(self, exc_type, exc_val, exc_tb):
class WorkingDirectory (line 200) | class WorkingDirectory(contextlib.ContextDecorator):
method __init__ (line 202) | def __init__(self, new_dir):
method __enter__ (line 206) | def __enter__(self):
method __exit__ (line 209) | def __exit__(self, exc_type, exc_val, exc_tb):
function methods (line 213) | def methods(instance):
function print_args (line 218) | def print_args(args: Optional[dict] = None, show_file=True, show_func=Fa...
function init_seeds (line 233) | def init_seeds(seed=0, deterministic=False):
function intersect_dicts (line 248) | def intersect_dicts(da, db, exclude=()):
function get_default_args (line 253) | def get_default_args(func):
function get_latest_run (line 259) | def get_latest_run(search_dir='.'):
function file_age (line 265) | def file_age(path=__file__):
function file_date (line 271) | def file_date(path=__file__):
function file_size (line 277) | def file_size(path):
function check_online (line 289) | def check_online():
function git_describe (line 304) | def git_describe(path=ROOT): # path must be a directory
function check_git_status (line 315) | def check_git_status(repo='WongKinYiu/yolov9', branch='main'):
function check_git_info (line 342) | def check_git_info(path='.'):
function check_python (line 359) | def check_python(minimum='3.7.0'):
function check_version (line 364) | def check_version(current='0.0.0', minimum='0.0.0', name='version ', pin...
function check_requirements (line 377) | def check_requirements(requirements=ROOT / 'requirements.txt', exclude=(...
function check_img_size (line 411) | def check_img_size(imgsz, s=32, floor=0):
function check_imshow (line 423) | def check_imshow(warn=False):
function check_suffix (line 439) | def check_suffix(file='yolo.pt', suffix=('.pt',), msg=''):
function check_yaml (line 450) | def check_yaml(file, suffix=('.yaml', '.yml')):
function check_file (line 455) | def check_file(file, suffix=''):
function check_font (line 483) | def check_font(font=FONT, progress=False):
function check_dataset (line 493) | def check_dataset(data, autodownload=True):
function check_amp (line 559) | def check_amp(model):
function yaml_load (line 587) | def yaml_load(file='data.yaml'):
function yaml_save (line 593) | def yaml_save(file='data.yaml', data={}):
function unzip_file (line 599) | def unzip_file(file, path=None, exclude=('.DS_Store', '__MACOSX')):
function url2file (line 609) | def url2file(url):
function download (line 615) | def download(url, dir='.', unzip=True, delete=True, curl=False, threads=...
function make_divisible (line 664) | def make_divisible(x, divisor):
function clean_str (line 671) | def clean_str(s):
function one_cycle (line 676) | def one_cycle(y1=0.0, y2=1.0, steps=100):
function one_flat_cycle (line 681) | def one_flat_cycle(y1=0.0, y2=1.0, steps=100):
function colorstr (line 687) | def colorstr(*input):
function labels_to_class_weights (line 713) | def labels_to_class_weights(labels, nc=80):
function labels_to_image_weights (line 732) | def labels_to_image_weights(labels, nc=80, class_weights=np.ones(80)):
function coco80_to_coco91_class (line 739) | def coco80_to_coco91_class(): # converts 80-index (val2014) to 91-index...
function xyxy2xywh (line 751) | def xyxy2xywh(x):
function xywh2xyxy (line 761) | def xywh2xyxy(x):
function xywhn2xyxy (line 771) | def xywhn2xyxy(x, w=640, h=640, padw=0, padh=0):
function xyxy2xywhn (line 781) | def xyxy2xywhn(x, w=640, h=640, clip=False, eps=0.0):
function xyn2xy (line 793) | def xyn2xy(x, w=640, h=640, padw=0, padh=0):
function segment2box (line 801) | def segment2box(segment, width=640, height=640):
function segments2boxes (line 809) | def segments2boxes(segments):
function resample_segments (line 818) | def resample_segments(segments, n=1000):
function scale_boxes (line 828) | def scale_boxes(img1_shape, boxes, img0_shape, ratio_pad=None):
function scale_segments (line 844) | def scale_segments(img1_shape, segments, img0_shape, ratio_pad=None, nor...
function clip_boxes (line 863) | def clip_boxes(boxes, shape):
function clip_segments (line 875) | def clip_segments(segments, shape):
function non_max_suppression (line 885) | def non_max_suppression(
function strip_optimizer (line 997) | def strip_optimizer(f='best.pt', s=''): # from utils.general import *; ...
function print_mutation (line 1013) | def print_mutation(keys, results, hyp, save_dir, bucket, prefix=colorstr...
function apply_classifier (line 1052) | def apply_classifier(x, model, img, im0):
function increment_path (line 1087) | def increment_path(path, exist_ok=False, sep='', mkdir=False):
function imread (line 1117) | def imread(path, flags=cv2.IMREAD_COLOR):
function imwrite (line 1121) | def imwrite(path, im):
function imshow (line 1129) | def imshow(path, im):
FILE: utils/lion.py
class Lion (line 6) | class Lion(Optimizer):
method __init__ (line 9) | def __init__(self, params, lr=1e-4, betas=(0.9, 0.99), weight_decay=0.0):
method step (line 30) | def step(self, closure=None):
FILE: utils/loggers/__init__.py
class Loggers (line 52) | class Loggers():
method __init__ (line 54) | def __init__(self, save_dir=None, weights=None, opt=None, hyp=None, lo...
method remote_dataset (line 133) | def remote_dataset(self):
method on_train_start (line 145) | def on_train_start(self):
method on_pretrain_routine_start (line 149) | def on_pretrain_routine_start(self):
method on_pretrain_routine_end (line 153) | def on_pretrain_routine_end(self, labels, names):
method on_train_batch_end (line 165) | def on_train_batch_end(self, model, ni, imgs, targets, paths, vals):
method on_train_epoch_end (line 185) | def on_train_epoch_end(self, epoch):
method on_val_start (line 193) | def on_val_start(self):
method on_val_image_end (line 197) | def on_val_image_end(self, pred, predn, path, names, im):
method on_val_batch_end (line 204) | def on_val_batch_end(self, batch_i, im, targets, paths, shapes, out):
method on_val_end (line 208) | def on_val_end(self, nt, tp, fp, p, r, f1, ap, ap50, ap_class, confusi...
method on_fit_epoch_end (line 220) | def on_fit_epoch_end(self, vals, epoch, best_fitness, fi):
method on_model_save (line 253) | def on_model_save(self, last, epoch, final_epoch, best_fitness, fi):
method on_train_end (line 266) | def on_train_end(self, last, best, epoch, results):
method on_params_update (line 298) | def on_params_update(self, params: dict):
class GenericLogger (line 306) | class GenericLogger:
method __init__ (line 316) | def __init__(self, opt, console_logger, include=('tb', 'wandb')):
method log_metrics (line 335) | def log_metrics(self, metrics, epoch):
method log_images (line 351) | def log_images(self, files, name='Images', epoch=0):
method log_graph (line 363) | def log_graph(self, model, imgsz=(640, 640)):
method log_model (line 368) | def log_model(self, model_path, epoch=0, metadata={}):
method update_params (line 375) | def update_params(self, params):
function log_tensorboard_graph (line 381) | def log_tensorboard_graph(tb, model, imgsz=(640, 640)):
function web_project_name (line 394) | def web_project_name(project):
FILE: utils/loggers/clearml/clearml_utils.py
function construct_dataset (line 20) | def construct_dataset(clearml_info_string):
class ClearmlLogger (line 55) | class ClearmlLogger:
method __init__ (line 66) | def __init__(self, opt, hyp):
method log_debug_samples (line 109) | def log_debug_samples(self, files, title='Debug Samples'):
method log_image_with_boxes (line 126) | def log_image_with_boxes(self, image_path, boxes, class_names, image, ...
FILE: utils/loggers/comet/__init__.py
class CometLogger (line 64) | class CometLogger:
method __init__ (line 69) | def __init__(self, opt, hyp, run_id=None, job_type="Training", **exper...
method _get_experiment (line 164) | def _get_experiment(self, mode, experiment_id=None):
method log_metrics (line 193) | def log_metrics(self, log_dict, **kwargs):
method log_parameters (line 196) | def log_parameters(self, log_dict, **kwargs):
method log_asset (line 199) | def log_asset(self, asset_path, **kwargs):
method log_asset_data (line 202) | def log_asset_data(self, asset, **kwargs):
method log_image (line 205) | def log_image(self, img, **kwargs):
method log_model (line 208) | def log_model(self, path, opt, epoch, fitness_score, best_model=False):
method check_dataset (line 230) | def check_dataset(self, data_file):
method log_predictions (line 244) | def log_predictions(self, image, labelsn, path, shape, predn):
method preprocess_prediction (line 288) | def preprocess_prediction(self, image, labels, shape, pred):
method add_assets_to_artifact (line 307) | def add_assets_to_artifact(self, artifact, path, asset_path, split):
method upload_dataset_artifact (line 324) | def upload_dataset_artifact(self):
method download_dataset_artifact (line 348) | def download_dataset_artifact(self, artifact_path):
method update_data_paths (line 368) | def update_data_paths(self, data_dict):
method on_pretrain_routine_end (line 379) | def on_pretrain_routine_end(self, paths):
method on_train_start (line 392) | def on_train_start(self):
method on_train_epoch_start (line 395) | def on_train_epoch_start(self):
method on_train_epoch_end (line 398) | def on_train_epoch_end(self, epoch):
method on_train_batch_start (line 403) | def on_train_batch_start(self):
method on_train_batch_end (line 406) | def on_train_batch_end(self, log_dict, step):
method on_train_end (line 413) | def on_train_end(self, files, save_dir, last, best, epoch, results):
method on_val_start (line 440) | def on_val_start(self):
method on_val_batch_start (line 443) | def on_val_batch_start(self):
method on_val_batch_end (line 446) | def on_val_batch_end(self, batch_i, images, targets, paths, shapes, ou...
method on_val_end (line 464) | def on_val_end(self, nt, tp, fp, p, r, f1, ap, ap50, ap_class, confusi...
method on_fit_epoch_end (line 497) | def on_fit_epoch_end(self, result, epoch):
method on_model_save (line 500) | def on_model_save(self, last, epoch, final_epoch, best_fitness, fi):
method on_params_update (line 504) | def on_params_update(self, params):
method finish_run (line 507) | def finish_run(self):
FILE: utils/loggers/comet/comet_utils.py
function download_model_checkpoint (line 19) | def download_model_checkpoint(opt, experiment):
function set_opt_parameters (line 66) | def set_opt_parameters(opt, experiment):
function check_comet_weights (line 97) | def check_comet_weights(opt):
function check_comet_resume (line 124) | def check_comet_resume(opt):
FILE: utils/loggers/comet/hpo.py
function get_args (line 27) | def get_args(known=False):
function run (line 83) | def run(parameters, opt):
FILE: utils/loggers/wandb/log_dataset.py
function create_dataset_artifact (line 10) | def create_dataset_artifact(opt):
FILE: utils/loggers/wandb/sweep.py
function sweep (line 17) | def sweep():
FILE: utils/loggers/wandb/wandb_utils.py
function remove_prefix (line 32) | def remove_prefix(from_string, prefix=WANDB_ARTIFACT_PREFIX):
function check_wandb_config_file (line 36) | def check_wandb_config_file(data_config_file):
function check_wandb_dataset (line 43) | def check_wandb_dataset(data_file):
function get_run_info (line 62) | def get_run_info(run_path):
function check_wandb_resume (line 71) | def check_wandb_resume(opt):
function process_wandb_config_ddp_mode (line 85) | def process_wandb_config_ddp_mode(opt):
class WandbLogger (line 109) | class WandbLogger():
method __init__ (line 123) | def __init__(self, opt, run_id=None, job_type='Training'):
method check_and_upload_dataset (line 203) | def check_and_upload_dataset(self, opt):
method setup_training (line 220) | def setup_training(self, opt):
method download_dataset_artifact (line 272) | def download_dataset_artifact(self, path, alias):
method download_model_artifact (line 292) | def download_model_artifact(self, opt):
method log_model (line 310) | def log_model(self, path, opt, epoch, fitness_score, best_model=False):
method log_dataset_artifact (line 335) | def log_dataset_artifact(self, data_file, single_cls, project, overwri...
method map_val_table_path (line 393) | def map_val_table_path(self):
method create_dataset_table (line 403) | def create_dataset_table(self, dataset: LoadImagesAndLabels, class_to_...
method log_training_progress (line 449) | def log_training_progress(self, predn, path, names):
method val_one_image (line 492) | def val_one_image(self, pred, predn, path, names, im):
method log (line 520) | def log(self, log_dict):
method end_epoch (line 531) | def end_epoch(self, best_result=False):
method finish_run (line 566) | def finish_run(self):
function all_logging_disabled (line 578) | def all_logging_disabled(highest_level=logging.CRITICAL):
FILE: utils/loss.py
function smooth_BCE (line 9) | def smooth_BCE(eps=0.1): # https://github.com/ultralytics/yolov3/issues...
class BCEBlurWithLogitsLoss (line 14) | class BCEBlurWithLogitsLoss(nn.Module):
method __init__ (line 16) | def __init__(self, alpha=0.05):
method forward (line 21) | def forward(self, pred, true):
class FocalLoss (line 31) | class FocalLoss(nn.Module):
method __init__ (line 33) | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
method forward (line 41) | def forward(self, pred, true):
class QFocalLoss (line 61) | class QFocalLoss(nn.Module):
method __init__ (line 63) | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
method forward (line 71) | def forward(self, pred, true):
class ComputeLoss (line 87) | class ComputeLoss:
method __init__ (line 91) | def __init__(self, model, autobalance=False):
method __call__ (line 116) | def __call__(self, p, targets): # predictions, targets
method build_targets (line 171) | def build_targets(self, p, targets):
class ComputeLoss_NEW (line 228) | class ComputeLoss_NEW:
method __init__ (line 232) | def __init__(self, model, autobalance=False):
method __call__ (line 258) | def __call__(self, p, targets): # predictions, targets
method build_targets (line 303) | def build_targets(self, p, targets):
FILE: utils/loss_tal.py
function smooth_BCE (line 14) | def smooth_BCE(eps=0.1): # https://github.com/ultralytics/yolov3/issues...
class VarifocalLoss (line 19) | class VarifocalLoss(nn.Module):
method __init__ (line 21) | def __init__(self):
method forward (line 24) | def forward(self, pred_score, gt_score, label, alpha=0.75, gamma=2.0):
class FocalLoss (line 32) | class FocalLoss(nn.Module):
method __init__ (line 34) | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
method forward (line 42) | def forward(self, pred, true):
class BboxLoss (line 62) | class BboxLoss(nn.Module):
method __init__ (line 63) | def __init__(self, reg_max, use_dfl=False):
method forward (line 68) | def forward(self, pred_dist, pred_bboxes, anchor_points, target_bboxes...
method _df_loss (line 94) | def _df_loss(self, pred_dist, target):
class ComputeLoss (line 106) | class ComputeLoss:
method __init__ (line 108) | def __init__(self, model, use_dfl=True):
method preprocess (line 142) | def preprocess(self, targets, batch_size, scale_tensor):
method bbox_decode (line 157) | def bbox_decode(self, anchor_points, pred_dist):
method __call__ (line 165) | def __call__(self, p, targets, img=None, epoch=0):
FILE: utils/loss_tal_dual.py
function smooth_BCE (line 14) | def smooth_BCE(eps=0.1): # https://github.com/ultralytics/yolov3/issues...
class VarifocalLoss (line 19) | class VarifocalLoss(nn.Module):
method __init__ (line 21) | def __init__(self):
method forward (line 24) | def forward(self, pred_score, gt_score, label, alpha=0.75, gamma=2.0):
class FocalLoss (line 32) | class FocalLoss(nn.Module):
method __init__ (line 34) | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
method forward (line 42) | def forward(self, pred, true):
class BboxLoss (line 62) | class BboxLoss(nn.Module):
method __init__ (line 63) | def __init__(self, reg_max, use_dfl=False):
method forward (line 68) | def forward(self, pred_dist, pred_bboxes, anchor_points, target_bboxes...
method _df_loss (line 94) | def _df_loss(self, pred_dist, target):
class ComputeLoss (line 106) | class ComputeLoss:
method __init__ (line 108) | def __init__(self, model, use_dfl=True):
method preprocess (line 147) | def preprocess(self, targets, batch_size, scale_tensor):
method bbox_decode (line 162) | def bbox_decode(self, anchor_points, pred_dist):
method __call__ (line 170) | def __call__(self, p, targets, img=None, epoch=0):
class ComputeLossLH (line 254) | class ComputeLossLH:
method __init__ (line 256) | def __init__(self, model, use_dfl=True):
method preprocess (line 290) | def preprocess(self, targets, batch_size, scale_tensor):
method bbox_decode (line 305) | def bbox_decode(self, anchor_points, pred_dist):
method __call__ (line 313) | def __call__(self, p, targets, img=None, epoch=0):
FILE: utils/loss_tal_triple.py
function smooth_BCE (line 14) | def smooth_BCE(eps=0.1): # https://github.com/ultralytics/yolov3/issues...
class VarifocalLoss (line 19) | class VarifocalLoss(nn.Module):
method __init__ (line 21) | def __init__(self):
method forward (line 24) | def forward(self, pred_score, gt_score, label, alpha=0.75, gamma=2.0):
class FocalLoss (line 32) | class FocalLoss(nn.Module):
method __init__ (line 34) | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
method forward (line 42) | def forward(self, pred, true):
class BboxLoss (line 62) | class BboxLoss(nn.Module):
method __init__ (line 63) | def __init__(self, reg_max, use_dfl=False):
method forward (line 68) | def forward(self, pred_dist, pred_bboxes, anchor_points, target_bboxes...
method _df_loss (line 94) | def _df_loss(self, pred_dist, target):
class ComputeLoss (line 106) | class ComputeLoss:
method __init__ (line 108) | def __init__(self, model, use_dfl=True):
method preprocess (line 152) | def preprocess(self, targets, batch_size, scale_tensor):
method bbox_decode (line 167) | def bbox_decode(self, anchor_points, pred_dist):
method __call__ (line 175) | def __call__(self, p, targets, img=None, epoch=0):
FILE: utils/metrics.py
function fitness (line 12) | def fitness(x):
function smooth (line 18) | def smooth(y, f=0.05):
function ap_per_class (line 26) | def ap_per_class(tp, conf, pred_cls, target_cls, plot=False, save_dir='....
function compute_ap (line 93) | def compute_ap(recall, precision):
class ConfusionMatrix (line 121) | class ConfusionMatrix:
method __init__ (line 123) | def __init__(self, nc, conf=0.25, iou_thres=0.45):
method process_batch (line 129) | def process_batch(self, detections, labels):
method matrix (line 175) | def matrix(self):
method tp_fp (line 178) | def tp_fp(self):
method plot (line 185) | def plot(self, normalize=True, save_dir='', names=()):
method print (line 215) | def print(self):
class WIoU_Scale (line 220) | class WIoU_Scale:
method __init__ (line 233) | def __init__(self, iou):
method _update (line 238) | def _update(cls, self):
method _scaled_loss (line 243) | def _scaled_loss(cls, self, gamma=1.9, delta=3):
function bbox_iou (line 254) | def bbox_iou(box1, box2, xywh=True, GIoU=False, DIoU=False, CIoU=False, ...
function box_iou (line 300) | def box_iou(box1, box2, eps=1e-7):
function bbox_ioa (line 321) | def bbox_ioa(box1, box2, eps=1e-7):
function wh_iou (line 343) | def wh_iou(wh1, wh2, eps=1e-7):
function plot_pr_curve (line 355) | def plot_pr_curve(px, py, ap, save_dir=Path('pr_curve.png'), names=()):
function plot_mc_curve (line 378) | def plot_mc_curve(px, py, save_dir=Path('mc_curve.png'), names=(), xlabe...
FILE: utils/panoptic/augmentations.py
function mixup (line 12) | def mixup(im, labels, segments, seg_cls, semantic_masks, im2, labels2, s...
function random_perspective (line 23) | def random_perspective(im,
function letterbox (line 126) | def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True...
function copy_paste (line 159) | def copy_paste(im, labels, segments, seg_cls, semantic_masks, p=0.5):
FILE: utils/panoptic/dataloaders.py
function create_dataloader (line 26) | def create_dataloader(path,
function img2stuff_paths (line 85) | def img2stuff_paths(img_paths):
class LoadImagesAndLabelsAndMasks (line 91) | class LoadImagesAndLabelsAndMasks(LoadImagesAndLabels): # for training/...
method __init__ (line 93) | def __init__(
method __getitem__ (line 172) | def __getitem__(self, index):
method load_mosaic (line 306) | def load_mosaic(self, index):
method cache_seg_labels (line 373) | def cache_seg_labels(self, path = Path('./labels_stuff.cache'), prefix...
method collate_fn (line 412) | def collate_fn(batch):
function polygon2mask (line 421) | def polygon2mask(img_size, polygons, color=1, downsample_ratio=1):
function polygons2masks (line 441) | def polygons2masks(img_size, polygons, color, downsample_ratio=1):
function polygons2masks_overlap (line 456) | def polygons2masks_overlap(img_size, segments, downsample_ratio=1):
FILE: utils/panoptic/general.py
function crop_mask (line 7) | def crop_mask(masks, boxes):
function process_mask_upsample (line 25) | def process_mask_upsample(protos, masks_in, bboxes, shape):
function process_mask (line 43) | def process_mask(protos, masks_in, bboxes, shape, upsample=False):
function scale_image (line 70) | def scale_image(im1_shape, masks, im0_shape, ratio_pad=None):
function mask_iou (line 98) | def mask_iou(mask1, mask2, eps=1e-7):
function masks_iou (line 111) | def masks_iou(mask1, mask2, eps=1e-7):
function masks2segments (line 124) | def masks2segments(masks, strategy='largest'):
FILE: utils/panoptic/loss.py
class ComputeLoss (line 12) | class ComputeLoss:
method __init__ (line 14) | def __init__(self, model, autobalance=False, overlap=False):
method __call__ (line 44) | def __call__(self, preds, targets, masks): # predictions, targets, model
method single_mask_loss (line 112) | def single_mask_loss(self, gt_mask, pred, proto, xyxy, area):
method build_targets (line 118) | def build_targets(self, p, targets):
FILE: utils/panoptic/loss_tal.py
function smooth_BCE (line 17) | def smooth_BCE(eps=0.1): # https://github.com/ultralytics/yolov3/issues...
class VarifocalLoss (line 22) | class VarifocalLoss(nn.Module):
method __init__ (line 24) | def __init__(self):
method forward (line 27) | def forward(self, pred_score, gt_score, label, alpha=0.75, gamma=2.0):
class FocalLoss (line 35) | class FocalLoss(nn.Module):
method __init__ (line 37) | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
method forward (line 45) | def forward(self, pred, true):
class BboxLoss (line 65) | class BboxLoss(nn.Module):
method __init__ (line 66) | def __init__(self, reg_max, use_dfl=False):
method forward (line 71) | def forward(self, pred_dist, pred_bboxes, anchor_points, target_bboxes...
method _df_loss (line 110) | def _df_loss(self, pred_dist, target):
class ComputeLoss (line 122) | class ComputeLoss:
method __init__ (line 124) | def __init__(self, model, use_dfl=True, overlap=True):
method preprocess (line 160) | def preprocess(self, targets, batch_size, scale_tensor):
method bbox_decode (line 175) | def bbox_decode(self, anchor_points, pred_dist):
method __call__ (line 183) | def __call__(self, p, targets, masks, semasks, img=None, epoch=0):
method single_mask_loss (line 281) | def single_mask_loss(self, gt_mask, pred, proto, xyxy, area):
FILE: utils/panoptic/metrics.py
function fitness (line 7) | def fitness(x):
function ap_per_class_box_and_mask (line 13) | def ap_per_class_box_and_mask(
class Metric (line 62) | class Metric:
method __init__ (line 64) | def __init__(self) -> None:
method ap50 (line 72) | def ap50(self):
method ap (line 80) | def ap(self):
method mp (line 88) | def mp(self):
method mr (line 96) | def mr(self):
method map50 (line 104) | def map50(self):
method map (line 112) | def map(self):
method mean_results (line 119) | def mean_results(self):
method class_result (line 123) | def class_result(self, i):
method get_maps (line 127) | def get_maps(self, nc):
method update (line 133) | def update(self, results):
class Metrics (line 146) | class Metrics:
method __init__ (line 149) | def __init__(self) -> None:
method update (line 153) | def update(self, results):
method mean_results (line 161) | def mean_results(self):
method class_result (line 164) | def class_result(self, i):
method get_maps (line 167) | def get_maps(self, nc):
method ap_class_index (line 171) | def ap_class_index(self):
class Semantic_Metrics (line 176) | class Semantic_Metrics:
method __init__ (line 177) | def __init__(self, nc, device):
method update (line 185) | def update(self, pred_masks, target_masks):
method results (line 214) | def results(self):
method reset (line 227) | def reset(self):
FILE: utils/panoptic/plots.py
function plot_images_and_masks (line 18) | def plot_images_and_masks(images, targets, masks, semasks, paths=None, f...
function plot_results_with_masks (line 132) | def plot_results_with_masks(file="path/to/results.csv", dir="", best=True):
FILE: utils/panoptic/tal/anchor_generator.py
function make_anchors (line 8) | def make_anchors(feats, strides, grid_cell_offset=0.5):
function dist2bbox (line 23) | def dist2bbox(distance, anchor_points, xywh=True, dim=-1):
function bbox2dist (line 35) | def bbox2dist(anchor_points, bbox, reg_max):
FILE: utils/panoptic/tal/assigner.py
function select_candidates_in_gts (line 8) | def select_candidates_in_gts(xy_centers, gt_bboxes, eps=1e-9):
function select_highest_overlaps (line 25) | def select_highest_overlaps(mask_pos, overlaps, n_max_boxes):
class TaskAlignedAssigner (line 51) | class TaskAlignedAssigner(nn.Module):
method __init__ (line 52) | def __init__(self, topk=13, num_classes=80, alpha=1.0, beta=6.0, eps=1...
method forward (line 62) | def forward(self, pd_scores, pd_bboxes, anc_points, gt_labels, gt_bbox...
method get_pos_mask (line 107) | def get_pos_mask(self, pd_scores, pd_bboxes, gt_labels, gt_bboxes, anc...
method get_box_metrics (line 121) | def get_box_metrics(self, pd_scores, pd_bboxes, gt_labels, gt_bboxes):
method select_topk_candidates (line 135) | def select_topk_candidates(self, metrics, largest=True, topk_mask=None):
method get_targets (line 158) | def get_targets(self, gt_labels, gt_bboxes, target_gt_idx, fg_mask):
FILE: utils/plots.py
class Colors (line 29) | class Colors:
method __init__ (line 31) | def __init__(self):
method __call__ (line 38) | def __call__(self, i, bgr=False):
method hex2rgb (line 43) | def hex2rgb(h): # rgb order (PIL)
function check_pil_font (line 50) | def check_pil_font(font=FONT, size=10):
class Annotator (line 66) | class Annotator:
method __init__ (line 68) | def __init__(self, im, line_width=None, font_size=None, font='Arial.tt...
method box_label (line 81) | def box_label(self, box, label='', color=(128, 128, 128), txt_color=(2...
method masks (line 112) | def masks(self, masks, colors, im_gpu=None, alpha=0.5):
method rectangle (line 158) | def rectangle(self, xy, fill=None, outline=None, width=1):
method text (line 162) | def text(self, xy, text, txt_color=(255, 255, 255), anchor='top'):
method fromarray (line 169) | def fromarray(self, im):
method result (line 174) | def result(self):
function feature_visualization (line 179) | def feature_visualization(x, module_type, stage, n=32, save_dir=Path('ru...
function hist2d (line 207) | def hist2d(x, y, n=100):
function butter_lowpass_filtfilt (line 216) | def butter_lowpass_filtfilt(data, cutoff=1500, fs=50000, order=5):
function output_to_target (line 229) | def output_to_target(output, max_det=300):
function plot_images (line 240) | def plot_images(images, targets, paths=None, fname='images.jpg', names=N...
function plot_lr_scheduler (line 304) | def plot_lr_scheduler(optimizer, scheduler, epochs=300, save_dir=''):
function plot_val_txt (line 321) | def plot_val_txt(): # from utils.plots import *; plot_val()
function plot_targets_txt (line 338) | def plot_targets_txt(): # from utils.plots import *; plot_targets_txt()
function plot_val_study (line 351) | def plot_val_study(file='', dir='', x=None): # from utils.plots import ...
function plot_labels (line 397) | def plot_labels(labels, names=(), save_dir=Path('')):
function imshow_cls (line 442) | def imshow_cls(im, labels=None, pred=None, names=None, nmax=25, verbose=...
function plot_evolve (line 471) | def plot_evolve(evolve_csv='path/to/evolve.csv'): # from utils.plots im...
function plot_results (line 498) | def plot_results(file='path/to/results.csv', dir=''):
function profile_idetection (line 524) | def profile_idetection(start=0, stop=0, labels=(), save_dir=''):
function save_one_box (line 555) | def save_one_box(xyxy, im, file=Path('im.jpg'), gain=1.02, pad=10, squar...
FILE: utils/segment/augmentations.py
function mixup (line 11) | def mixup(im, labels, segments, im2, labels2, segments2):
function random_perspective (line 20) | def random_perspective(im,
FILE: utils/segment/dataloaders.py
function create_dataloader (line 18) | def create_dataloader(path,
class LoadImagesAndLabelsAndMasks (line 78) | class LoadImagesAndLabelsAndMasks(LoadImagesAndLabels): # for training/...
method __init__ (line 80) | def __init__(
method __getitem__ (line 103) | def __getitem__(self, index):
method load_mosaic (line 204) | def load_mosaic(self, index):
method collate_fn (line 263) | def collate_fn(batch):
function polygon2mask (line 271) | def polygon2mask(img_size, polygons, color=1, downsample_ratio=1):
function polygons2masks (line 291) | def polygons2masks(img_size, polygons, color, downsample_ratio=1):
function polygons2masks_overlap (line 306) | def polygons2masks_overlap(img_size, segments, downsample_ratio=1):
FILE: utils/segment/general.py
function crop_mask (line 7) | def crop_mask(masks, boxes):
function process_mask_upsample (line 25) | def process_mask_upsample(protos, masks_in, bboxes, shape):
function process_mask (line 43) | def process_mask(protos, masks_in, bboxes, shape, upsample=False):
function scale_image (line 70) | def scale_image(im1_shape, masks, im0_shape, ratio_pad=None):
function mask_iou (line 98) | def mask_iou(mask1, mask2, eps=1e-7):
function masks_iou (line 111) | def masks_iou(mask1, mask2, eps=1e-7):
function masks2segments (line 124) | def masks2segments(masks, strategy='largest'):
FILE: utils/segment/loss.py
class ComputeLoss (line 12) | class ComputeLoss:
method __init__ (line 14) | def __init__(self, model, autobalance=False, overlap=False):
method __call__ (line 44) | def __call__(self, preds, targets, masks): # predictions, targets, model
method single_mask_loss (line 112) | def single_mask_loss(self, gt_mask, pred, proto, xyxy, area):
method build_targets (line 118) | def build_targets(self, p, targets):
FILE: utils/segment/loss_tal.py
function smooth_BCE (line 17) | def smooth_BCE(eps=0.1): # https://github.com/ultralytics/yolov3/issues...
class VarifocalLoss (line 22) | class VarifocalLoss(nn.Module):
method __init__ (line 24) | def __init__(self):
method forward (line 27) | def forward(self, pred_score, gt_score, label, alpha=0.75, gamma=2.0):
class FocalLoss (line 35) | class FocalLoss(nn.Module):
method __init__ (line 37) | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
method forward (line 45) | def forward(self, pred, true):
class BboxLoss (line 65) | class BboxLoss(nn.Module):
method __init__ (line 66) | def __init__(self, reg_max, use_dfl=False):
method forward (line 71) | def forward(self, pred_dist, pred_bboxes, anchor_points, target_bboxes...
method _df_loss (line 97) | def _df_loss(self, pred_dist, target):
class ComputeLoss (line 109) | class ComputeLoss:
method __init__ (line 111) | def __init__(self, model, use_dfl=True, overlap=True):
method preprocess (line 147) | def preprocess(self, targets, batch_size, scale_tensor):
method bbox_decode (line 162) | def bbox_decode(self, anchor_points, pred_dist):
method __call__ (line 170) | def __call__(self, p, targets, masks, img=None, epoch=0):
method single_mask_loss (line 246) | def single_mask_loss(self, gt_mask, pred, proto, xyxy, area):
FILE: utils/segment/loss_tal_dual.py
function smooth_BCE (line 17) | def smooth_BCE(eps=0.1): # https://github.com/ultralytics/yolov3/issues...
class VarifocalLoss (line 22) | class VarifocalLoss(nn.Module):
method __init__ (line 24) | def __init__(self):
method forward (line 27) | def forward(self, pred_score, gt_score, label, alpha=0.75, gamma=2.0):
class FocalLoss (line 35) | class FocalLoss(nn.Module):
method __init__ (line 37) | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
method forward (line 45) | def forward(self, pred, true):
class BboxLoss (line 65) | class BboxLoss(nn.Module):
method __init__ (line 66) | def __init__(self, reg_max, use_dfl=False):
method forward (line 71) | def forward(self, pred_dist, pred_bboxes, anchor_points, target_bboxes...
method _df_loss (line 97) | def _df_loss(self, pred_dist, target):
class ComputeLoss (line 109) | class ComputeLoss:
method __init__ (line 111) | def __init__(self, model, use_dfl=True, overlap=True):
method preprocess (line 152) | def preprocess(self, targets, batch_size, scale_tensor):
method bbox_decode (line 167) | def bbox_decode(self, anchor_points, pred_dist):
method __call__ (line 175) | def __call__(self, p, targets, masks, img=None, epoch=0):
method single_mask_loss (line 311) | def single_mask_loss(self, gt_mask, pred, proto, xyxy, area):
class ComputeLossLH (line 326) | class ComputeLossLH:
method __init__ (line 328) | def __init__(self, model, use_dfl=True, overlap=True):
method preprocess (line 364) | def preprocess(self, targets, batch_size, scale_tensor):
method bbox_decode (line 379) | def bbox_decode(self, anchor_points, pred_dist):
method __call__ (line 387) | def __call__(self, p, targets, masks, img=None, epoch=0):
method single_mask_loss (line 513) | def single_mask_loss(self, gt_mask, pred, proto, xyxy, area):
class ComputeLossLH0 (line 528) | class ComputeLossLH0:
method __init__ (line 530) | def __init__(self, model, use_dfl=True, overlap=True):
method preprocess (line 566) | def preprocess(self, targets, batch_size, scale_tensor):
method bbox_decode (line 581) | def bbox_decode(self, anchor_points, pred_dist):
method __call__ (line 589) | def __call__(self, p, targets, masks, img=None, epoch=0):
method single_mask_loss (line 715) | def single_mask_loss(self, gt_mask, pred, proto, xyxy, area):
FILE: utils/segment/metrics.py
function fitness (line 6) | def fitness(x):
function ap_per_class_box_and_mask (line 12) | def ap_per_class_box_and_mask(
class Metric (line 61) | class Metric:
method __init__ (line 63) | def __init__(self) -> None:
method ap50 (line 71) | def ap50(self):
method ap (line 79) | def ap(self):
method mp (line 87) | def mp(self):
method mr (line 95) | def mr(self):
method map50 (line 103) | def map50(self):
method map (line 111) | def map(self):
method mean_results (line 118) | def mean_results(self):
method class_result (line 122) | def class_result(self, i):
method get_maps (line 126) | def get_maps(self, nc):
method update (line 132) | def update(self, results):
class Metrics (line 145) | class Metrics:
method __init__ (line 148) | def __init__(self) -> None:
method update (line 152) | def update(self, results):
method mean_results (line 160) | def mean_results(self):
method class_result (line 163) | def class_result(self, i):
method get_maps (line 166) | def get_maps(self, nc):
method ap_class_index (line 170) | def ap_class_index(self):
FILE: utils/segment/plots.py
function plot_images_and_masks (line 17) | def plot_images_and_masks(images, targets, masks, paths=None, fname='ima...
function plot_results_with_masks (line 111) | def plot_results_with_masks(file="path/to/results.csv", dir="", best=True):
FILE: utils/segment/tal/anchor_generator.py
function make_anchors (line 8) | def make_anchors(feats, strides, grid_cell_offset=0.5):
function dist2bbox (line 23) | def dist2bbox(distance, anchor_points, xywh=True, dim=-1):
function bbox2dist (line 35) | def bbox2dist(anchor_points, bbox, reg_max):
FILE: utils/segment/tal/assigner.py
function select_candidates_in_gts (line 8) | def select_candidates_in_gts(xy_centers, gt_bboxes, eps=1e-9):
function select_highest_overlaps (line 25) | def select_highest_overlaps(mask_pos, overlaps, n_max_boxes):
class TaskAlignedAssigner (line 51) | class TaskAlignedAssigner(nn.Module):
method __init__ (line 52) | def __init__(self, topk=13, num_classes=80, alpha=1.0, beta=6.0, eps=1...
method forward (line 62) | def forward(self, pd_scores, pd_bboxes, anc_points, gt_labels, gt_bbox...
method get_pos_mask (line 107) | def get_pos_mask(self, pd_scores, pd_bboxes, gt_labels, gt_bboxes, anc...
method get_box_metrics (line 121) | def get_box_metrics(self, pd_scores, pd_bboxes, gt_labels, gt_bboxes):
method select_topk_candidates (line 134) | def select_topk_candidates(self, metrics, largest=True, topk_mask=None):
method get_targets (line 157) | def get_targets(self, gt_labels, gt_bboxes, target_gt_idx, fg_mask):
FILE: utils/tal/anchor_generator.py
function make_anchors (line 8) | def make_anchors(feats, strides, grid_cell_offset=0.5):
function dist2bbox (line 23) | def dist2bbox(distance, anchor_points, xywh=True, dim=-1):
function bbox2dist (line 35) | def bbox2dist(anchor_points, bbox, reg_max):
FILE: utils/tal/assigner.py
function select_candidates_in_gts (line 8) | def select_candidates_in_gts(xy_centers, gt_bboxes, eps=1e-9):
function select_highest_overlaps (line 25) | def select_highest_overlaps(mask_pos, overlaps, n_max_boxes):
class TaskAlignedAssigner (line 51) | class TaskAlignedAssigner(nn.Module):
method __init__ (line 52) | def __init__(self, topk=13, num_classes=80, alpha=1.0, beta=6.0, eps=1...
method forward (line 62) | def forward(self, pd_scores, pd_bboxes, anc_points, gt_labels, gt_bbox...
method get_pos_mask (line 106) | def get_pos_mask(self, pd_scores, pd_bboxes, gt_labels, gt_bboxes, anc...
method get_box_metrics (line 120) | def get_box_metrics(self, pd_scores, pd_bboxes, gt_labels, gt_bboxes):
method select_topk_candidates (line 133) | def select_topk_candidates(self, metrics, largest=True, topk_mask=None):
method get_targets (line 156) | def get_targets(self, gt_labels, gt_bboxes, target_gt_idx, fg_mask):
FILE: utils/torch_utils.py
function smart_inference_mode (line 34) | def smart_inference_mode(torch_1_9=check_version(torch.__version__, '1.9...
function smartCrossEntropyLoss (line 42) | def smartCrossEntropyLoss(label_smoothing=0.0):
function smart_DDP (line 51) | def smart_DDP(model):
function reshape_classifier_output (line 62) | def reshape_classifier_output(model, n=1000):
function torch_distributed_zero_first (line 85) | def torch_distributed_zero_first(local_rank: int):
function device_count (line 94) | def device_count():
function select_device (line 104) | def select_device(device='', batch_size=0, newline=True):
function time_sync (line 140) | def time_sync():
function profile (line 147) | def profile(input, ops, n=10, device=None):
function is_parallel (line 198) | def is_parallel(model):
function de_parallel (line 203) | def de_parallel(model):
function initialize_weights (line 208) | def initialize_weights(model):
function find_modules (line 220) | def find_modules(model, mclass=nn.Conv2d):
function sparsity (line 225) | def sparsity(model):
function prune (line 234) | def prune(model, amount=0.3):
function fuse_conv_and_bn (line 244) | def fuse_conv_and_bn(conv, bn):
function model_info (line 268) | def model_info(model, verbose=False, imgsz=640):
function scale_img (line 293) | def scale_img(img, ratio=1.0, same_shape=False, gs=32): # img(16,3,256,...
function copy_attr (line 305) | def copy_attr(a, b, include=(), exclude=()):
function smart_optimizer (line 314) | def smart_optimizer(model, name='Adam', lr=0.001, momentum=0.9, decay=1e...
function smart_hub_load (line 446) | def smart_hub_load(repo='ultralytics/yolov5', model='yolov5s', **kwargs):
function smart_resume (line 458) | def smart_resume(ckpt, optimizer, ema=None, weights='yolov5s.pt', epochs...
class EarlyStopping (line 478) | class EarlyStopping:
method __init__ (line 480) | def __init__(self, patience=30):
method __call__ (line 486) | def __call__(self, epoch, fitness):
class ModelEMA (line 501) | class ModelEMA:
method __init__ (line 507) | def __init__(self, model, decay=0.9999, tau=2000, updates=0):
method update (line 515) | def update(self, model):
method update_attr (line 527) | def update_attr(self, model, include=(), exclude=('process_group', 're...
FILE: utils/triton.py
class TritonRemoteModel (line 7) | class TritonRemoteModel:
method __init__ (line 13) | def __init__(self, url: str):
method runtime (line 47) | def runtime(self):
method __call__ (line 51) | def __call__(self, *args, **kwargs) -> typing.Union[torch.Tensor, typi...
method _create_inputs (line 64) | def _create_inputs(self, *args, **kwargs):
FILE: val.py
function save_one_txt (line 28) | def save_one_txt(predn, save_conf, shape, file):
function save_one_json (line 38) | def save_one_json(predn, jdict, path, class_map):
function process_batch (line 51) | def process_batch(detections, labels, iouv):
function run (line 77) | def run(
function parse_opt (line 321) | def parse_opt():
function main (line 354) | def main(opt):
FILE: val_dual.py
function save_one_txt (line 28) | def save_one_txt(predn, save_conf, shape, file):
function save_one_json (line 38) | def save_one_json(predn, jdict, path, class_map):
function process_batch (line 51) | def process_batch(detections, labels, iouv):
function run (line 77) | def run(
function parse_opt (line 325) | def parse_opt():
function main (line 358) | def main(opt):
FILE: val_triple.py
function save_one_txt (line 28) | def save_one_txt(predn, save_conf, shape, file):
function save_one_json (line 38) | def save_one_json(predn, jdict, path, class_map):
function process_batch (line 51) | def process_batch(detections, labels, iouv):
function run (line 77) | def run(
function parse_opt (line 323) | def parse_opt():
function main (line 356) | def main(opt):
Condensed preview — 116 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (1,337K chars).
[
{
"path": "LICENSE.md",
"chars": 35149,
"preview": " GNU GENERAL PUBLIC LICENSE\n Version 3, 29 June 2007\n\n Copyright (C) 2007 Free "
},
{
"path": "README.md",
"chars": 16480,
"preview": "# YOLOv9\n\nImplementation of paper - [YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information](ht"
},
{
"path": "benchmarks.py",
"chars": 6263,
"preview": "import argparse\nimport platform\nimport sys\nimport time\nfrom pathlib import Path\n\nimport pandas as pd\n\nFILE = Path(__file"
},
{
"path": "classify/predict.py",
"chars": 11490,
"preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nRun YOLOv5 classification inference on images, videos, directories, globs"
},
{
"path": "classify/train.py",
"chars": 16322,
"preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nTrain a YOLOv5 classifier model on a classification dataset\n\nUsage - Sing"
},
{
"path": "classify/val.py",
"chars": 8047,
"preview": "# YOLOv5 🚀 by Ultralytics, GPL-3.0 license\n\"\"\"\nValidate a trained YOLOv5 classification model on a classification datase"
},
{
"path": "data/coco.yaml",
"chars": 3271,
"preview": "path: ../datasets/coco # dataset root dir\ntrain: train2017.txt # train images (relative to 'path') 118287 images\nval: "
},
{
"path": "data/hyps/hyp.scratch-high.yaml",
"chars": 1400,
"preview": "lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)\nlrf: 0.01 # final OneCycleLR learning rate (lr0 * lrf)\nmomentu"
},
{
"path": "detect.py",
"chars": 12240,
"preview": "import argparse\nimport os\nimport platform\nimport sys\nfrom pathlib import Path\n\nimport torch\n\nFILE = Path(__file__).resol"
},
{
"path": "detect_dual.py",
"chars": 12270,
"preview": "import argparse\nimport os\nimport platform\nimport sys\nfrom pathlib import Path\n\nimport torch\n\nFILE = Path(__file__).resol"
},
{
"path": "export.py",
"chars": 32325,
"preview": "import argparse\nimport contextlib\nimport json\nimport os\nimport platform\nimport re\nimport subprocess\nimport sys\nimport ti"
},
{
"path": "hubconf.py",
"chars": 4647,
"preview": "import torch\n\n\ndef _create(name, pretrained=True, channels=3, classes=80, autoshape=True, verbose=True, device=None):\n "
},
{
"path": "models/__init__.py",
"chars": 6,
"preview": "# init"
},
{
"path": "models/common.py",
"chars": 54495,
"preview": "import ast\nimport contextlib\nimport json\nimport math\nimport platform\nimport warnings\nimport zipfile\nfrom collections imp"
},
{
"path": "models/detect/gelan-c.yaml",
"chars": 1730,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/detect/gelan-e.yaml",
"chars": 2868,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/detect/gelan-m.yaml",
"chars": 1715,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/detect/gelan-s.yaml",
"chars": 1697,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/detect/gelan-t.yaml",
"chars": 1681,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/detect/gelan.yaml",
"chars": 1753,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/detect/yolov7-af.yaml",
"chars": 3869,
"preview": "# YOLOv7\n\n# Parameters\nnc: 80 # number of classes\ndepth_multiple: 1. # model depth multiple\nwidth_multiple: 1. # laye"
},
{
"path": "models/detect/yolov9-c.yaml",
"chars": 2745,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/detect/yolov9-cf.yaml",
"chars": 2816,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/detect/yolov9-e.yaml",
"chars": 3431,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/detect/yolov9-m.yaml",
"chars": 2613,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/detect/yolov9-s.yaml",
"chars": 2150,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/detect/yolov9-t.yaml",
"chars": 2129,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/detect/yolov9.yaml",
"chars": 2667,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/experimental.py",
"chars": 11623,
"preview": "import math\n\nimport numpy as np\nimport torch\nimport torch.nn as nn\n\nfrom utils.downloads import attempt_download\n\n\nclass"
},
{
"path": "models/hub/anchors.yaml",
"chars": 3307,
"preview": "# YOLOv3 & YOLOv5\n# Default anchors for COCO data\n\n\n# P5 ---------------------------------------------------------------"
},
{
"path": "models/hub/yolov3-spp.yaml",
"chars": 1530,
"preview": "# YOLOv3\n\n# Parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/hub/yolov3-tiny.yaml",
"chars": 1195,
"preview": "# YOLOv3\n\n# Parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/hub/yolov3.yaml",
"chars": 1521,
"preview": "# YOLOv3\n\n# Parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/panoptic/gelan-c-pan.yaml",
"chars": 1747,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/panoptic/yolov7-af-pan.yaml",
"chars": 3893,
"preview": "# YOLOv7\n\n# Parameters\nnc: 80 # number of classes\nsem_nc: 93 # number of stuff classes\ndepth_multiple: 1.0 # model de"
},
{
"path": "models/segment/gelan-c-dseg.yaml",
"chars": 1884,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/segment/gelan-c-seg.yaml",
"chars": 1740,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/segment/yolov7-af-seg.yaml",
"chars": 3849,
"preview": "# YOLOv7\n\n# Parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/segment/yolov9-c-dseg.yaml",
"chars": 3004,
"preview": "# YOLOv9\n\n# parameters\nnc: 80 # number of classes\ndepth_multiple: 1.0 # model depth multiple\nwidth_multiple: 1.0 # la"
},
{
"path": "models/tf.py",
"chars": 26659,
"preview": "import argparse\nimport sys\nfrom copy import deepcopy\nfrom pathlib import Path\n\nFILE = Path(__file__).resolve()\nROOT = FI"
},
{
"path": "models/yolo.py",
"chars": 40858,
"preview": "import argparse\nimport os\nimport platform\nimport sys\nfrom copy import deepcopy\nfrom pathlib import Path\n\nFILE = Path(__f"
},
{
"path": "panoptic/predict.py",
"chars": 12944,
"preview": "import argparse\nimport os\nimport platform\nimport sys\nfrom pathlib import Path\n\nimport torch\n\nFILE = Path(__file__).resol"
},
{
"path": "panoptic/train.py",
"chars": 34899,
"preview": "import argparse\nimport math\nimport os\nimport random\nimport sys\nimport time\nfrom copy import deepcopy\nfrom datetime impor"
},
{
"path": "panoptic/val.py",
"chars": 29773,
"preview": "import argparse\nimport json\nimport os\nimport sys\nfrom multiprocessing.pool import ThreadPool\nfrom pathlib import Path\n\ni"
},
{
"path": "requirements.txt",
"chars": 1073,
"preview": "# requirements\n# Usage: pip install -r requirements.txt\n\n# Base --------------------------------------------------------"
},
{
"path": "scripts/get_coco.sh",
"chars": 820,
"preview": "#!/bin/bash\n# COCO 2017 dataset http://cocodataset.org\n# Download command: bash ./scripts/get_coco.sh\n\n# Download/unzip "
},
{
"path": "segment/predict.py",
"chars": 12941,
"preview": "import argparse\nimport os\nimport platform\nimport sys\nfrom pathlib import Path\n\nimport torch\n\nFILE = Path(__file__).resol"
},
{
"path": "segment/train.py",
"chars": 33886,
"preview": "import argparse\nimport math\nimport os\nimport random\nimport sys\nimport time\nfrom copy import deepcopy\nfrom datetime impor"
},
{
"path": "segment/train_dual.py",
"chars": 33966,
"preview": "import argparse\nimport math\nimport os\nimport random\nimport sys\nimport time\nfrom copy import deepcopy\nfrom datetime impor"
},
{
"path": "segment/val.py",
"chars": 22931,
"preview": "import argparse\nimport json\nimport os\nimport sys\nfrom multiprocessing.pool import ThreadPool\nfrom pathlib import Path\n\ni"
},
{
"path": "segment/val_dual.py",
"chars": 22963,
"preview": "import argparse\nimport json\nimport os\nimport sys\nfrom multiprocessing.pool import ThreadPool\nfrom pathlib import Path\n\ni"
},
{
"path": "tools/reparameterization.ipynb",
"chars": 18383,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": null,\n \"id\": \"4beac401\",\n \"metadata\": {},\n \"output"
},
{
"path": "train.py",
"chars": 33792,
"preview": "import argparse\nimport math\nimport os\nimport random\nimport sys\nimport time\nfrom copy import deepcopy\nfrom datetime impor"
},
{
"path": "train_dual.py",
"chars": 34308,
"preview": "import argparse\nimport math\nimport os\nimport random\nimport sys\nimport time\nfrom copy import deepcopy\nfrom datetime impor"
},
{
"path": "train_triple.py",
"chars": 33825,
"preview": "import argparse\nimport math\nimport os\nimport random\nimport sys\nimport time\nfrom copy import deepcopy\nfrom datetime impor"
},
{
"path": "utils/__init__.py",
"chars": 2201,
"preview": "import contextlib\nimport platform\nimport threading\n\n\ndef emojis(str=''):\n # Return platform-dependent emoji-safe vers"
},
{
"path": "utils/activations.py",
"chars": 3373,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\nclass SiLU(nn.Module):\n # SiLU activation https:"
},
{
"path": "utils/augmentations.py",
"chars": 17059,
"preview": "import math\nimport random\n\nimport cv2\nimport numpy as np\nimport torch\nimport torchvision.transforms as T\nimport torchvis"
},
{
"path": "utils/autoanchor.py",
"chars": 7328,
"preview": "import random\n\nimport numpy as np\nimport torch\nimport yaml\nfrom tqdm import tqdm\n\nfrom utils import TryExcept\nfrom utils"
},
{
"path": "utils/autobatch.py",
"chars": 2907,
"preview": "from copy import deepcopy\n\nimport numpy as np\nimport torch\n\nfrom utils.general import LOGGER, colorstr\nfrom utils.torch_"
},
{
"path": "utils/callbacks.py",
"chars": 2591,
"preview": "import threading\n\n\nclass Callbacks:\n \"\"\"\"\n Handles all registered callbacks for YOLOv5 Hooks\n \"\"\"\n\n def __in"
},
{
"path": "utils/coco_utils.py",
"chars": 3256,
"preview": "import cv2\n\nfrom pycocotools.coco import COCO\nfrom pycocotools import mask as maskUtils\n\n# coco id: https://tech.amikeli"
},
{
"path": "utils/dataloaders.py",
"chars": 55574,
"preview": "import contextlib\nimport glob\nimport hashlib\nimport json\nimport math\nimport os\nimport random\nimport shutil\nimport time\nf"
},
{
"path": "utils/downloads.py",
"chars": 4581,
"preview": "import logging\nimport os\nimport subprocess\nimport urllib\nfrom pathlib import Path\n\nimport requests\nimport torch\n\n\ndef is"
},
{
"path": "utils/general.py",
"chars": 46900,
"preview": "import contextlib\nimport glob\nimport inspect\nimport logging\nimport logging.config\nimport math\nimport os\nimport platform\n"
},
{
"path": "utils/lion.py",
"chars": 2518,
"preview": "\"\"\"PyTorch implementation of the Lion optimizer.\"\"\"\nimport torch\nfrom torch.optim.optimizer import Optimizer\n\n\nclass Lio"
},
{
"path": "utils/loggers/__init__.py",
"chars": 17024,
"preview": "import os\nimport warnings\nfrom pathlib import Path\n\nimport pkg_resources as pkg\nimport torch\nfrom torch.utils.tensorboar"
},
{
"path": "utils/loggers/clearml/__init__.py",
"chars": 6,
"preview": "# init"
},
{
"path": "utils/loggers/clearml/clearml_utils.py",
"chars": 7553,
"preview": "\"\"\"Main Logger class for ClearML experiment tracking.\"\"\"\nimport glob\nimport re\nfrom pathlib import Path\n\nimport numpy as"
},
{
"path": "utils/loggers/clearml/hpo.py",
"chars": 5271,
"preview": "from clearml import Task\n# Connecting ClearML with the current process,\n# from here on everything is logged automaticall"
},
{
"path": "utils/loggers/comet/__init__.py",
"chars": 18731,
"preview": "import glob\nimport json\nimport logging\nimport os\nimport sys\nfrom pathlib import Path\n\nlogger = logging.getLogger(__name_"
},
{
"path": "utils/loggers/comet/comet_utils.py",
"chars": 4751,
"preview": "import logging\nimport os\nfrom urllib.parse import urlparse\n\ntry:\n import comet_ml\nexcept (ModuleNotFoundError, Import"
},
{
"path": "utils/loggers/comet/hpo.py",
"chars": 6653,
"preview": "import argparse\nimport json\nimport logging\nimport os\nimport sys\nfrom pathlib import Path\n\nimport comet_ml\n\nlogger = logg"
},
{
"path": "utils/loggers/comet/optimizer_config.json",
"chars": 3020,
"preview": "{\n \"algorithm\": \"random\",\n \"parameters\": {\n \"anchor_t\": {\n \"type\": \"discrete\",\n \"values\": [\n 2,\n "
},
{
"path": "utils/loggers/wandb/__init__.py",
"chars": 6,
"preview": "# init"
},
{
"path": "utils/loggers/wandb/log_dataset.py",
"chars": 1032,
"preview": "import argparse\n\nfrom wandb_utils import WandbLogger\n\nfrom utils.general import LOGGER\n\nWANDB_ARTIFACT_PREFIX = 'wandb-a"
},
{
"path": "utils/loggers/wandb/sweep.py",
"chars": 1213,
"preview": "import sys\nfrom pathlib import Path\n\nimport wandb\n\nFILE = Path(__file__).resolve()\nROOT = FILE.parents[3] # YOLOv5 root"
},
{
"path": "utils/loggers/wandb/sweep.yaml",
"chars": 2463,
"preview": "# Hyperparameters for training\n# To set range-\n# Provide min and max values as:\n# parameter:\n#\n# min: scala"
},
{
"path": "utils/loggers/wandb/wandb_utils.py",
"chars": 28239,
"preview": "\"\"\"Utilities and tools for tracking runs with Weights & Biases.\"\"\"\n\nimport logging\nimport os\nimport sys\nfrom contextlib "
},
{
"path": "utils/loss.py",
"chars": 16076,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom utils.metrics import bbox_iou\nfrom utils.torch_"
},
{
"path": "utils/loss_tal.py",
"chars": 9688,
"preview": "import os\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom utils.general import xywh2xyxy\nfrom "
},
{
"path": "utils/loss_tal_dual.py",
"chars": 18037,
"preview": "import os\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom utils.general import xywh2xyxy\nfrom "
},
{
"path": "utils/loss_tal_triple.py",
"chars": 13592,
"preview": "import os\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom utils.general import xywh2xyxy\nfrom "
},
{
"path": "utils/metrics.py",
"chars": 15909,
"preview": "import math\nimport warnings\nfrom pathlib import Path\n\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport torch\n\nf"
},
{
"path": "utils/panoptic/__init__.py",
"chars": 6,
"preview": "# init"
},
{
"path": "utils/panoptic/augmentations.py",
"chars": 7672,
"preview": "import math\nimport random\n\nimport cv2\nimport numpy as np\n\nfrom ..augmentations import box_candidates\nfrom ..general impo"
},
{
"path": "utils/panoptic/dataloaders.py",
"chars": 21532,
"preview": "import os\nimport random\n\nimport pickle\nfrom pathlib import Path\n\nfrom itertools import repeat\nfrom multiprocessing.pool "
},
{
"path": "utils/panoptic/general.py",
"chars": 4934,
"preview": "import cv2\nimport numpy as np\nimport torch\nimport torch.nn.functional as F\n\n\ndef crop_mask(masks, boxes):\n \"\"\"\n \"C"
},
{
"path": "utils/panoptic/loss.py",
"chars": 8620,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom ..general import xywh2xyxy\nfrom ..loss import F"
},
{
"path": "utils/panoptic/loss_tal.py",
"chars": 13132,
"preview": "import os\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom torchvision.ops import sigmoid_focal"
},
{
"path": "utils/panoptic/metrics.py",
"chars": 8319,
"preview": "import numpy as np\nimport torch\n\nfrom ..metrics import ap_per_class\n\n\ndef fitness(x):\n # Model fitness as a weighted "
},
{
"path": "utils/panoptic/plots.py",
"chars": 7167,
"preview": "import contextlib\nimport math\nfrom pathlib import Path\n\nimport cv2\nimport matplotlib.pyplot as plt\nimport numpy as np\nim"
},
{
"path": "utils/panoptic/tal/__init__.py",
"chars": 6,
"preview": "# init"
},
{
"path": "utils/panoptic/tal/anchor_generator.py",
"chars": 1557,
"preview": "import torch\n\nfrom utils.general import check_version\n\nTORCH_1_10 = check_version(torch.__version__, '1.10.0')\n\n\ndef mak"
},
{
"path": "utils/panoptic/tal/assigner.py",
"chars": 8450,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom utils.metrics import bbox_iou\n\n\ndef select_cand"
},
{
"path": "utils/plots.py",
"chars": 25206,
"preview": "import contextlib\nimport math\nimport os\nfrom copy import copy\nfrom pathlib import Path\nfrom urllib.error import URLError"
},
{
"path": "utils/segment/__init__.py",
"chars": 6,
"preview": "# init"
},
{
"path": "utils/segment/augmentations.py",
"chars": 3672,
"preview": "import math\nimport random\n\nimport cv2\nimport numpy as np\n\nfrom ..augmentations import box_candidates\nfrom ..general impo"
},
{
"path": "utils/segment/dataloaders.py",
"chars": 13851,
"preview": "import os\nimport random\n\nimport cv2\nimport numpy as np\nimport torch\nfrom torch.utils.data import DataLoader, distributed"
},
{
"path": "utils/segment/general.py",
"chars": 4934,
"preview": "import cv2\nimport numpy as np\nimport torch\nimport torch.nn.functional as F\n\n\ndef crop_mask(masks, boxes):\n \"\"\"\n \"C"
},
{
"path": "utils/segment/loss.py",
"chars": 8620,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom ..general import xywh2xyxy\nfrom ..loss import F"
},
{
"path": "utils/segment/loss_tal.py",
"chars": 12024,
"preview": "import os\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom torchvision.ops import sigmoid_focal"
},
{
"path": "utils/segment/loss_tal_dual.py",
"chars": 34900,
"preview": "import os\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom torchvision.ops import sigmoid_focal"
},
{
"path": "utils/segment/metrics.py",
"chars": 5377,
"preview": "import numpy as np\n\nfrom ..metrics import ap_per_class\n\n\ndef fitness(x):\n # Model fitness as a weighted combination o"
},
{
"path": "utils/segment/plots.py",
"chars": 6390,
"preview": "import contextlib\nimport math\nfrom pathlib import Path\n\nimport cv2\nimport matplotlib.pyplot as plt\nimport numpy as np\nim"
},
{
"path": "utils/segment/tal/__init__.py",
"chars": 6,
"preview": "# init"
},
{
"path": "utils/segment/tal/anchor_generator.py",
"chars": 1557,
"preview": "import torch\n\nfrom utils.general import check_version\n\nTORCH_1_10 = check_version(torch.__version__, '1.10.0')\n\n\ndef mak"
},
{
"path": "utils/segment/tal/assigner.py",
"chars": 8316,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom utils.metrics import bbox_iou\n\n\ndef select_cand"
},
{
"path": "utils/tal/__init__.py",
"chars": 6,
"preview": "# init"
},
{
"path": "utils/tal/anchor_generator.py",
"chars": 1557,
"preview": "import torch\n\nfrom utils.general import check_version\n\nTORCH_1_10 = check_version(torch.__version__, '1.10.0')\n\n\ndef mak"
},
{
"path": "utils/tal/assigner.py",
"chars": 8231,
"preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nfrom utils.metrics import bbox_iou\n\n\ndef select_cand"
},
{
"path": "utils/torch_utils.py",
"chars": 23367,
"preview": "import math\nimport os\nimport platform\nimport subprocess\nimport time\nimport warnings\nfrom contextlib import contextmanage"
},
{
"path": "utils/triton.py",
"chars": 3528,
"preview": "import typing\nfrom urllib.parse import urlparse\n\nimport torch\n\n\nclass TritonRemoteModel:\n \"\"\" A wrapper over a model "
},
{
"path": "val.py",
"chars": 19469,
"preview": "import argparse\nimport json\nimport os\nimport sys\nfrom pathlib import Path\n\nimport numpy as np\nimport torch\nfrom tqdm imp"
},
{
"path": "val_dual.py",
"chars": 19583,
"preview": "import argparse\nimport json\nimport os\nimport sys\nfrom pathlib import Path\n\nimport numpy as np\nimport torch\nfrom tqdm imp"
},
{
"path": "val_triple.py",
"chars": 19537,
"preview": "import argparse\nimport json\nimport os\nimport sys\nfrom pathlib import Path\n\nimport numpy as np\nimport torch\nfrom tqdm imp"
}
]
About this extraction
This page contains the full source code of the WongKinYiu/yolov9 GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 116 files (1.2 MB), approximately 364.7k tokens, and a symbol index with 1072 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.