Full Code of RizwanMunawar/yolov8-object-tracking for AI

main 3606d19c21ef cached

77 files

639.3 KB

177.2k tokens

819 symbols

1 requests

Download .txt

Showing preview only (669K chars total). Download the full file or copy to clipboard to get everything.

Repository: RizwanMunawar/yolov8-object-tracking
Branch: main
Commit: 3606d19c21ef
Files: 77
Total size: 639.3 KB

Directory structure:
gitextract_66agq4x4/

├── LICENSE
├── README.md
├── __init__.py
├── models/
│   └── v8/
│       ├── yolov8l.yaml
│       ├── yolov8m.yaml
│       ├── yolov8n.yaml
│       ├── yolov8s.yaml
│       ├── yolov8x.yaml
│       └── yolov8x6.yaml
├── nn/
│   ├── __init__.py
│   ├── autobackend.py
│   ├── modules.py
│   └── tasks.py
├── requirements.txt
└── yolo/
    ├── cli.py
    ├── configs/
    │   ├── __init__.py
    │   ├── default.yaml
    │   └── hydra_patch.py
    ├── data/
    │   ├── __init__.py
    │   ├── augment.py
    │   ├── base.py
    │   ├── build.py
    │   ├── dataloaders/
    │   │   ├── __init__.py
    │   │   ├── stream_loaders.py
    │   │   ├── v5augmentations.py
    │   │   └── v5loader.py
    │   ├── dataset.py
    │   ├── dataset_wrappers.py
    │   ├── datasets/
    │   │   ├── Argoverse.yaml
    │   │   ├── GlobalWheat2020.yaml
    │   │   ├── ImageNet.yaml
    │   │   ├── Objects365.yaml
    │   │   ├── SKU-110K.yaml
    │   │   ├── VOC.yaml
    │   │   ├── VisDrone.yaml
    │   │   ├── coco.yaml
    │   │   ├── coco128-seg.yaml
    │   │   ├── coco128.yaml
    │   │   └── xView.yaml
    │   ├── scripts/
    │   │   ├── download_weights.sh
    │   │   ├── get_coco.sh
    │   │   ├── get_coco128.sh
    │   │   └── get_imagenet.sh
    │   └── utils.py
    ├── engine/
    │   ├── __init__.py
    │   ├── exporter.py
    │   ├── model.py
    │   ├── predictor.py
    │   ├── sort.py
    │   ├── trainer.py
    │   └── validator.py
    ├── utils/
    │   ├── __init__.py
    │   ├── autobatch.py
    │   ├── callbacks/
    │   │   ├── __init__.py
    │   │   ├── base.py
    │   │   ├── clearml.py
    │   │   ├── comet.py
    │   │   ├── hub.py
    │   │   └── tensorboard.py
    │   ├── checks.py
    │   ├── dist.py
    │   ├── downloads.py
    │   ├── files.py
    │   ├── instance.py
    │   ├── loss.py
    │   ├── metrics.py
    │   ├── ops.py
    │   ├── plotting.py
    │   ├── tal.py
    │   └── torch_utils.py
    └── v8/
        ├── __init__.py
        └── detect/
            ├── __init__.py
            ├── detect_and_trk.py
            ├── predict.py
            ├── sort.py
            ├── train.py
            └── val.py

================================================
FILE CONTENTS
================================================

================================================
FILE: LICENSE
================================================
                    GNU AFFERO GENERAL PUBLIC LICENSE
                       Version 3, 19 November 2007

 Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
 Everyone is permitted to copy and distribute verbatim copies
 of this license document, but changing it is not allowed.

                            Preamble

  The GNU Affero General Public License is a free, copyleft license for
software and other kinds of works, specifically designed to ensure
cooperation with the community in the case of network server software.

  The licenses for most software and other practical works are designed
to take away your freedom to share and change the works.  By contrast,
our General Public Licenses are intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users.

  When we speak of free software, we are referring to freedom, not
price.  Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.

  Developers that use our General Public Licenses protect your rights
with two steps: (1) assert copyright on the software, and (2) offer
you this License which gives you legal permission to copy, distribute
and/or modify the software.

  A secondary benefit of defending all users' freedom is that
improvements made in alternate versions of the program, if they
receive widespread use, become available for other developers to
incorporate.  Many developers of free software are heartened and
encouraged by the resulting cooperation.  However, in the case of
software used on network servers, this result may fail to come about.
The GNU General Public License permits making a modified version and
letting the public access it on a server without ever releasing its
source code to the public.

  The GNU Affero General Public License is designed specifically to
ensure that, in such cases, the modified source code becomes available
to the community.  It requires the operator of a network server to
provide the source code of the modified version running there to the
users of that server.  Therefore, public use of a modified version, on
a publicly accessible server, gives the public access to the source
code of the modified version.

  An older license, called the Affero General Public License and
published by Affero, was designed to accomplish similar goals.  This is
a different license, not a version of the Affero GPL, but Affero has
released a new version of the Affero GPL which permits relicensing under
this license.

  The precise terms and conditions for copying, distribution and
modification follow.

                       TERMS AND CONDITIONS

  0. Definitions.

  "This License" refers to version 3 of the GNU Affero General Public License.

  "Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.

  "The Program" refers to any copyrightable work licensed under this
License.  Each licensee is addressed as "you".  "Licensees" and
"recipients" may be individuals or organizations.

  To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy.  The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.

  A "covered work" means either the unmodified Program or a work based
on the Program.

  To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy.  Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.

  To "convey" a work means any kind of propagation that enables other
parties to make or receive copies.  Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.

  An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License.  If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.

  1. Source Code.

  The "source code" for a work means the preferred form of the work
for making modifications to it.  "Object code" means any non-source
form of a work.

  A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.

  The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form.  A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.

  The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities.  However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work.  For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.

  The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.

  The Corresponding Source for a work in source code form is that
same work.

  2. Basic Permissions.

  All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met.  This License explicitly affirms your unlimited
permission to run the unmodified Program.  The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work.  This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.

  You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force.  You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright.  Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.

  Conveying under any other circumstances is permitted solely under
the conditions stated below.  Sublicensing is not allowed; section 10
makes it unnecessary.

  3. Protecting Users' Legal Rights From Anti-Circumvention Law.

  No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.

  When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.

  4. Conveying Verbatim Copies.

  You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.

  You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.

  5. Conveying Modified Source Versions.

  You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:

    a) The work must carry prominent notices stating that you modified
    it, and giving a relevant date.

    b) The work must carry prominent notices stating that it is
    released under this License and any conditions added under section
    7.  This requirement modifies the requirement in section 4 to
    "keep intact all notices".

    c) You must license the entire work, as a whole, under this
    License to anyone who comes into possession of a copy.  This
    License will therefore apply, along with any applicable section 7
    additional terms, to the whole of the work, and all its parts,
    regardless of how they are packaged.  This License gives no
    permission to license the work in any other way, but it does not
    invalidate such permission if you have separately received it.

    d) If the work has interactive user interfaces, each must display
    Appropriate Legal Notices; however, if the Program has interactive
    interfaces that do not display Appropriate Legal Notices, your
    work need not make them do so.

  A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit.  Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.

  6. Conveying Non-Source Forms.

  You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:

    a) Convey the object code in, or embodied in, a physical product
    (including a physical distribution medium), accompanied by the
    Corresponding Source fixed on a durable physical medium
    customarily used for software interchange.

    b) Convey the object code in, or embodied in, a physical product
    (including a physical distribution medium), accompanied by a
    written offer, valid for at least three years and valid for as
    long as you offer spare parts or customer support for that product
    model, to give anyone who possesses the object code either (1) a
    copy of the Corresponding Source for all the software in the
    product that is covered by this License, on a durable physical
    medium customarily used for software interchange, for a price no
    more than your reasonable cost of physically performing this
    conveying of source, or (2) access to copy the
    Corresponding Source from a network server at no charge.

    c) Convey individual copies of the object code with a copy of the
    written offer to provide the Corresponding Source.  This
    alternative is allowed only occasionally and noncommercially, and
    only if you received the object code with such an offer, in accord
    with subsection 6b.

    d) Convey the object code by offering access from a designated
    place (gratis or for a charge), and offer equivalent access to the
    Corresponding Source in the same way through the same place at no
    further charge.  You need not require recipients to copy the
    Corresponding Source along with the object code.  If the place to
    copy the object code is a network server, the Corresponding Source
    may be on a different server (operated by you or a third party)
    that supports equivalent copying facilities, provided you maintain
    clear directions next to the object code saying where to find the
    Corresponding Source.  Regardless of what server hosts the
    Corresponding Source, you remain obligated to ensure that it is
    available for as long as needed to satisfy these requirements.

    e) Convey the object code using peer-to-peer transmission, provided
    you inform other peers where the object code and Corresponding
    Source of the work are being offered to the general public at no
    charge under subsection 6d.

  A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.

  A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling.  In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage.  For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product.  A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.

  "Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source.  The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.

  If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information.  But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).

  The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed.  Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.

  Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.

  7. Additional Terms.

  "Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law.  If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.

  When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it.  (Additional permissions may be written to require their own
removal in certain cases when you modify the work.)  You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.

  Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:

    a) Disclaiming warranty or limiting liability differently from the
    terms of sections 15 and 16 of this License; or

    b) Requiring preservation of specified reasonable legal notices or
    author attributions in that material or in the Appropriate Legal
    Notices displayed by works containing it; or

    c) Prohibiting misrepresentation of the origin of that material, or
    requiring that modified versions of such material be marked in
    reasonable ways as different from the original version; or

    d) Limiting the use for publicity purposes of names of licensors or
    authors of the material; or

    e) Declining to grant rights under trademark law for use of some
    trade names, trademarks, or service marks; or

    f) Requiring indemnification of licensors and authors of that
    material by anyone who conveys the material (or modified versions of
    it) with contractual assumptions of liability to the recipient, for
    any liability that these contractual assumptions directly impose on
    those licensors and authors.

  All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10.  If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term.  If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.

  If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.

  Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.

  8. Termination.

  You may not propagate or modify a covered work except as expressly
provided under this License.  Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).

  However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.

  Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.

  Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License.  If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.

  9. Acceptance Not Required for Having Copies.

  You are not required to accept this License in order to receive or
run a copy of the Program.  Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance.  However,
nothing other than this License grants you permission to propagate or
modify any covered work.  These actions infringe copyright if you do
not accept this License.  Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.

  10. Automatic Licensing of Downstream Recipients.

  Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License.  You are not responsible
for enforcing compliance by third parties with this License.

  An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations.  If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.

  You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License.  For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.

  11. Patents.

  A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based.  The
work thus licensed is called the contributor's "contributor version".

  A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version.  For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.

  Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.

  In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement).  To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.

  If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients.  "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.

  If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.

  A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License.  You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.

  Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.

  12. No Surrender of Others' Freedom.

  If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License.  If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all.  For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.

  13. Remote Network Interaction; Use with the GNU General Public License.

  Notwithstanding any other provision of this License, if you modify the
Program, your modified version must prominently offer all users
interacting with it remotely through a computer network (if your version
supports such interaction) an opportunity to receive the Corresponding
Source of your version by providing access to the Corresponding Source
from a network server at no charge, through some standard or customary
means of facilitating copying of software.  This Corresponding Source
shall include the Corresponding Source for any work covered by version 3
of the GNU General Public License that is incorporated pursuant to the
following paragraph.

  Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU General Public License into a single
combined work, and to convey the resulting work.  The terms of this
License will continue to apply to the part which is the covered work,
but the work with which it is combined will remain governed by version
3 of the GNU General Public License.

  14. Revised Versions of this License.

  The Free Software Foundation may publish revised and/or new versions of
the GNU Affero General Public License from time to time.  Such new versions
will be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.

  Each version is given a distinguishing version number.  If the
Program specifies that a certain numbered version of the GNU Affero General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation.  If the Program does not specify a version number of the
GNU Affero General Public License, you may choose any version ever published
by the Free Software Foundation.

  If the Program specifies that a proxy can decide which future
versions of the GNU Affero General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.

  Later license versions may give you additional or different
permissions.  However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.

  15. Disclaimer of Warranty.

  THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

  16. Limitation of Liability.

  IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.

  17. Interpretation of Sections 15 and 16.

  If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.

                     END OF TERMS AND CONDITIONS

            How to Apply These Terms to Your New Programs

  If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.

  To do so, attach the following notices to the program.  It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.

    <one line to give the program's name and a brief idea of what it does.>
    Copyright (C) <year>  <name of author>

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU Affero General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU Affero General Public License for more details.

    You should have received a copy of the GNU Affero General Public License
    along with this program.  If not, see <https://www.gnu.org/licenses/>.

Also add information on how to contact you by electronic and paper mail.

  If your software can interact with users remotely through a computer
network, you should also make sure that it provides a way for users to
get its source.  For example, if your program is a web application, its
interface could display a "Source" link that leads users to an archive
of the code.  There are many ways you could offer source, and different
solutions will be better for different programs; see section 13 for the
specific requirements.

  You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU AGPL, see
<https://www.gnu.org/licenses/>.


================================================
FILE: README.md
================================================
# yolov8-object-tracking 

This is compatible only with `ultralytics==8.0.0`. However, I highly recommend using the latest version of the Ultralytics package and referring to the official Ultralytics codebase here: [GitHub Repository](https://github.com/ultralytics/ultralytics/).

[![Static Badge](https://img.shields.io/badge/yolov8-blog-blue)](https://muhammadrizwanmunawar.medium.com/train-yolov8-on-custom-data-6d28cd348262)

### Steps to run Code

- Clone the repository
```bash
https://github.com/RizwanMunawar/yolov8-object-tracking.git
```

- Move to the cloned folder

```bash
cd yolov8-object-tracking
```

- Install the ultralytics package
```bash
pip install ultralytics==8.0.0
```

- Do tracking with the mentioned command below
```bash
#video file
python yolo\v8\detect\detect_and_trk.py model=yolov8s.pt source="test.mp4" show=True

#imagefile
python yolo\v8\detect\detect_and_trk.py model=yolov8m.pt source="path to image"

#Webcam
python yolo\v8\detect\detect_and_trk.py model=yolov8m.pt source=0 show=True

#External Camera
python yolo\v8\detect\detect_and_trk.py model=yolov8m.pt source=1 show=True
```

- Output file will be created in the `runs/detect/train` with the original filename


### Results 📊
<table>
  <tr>
    <td>YOLOv8s Object Tracking</td>
    <td>YOLOv8m Object Tracking</td>
  </tr>
  <tr>
    <td><img src="https://user-images.githubusercontent.com/62513924/211671576-7d39829a-f8f5-4e25-b30a-530548c11a24.png"></td>
    <td><img src="https://user-images.githubusercontent.com/62513924/211672010-7415ef8b-7941-4545-8434-377d94675299.png"></td>
  </tr>
 </table>

### Star History

[![Star History Chart](https://api.star-history.com/svg?repos=RizwanMunawar/yolov8-object-tracking&type=date&legend=top-left)](https://www.star-history.com/#RizwanMunawar/yolov8-object-tracking&type=date&legend=top-left)


### References 🔗
- 🔗 https://github.com/ultralytics/ultralytics
- 🔗 https://github.com/abewley/sort
- 🔗 https://docs.ultralytics.com/

**Some of my articles/research papers | Computer vision awesome resources for learning | How do I appear to the world? 🚀**

| Article Title & Link | Published Date |
|-----------------------|----------------|
| [Ultralytics YOLO11: Object Detection and Instance Segmentation🤯](https://muhammadrizwanmunawar.medium.com/ultralytics-yolo11-object-detection-and-instance-segmentation-88ef0239a811) | ![Published Date](https://img.shields.io/badge/published_Date-2024--10--27-brightgreen) |
| [Parking Management using Ultralytics YOLO11](https://muhammadrizwanmunawar.medium.com/parking-management-using-ultralytics-yolo11-fba4c6bc62bc) | ![Published Date](https://img.shields.io/badge/published_Date-2024--11--10-brightgreen) |
| [My 🖐️Computer Vision Hobby Projects that Yielded Earnings](https://muhammadrizwanmunawar.medium.com/my-️computer-vision-hobby-projects-that-yielded-earnings-7923c9b9eead) | ![Published Date](https://img.shields.io/badge/published_Date-2023--09--10-brightgreen) |
| [Best Resources to Learn Computer Vision](https://muhammadrizwanmunawar.medium.com/best-resources-to-learn-computer-vision-311352ed0833) | ![Published Date](https://img.shields.io/badge/published_Date-2023--06--30-brightgreen) |
| [Roadmap for Computer Vision Engineer](https://medium.com/augmented-startups/roadmap-for-computer-vision-engineer-45167b94518c) | ![Published Date](https://img.shields.io/badge/published_Date-2022--08--07-brightgreen) |
| [How did I spend 2022 in the Computer Vision Field](https://www.linkedin.com/pulse/how-did-i-spend-2022-computer-vision-field-muhammad-rizwan-munawar) | ![Published Date](https://img.shields.io/badge/published_Date-2022--12--20-brightgreen) |
| [Domain Feature Mapping with YOLOv7 for Automated Edge-Based Pallet Racking Inspections](https://www.mdpi.com/1424-8220/22/18/6927) | ![Published Date](https://img.shields.io/badge/published_Date-2022--09--13-brightgreen) |
| [Exudate Regeneration for Automated Exudate Detection in Retinal Fundus Images](https://ieeexplore.ieee.org/document/9885192) | ![Published Date](https://img.shields.io/badge/published_Date-2022--09--12-brightgreen) |
| [Feature Mapping for Rice Leaf Defect Detection Based on a Custom Convolutional Architecture](https://www.mdpi.com/2304-8158/11/23/3914) | ![Published Date](https://img.shields.io/badge/published_Date-2022--12--04-brightgreen) |
| [Yolov5, Yolo-x, Yolo-r, Yolov7 Performance Comparison: A Survey](https://aircconline.com/csit/papers/vol12/csit121602.pdf) | ![Published Date](https://img.shields.io/badge/published_Date-2022--09--24-brightgreen) |
| [Explainable AI in Drug Sensitivity Prediction on Cancer Cell Lines](https://ieeexplore.ieee.org/document/9922931) | ![Published Date](https://img.shields.io/badge/published_Date-2022--09--23-brightgreen) |
| [Train YOLOv8 on Custom Data](https://medium.com/augmented-startups/train-yolov8-on-custom-data-6d28cd348262) | ![Published Date](https://img.shields.io/badge/published_Date-2022--09--23-brightgreen) |


**More Information**

For more details, you can reach out to me on [Medium](https://muhammadrizwanmunawar.medium.com/) or connect with me on [LinkedIn](https://www.linkedin.com/in/muhammadrizwanmunawar/)


================================================
FILE: __init__.py
================================================
from hub import checks
from engine.model import YOLO
from utils import ops
from . import v8

================================================
FILE: models/v8/yolov8l.yaml
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.00  # scales module repeats
width_multiple: 1.00  # scales convolution channels

# YOLOv8.0l backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [512, True]]
  - [-1, 1, SPPF, [512, 5]]  # 9

# YOLOv8.0l head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4
  - [-1, 3, C2f, [512]]  # 13

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]  # cat backbone P3
  - [-1, 3, C2f, [256]]  # 17 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]]  # cat head P4
  - [-1, 3, C2f, [512]]  # 20 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]]  # cat head P5
  - [-1, 3, C2f, [512]]  # 23 (P5/32-large)

  - [[15, 18, 21], 1, Detect, [nc]]  # Detect(P3, P4, P5)


================================================
FILE: models/v8/yolov8m.yaml
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 0.67  # scales module repeats
width_multiple: 0.75  # scales convolution channels

# YOLOv8.0m backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [768, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [768, True]]
  - [-1, 1, SPPF, [768, 5]]  # 9

# YOLOv8.0m head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4
  - [-1, 3, C2f, [512]]  # 13

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]  # cat backbone P3
  - [-1, 3, C2f, [256]]  # 17 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]]  # cat head P4
  - [-1, 3, C2f, [512]]  # 20 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]]  # cat head P5
  - [-1, 3, C2f, [768]]  # 23 (P5/32-large)

  - [[15, 18, 21], 1, Detect, [nc]]  # Detect(P3, P4, P5)


================================================
FILE: models/v8/yolov8n.yaml
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 0.33  # scales module repeats
width_multiple: 0.25  # scales convolution channels

# YOLOv8.0n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]]  # 9

# YOLOv8.0n head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4
  - [-1, 3, C2f, [512]]  # 13

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]  # cat backbone P3
  - [-1, 3, C2f, [256]]  # 17 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]]  # cat head P4
  - [-1, 3, C2f, [512]]  # 20 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]]  # cat head P5
  - [-1, 3, C2f, [1024]]  # 23 (P5/32-large)

  - [[15, 18, 21], 1, Detect, [nc]]  # Detect(P3, P4, P5)


================================================
FILE: models/v8/yolov8s.yaml
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 0.33  # scales module repeats
width_multiple: 0.50  # scales convolution channels

# YOLOv8.0s backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]]  # 9

# YOLOv8.0s head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4
  - [-1, 3, C2f, [512]]  # 13

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]  # cat backbone P3
  - [-1, 3, C2f, [256]]  # 17 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]]  # cat head P4
  - [-1, 3, C2f, [512]]  # 20 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]]  # cat head P5
  - [-1, 3, C2f, [1024]]  # 23 (P5/32-large)

  - [[15, 18, 21], 1, Detect, [nc]]  # Detect(P3, P4, P5)


================================================
FILE: models/v8/yolov8x.yaml
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.00  # scales module repeats
width_multiple: 1.25  # scales convolution channels

# YOLOv8.0x backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [512, True]]
  - [-1, 1, SPPF, [512, 5]]  # 9

# YOLOv8.0x head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4
  - [-1, 3, C2f, [512]]  # 13

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]  # cat backbone P3
  - [-1, 3, C2f, [256]]  # 17 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]]  # cat head P4
  - [-1, 3, C2f, [512]]  # 20 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]]  # cat head P5
  - [-1, 3, C2f, [512]]  # 23 (P5/32-large)

  - [[15, 18, 21], 1, Detect, [nc]]  # Detect(P3, P4, P5)


================================================
FILE: models/v8/yolov8x6.yaml
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 1.00  # scales module repeats
width_multiple: 1.25  # scales convolution channels

# YOLOv8.0x6 backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [512, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 9-P6/64
  - [-1, 3, C2f, [512, True]]
  - [-1, 1, SPPF, [512, 5]]  # 11

# YOLOv8.0x6 head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 8], 1, Concat, [1]]  # cat backbone P5
  - [-1, 3, C2, [512, False]]  # 14

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4
  - [-1, 3, C2, [512, False]]  # 17

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]  # cat backbone P3
  - [-1, 3, C2, [256, False]]  # 20 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 17], 1, Concat, [1]]  # cat head P4
  - [-1, 3, C2, [512, False]]  # 23 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 14], 1, Concat, [1]]  # cat head P5
  - [-1, 3, C2, [512, False]]  # 26 (P5/32-large)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 11], 1, Concat, [1]]  # cat head P6
  - [-1, 3, C2, [512, False]]  # 29 (P6/64-xlarge)

  - [[20, 23, 26, 29], 1, Detect, [nc]]  # Detect(P3, P4, P5, P6)


================================================
FILE: nn/__init__.py
================================================


================================================
FILE: nn/autobackend.py
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

import json
import platform
from collections import OrderedDict, namedtuple
from pathlib import Path
from urllib.parse import urlparse

import cv2
import numpy as np
import torch
import torch.nn as nn
from PIL import Image

from yolo.utils import LOGGER, ROOT, yaml_load
from yolo.utils.checks import check_requirements, check_suffix, check_version
from yolo.utils.downloads import attempt_download, is_url
from yolo.utils.ops import xywh2xyxy


class AutoBackend(nn.Module):

    def __init__(self, weights='yolov8n.pt', device=torch.device('cpu'), dnn=False, data=None, fp16=False, fuse=True):
        """
        Ultralytics YOLO MultiBackend class for python inference on various backends

        Args:
          weights: the path to the weights file. Defaults to yolov8n.pt
          device: The device to run the model on.
          dnn: If you want to use OpenCV's DNN module to run the inference, set this to True. Defaults to
        False
          data: a dictionary containing the following keys:
          fp16: If true, will use half precision. Defaults to False
          fuse: whether to fuse the model or not. Defaults to True

        Supported format and their usage:
            | Platform              | weights          |
            |-----------------------|------------------|
            | PyTorch               | *.pt             |
            | TorchScript           | *.torchscript    |
            | ONNX Runtime          | *.onnx           |
            | ONNX OpenCV DNN       | *.onnx --dnn     |
            | OpenVINO              | *.xml            |
            | CoreML                | *.mlmodel        |
            | TensorRT              | *.engine         |
            | TensorFlow SavedModel | *_saved_model    |
            | TensorFlow GraphDef   | *.pb             |
            | TensorFlow Lite       | *.tflite         |
            | TensorFlow Edge TPU   | *_edgetpu.tflite |
            | PaddlePaddle          | *_paddle_model   |
        """
        super().__init__()
        w = str(weights[0] if isinstance(weights, list) else weights)
        nn_module = isinstance(weights, torch.nn.Module)
        pt, jit, onnx, xml, engine, coreml, saved_model, pb, tflite, edgetpu, tfjs, paddle, triton = self._model_type(w)
        fp16 &= pt or jit or onnx or engine or nn_module  # FP16
        nhwc = coreml or saved_model or pb or tflite or edgetpu  # BHWC formats (vs torch BCWH)
        stride = 32  # default stride
        cuda = torch.cuda.is_available() and device.type != 'cpu'  # use CUDA
        if not (pt or triton or nn_module):
            w = attempt_download(w)  # download if not local

        # NOTE: special case: in-memory pytorch model
        if nn_module:
            model = weights.to(device)
            model = model.fuse() if fuse else model
            names = model.module.names if hasattr(model, 'module') else model.names  # get class names
            model.half() if fp16 else model.float()
            self.model = model  # explicitly assign for to(), cpu(), cuda(), half()
            pt = True
        elif pt:  # PyTorch
            from nn.tasks import attempt_load_weights
            model = attempt_load_weights(weights if isinstance(weights, list) else w,
                                         device=device,
                                         inplace=True,
                                         fuse=fuse)
            stride = max(int(model.stride.max()), 32)  # model stride
            names = model.module.names if hasattr(model, 'module') else model.names  # get class names
            model.half() if fp16 else model.float()
            self.model = model  # explicitly assign for to(), cpu(), cuda(), half()
        elif jit:  # TorchScript
            LOGGER.info(f'Loading {w} for TorchScript inference...')
            extra_files = {'config.txt': ''}  # model metadata
            model = torch.jit.load(w, _extra_files=extra_files, map_location=device)
            model.half() if fp16 else model.float()
            if extra_files['config.txt']:  # load metadata dict
                d = json.loads(extra_files['config.txt'],
                               object_hook=lambda d: {int(k) if k.isdigit() else k: v
                                                      for k, v in d.items()})
                stride, names = int(d['stride']), d['names']
        elif dnn:  # ONNX OpenCV DNN
            LOGGER.info(f'Loading {w} for ONNX OpenCV DNN inference...')
            check_requirements('opencv-python>=4.5.4')
            net = cv2.dnn.readNetFromONNX(w)
        elif onnx:  # ONNX Runtime
            LOGGER.info(f'Loading {w} for ONNX Runtime inference...')
            check_requirements(('onnx', 'onnxruntime-gpu' if cuda else 'onnxruntime'))
            import onnxruntime
            providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if cuda else ['CPUExecutionProvider']
            session = onnxruntime.InferenceSession(w, providers=providers)
            output_names = [x.name for x in session.get_outputs()]
            meta = session.get_modelmeta().custom_metadata_map  # metadata
            if 'stride' in meta:
                stride, names = int(meta['stride']), eval(meta['names'])
        elif xml:  # OpenVINO
            LOGGER.info(f'Loading {w} for OpenVINO inference...')
            check_requirements('openvino')  # requires openvino-dev: https://pypi.org/project/openvino-dev/
            from openvino.runtime import Core, Layout, get_batch  # noqa
            ie = Core()
            if not Path(w).is_file():  # if not *.xml
                w = next(Path(w).glob('*.xml'))  # get *.xml file from *_openvino_model dir
            network = ie.read_model(model=w, weights=Path(w).with_suffix('.bin'))
            if network.get_parameters()[0].get_layout().empty:
                network.get_parameters()[0].set_layout(Layout("NCHW"))
            batch_dim = get_batch(network)
            if batch_dim.is_static:
                batch_size = batch_dim.get_length()
            executable_network = ie.compile_model(network, device_name="CPU")  # device_name="MYRIAD" for Intel NCS2
            stride, names = self._load_metadata(Path(w).with_suffix('.yaml'))  # load metadata
        elif engine:  # TensorRT
            LOGGER.info(f'Loading {w} for TensorRT inference...')
            import tensorrt as trt  # https://developer.nvidia.com/nvidia-tensorrt-download
            check_version(trt.__version__, '7.0.0', hard=True)  # require tensorrt>=7.0.0
            if device.type == 'cpu':
                device = torch.device('cuda:0')
            Binding = namedtuple('Binding', ('name', 'dtype', 'shape', 'data', 'ptr'))
            logger = trt.Logger(trt.Logger.INFO)
            with open(w, 'rb') as f, trt.Runtime(logger) as runtime:
                model = runtime.deserialize_cuda_engine(f.read())
            context = model.create_execution_context()
            bindings = OrderedDict()
            output_names = []
            fp16 = False  # default updated below
            dynamic = False
            for i in range(model.num_bindings):
                name = model.get_binding_name(i)
                dtype = trt.nptype(model.get_binding_dtype(i))
                if model.binding_is_input(i):
                    if -1 in tuple(model.get_binding_shape(i)):  # dynamic
                        dynamic = True
                        context.set_binding_shape(i, tuple(model.get_profile_shape(0, i)[2]))
                    if dtype == np.float16:
                        fp16 = True
                else:  # output
                    output_names.append(name)
                shape = tuple(context.get_binding_shape(i))
                im = torch.from_numpy(np.empty(shape, dtype=dtype)).to(device)
                bindings[name] = Binding(name, dtype, shape, im, int(im.data_ptr()))
            binding_addrs = OrderedDict((n, d.ptr) for n, d in bindings.items())
            batch_size = bindings['images'].shape[0]  # if dynamic, this is instead max batch size
        elif coreml:  # CoreML
            LOGGER.info(f'Loading {w} for CoreML inference...')
            import coremltools as ct
            model = ct.models.MLModel(w)
        elif saved_model:  # TF SavedModel
            LOGGER.info(f'Loading {w} for TensorFlow SavedModel inference...')
            import tensorflow as tf
            keras = False  # assume TF1 saved_model
            model = tf.keras.models.load_model(w) if keras else tf.saved_model.load(w)
        elif pb:  # GraphDef https://www.tensorflow.org/guide/migrate#a_graphpb_or_graphpbtxt
            LOGGER.info(f'Loading {w} for TensorFlow GraphDef inference...')
            import tensorflow as tf

            def wrap_frozen_graph(gd, inputs, outputs):
                x = tf.compat.v1.wrap_function(lambda: tf.compat.v1.import_graph_def(gd, name=""), [])  # wrapped
                ge = x.graph.as_graph_element
                return x.prune(tf.nest.map_structure(ge, inputs), tf.nest.map_structure(ge, outputs))

            def gd_outputs(gd):
                name_list, input_list = [], []
                for node in gd.node:  # tensorflow.core.framework.node_def_pb2.NodeDef
                    name_list.append(node.name)
                    input_list.extend(node.input)
                return sorted(f'{x}:0' for x in list(set(name_list) - set(input_list)) if not x.startswith('NoOp'))

            gd = tf.Graph().as_graph_def()  # TF GraphDef
            with open(w, 'rb') as f:
                gd.ParseFromString(f.read())
            frozen_func = wrap_frozen_graph(gd, inputs="x:0", outputs=gd_outputs(gd))
        elif tflite or edgetpu:  # https://www.tensorflow.org/lite/guide/python#install_tensorflow_lite_for_python
            try:  # https://coral.ai/docs/edgetpu/tflite-python/#update-existing-tf-lite-code-for-the-edge-tpu
                from tflite_runtime.interpreter import Interpreter, load_delegate
            except ImportError:
                import tensorflow as tf
                Interpreter, load_delegate = tf.lite.Interpreter, tf.lite.experimental.load_delegate,
            if edgetpu:  # TF Edge TPU https://coral.ai/software/#edgetpu-runtime
                LOGGER.info(f'Loading {w} for TensorFlow Lite Edge TPU inference...')
                delegate = {
                    'Linux': 'libedgetpu.so.1',
                    'Darwin': 'libedgetpu.1.dylib',
                    'Windows': 'edgetpu.dll'}[platform.system()]
                interpreter = Interpreter(model_path=w, experimental_delegates=[load_delegate(delegate)])
            else:  # TFLite
                LOGGER.info(f'Loading {w} for TensorFlow Lite inference...')
                interpreter = Interpreter(model_path=w)  # load TFLite model
            interpreter.allocate_tensors()  # allocate
            input_details = interpreter.get_input_details()  # inputs
            output_details = interpreter.get_output_details()  # outputs
        elif tfjs:  # TF.js
            raise NotImplementedError('ERROR: YOLOv5 TF.js inference is not supported')
        elif paddle:  # PaddlePaddle
            LOGGER.info(f'Loading {w} for PaddlePaddle inference...')
            check_requirements('paddlepaddle-gpu' if cuda else 'paddlepaddle')
            import paddle.inference as pdi
            if not Path(w).is_file():  # if not *.pdmodel
                w = next(Path(w).rglob('*.pdmodel'))  # get *.xml file from *_openvino_model dir
            weights = Path(w).with_suffix('.pdiparams')
            config = pdi.Config(str(w), str(weights))
            if cuda:
                config.enable_use_gpu(memory_pool_init_size_mb=2048, device_id=0)
            predictor = pdi.create_predictor(config)
            input_handle = predictor.get_input_handle(predictor.get_input_names()[0])
            output_names = predictor.get_output_names()
        elif triton:  # NVIDIA Triton Inference Server
            LOGGER.info('Triton Inference Server not supported...')
            '''
            TODO:
            check_requirements('tritonclient[all]')
            from utils.triton import TritonRemoteModel
            model = TritonRemoteModel(url=w)
            nhwc = model.runtime.startswith("tensorflow")
            '''
        else:
            raise NotImplementedError(f'ERROR: {w} is not a supported format')

        # class names
        if 'names' not in locals():
            names = yaml_load(data)['names'] if data else {i: f'class{i}' for i in range(999)}
        if names[0] == 'n01440764' and len(names) == 1000:  # ImageNet
            names = yaml_load(ROOT / 'yolo/data/datasets/ImageNet.yaml')['names']  # human-readable names

        self.__dict__.update(locals())  # assign all variables to self

    def forward(self, im, augment=False, visualize=False):
        """
        Runs inference on the given model

        Args:
          im: the image tensor
          augment: whether to augment the image. Defaults to False
          visualize: if True, then the network will output the feature maps of the last convolutional layer.
        Defaults to False
        """
        # YOLOv5 MultiBackend inference
        b, ch, h, w = im.shape  # batch, channel, height, width
        if self.fp16 and im.dtype != torch.float16:
            im = im.half()  # to FP16
        if self.nhwc:
            im = im.permute(0, 2, 3, 1)  # torch BCHW to numpy BHWC shape(1,320,192,3)

        if self.pt or self.nn_module:  # PyTorch
            y = self.model(im, augment=augment, visualize=visualize) if augment or visualize else self.model(im)
        elif self.jit:  # TorchScript
            y = self.model(im)
        elif self.dnn:  # ONNX OpenCV DNN
            im = im.cpu().numpy()  # torch to numpy
            self.net.setInput(im)
            y = self.net.forward()
        elif self.onnx:  # ONNX Runtime
            im = im.cpu().numpy()  # torch to numpy
            y = self.session.run(self.output_names, {self.session.get_inputs()[0].name: im})
        elif self.xml:  # OpenVINO
            im = im.cpu().numpy()  # FP32
            y = list(self.executable_network([im]).values())
        elif self.engine:  # TensorRT
            if self.dynamic and im.shape != self.bindings['images'].shape:
                i = self.model.get_binding_index('images')
                self.context.set_binding_shape(i, im.shape)  # reshape if dynamic
                self.bindings['images'] = self.bindings['images']._replace(shape=im.shape)
                for name in self.output_names:
                    i = self.model.get_binding_index(name)
                    self.bindings[name].data.resize_(tuple(self.context.get_binding_shape(i)))
            s = self.bindings['images'].shape
            assert im.shape == s, f"input size {im.shape} {'>' if self.dynamic else 'not equal to'} max model size {s}"
            self.binding_addrs['images'] = int(im.data_ptr())
            self.context.execute_v2(list(self.binding_addrs.values()))
            y = [self.bindings[x].data for x in sorted(self.output_names)]
        elif self.coreml:  # CoreML
            im = im.cpu().numpy()
            im = Image.fromarray((im[0] * 255).astype('uint8'))
            # im = im.resize((192, 320), Image.ANTIALIAS)
            y = self.model.predict({'image': im})  # coordinates are xywh normalized
            if 'confidence' in y:
                box = xywh2xyxy(y['coordinates'] * [[w, h, w, h]])  # xyxy pixels
                conf, cls = y['confidence'].max(1), y['confidence'].argmax(1).astype(np.float)
                y = np.concatenate((box, conf.reshape(-1, 1), cls.reshape(-1, 1)), 1)
            else:
                y = list(reversed(y.values()))  # reversed for segmentation models (pred, proto)
        elif self.paddle:  # PaddlePaddle
            im = im.cpu().numpy().astype(np.float32)
            self.input_handle.copy_from_cpu(im)
            self.predictor.run()
            y = [self.predictor.get_output_handle(x).copy_to_cpu() for x in self.output_names]
        elif self.triton:  # NVIDIA Triton Inference Server
            y = self.model(im)
        else:  # TensorFlow (SavedModel, GraphDef, Lite, Edge TPU)
            im = im.cpu().numpy()
            if self.saved_model:  # SavedModel
                y = self.model(im, training=False) if self.keras else self.model(im)
            elif self.pb:  # GraphDef
                y = self.frozen_func(x=self.tf.constant(im))
            else:  # Lite or Edge TPU
                input = self.input_details[0]
                int8 = input['dtype'] == np.uint8  # is TFLite quantized uint8 model
                if int8:
                    scale, zero_point = input['quantization']
                    im = (im / scale + zero_point).astype(np.uint8)  # de-scale
                self.interpreter.set_tensor(input['index'], im)
                self.interpreter.invoke()
                y = []
                for output in self.output_details:
                    x = self.interpreter.get_tensor(output['index'])
                    if int8:
                        scale, zero_point = output['quantization']
                        x = (x.astype(np.float32) - zero_point) * scale  # re-scale
                    y.append(x)
            y = [x if isinstance(x, np.ndarray) else x.numpy() for x in y]
            y[0][..., :4] *= [w, h, w, h]  # xywh normalized to pixels

        if isinstance(y, (list, tuple)):
            return self.from_numpy(y[0]) if len(y) == 1 else [self.from_numpy(x) for x in y]
        else:
            return self.from_numpy(y)

    def from_numpy(self, x):
        """
        `from_numpy` converts a numpy array to a tensor

        Args:
          x: the numpy array to convert
        """
        return torch.from_numpy(x).to(self.device) if isinstance(x, np.ndarray) else x

    def warmup(self, imgsz=(1, 3, 640, 640)):
        """
        Warmup model by running inference once

        Args:
          imgsz: the size of the image you want to run inference on.
        """
        warmup_types = self.pt, self.jit, self.onnx, self.engine, self.saved_model, self.pb, self.triton, self.nn_module
        if any(warmup_types) and (self.device.type != 'cpu' or self.triton):
            im = torch.empty(*imgsz, dtype=torch.half if self.fp16 else torch.float, device=self.device)  # input
            for _ in range(2 if self.jit else 1):  #
                self.forward(im)  # warmup

    @staticmethod
    def _model_type(p='path/to/model.pt'):
        """
        This function takes a path to a model file and returns the model type

        Args:
          p: path to the model file. Defaults to path/to/model.pt
        """
        # Return model type from model path, i.e. path='path/to/model.onnx' -> type=onnx
        # types = [pt, jit, onnx, xml, engine, coreml, saved_model, pb, tflite, edgetpu, tfjs, paddle]
        from yolo.engine.exporter import export_formats
        sf = list(export_formats().Suffix)  # export suffixes
        if not is_url(p, check=False) and not isinstance(p, str):
            check_suffix(p, sf)  # checks
        url = urlparse(p)  # if url may be Triton inference server
        types = [s in Path(p).name for s in sf]
        types[8] &= not types[9]  # tflite &= not edgetpu
        triton = not any(types) and all([any(s in url.scheme for s in ["http", "grpc"]), url.netloc])
        return types + [triton]

    @staticmethod
    def _load_metadata(f=Path('path/to/meta.yaml')):
        """
        > Loads the metadata from a yaml file

        Args:
          f: The path to the metadata file.
        """
        from yolo.utils.files import yaml_load

        # Load metadata from meta.yaml if it exists
        if f.exists():
            d = yaml_load(f)
            return d['stride'], d['names']  # assign stride, names
        return None, None


================================================
FILE: nn/modules.py
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license
"""
Common modules
"""

import math
import warnings
from copy import copy
from pathlib import Path

import cv2
import numpy as np
import pandas as pd
import requests
import torch
import torch.nn as nn
from PIL import Image, ImageOps
from torch.cuda import amp

from nn.autobackend import AutoBackend
from yolo.data.augment import LetterBox
from yolo.utils import LOGGER, colorstr
from yolo.utils.files import increment_path
from yolo.utils.ops import Profile, make_divisible, non_max_suppression, scale_boxes, xyxy2xywh
from yolo.utils.plotting import Annotator, colors, save_one_box
from yolo.utils.tal import dist2bbox, make_anchors
from yolo.utils.torch_utils import copy_attr, smart_inference_mode

# from utils.plots import feature_visualization TODO


def autopad(k, p=None, d=1):  # kernel, padding, dilation
    # Pad to 'same' shape outputs
    if d > 1:
        k = d * (k - 1) + 1 if isinstance(k, int) else [d * (x - 1) + 1 for x in k]  # actual kernel-size
    if p is None:
        p = k // 2 if isinstance(k, int) else [x // 2 for x in k]  # auto-pad
    return p


class Conv(nn.Module):
    # Standard convolution with args(ch_in, ch_out, kernel, stride, padding, groups, dilation, activation)
    default_act = nn.SiLU()  # default activation

    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, d=1, act=True):
        super().__init__()
        self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p, d), groups=g, dilation=d, bias=False)
        self.bn = nn.BatchNorm2d(c2)
        self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity()

    def forward(self, x):
        return self.act(self.bn(self.conv(x)))

    def forward_fuse(self, x):
        return self.act(self.conv(x))


class DWConv(Conv):
    # Depth-wise convolution
    def __init__(self, c1, c2, k=1, s=1, d=1, act=True):  # ch_in, ch_out, kernel, stride, dilation, activation
        super().__init__(c1, c2, k, s, g=math.gcd(c1, c2), d=d, act=act)


class DWConvTranspose2d(nn.ConvTranspose2d):
    # Depth-wise transpose convolution
    def __init__(self, c1, c2, k=1, s=1, p1=0, p2=0):  # ch_in, ch_out, kernel, stride, padding, padding_out
        super().__init__(c1, c2, k, s, p1, p2, groups=math.gcd(c1, c2))


class ConvTranspose(nn.Module):
    # Convolution transpose 2d layer
    default_act = nn.SiLU()  # default activation

    def __init__(self, c1, c2, k=2, s=2, p=0, bn=True, act=True):
        super().__init__()
        self.conv_transpose = nn.ConvTranspose2d(c1, c2, k, s, p, bias=not bn)
        self.bn = nn.BatchNorm2d(c2) if bn else nn.Identity()
        self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity()

    def forward(self, x):
        return self.act(self.bn(self.conv_transpose(x)))


class DFL(nn.Module):
    # DFL module
    def __init__(self, c1=16):
        super().__init__()
        self.conv = nn.Conv2d(c1, 1, 1, bias=False).requires_grad_(False)
        x = torch.arange(c1, dtype=torch.float)
        self.conv.weight.data[:] = nn.Parameter(x.view(1, c1, 1, 1))
        self.c1 = c1

    def forward(self, x):
        b, c, a = x.shape  # batch, channels, anchors
        return self.conv(x.view(b, 4, self.c1, a).transpose(2, 1).softmax(1)).view(b, 4, a)
        # return self.conv(x.view(b, self.c1, 4, a).softmax(1)).view(b, 4, a)


class TransformerLayer(nn.Module):
    # Transformer layer https://arxiv.org/abs/2010.11929 (LayerNorm layers removed for better performance)
    def __init__(self, c, num_heads):
        super().__init__()
        self.q = nn.Linear(c, c, bias=False)
        self.k = nn.Linear(c, c, bias=False)
        self.v = nn.Linear(c, c, bias=False)
        self.ma = nn.MultiheadAttention(embed_dim=c, num_heads=num_heads)
        self.fc1 = nn.Linear(c, c, bias=False)
        self.fc2 = nn.Linear(c, c, bias=False)

    def forward(self, x):
        x = self.ma(self.q(x), self.k(x), self.v(x))[0] + x
        x = self.fc2(self.fc1(x)) + x
        return x


class TransformerBlock(nn.Module):
    # Vision Transformer https://arxiv.org/abs/2010.11929
    def __init__(self, c1, c2, num_heads, num_layers):
        super().__init__()
        self.conv = None
        if c1 != c2:
            self.conv = Conv(c1, c2)
        self.linear = nn.Linear(c2, c2)  # learnable position embedding
        self.tr = nn.Sequential(*(TransformerLayer(c2, num_heads) for _ in range(num_layers)))
        self.c2 = c2

    def forward(self, x):
        if self.conv is not None:
            x = self.conv(x)
        b, _, w, h = x.shape
        p = x.flatten(2).permute(2, 0, 1)
        return self.tr(p + self.linear(p)).permute(1, 2, 0).reshape(b, self.c2, w, h)


class Bottleneck(nn.Module):
    # Standard bottleneck
    def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5):  # ch_in, ch_out, shortcut, kernels, groups, expand
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, k[0], 1)
        self.cv2 = Conv(c_, c2, k[1], 1, g=g)
        self.add = shortcut and c1 == c2

    def forward(self, x):
        return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))


class BottleneckCSP(nn.Module):
    # CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks
    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):  # ch_in, ch_out, number, shortcut, groups, expansion
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False)
        self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False)
        self.cv4 = Conv(2 * c_, c2, 1, 1)
        self.bn = nn.BatchNorm2d(2 * c_)  # applied to cat(cv2, cv3)
        self.act = nn.SiLU()
        self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))

    def forward(self, x):
        y1 = self.cv3(self.m(self.cv1(x)))
        y2 = self.cv2(x)
        return self.cv4(self.act(self.bn(torch.cat((y1, y2), 1))))


class C3(nn.Module):
    # CSP Bottleneck with 3 convolutions
    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):  # ch_in, ch_out, number, shortcut, groups, expansion
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c1, c_, 1, 1)
        self.cv3 = Conv(2 * c_, c2, 1)  # optional act=FReLU(c2)
        self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))

    def forward(self, x):
        return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1))


class C2(nn.Module):
    # CSP Bottleneck with 2 convolutions
    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):  # ch_in, ch_out, number, shortcut, groups, expansion
        super().__init__()
        self.c = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, 2 * self.c, 1, 1)
        self.cv2 = Conv(2 * self.c, c2, 1)  # optional act=FReLU(c2)
        # self.attention = ChannelAttention(2 * self.c)  # or SpatialAttention()
        self.m = nn.Sequential(*(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n)))

    def forward(self, x):
        a, b = self.cv1(x).split((self.c, self.c), 1)
        return self.cv2(torch.cat((self.m(a), b), 1))


class C2f(nn.Module):
    # CSP Bottleneck with 2 convolutions
    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5):  # ch_in, ch_out, number, shortcut, groups, expansion
        super().__init__()
        self.c = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, 2 * self.c, 1, 1)
        self.cv2 = Conv((2 + n) * self.c, c2, 1)  # optional act=FReLU(c2)
        self.m = nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n))

    def forward(self, x):
        y = list(self.cv1(x).split((self.c, self.c), 1))
        y.extend(m(y[-1]) for m in self.m)
        return self.cv2(torch.cat(y, 1))


class ChannelAttention(nn.Module):
    # Channel-attention module https://github.com/open-mmlab/mmdetection/tree/v3.0.0rc1/configs/rtmdet
    def __init__(self, channels: int) -> None:
        super().__init__()
        self.pool = nn.AdaptiveAvgPool2d(1)
        self.fc = nn.Conv2d(channels, channels, 1, 1, 0, bias=True)
        self.act = nn.Sigmoid()

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        return x * self.act(self.fc(self.pool(x)))


class SpatialAttention(nn.Module):
    # Spatial-attention module
    def __init__(self, kernel_size=7):
        super().__init__()
        assert kernel_size in (3, 7), 'kernel size must be 3 or 7'
        padding = 3 if kernel_size == 7 else 1
        self.cv1 = nn.Conv2d(2, 1, kernel_size, padding=padding, bias=False)
        self.act = nn.Sigmoid()

    def forward(self, x):
        return x * self.act(self.cv1(torch.cat([torch.mean(x, 1, keepdim=True), torch.max(x, 1, keepdim=True)[0]], 1)))


class CBAM(nn.Module):
    # CSP Bottleneck with 3 convolutions
    def __init__(self, c1, ratio=16, kernel_size=7):  # ch_in, ch_out, number, shortcut, groups, expansion
        super().__init__()
        self.channel_attention = ChannelAttention(c1)
        self.spatial_attention = SpatialAttention(kernel_size)

    def forward(self, x):
        return self.spatial_attention(self.channel_attention(x))


class C1(nn.Module):
    # CSP Bottleneck with 3 convolutions
    def __init__(self, c1, c2, n=1):  # ch_in, ch_out, number, shortcut, groups, expansion
        super().__init__()
        self.cv1 = Conv(c1, c2, 1, 1)
        self.m = nn.Sequential(*(Conv(c2, c2, 3) for _ in range(n)))

    def forward(self, x):
        y = self.cv1(x)
        return self.m(y) + y


class C3x(C3):
    # C3 module with cross-convolutions
    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
        super().__init__(c1, c2, n, shortcut, g, e)
        self.c_ = int(c2 * e)
        self.m = nn.Sequential(*(Bottleneck(self.c_, self.c_, shortcut, g, k=((1, 3), (3, 1)), e=1) for _ in range(n)))


class C3TR(C3):
    # C3 module with TransformerBlock()
    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
        super().__init__(c1, c2, n, shortcut, g, e)
        c_ = int(c2 * e)
        self.m = TransformerBlock(c_, c_, 4, n)


class C3Ghost(C3):
    # C3 module with GhostBottleneck()
    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
        super().__init__(c1, c2, n, shortcut, g, e)
        c_ = int(c2 * e)  # hidden channels
        self.m = nn.Sequential(*(GhostBottleneck(c_, c_) for _ in range(n)))


class SPP(nn.Module):
    # Spatial Pyramid Pooling (SPP) layer https://arxiv.org/abs/1406.4729
    def __init__(self, c1, c2, k=(5, 9, 13)):
        super().__init__()
        c_ = c1 // 2  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1)
        self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])

    def forward(self, x):
        x = self.cv1(x)
        with warnings.catch_warnings():
            warnings.simplefilter('ignore')  # suppress torch 1.9.0 max_pool2d() warning
            return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1))


class SPPF(nn.Module):
    # Spatial Pyramid Pooling - Fast (SPPF) layer for YOLOv5 by Glenn Jocher
    def __init__(self, c1, c2, k=5):  # equivalent to SPP(k=(5, 9, 13))
        super().__init__()
        c_ = c1 // 2  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c_ * 4, c2, 1, 1)
        self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)

    def forward(self, x):
        x = self.cv1(x)
        with warnings.catch_warnings():
            warnings.simplefilter('ignore')  # suppress torch 1.9.0 max_pool2d() warning
            y1 = self.m(x)
            y2 = self.m(y1)
            return self.cv2(torch.cat((x, y1, y2, self.m(y2)), 1))


class Focus(nn.Module):
    # Focus wh information into c-space
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups
        super().__init__()
        self.conv = Conv(c1 * 4, c2, k, s, p, g, act=act)
        # self.contract = Contract(gain=2)

    def forward(self, x):  # x(b,c,w,h) -> y(b,4c,w/2,h/2)
        return self.conv(torch.cat((x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]), 1))
        # return self.conv(self.contract(x))


class GhostConv(nn.Module):
    # Ghost Convolution https://github.com/huawei-noah/ghostnet
    def __init__(self, c1, c2, k=1, s=1, g=1, act=True):  # ch_in, ch_out, kernel, stride, groups
        super().__init__()
        c_ = c2 // 2  # hidden channels
        self.cv1 = Conv(c1, c_, k, s, None, g, act=act)
        self.cv2 = Conv(c_, c_, 5, 1, None, c_, act=act)

    def forward(self, x):
        y = self.cv1(x)
        return torch.cat((y, self.cv2(y)), 1)


class GhostBottleneck(nn.Module):
    # Ghost Bottleneck https://github.com/huawei-noah/ghostnet
    def __init__(self, c1, c2, k=3, s=1):  # ch_in, ch_out, kernel, stride
        super().__init__()
        c_ = c2 // 2
        self.conv = nn.Sequential(
            GhostConv(c1, c_, 1, 1),  # pw
            DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(),  # dw
            GhostConv(c_, c2, 1, 1, act=False))  # pw-linear
        self.shortcut = nn.Sequential(DWConv(c1, c1, k, s, act=False), Conv(c1, c2, 1, 1,
                                                                            act=False)) if s == 2 else nn.Identity()

    def forward(self, x):
        return self.conv(x) + self.shortcut(x)


class Concat(nn.Module):
    # Concatenate a list of tensors along dimension
    def __init__(self, dimension=1):
        super().__init__()
        self.d = dimension

    def forward(self, x):
        return torch.cat(x, self.d)


class AutoShape(nn.Module):
    # YOLOv5 input-robust model wrapper for passing cv2/np/PIL/torch inputs. Includes preprocessing, inference and NMS
    conf = 0.25  # NMS confidence threshold
    iou = 0.45  # NMS IoU threshold
    agnostic = False  # NMS class-agnostic
    multi_label = False  # NMS multiple labels per box
    classes = None  # (optional list) filter by class, i.e. = [0, 15, 16] for COCO persons, cats and dogs
    max_det = 1000  # maximum number of detections per image
    amp = False  # Automatic Mixed Precision (AMP) inference

    def __init__(self, model, verbose=True):
        super().__init__()
        if verbose:
            LOGGER.info('Adding AutoShape... ')
        copy_attr(self, model, include=('yaml', 'nc', 'hyp', 'names', 'stride', 'abc'), exclude=())  # copy attributes
        self.dmb = isinstance(model, AutoBackend)  # DetectMultiBackend() instance
        self.pt = not self.dmb or model.pt  # PyTorch model
        self.model = model.eval()
        if self.pt:
            m = self.model.model.model[-1] if self.dmb else self.model.model[-1]  # Detect()
            m.inplace = False  # Detect.inplace=False for safe multithread inference
            m.export = True  # do not output loss values

    def _apply(self, fn):
        # Apply to(), cpu(), cuda(), half() to model tensors that are not parameters or registered buffers
        self = super()._apply(fn)
        if self.pt:
            m = self.model.model.model[-1] if self.dmb else self.model.model[-1]  # Detect()
            m.stride = fn(m.stride)
            m.grid = list(map(fn, m.grid))
            if isinstance(m.anchor_grid, list):
                m.anchor_grid = list(map(fn, m.anchor_grid))
        return self

    @smart_inference_mode()
    def forward(self, ims, size=640, augment=False, profile=False):
        # Inference from various sources. For size(height=640, width=1280), RGB images example inputs are:
        #   file:        ims = 'data/images/zidane.jpg'  # str or PosixPath
        #   URI:             = 'https://com/images/zidane.jpg'
        #   OpenCV:          = cv2.imread('image.jpg')[:,:,::-1]  # HWC BGR to RGB x(640,1280,3)
        #   PIL:             = Image.open('image.jpg') or ImageGrab.grab()  # HWC x(640,1280,3)
        #   numpy:           = np.zeros((640,1280,3))  # HWC
        #   torch:           = torch.zeros(16,3,320,640)  # BCHW (scaled to size=640, 0-1 values)
        #   multiple:        = [Image.open('image1.jpg'), Image.open('image2.jpg'), ...]  # list of images

        dt = (Profile(), Profile(), Profile())
        with dt[0]:
            if isinstance(size, int):  # expand
                size = (size, size)
            p = next(self.model.parameters()) if self.pt else torch.empty(1, device=self.model.device)  # param
            autocast = self.amp and (p.device.type != 'cpu')  # Automatic Mixed Precision (AMP) inference
            if isinstance(ims, torch.Tensor):  # torch
                with amp.autocast(autocast):
                    return self.model(ims.to(p.device).type_as(p), augment=augment)  # inference

            # Pre-process
            n, ims = (len(ims), list(ims)) if isinstance(ims, (list, tuple)) else (1, [ims])  # number, list of images
            shape0, shape1, files = [], [], []  # image and inference shapes, filenames
            for i, im in enumerate(ims):
                f = f'image{i}'  # filename
                if isinstance(im, (str, Path)):  # filename or uri
                    im, f = Image.open(requests.get(im, stream=True).raw if str(im).startswith('http') else im), im
                    im = np.asarray(ImageOps.exif_transpose(im))
                elif isinstance(im, Image.Image):  # PIL Image
                    im, f = np.asarray(ImageOps.exif_transpose(im)), getattr(im, 'filename', f) or f
                files.append(Path(f).with_suffix('.jpg').name)
                if im.shape[0] < 5:  # image in CHW
                    im = im.transpose((1, 2, 0))  # reverse dataloader .transpose(2, 0, 1)
                im = im[..., :3] if im.ndim == 3 else cv2.cvtColor(im, cv2.COLOR_GRAY2BGR)  # enforce 3ch input
                s = im.shape[:2]  # HWC
                shape0.append(s)  # image shape
                g = max(size) / max(s)  # gain
                shape1.append([y * g for y in s])
                ims[i] = im if im.data.contiguous else np.ascontiguousarray(im)  # update
            shape1 = [make_divisible(x, self.stride) for x in np.array(shape1).max(0)] if self.pt else size  # inf shape
            x = [LetterBox(shape1, auto=False)(image=im)["img"] for im in ims]  # pad
            x = np.ascontiguousarray(np.array(x).transpose((0, 3, 1, 2)))  # stack and BHWC to BCHW
            x = torch.from_numpy(x).to(p.device).type_as(p) / 255  # uint8 to fp16/32

        with amp.autocast(autocast):
            # Inference
            with dt[1]:
                y = self.model(x, augment=augment)  # forward

            # Post-process
            with dt[2]:
                y = non_max_suppression(y if self.dmb else y[0],
                                        self.conf,
                                        self.iou,
                                        self.classes,
                                        self.agnostic,
                                        self.multi_label,
                                        max_det=self.max_det)  # NMS
                for i in range(n):
                    scale_boxes(shape1, y[i][:, :4], shape0[i])

            return Detections(ims, y, files, dt, self.names, x.shape)


class Detections:
    # YOLOv5 detections class for inference results
    def __init__(self, ims, pred, files, times=(0, 0, 0), names=None, shape=None):
        super().__init__()
        d = pred[0].device  # device
        gn = [torch.tensor([*(im.shape[i] for i in [1, 0, 1, 0]), 1, 1], device=d) for im in ims]  # normalizations
        self.ims = ims  # list of images as numpy arrays
        self.pred = pred  # list of tensors pred[0] = (xyxy, conf, cls)
        self.names = names  # class names
        self.files = files  # image filenames
        self.times = times  # profiling times
        self.xyxy = pred  # xyxy pixels
        self.xywh = [xyxy2xywh(x) for x in pred]  # xywh pixels
        self.xyxyn = [x / g for x, g in zip(self.xyxy, gn)]  # xyxy normalized
        self.xywhn = [x / g for x, g in zip(self.xywh, gn)]  # xywh normalized
        self.n = len(self.pred)  # number of images (batch size)
        self.t = tuple(x.t / self.n * 1E3 for x in times)  # timestamps (ms)
        self.s = tuple(shape)  # inference BCHW shape

    def _run(self, pprint=False, show=False, save=False, crop=False, render=False, labels=True, save_dir=Path('')):
        s, crops = '', []
        for i, (im, pred) in enumerate(zip(self.ims, self.pred)):
            s += f'\nimage {i + 1}/{len(self.pred)}: {im.shape[0]}x{im.shape[1]} '  # string
            if pred.shape[0]:
                for c in pred[:, -1].unique():
                    n = (pred[:, -1] == c).sum()  # detections per class
                    s += f"{n} {self.names[int(c)]}{'s' * (n > 1)}, "  # add to string
                s = s.rstrip(', ')
                if show or save or render or crop:
                    annotator = Annotator(im, example=str(self.names))
                    for *box, conf, cls in reversed(pred):  # xyxy, confidence, class
                        label = f'{self.names[int(cls)]} {conf:.2f}'
                        if crop:
                            file = save_dir / 'crops' / self.names[int(cls)] / self.files[i] if save else None
                            crops.append({
                                'box': box,
                                'conf': conf,
                                'cls': cls,
                                'label': label,
                                'im': save_one_box(box, im, file=file, save=save)})
                        else:  # all others
                            annotator.box_label(box, label if labels else '', color=colors(cls))
                    im = annotator.im
            else:
                s += '(no detections)'

            im = Image.fromarray(im.astype(np.uint8)) if isinstance(im, np.ndarray) else im  # from np
            if show:
                im.show(self.files[i])  # show
            if save:
                f = self.files[i]
                im.save(save_dir / f)  # save
                if i == self.n - 1:
                    LOGGER.info(f"Saved {self.n} image{'s' * (self.n > 1)} to {colorstr('bold', save_dir)}")
            if render:
                self.ims[i] = np.asarray(im)
        if pprint:
            s = s.lstrip('\n')
            return f'{s}\nSpeed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {self.s}' % self.t
        if crop:
            if save:
                LOGGER.info(f'Saved results to {save_dir}\n')
            return crops

    def show(self, labels=True):
        self._run(show=True, labels=labels)  # show results

    def save(self, labels=True, save_dir='runs/detect/exp', exist_ok=False):
        save_dir = increment_path(save_dir, exist_ok, mkdir=True)  # increment save_dir
        self._run(save=True, labels=labels, save_dir=save_dir)  # save results

    def crop(self, save=True, save_dir='runs/detect/exp', exist_ok=False):
        save_dir = increment_path(save_dir, exist_ok, mkdir=True) if save else None
        return self._run(crop=True, save=save, save_dir=save_dir)  # crop results

    def render(self, labels=True):
        self._run(render=True, labels=labels)  # render results
        return self.ims

    def pandas(self):
        # return detections as pandas DataFrames, i.e. print(results.pandas().xyxy[0])
        new = copy(self)  # return copy
        ca = 'xmin', 'ymin', 'xmax', 'ymax', 'confidence', 'class', 'name'  # xyxy columns
        cb = 'xcenter', 'ycenter', 'width', 'height', 'confidence', 'class', 'name'  # xywh columns
        for k, c in zip(['xyxy', 'xyxyn', 'xywh', 'xywhn'], [ca, ca, cb, cb]):
            a = [[x[:5] + [int(x[5]), self.names[int(x[5])]] for x in x.tolist()] for x in getattr(self, k)]  # update
            setattr(new, k, [pd.DataFrame(x, columns=c) for x in a])
        return new

    def tolist(self):
        # return a list of Detections objects, i.e. 'for result in results.tolist():'
        r = range(self.n)  # iterable
        x = [Detections([self.ims[i]], [self.pred[i]], [self.files[i]], self.times, self.names, self.s) for i in r]
        # for d in x:
        #    for k in ['ims', 'pred', 'xyxy', 'xyxyn', 'xywh', 'xywhn']:
        #        setattr(d, k, getattr(d, k)[0])  # pop out of list
        return x

    def print(self):
        LOGGER.info(self.__str__())

    def __len__(self):  # override len(results)
        return self.n

    def __str__(self):  # override print(results)
        return self._run(pprint=True)  # print results

    def __repr__(self):
        return f'YOLOv5 {self.__class__} instance\n' + self.__str__()


class Proto(nn.Module):
    # YOLOv8 mask Proto module for segmentation models
    def __init__(self, c1, c_=256, c2=32):  # ch_in, number of protos, number of masks
        super().__init__()
        self.cv1 = Conv(c1, c_, k=3)
        self.upsample = nn.ConvTranspose2d(c_, c_, 2, 2, 0, bias=True)  # nn.Upsample(scale_factor=2, mode='nearest')
        self.cv2 = Conv(c_, c_, k=3)
        self.cv3 = Conv(c_, c2)

    def forward(self, x):
        return self.cv3(self.cv2(self.upsample(self.cv1(x))))


class Ensemble(nn.ModuleList):
    # Ensemble of models
    def __init__(self):
        super().__init__()

    def forward(self, x, augment=False, profile=False, visualize=False):
        y = [module(x, augment, profile, visualize)[0] for module in self]
        # y = torch.stack(y).max(0)[0]  # max ensemble
        # y = torch.stack(y).mean(0)  # mean ensemble
        y = torch.cat(y, 1)  # nms ensemble
        return y, None  # inference, train output


# heads
class Detect(nn.Module):
    # YOLOv5 Detect head for detection models
    dynamic = False  # force grid reconstruction
    export = False  # export mode
    shape = None
    anchors = torch.empty(0)  # init
    strides = torch.empty(0)  # init

    def __init__(self, nc=80, ch=()):  # detection layer
        super().__init__()
        self.nc = nc  # number of classes
        self.nl = len(ch)  # number of detection layers
        self.reg_max = 16  # DFL channels (ch[0] // 16 to scale 4/8/12/16/20 for n/s/m/l/x)
        self.no = nc + self.reg_max * 4  # number of outputs per anchor
        self.stride = torch.zeros(self.nl)  # strides computed during build

        c2, c3 = max((16, ch[0] // 4, self.reg_max * 4)), max(ch[0], self.nc)  # channels
        self.cv2 = nn.ModuleList(
            nn.Sequential(Conv(x, c2, 3), Conv(c2, c2, 3), nn.Conv2d(c2, 4 * self.reg_max, 1)) for x in ch)
        self.cv3 = nn.ModuleList(nn.Sequential(Conv(x, c3, 3), Conv(c3, c3, 3), nn.Conv2d(c3, self.nc, 1)) for x in ch)
        self.dfl = DFL(self.reg_max) if self.reg_max > 1 else nn.Identity()

    def forward(self, x):
        shape = x[0].shape  # BCHW
        for i in range(self.nl):
            x[i] = torch.cat((self.cv2[i](x[i]), self.cv3[i](x[i])), 1)
        if self.training:
            return x
        elif self.dynamic or self.shape != shape:
            self.anchors, self.strides = (x.transpose(0, 1) for x in make_anchors(x, self.stride, 0.5))
            self.shape = shape

        box, cls = torch.cat([xi.view(shape[0], self.no, -1) for xi in x], 2).split((self.reg_max * 4, self.nc), 1)
        dbox = dist2bbox(self.dfl(box), self.anchors.unsqueeze(0), xywh=True, dim=1) * self.strides
        y = torch.cat((dbox, cls.sigmoid()), 1)
        return y if self.export else (y, x)

    def bias_init(self):
        # Initialize Detect() biases, WARNING: requires stride availability
        m = self  # self.model[-1]  # Detect() module
        # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1
        # ncf = math.log(0.6 / (m.nc - 0.999999)) if cf is None else torch.log(cf / cf.sum())  # nominal class frequency
        for a, b, s in zip(m.cv2, m.cv3, m.stride):  # from
            a[-1].bias.data[:] = 1.0  # box
            b[-1].bias.data[:m.nc] = math.log(5 / m.nc / (640 / s) ** 2)  # cls (.01 objects, 80 classes, 640 img)


class Segment(Detect):
    # YOLOv5 Segment head for segmentation models
    def __init__(self, nc=80, nm=32, npr=256, ch=()):
        super().__init__(nc, ch)
        self.nm = nm  # number of masks
        self.npr = npr  # number of protos
        self.proto = Proto(ch[0], self.npr, self.nm)  # protos
        self.detect = Detect.forward

        c4 = max(ch[0] // 4, self.nm)
        self.cv4 = nn.ModuleList(nn.Sequential(Conv(x, c4, 3), Conv(c4, c4, 3), nn.Conv2d(c4, self.nm, 1)) for x in ch)

    def forward(self, x):
        p = self.proto(x[0])  # mask protos
        bs = p.shape[0]  # batch size

        mc = torch.cat([self.cv4[i](x[i]).view(bs, self.nm, -1) for i in range(self.nl)], 2)  # mask coefficients
        x = self.detect(self, x)
        if self.training:
            return x, mc, p
        return (torch.cat([x, mc], 1), p) if self.export else (torch.cat([x[0], mc], 1), (x[1], mc, p))


class Classify(nn.Module):
    # YOLOv5 classification head, i.e. x(b,c1,20,20) to x(b,c2)
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1):  # ch_in, ch_out, kernel, stride, padding, groups
        super().__init__()
        c_ = 1280  # efficientnet_b0 size
        self.conv = Conv(c1, c_, k, s, autopad(k, p), g)
        self.pool = nn.AdaptiveAvgPool2d(1)  # to x(b,c_,1,1)
        self.drop = nn.Dropout(p=0.0, inplace=True)
        self.linear = nn.Linear(c_, c2)  # to x(b,c2)

    def forward(self, x):
        if isinstance(x, list):
            x = torch.cat(x, 1)
        return self.linear(self.drop(self.pool(self.conv(x)).flatten(1)))


================================================
FILE: nn/tasks.py
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

import contextlib
from copy import deepcopy

import thop
import torch
import torch.nn as nn

from nn.modules import (C1, C2, C3, C3TR, SPP, SPPF, Bottleneck, BottleneckCSP, C2f, C3Ghost, C3x, Classify,
                                    Concat, Conv, ConvTranspose, Detect, DWConv, DWConvTranspose2d, Ensemble, Focus,
                                    GhostBottleneck, GhostConv, Segment)
from yolo.utils import DEFAULT_CONFIG_DICT, DEFAULT_CONFIG_KEYS, LOGGER, colorstr, yaml_load
from yolo.utils.checks import check_yaml
from yolo.utils.torch_utils import (fuse_conv_and_bn, initialize_weights, intersect_dicts, make_divisible,
                                                model_info, scale_img, time_sync)


class BaseModel(nn.Module):
    '''
     The BaseModel class is a base class for all the models in the Ultralytics YOLO family.
    '''

    def forward(self, x, profile=False, visualize=False):
        """
        > `forward` is a wrapper for `_forward_once` that runs the model on a single scale

        Args:
          x: the input image
          profile: whether to profile the model. Defaults to False
          visualize: if True, will return the intermediate feature maps. Defaults to False

        Returns:
          The output of the network.
        """
        return self._forward_once(x, profile, visualize)

    def _forward_once(self, x, profile=False, visualize=False):
        """
        > Forward pass of the network

        Args:
          x: input to the model
          profile: if True, the time taken for each layer will be printed. Defaults to False
          visualize: If True, it will save the feature maps of the model. Defaults to False

        Returns:
          The last layer of the model.
        """
        y, dt = [], []  # outputs
        for m in self.model:
            if m.f != -1:  # if not from previous layer
                x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]  # from earlier layers
            if profile:
                self._profile_one_layer(m, x, dt)
            x = m(x)  # run
            y.append(x if m.i in self.save else None)  # save output
            if visualize:
                pass
                # TODO: feature_visualization(x, m.type, m.i, save_dir=visualize)
        return x

    def _profile_one_layer(self, m, x, dt):
        """
        It takes a model, an input, and a list of times, and it profiles the model on the input, appending
        the time to the list

        Args:
          m: the model
          x: the input image
          dt: list of time taken for each layer
        """
        c = m == self.model[-1]  # is final layer, copy input as inplace fix
        o = thop.profile(m, inputs=(x.copy() if c else x,), verbose=False)[0] / 1E9 * 2 if thop else 0  # FLOPs
        t = time_sync()
        for _ in range(10):
            m(x.copy() if c else x)
        dt.append((time_sync() - t) * 100)
        if m == self.model[0]:
            LOGGER.info(f"{'time (ms)':>10s} {'GFLOPs':>10s} {'params':>10s}  module")
        LOGGER.info(f'{dt[-1]:10.2f} {o:10.2f} {m.np:10.0f}  {m.type}')
        if c:
            LOGGER.info(f"{sum(dt):10.2f} {'-':>10s} {'-':>10s}  Total")

    def fuse(self):
        """
        > It takes a model and fuses the Conv2d() and BatchNorm2d() layers into a single layer

        Returns:
          The model is being returned.
        """
        LOGGER.info('Fusing layers... ')
        for m in self.model.modules():
            if isinstance(m, (Conv, DWConv)) and hasattr(m, 'bn'):
                m.conv = fuse_conv_and_bn(m.conv, m.bn)  # update conv
                delattr(m, 'bn')  # remove batchnorm
                m.forward = m.forward_fuse  # update forward
        self.info()
        return self

    def info(self, verbose=False, imgsz=640):
        """
        Prints model information

        Args:
          verbose: if True, prints out the model information. Defaults to False
          imgsz: the size of the image that the model will be trained on. Defaults to 640
        """
        model_info(self, verbose, imgsz)

    def _apply(self, fn):
        """
        `_apply()` is a function that applies a function to all the tensors in the model that are not
        parameters or registered buffers

        Args:
          fn: the function to apply to the model

        Returns:
          A model that is a Detect() object.
        """
        self = super()._apply(fn)
        m = self.model[-1]  # Detect()
        if isinstance(m, (Detect, Segment)):
            m.stride = fn(m.stride)
            m.anchors = fn(m.anchors)
            m.strides = fn(m.strides)
        return self

    def load(self, weights):
        """
        > This function loads the weights of the model from a file

        Args:
          weights: The weights to load into the model.
        """
        # Force all tasks to implement this function
        raise NotImplementedError("This function needs to be implemented by derived classes!")


class DetectionModel(BaseModel):
    # YOLOv5 detection model
    def __init__(self, cfg='yolov8n.yaml', ch=3, nc=None, verbose=True):  # model, input channels, number of classes
        super().__init__()
        self.yaml = cfg if isinstance(cfg, dict) else yaml_load(check_yaml(cfg), append_filename=True)  # cfg dict

        # Define model
        ch = self.yaml['ch'] = self.yaml.get('ch', ch)  # input channels
        if nc and nc != self.yaml['nc']:
            LOGGER.info(f"Overriding model.yaml nc={self.yaml['nc']} with nc={nc}")
            self.yaml['nc'] = nc  # override yaml value
        self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch], verbose=verbose)  # model, savelist
        self.names = {i: f'{i}' for i in range(self.yaml['nc'])}  # default names dict
        self.inplace = self.yaml.get('inplace', True)

        # Build strides
        m = self.model[-1]  # Detect()
        if isinstance(m, (Detect, Segment)):
            s = 256  # 2x min stride
            m.inplace = self.inplace
            forward = lambda x: self.forward(x)[0] if isinstance(m, Segment) else self.forward(x)
            m.stride = torch.tensor([s / x.shape[-2] for x in forward(torch.zeros(1, ch, s, s))])  # forward
            self.stride = m.stride
            m.bias_init()  # only run once

        # Init weights, biases
        initialize_weights(self)
        if verbose:
            self.info()
            LOGGER.info('')

    def forward(self, x, augment=False, profile=False, visualize=False):
        if augment:
            return self._forward_augment(x)  # augmented inference, None
        return self._forward_once(x, profile, visualize)  # single-scale inference, train

    def _forward_augment(self, x):
        img_size = x.shape[-2:]  # height, width
        s = [1, 0.83, 0.67]  # scales
        f = [None, 3, None]  # flips (2-ud, 3-lr)
        y = []  # outputs
        for si, fi in zip(s, f):
            xi = scale_img(x.flip(fi) if fi else x, si, gs=int(self.stride.max()))
            yi = self._forward_once(xi)[0]  # forward
            # cv2.imwrite(f'img_{si}.jpg', 255 * xi[0].cpu().numpy().transpose((1, 2, 0))[:, :, ::-1])  # save
            yi = self._descale_pred(yi, fi, si, img_size)
            y.append(yi)
        y = self._clip_augmented(y)  # clip augmented tails
        return torch.cat(y, -1), None  # augmented inference, train

    @staticmethod
    def _descale_pred(p, flips, scale, img_size, dim=1):
        # de-scale predictions following augmented inference (inverse operation)
        p[:, :4] /= scale  # de-scale
        x, y, wh, cls = p.split((1, 1, 2, p.shape[dim] - 4), dim)
        if flips == 2:
            y = img_size[0] - y  # de-flip ud
        elif flips == 3:
            x = img_size[1] - x  # de-flip lr
        return torch.cat((x, y, wh, cls), dim)

    def _clip_augmented(self, y):
        # Clip YOLOv5 augmented inference tails
        nl = self.model[-1].nl  # number of detection layers (P3-P5)
        g = sum(4 ** x for x in range(nl))  # grid points
        e = 1  # exclude layer count
        i = (y[0].shape[-1] // g) * sum(4 ** x for x in range(e))  # indices
        y[0] = y[0][..., :-i]  # large
        i = (y[-1].shape[-1] // g) * sum(4 ** (nl - 1 - x) for x in range(e))  # indices
        y[-1] = y[-1][..., i:]  # small
        return y

    def load(self, weights, verbose=True):
        csd = weights.float().state_dict()  # checkpoint state_dict as FP32
        csd = intersect_dicts(csd, self.state_dict())  # intersect
        self.load_state_dict(csd, strict=False)  # load
        if verbose:
            LOGGER.info(f'Transferred {len(csd)}/{len(self.model.state_dict())} items from pretrained weights')


class SegmentationModel(DetectionModel):
    # YOLOv5 segmentation model
    def __init__(self, cfg='yolov8n-seg.yaml', ch=3, nc=None, verbose=True):
        super().__init__(cfg, ch, nc, verbose)


class ClassificationModel(BaseModel):
    # YOLOv5 classification model
    def __init__(self,
                 cfg=None,
                 model=None,
                 ch=3,
                 nc=1000,
                 cutoff=10,
                 verbose=True):  # yaml, model, number of classes, cutoff index
        super().__init__()
        self._from_detection_model(model, nc, cutoff) if model is not None else self._from_yaml(cfg, ch, nc, verbose)

    def _from_detection_model(self, model, nc=1000, cutoff=10):
        # Create a YOLOv5 classification model from a YOLOv5 detection model
        from nn.autobackend import AutoBackend
        if isinstance(model, AutoBackend):
            model = model.model  # unwrap DetectMultiBackend
        model.model = model.model[:cutoff]  # backbone
        m = model.model[-1]  # last layer
        ch = m.conv.in_channels if hasattr(m, 'conv') else m.cv1.conv.in_channels  # ch into module
        c = Classify(ch, nc)  # Classify()
        c.i, c.f, c.type = m.i, m.f, 'models.common.Classify'  # index, from, type
        model.model[-1] = c  # replace
        self.model = model.model
        self.stride = model.stride
        self.save = []
        self.nc = nc

    def _from_yaml(self, cfg, ch, nc, verbose):
        self.yaml = cfg if isinstance(cfg, dict) else yaml_load(check_yaml(cfg), append_filename=True)  # cfg dict
        # Define model
        ch = self.yaml['ch'] = self.yaml.get('ch', ch)  # input channels
        if nc and nc != self.yaml['nc']:
            LOGGER.info(f"Overriding model.yaml nc={self.yaml['nc']} with nc={nc}")
            self.yaml['nc'] = nc  # override yaml value
        self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch], verbose=verbose)  # model, savelist
        self.names = {i: f'{i}' for i in range(self.yaml['nc'])}  # default names dict
        self.info()

    def load(self, weights):
        model = weights["model"] if isinstance(weights, dict) else weights  # torchvision models are not dicts
        csd = model.float().state_dict()
        csd = intersect_dicts(csd, self.state_dict())  # intersect
        self.load_state_dict(csd, strict=False)  # load

    @staticmethod
    def reshape_outputs(model, nc):
        # Update a TorchVision classification model to class count 'n' if required
        name, m = list((model.model if hasattr(model, 'model') else model).named_children())[-1]  # last module
        if isinstance(m, Classify):  # YOLO Classify() head
            if m.linear.out_features != nc:
                m.linear = nn.Linear(m.linear.in_features, nc)
        elif isinstance(m, nn.Linear):  # ResNet, EfficientNet
            if m.out_features != nc:
                setattr(model, name, nn.Linear(m.in_features, nc))
        elif isinstance(m, nn.Sequential):
            types = [type(x) for x in m]
            if nn.Linear in types:
                i = types.index(nn.Linear)  # nn.Linear index
                if m[i].out_features != nc:
                    m[i] = nn.Linear(m[i].in_features, nc)
            elif nn.Conv2d in types:
                i = types.index(nn.Conv2d)  # nn.Conv2d index
                if m[i].out_channels != nc:
                    m[i] = nn.Conv2d(m[i].in_channels, nc, m[i].kernel_size, m[i].stride, bias=m[i].bias is not None)


# Functions ------------------------------------------------------------------------------------------------------------


def attempt_load_weights(weights, device=None, inplace=True, fuse=False):
    # Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a
    from yolo.utils.downloads import attempt_download

    model = Ensemble()
    for w in weights if isinstance(weights, list) else [weights]:
        ckpt = torch.load(attempt_download(w), map_location='cpu')  # load
        args = {**DEFAULT_CONFIG_DICT, **ckpt['train_args']}  # combine model and default args, preferring model args
        ckpt = (ckpt.get('ema') or ckpt['model']).to(device).float()  # FP32 model

        # Model compatibility updates
        ckpt.args = {k: v for k, v in args.items() if k in DEFAULT_CONFIG_KEYS}  # attach args to model
        ckpt.pt_path = weights  # attach *.pt file path to model
        if not hasattr(ckpt, 'stride'):
            ckpt.stride = torch.tensor([32.])

        # Append
        model.append(ckpt.fuse().eval() if fuse and hasattr(ckpt, 'fuse') else ckpt.eval())  # model in eval mode

    # Module compatibility updates
    for m in model.modules():
        t = type(m)
        if t in (nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6, nn.SiLU, Detect, Segment):
            m.inplace = inplace  # torch 1.7.0 compatibility
        elif t is nn.Upsample and not hasattr(m, 'recompute_scale_factor'):
            m.recompute_scale_factor = None  # torch 1.11.0 compatibility

    # Return model
    if len(model) == 1:
        return model[-1]

    # Return ensemble
    print(f'Ensemble created with {weights}\n')
    for k in 'names', 'nc', 'yaml':
        setattr(model, k, getattr(model[0], k))
    model.stride = model[torch.argmax(torch.tensor([m.stride.max() for m in model])).int()].stride  # max stride
    assert all(model[0].nc == m.nc for m in model), f'Models have different class counts: {[m.nc for m in model]}'
    return model


def attempt_load_one_weight(weight, device=None, inplace=True, fuse=False):
    # Loads a single model weights
    from yolo.utils.downloads import attempt_download

    ckpt = torch.load(attempt_download(weight), map_location='cpu')  # load
    args = {**DEFAULT_CONFIG_DICT, **ckpt['train_args']}  # combine model and default args, preferring model args
    model = (ckpt.get('ema') or ckpt['model']).to(device).float()  # FP32 model

    # Model compatibility updates
    model.args = {k: v for k, v in args.items() if k in DEFAULT_CONFIG_KEYS}  # attach args to model
    model.pt_path = weight  # attach *.pt file path to model
    if not hasattr(model, 'stride'):
        model.stride = torch.tensor([32.])

    model = model.fuse().eval() if fuse and hasattr(model, 'fuse') else model.eval()  # model in eval mode

    # Module compatibility updates
    for m in model.modules():
        t = type(m)
        if t in (nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6, nn.SiLU, Detect, Segment):
            m.inplace = inplace  # torch 1.7.0 compatibility
        elif t is nn.Upsample and not hasattr(m, 'recompute_scale_factor'):
            m.recompute_scale_factor = None  # torch 1.11.0 compatibility

    # Return model and ckpt
    return model, ckpt


def parse_model(d, ch, verbose=True):  # model_dict, input_channels(3)
    # Parse a YOLO model.yaml dictionary
    if verbose:
        LOGGER.info(f"\n{'':>3}{'from':>20}{'n':>3}{'params':>10}  {'module':<45}{'arguments':<30}")
    nc, gd, gw, act = d['nc'], d['depth_multiple'], d['width_multiple'], d.get('activation')
    if act:
        Conv.default_act = eval(act)  # redefine default activation, i.e. Conv.default_act = nn.SiLU()
        if verbose:
            LOGGER.info(f"{colorstr('activation:')} {act}")  # print

    layers, save, c2 = [], [], ch[-1]  # layers, savelist, ch out
    for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']):  # from, number, module, args
        m = eval(m) if isinstance(m, str) else m  # eval strings
        for j, a in enumerate(args):
            with contextlib.suppress(NameError):
                args[j] = eval(a) if isinstance(a, str) else a  # eval strings

        n = n_ = max(round(n * gd), 1) if n > 1 else n  # depth gain
        if m in {
                Classify, Conv, ConvTranspose, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, DWConv, Focus,
                BottleneckCSP, C1, C2, C2f, C3, C3TR, C3Ghost, nn.ConvTranspose2d, DWConvTranspose2d, C3x}:
            c1, c2 = ch[f], args[0]
            if c2 != nc:  # if c2 not equal to number of classes (i.e. for Classify() output)
                c2 = make_divisible(c2 * gw, 8)

            args = [c1, c2, *args[1:]]
            if m in {BottleneckCSP, C1, C2, C2f, C3, C3TR, C3Ghost, C3x}:
                args.insert(2, n)  # number of repeats
                n = 1
        elif m is nn.BatchNorm2d:
            args = [ch[f]]
        elif m is Concat:
            c2 = sum(ch[x] for x in f)
        elif m in {Detect, Segment}:
            args.append([ch[x] for x in f])
            if m is Segment:
                args[2] = make_divisible(args[2] * gw, 8)
        else:
            c2 = ch[f]

        m_ = nn.Sequential(*(m(*args) for _ in range(n))) if n > 1 else m(*args)  # module
        t = str(m)[8:-2].replace('__main__.', '')  # module type
        m.np = sum(x.numel() for x in m_.parameters())  # number params
        m_.i, m_.f, m_.type = i, f, t  # attach index, 'from' index, type
        if verbose:
            LOGGER.info(f'{i:>3}{str(f):>20}{n_:>3}{m.np:10.0f}  {t:<45}{str(args):<30}')  # print
        save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1)  # append to savelist
        layers.append(m_)
        if i == 0:
            ch = []
        ch.append(c2)
    return nn.Sequential(*layers), sorted(save)


================================================
FILE: requirements.txt
================================================
# Ultralytics requirements
# Usage: pip install -r requirements.txt

# Base ----------------------------------------
hydra-core>=1.2.0
matplotlib>=3.2.2
numpy>=1.18.5
opencv-python>=4.1.1
Pillow>=7.1.2
PyYAML>=5.3.1
requests>=2.23.0
scipy>=1.4.1
torch>=1.7.0
torchvision>=0.8.1
tqdm>=4.64.0
ultralytics==8.0.0

# Logging -------------------------------------
tensorboard>=2.4.1
# clearml
# comet

#tracking
filterpy
scikit-image

# Plotting ------------------------------------
pandas>=1.1.4
seaborn>=0.11.0

# Export --------------------------------------
# coremltools>=6.0  # CoreML export
# onnx>=1.12.0  # ONNX export
# onnx-simplifier>=0.4.1  # ONNX simplifier
# nvidia-pyindex  # TensorRT export
# nvidia-tensorrt  # TensorRT export
# scikit-learn==0.19.2  # CoreML quantization
# tensorflow>=2.4.1  # TF exports (-cpu, -aarch64, -macos)
# tensorflowjs>=3.9.0  # TF.js export
# openvino-dev  # OpenVINO export

# Extras --------------------------------------
ipython  # interactive notebook
psutil  # system utilization
thop>=0.1.1  # FLOPs computation
# albumentations>=1.0.3
# pycocotools>=2.0.6  # COCO mAP
# roboflow

# HUB -----------------------------------------
GitPython>=3.1.24


================================================
FILE: yolo/cli.py
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

import shutil
from pathlib import Path

import hydra

import hub, yolo
from yolo.utils import DEFAULT_CONFIG, LOGGER, colorstr

DIR = Path(__file__).parent


@hydra.main(version_base=None, config_path=str(DEFAULT_CONFIG.parent.relative_to(DIR)), config_name=DEFAULT_CONFIG.name)
def cli(cfg):
    """
    Run a specified task and mode with the given configuration.

    Args:
        cfg (DictConfig): Configuration for the task and mode.
    """
    # LOGGER.info(f"{colorstr(f'Ultralytics YOLO v{ultralytics.__version__}')}")
    task, mode = cfg.task.lower(), cfg.mode.lower()

    # Special case for initializing the configuration
    if task == "init":
        shutil.copy2(DEFAULT_CONFIG, Path.cwd())
        LOGGER.info(f"""
        {colorstr("YOLO:")} configuration saved to {Path.cwd() / DEFAULT_CONFIG.name}.
        To run experiments using custom configuration:
        yolo task='task' mode='mode' --config-name config_file.yaml
                    """)
        return

    # Mapping from task to module
    task_module_map = {"detect": yolo.v8.detect, "segment": yolo.v8.segment, "classify": yolo.v8.classify}
    module = task_module_map.get(task)
    if not module:
        raise SyntaxError(f"task not recognized. Choices are {', '.join(task_module_map.keys())}")

    # Mapping from mode to function
    mode_func_map = {
        "train": module.train,
        "val": module.val,
        "predict": module.predict,
        "export": yolo.engine.exporter.export,
        "checks": hub.checks}
    func = mode_func_map.get(mode)
    if not func:
        raise SyntaxError(f"mode not recognized. Choices are {', '.join(mode_func_map.keys())}")

    func(cfg)


================================================
FILE: yolo/configs/__init__.py
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

from pathlib import Path
from typing import Dict, Union

from omegaconf import DictConfig, OmegaConf

from ultralytics.yolo.configs.hydra_patch import check_config_mismatch


def get_config(config: Union[str, DictConfig], overrides: Union[str, Dict] = None):
    """
    Load and merge configuration data from a file or dictionary.

    Args:
        config (Union[str, DictConfig]): Configuration data in the form of a file name or a DictConfig object.
        overrides (Union[str, Dict], optional): Overrides in the form of a file name or a dictionary. Default is None.

    Returns:
        OmegaConf.Namespace: Training arguments namespace.
    """
    if overrides is None:
        overrides = {}
    if isinstance(config, (str, Path)):
        config = OmegaConf.load(config)
    elif isinstance(config, Dict):
        config = OmegaConf.create(config)
    # override
    if isinstance(overrides, str):
        overrides = OmegaConf.load(overrides)
    elif isinstance(overrides, Dict):
        overrides = OmegaConf.create(overrides)

    check_config_mismatch(dict(overrides).keys(), dict(config).keys())

    return OmegaConf.merge(config, overrides)


================================================
FILE: yolo/configs/default.yaml
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license
# Default training settings and hyperparameters for medium-augmentation COCO training

task: "detect" # choices=['detect', 'segment', 'classify', 'init'] # init is a special case. Specify task to run.
mode: "train" # choices=['train', 'val', 'predict'] # mode to run task in.

# Train settings -------------------------------------------------------------------------------------------------------
model: null # i.e. yolov8n.pt, yolov8n.yaml. Path to model file
data: null # i.e. coco128.yaml. Path to data file
epochs: 100 # number of epochs to train for
patience: 50  # TODO: epochs to wait for no observable improvement for early stopping of training
batch: 16 # number of images per batch
imgsz: 640 # size of input images
save: True # save checkpoints
cache: False # True/ram, disk or False. Use cache for data loading
device: null # cuda device, i.e. 0 or 0,1,2,3 or cpu. Device to run on
workers: 8 # number of worker threads for data loading
project: null # project name
name: null # experiment name
exist_ok: False # whether to overwrite existing experiment
pretrained: False # whether to use a pretrained model
optimizer: 'SGD' # optimizer to use, choices=['SGD', 'Adam', 'AdamW', 'RMSProp']
verbose: False # whether to print verbose output
seed: 0 # random seed for reproducibility
deterministic: True # whether to enable deterministic mode
single_cls: False # train multi-class data as single-class
image_weights: False # use weighted image selection for training
rect: False # support rectangular training
cos_lr: False # use cosine learning rate scheduler
close_mosaic: 10 # disable mosaic augmentation for final 10 epochs
resume: False # resume training from last checkpoint
# Segmentation
overlap_mask: True # masks should overlap during training
mask_ratio: 4 # mask downsample ratio
# Classification
dropout: 0.0  # use dropout regularization

# Val/Test settings ----------------------------------------------------------------------------------------------------
val: True # validate/test during training
save_json: False # save results to JSON file
save_hybrid: False # save hybrid version of labels (labels + additional predictions)
conf: null # object confidence threshold for detection (default 0.25 predict, 0.001 val)
iou: 0.7 # intersection over union (IoU) threshold for NMS
max_det: 300 # maximum number of detections per image
half: False # use half precision (FP16)
dnn: False # use OpenCV DNN for ONNX inference
plots: True # show plots during training

# Prediction settings --------------------------------------------------------------------------------------------------
source: null # source directory for images or videos
show: False # show results if possible
save_txt: False # save results as .txt file
save_conf: False # save results with confidence scores
save_crop: False # save cropped images with results
hide_labels: False # hide labels
hide_conf: True # hide confidence scores
vid_stride: 1 # video frame-rate stride
line_thickness: 3 # bounding box thickness (pixels)
visualize: False # visualize results
augment: False # apply data augmentation to images
agnostic_nms: False # class-agnostic NMS
retina_masks: False # use retina masks for object detection

# Export settings ------------------------------------------------------------------------------------------------------
format: torchscript # format to export to
keras: False  # use Keras
optimize: False  # TorchScript: optimize for mobile
int8: False  # CoreML/TF INT8 quantization
dynamic: False  # ONNX/TF/TensorRT: dynamic axes
simplify: False  # ONNX: simplify model
opset: 17  # ONNX: opset version
workspace: 4  # TensorRT: workspace size (GB)
nms: False  # CoreML: add NMS

# Hyperparameters ------------------------------------------------------------------------------------------------------
lr0: 0.01  # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.01  # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937  # SGD momentum/Adam beta1
weight_decay: 0.0005  # optimizer weight decay 5e-4
warmup_epochs: 3.0  # warmup epochs (fractions ok)
warmup_momentum: 0.8  # warmup initial momentum
warmup_bias_lr: 0.1  # warmup initial bias lr
box: 7.5  # box loss gain
cls: 0.5  # cls loss gain (scale with pixels)
dfl: 1.5  # dfl loss gain
fl_gamma: 0.0  # focal loss gamma (efficientDet default gamma=1.5)
label_smoothing: 0.0
nbs: 64  # nominal batch size
hsv_h: 0.015  # image HSV-Hue augmentation (fraction)
hsv_s: 0.7  # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4  # image HSV-Value augmentation (fraction)
degrees: 0.0  # image rotation (+/- deg)
translate: 0.1  # image translation (+/- fraction)
scale: 0.5  # image scale (+/- gain)
shear: 0.0  # image shear (+/- deg)
perspective: 0.0  # image perspective (+/- fraction), range 0-0.001
flipud: 0.0  # image flip up-down (probability)
fliplr: 0.5  # image flip left-right (probability)
mosaic: 1.0  # image mosaic (probability)
mixup: 0.0  # image mixup (probability)
copy_paste: 0.0  # segment copy-paste (probability)

# Hydra configs --------------------------------------------------------------------------------------------------------
hydra:
  output_subdir: null  # disable hydra directory creation
  run:
    dir: .

# Debug, do not modify -------------------------------------------------------------------------------------------------
v5loader: False  # use legacy YOLOv5 dataloader


================================================
FILE: yolo/configs/hydra_patch.py
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

import sys
from difflib import get_close_matches
from textwrap import dedent

import hydra
from hydra.errors import ConfigCompositionException
from omegaconf import OmegaConf, open_dict  # noqa
from omegaconf.errors import ConfigAttributeError, ConfigKeyError, OmegaConfBaseException  # noqa

from yolo.utils import LOGGER, colorstr


def override_config(overrides, cfg):
    override_keys = [override.key_or_group for override in overrides]
    check_config_mismatch(override_keys, cfg.keys())
    for override in overrides:
        if override.package is not None:
            raise ConfigCompositionException(f"Override {override.input_line} looks like a config group"
                                             f" override, but config group '{override.key_or_group}' does not exist.")

        key = override.key_or_group
        value = override.value()
        try:
            if override.is_delete():
                config_val = OmegaConf.select(cfg, key, throw_on_missing=False)
                if config_val is None:
                    raise ConfigCompositionException(f"Could not delete from config. '{override.key_or_group}'"
                                                     " does not exist.")
                elif value is not None and value != config_val:
                    raise ConfigCompositionException("Could not delete from config. The value of"
                                                     f" '{override.key_or_group}' is {config_val} and not"
                                                     f" {value}.")

                last_dot = key.rfind(".")
                with open_dict(cfg):
                    if last_dot == -1:
                        del cfg[key]
                    else:
                        node = OmegaConf.select(cfg, key[:last_dot])
                        del node[key[last_dot + 1:]]

            elif override.is_add():
                if OmegaConf.select(cfg, key, throw_on_missing=False) is None or isinstance(value, (dict, list)):
                    OmegaConf.update(cfg, key, value, merge=True, force_add=True)
                else:
                    assert override.input_line is not None
                    raise ConfigCompositionException(
                        dedent(f"""\
                    Could not append to config. An item is already at '{override.key_or_group}'.
                    Either remove + prefix: '{override.input_line[1:]}'
                    Or add a second + to add or override '{override.key_or_group}': '+{override.input_line}'
                    """))
            elif override.is_force_add():
                OmegaConf.update(cfg, key, value, merge=True, force_add=True)
            else:
                try:
                    OmegaConf.update(cfg, key, value, merge=True)
                except (ConfigAttributeError, ConfigKeyError) as ex:
                    raise ConfigCompositionException(f"Could not override '{override.key_or_group}'."
                                                     f"\nTo append to your config use +{override.input_line}") from ex
        except OmegaConfBaseException as ex:
            raise ConfigCompositionException(f"Error merging override {override.input_line}").with_traceback(
                sys.exc_info()[2]) from ex


def check_config_mismatch(overrides, cfg):
    mismatched = [option for option in overrides if option not in cfg and 'hydra.' not in option]

    for option in mismatched:
        LOGGER.info(f"{colorstr(option)} is not a valid key. Similar keys: {get_close_matches(option, cfg, 3, 0.6)}")
    if mismatched:
        exit()


hydra._internal.config_loader_impl.ConfigLoaderImpl._apply_overrides_to_config = override_config


================================================
FILE: yolo/data/__init__.py
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

from .base import BaseDataset
from .build import build_classification_dataloader, build_dataloader
from .dataset import ClassificationDataset, SemanticDataset, YOLODataset
from .dataset_wrappers import MixAndRectDataset


================================================
FILE: yolo/data/augment.py
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

import math
import random
from copy import deepcopy

import cv2
import numpy as np
import torch
import torchvision.transforms as T

from ..utils import LOGGER, colorstr
from ..utils.checks import check_version
from ..utils.instance import Instances
from ..utils.metrics import bbox_ioa
from ..utils.ops import segment2box
from .utils import IMAGENET_MEAN, IMAGENET_STD, polygons2masks, polygons2masks_overlap


# TODO: we might need a BaseTransform to make all these augments be compatible with both classification and semantic
class BaseTransform:

    def __init__(self) -> None:
        pass

    def apply_image(self, labels):
        pass

    def apply_instances(self, labels):
        pass

    def apply_semantic(self, labels):
        pass

    def __call__(self, labels):
        self.apply_image(labels)
        self.apply_instances(labels)
        self.apply_semantic(labels)


class Compose:

    def __init__(self, transforms):
        self.transforms = transforms

    def __call__(self, data):
        for t in self.transforms:
            data = t(data)
        return data

    def append(self, transform):
        self.transforms.append(transform)

    def tolist(self):
        return self.transforms

    def __repr__(self):
        format_string = f"{self.__class__.__name__}("
        for t in self.transforms:
            format_string += "\n"
            format_string += f"    {t}"
        format_string += "\n)"
        return format_string


class BaseMixTransform:
    """This implementation is from mmyolo"""

    def __init__(self, dataset, pre_transform=None, p=0.0) -> None:
        self.dataset = dataset
        self.pre_transform = pre_transform
        self.p = p

    def __call__(self, labels):
        if random.uniform(0, 1) > self.p:
            return labels

        # get index of one or three other images
        indexes = self.get_indexes()
        if isinstance(indexes, int):
            indexes = [indexes]

        # get images information will be used for Mosaic or MixUp
        mix_labels = [self.dataset.get_label_info(i) for i in indexes]

        if self.pre_transform is not None:
            for i, data in enumerate(mix_labels):
                mix_labels[i] = self.pre_transform(data)
        labels["mix_labels"] = mix_labels

        # Mosaic or MixUp
        labels = self._mix_transform(labels)
        labels.pop("mix_labels", None)
        return labels

    def _mix_transform(self, labels):
        raise NotImplementedError

    def get_indexes(self):
        raise NotImplementedError


class Mosaic(BaseMixTransform):
    """Mosaic augmentation.
    Args:
        imgsz (Sequence[int]): Image size after mosaic pipeline of single
            image. The shape order should be (height, width).
            Default to (640, 640).
    """

    def __init__(self, dataset, imgsz=640, p=1.0, border=(0, 0)):
        assert 0 <= p <= 1.0, "The probability should be in range [0, 1]. " f"got {p}."
        super().__init__(dataset=dataset, p=p)
        self.dataset = dataset
        self.imgsz = imgsz
        self.border = border

    def get_indexes(self):
        return [random.randint(0, len(self.dataset) - 1) for _ in range(3)]

    def _mix_transform(self, labels):
        mosaic_labels = []
        assert labels.get("rect_shape", None) is None, "rect and mosaic is exclusive."
        assert len(labels.get("mix_labels", [])) > 0, "There are no other images for mosaic augment."
        s = self.imgsz
        yc, xc = (int(random.uniform(-x, 2 * s + x)) for x in self.border)  # mosaic center x, y
        for i in range(4):
            labels_patch = (labels if i == 0 else labels["mix_labels"][i - 1]).copy()
            # Load image
            img = labels_patch["img"]
            h, w = labels_patch["resized_shape"]

            # place img in img4
            if i == 0:  # top left
                img4 = np.full((s * 2, s * 2, img.shape[2]), 114, dtype=np.uint8)  # base image with 4 tiles
                x1a, y1a, x2a, y2a = max(xc - w, 0), max(yc - h, 0), xc, yc  # xmin, ymin, xmax, ymax (large image)
                x1b, y1b, x2b, y2b = w - (x2a - x1a), h - (y2a - y1a), w, h  # xmin, ymin, xmax, ymax (small image)
            elif i == 1:  # top right
                x1a, y1a, x2a, y2a = xc, max(yc - h, 0), min(xc + w, s * 2), yc
                x1b, y1b, x2b, y2b = 0, h - (y2a - y1a), min(w, x2a - x1a), h
            elif i == 2:  # bottom left
                x1a, y1a, x2a, y2a = max(xc - w, 0), yc, xc, min(s * 2, yc + h)
                x1b, y1b, x2b, y2b = w - (x2a - x1a), 0, w, min(y2a - y1a, h)
            elif i == 3:  # bottom right
                x1a, y1a, x2a, y2a = xc, yc, min(xc + w, s * 2), min(s * 2, yc + h)
                x1b, y1b, x2b, y2b = 0, 0, min(w, x2a - x1a), min(y2a - y1a, h)

            img4[y1a:y2a, x1a:x2a] = img[y1b:y2b, x1b:x2b]  # img4[ymin:ymax, xmin:xmax]
            padw = x1a - x1b
            padh = y1a - y1b

            labels_patch = self._update_labels(labels_patch, padw, padh)
            mosaic_labels.append(labels_patch)
        final_labels = self._cat_labels(mosaic_labels)
        final_labels["img"] = img4
        return final_labels

    def _update_labels(self, labels, padw, padh):
        """Update labels"""
        nh, nw = labels["img"].shape[:2]
        labels["instances"].convert_bbox(format="xyxy")
        labels["instances"].denormalize(nw, nh)
        labels["instances"].add_padding(padw, padh)
        return labels

    def _cat_labels(self, mosaic_labels):
        if len(mosaic_labels) == 0:
            return {}
        cls = []
        instances = []
        for labels in mosaic_labels:
            cls.append(labels["cls"])
            instances.append(labels["instances"])
        final_labels = {
            "ori_shape": mosaic_labels[0]["ori_shape"],
            "resized_shape": (self.imgsz * 2, self.imgsz * 2),
            "im_file": mosaic_labels[0]["im_file"],
            "cls": np.concatenate(cls, 0),
            "instances": Instances.concatenate(instances, axis=0)}
        final_labels["instances"].clip(self.imgsz * 2, self.imgsz * 2)
        return final_labels


class MixUp(BaseMixTransform):

    def __init__(self, dataset, pre_transform=None, p=0.0) -> None:
        super().__init__(dataset=dataset, pre_transform=pre_transform, p=p)

    def get_indexes(self):
        return random.randint(0, len(self.dataset) - 1)

    def _mix_transform(self, labels):
        # Applies MixUp augmentation https://arxiv.org/pdf/1710.09412.pdf
        r = np.random.beta(32.0, 32.0)  # mixup ratio, alpha=beta=32.0
        labels2 = labels["mix_labels"][0]
        labels["img"] = (labels["img"] * r + labels2["img"] * (1 - r)).astype(np.uint8)
        labels["instances"] = Instances.concatenate([labels["instances"], labels2["instances"]], axis=0)
        labels["cls"] = np.concatenate([labels["cls"], labels2["cls"]], 0)
        return labels


class RandomPerspective:

    def __init__(self, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, border=(0, 0)):
        self.degrees = degrees
        self.translate = translate
        self.scale = scale
        self.shear = shear
        self.perspective = perspective
        # mosaic border
        self.border = border

    def affine_transform(self, img):
        # Center
        C = np.eye(3)

        C[0, 2] = -img.shape[1] / 2  # x translation (pixels)
        C[1, 2] = -img.shape[0] / 2  # y translation (pixels)

        # Perspective
        P = np.eye(3)
        P[2, 0] = random.uniform(-self.perspective, self.perspective)  # x perspective (about y)
        P[2, 1] = random.uniform(-self.perspective, self.perspective)  # y perspective (about x)

        # Rotation and Scale
        R = np.eye(3)
        a = random.uniform(-self.degrees, self.degrees)
        # a += random.choice([-180, -90, 0, 90])  # add 90deg rotations to small rotations
        s = random.uniform(1 - self.scale, 1 + self.scale)
        # s = 2 ** random.uniform(-scale, scale)
        R[:2] = cv2.getRotationMatrix2D(angle=a, center=(0, 0), scale=s)

        # Shear
        S = np.eye(3)
        S[0, 1] = math.tan(random.uniform(-self.shear, self.shear) * math.pi / 180)  # x shear (deg)
        S[1, 0] = math.tan(random.uniform(-self.shear, self.shear) * math.pi / 180)  # y shear (deg)

        # Translation
        T = np.eye(3)
        T[0, 2] = random.uniform(0.5 - self.translate, 0.5 + self.translate) * self.size[0]  # x translation (pixels)
        T[1, 2] = random.uniform(0.5 - self.translate, 0.5 + self.translate) * self.size[1]  # y translation (pixels)

        # Combined rotation matrix
        M = T @ S @ R @ P @ C  # order of operations (right to left) is IMPORTANT
        # affine image
        if (self.border[0] != 0) or (self.border[1] != 0) or (M != np.eye(3)).any():  # image changed
            if self.perspective:
                img = cv2.warpPerspective(img, M, dsize=self.size, borderValue=(114, 114, 114))
            else:  # affine
                img = cv2.warpAffine(img, M[:2], dsize=self.size, borderValue=(114, 114, 114))
        return img, M, s

    def apply_bboxes(self, bboxes, M):
        """apply affine to bboxes only.

        Args:
            bboxes(ndarray): list of bboxes, xyxy format, with shape (num_bboxes, 4).
            M(ndarray): affine matrix.
        Returns:
            new_bboxes(ndarray): bboxes after affine, [num_bboxes, 4].
        """
        n = len(bboxes)
        if n == 0:
            return bboxes

        xy = np.ones((n * 4, 3))
        xy[:, :2] = bboxes[:, [0, 1, 2, 3, 0, 3, 2, 1]].reshape(n * 4, 2)  # x1y1, x2y2, x1y2, x2y1
        xy = xy @ M.T  # transform
        xy = (xy[:, :2] / xy[:, 2:3] if self.perspective else xy[:, :2]).reshape(n, 8)  # perspective rescale or affine

        # create new boxes
        x = xy[:, [0, 2, 4, 6]]
        y = xy[:, [1, 3, 5, 7]]
        return np.concatenate((x.min(1), y.min(1), x.max(1), y.max(1))).reshape(4, n).T

    def apply_segments(self, segments, M):
        """apply affine to segments and generate new bboxes from segments.

        Args:
            segments(ndarray): list of segments, [num_samples, 500, 2].
            M(ndarray): affine matrix.
        Returns:
            new_segments(ndarray): list of segments after affine, [num_samples, 500, 2].
            new_bboxes(ndarray): bboxes after affine, [N, 4].
        """
        n, num = segments.shape[:2]
        if n == 0:
            return [], segments

        xy = np.ones((n * num, 3))
        segments = segments.reshape(-1, 2)
        xy[:, :2] = segments
        xy = xy @ M.T  # transform
        xy = xy[:, :2] / xy[:, 2:3]
        segments = xy.reshape(n, -1, 2)
        bboxes = np.stack([segment2box(xy, self.size[0], self.size[1]) for xy in segments], 0)
        return bboxes, segments

    def apply_keypoints(self, keypoints, M):
        """apply affine to keypoints.

        Args:
            keypoints(ndarray): keypoints, [N, 17, 2].
            M(ndarray): affine matrix.
        Return:
            new_keypoints(ndarray): keypoints after affine, [N, 17, 2].
        """
        n = len(keypoints)
        if n == 0:
            return keypoints
        new_keypoints = np.ones((n * 17, 3))
        new_keypoints[:, :2] = keypoints.reshape(n * 17, 2)  # num_kpt is hardcoded to 17
        new_keypoints = new_keypoints @ M.T  # transform
        new_keypoints = (new_keypoints[:, :2] / new_keypoints[:, 2:3]).reshape(n, 34)  # perspective rescale or affine
        new_keypoints[keypoints.reshape(-1, 34) == 0] = 0
        x_kpts = new_keypoints[:, list(range(0, 34, 2))]
        y_kpts = new_keypoints[:, list(range(1, 34, 2))]

        x_kpts[np.logical_or.reduce((x_kpts < 0, x_kpts > self.size[0], y_kpts < 0, y_kpts > self.size[1]))] = 0
        y_kpts[np.logical_or.reduce((x_kpts < 0, x_kpts > self.size[0], y_kpts < 0, y_kpts > self.size[1]))] = 0
        new_keypoints[:, list(range(0, 34, 2))] = x_kpts
        new_keypoints[:, list(range(1, 34, 2))] = y_kpts
        return new_keypoints.reshape(n, 17, 2)

    def __call__(self, labels):
        """
        Affine images and targets.

        Args:
            labels(Dict): a dict of `bboxes`, `segments`, `keypoints`.
        """
        img = labels["img"]
        cls = labels["cls"]
        instances = labels.pop("instances")
        # make sure the coord formats are right
        instances.convert_bbox(format="xyxy")
        instances.denormalize(*img.shape[:2][::-1])

        self.size = img.shape[1] + self.border[1] * 2, img.shape[0] + self.border[0] * 2  # w, h
        # M is affine matrix
        # scale for func:`box_candidates`
        img, M, scale = self.affine_transform(img)

        bboxes = self.apply_bboxes(instances.bboxes, M)

        segments = instances.segments
        keypoints = instances.keypoints
        # update bboxes if there are segments.
        if len(segments):
            bboxes, segments = self.apply_segments(segments, M)

        if keypoints is not None:
            keypoints = self.apply_keypoints(keypoints, M)
        new_instances = Instances(bboxes, segments, keypoints, bbox_format="xyxy", normalized=False)
        # clip
        new_instances.clip(*self.size)

        # filter instances
        instances.scale(scale_w=scale, scale_h=scale, bbox_only=True)
        # make the bboxes have the same scale with new_bboxes
        i = self.box_candidates(box1=instances.bboxes.T,
                                box2=new_instances.bboxes.T,
                                area_thr=0.01 if len(segments) else 0.10)
        labels["instances"] = new_instances[i]
        labels["cls"] = cls[i]
        labels["img"] = img
        labels["resized_shape"] = img.shape[:2]
        return labels

    def box_candidates(self, box1, box2, wh_thr=2, ar_thr=100, area_thr=0.1, eps=1e-16):  # box1(4,n), box2(4,n)
        # Compute box candidates: box1 before augment, box2 after augment, wh_thr (pixels), aspect_ratio_thr, area_ratio
        w1, h1 = box1[2] - box1[0], box1[3] - box1[1]
        w2, h2 = box2[2] - box2[0], box2[3] - box2[1]
        ar = np.maximum(w2 / (h2 + eps), h2 / (w2 + eps))  # aspect ratio
        return (w2 > wh_thr) & (h2 > wh_thr) & (w2 * h2 / (w1 * h1 + eps) > area_thr) & (ar < ar_thr)  # candidates


class RandomHSV:

    def __init__(self, hgain=0.5, sgain=0.5, vgain=0.5) -> None:
        self.hgain = hgain
        self.sgain = sgain
        self.vgain = vgain

    def __call__(self, labels):
        img = labels["img"]
        if self.hgain or self.sgain or self.vgain:
            r = np.random.uniform(-1, 1, 3) * [self.hgain, self.sgain, self.vgain] + 1  # random gains
            hue, sat, val = cv2.split(cv2.cvtColor(img, cv2.COLOR_BGR2HSV))
            dtype = img.dtype  # uint8

            x = np.arange(0, 256, dtype=r.dtype)
            lut_hue = ((x * r[0]) % 180).astype(dtype)
            lut_sat = np.clip(x * r[1], 0, 255).astype(dtype)
            lut_val = np.clip(x * r[2], 0, 255).astype(dtype)

            im_hsv = cv2.merge((cv2.LUT(hue, lut_hue), cv2.LUT(sat, lut_sat), cv2.LUT(val, lut_val)))
            cv2.cvtColor(im_hsv, cv2.COLOR_HSV2BGR, dst=img)  # no return needed
        return labels


class RandomFlip:

    def __init__(self, p=0.5, direction="horizontal") -> None:
        assert direction in ["horizontal", "vertical"], f"Support direction `horizontal` or `vertical`, got {direction}"
        assert 0 <= p <= 1.0

        self.p = p
        self.direction = direction

    def __call__(self, labels):
        img = labels["img"]
        instances = labels.pop("instances")
        instances.convert_bbox(format="xywh")
        h, w = img.shape[:2]
        h = 1 if instances.normalized else h
        w = 1 if instances.normalized else w

        # Flip up-down
        if self.direction == "vertical" and random.random() < self.p:
            img = np.flipud(img)
            instances.flipud(h)
        if self.direction == "horizontal" and random.random() < self.p:
            img = np.fliplr(img)
            instances.fliplr(w)
        labels["img"] = np.ascontiguousarray(img)
        labels["instances"] = instances
        return labels


class LetterBox:
    """Resize image and padding for detection, instance segmentation, pose"""

    def __init__(self, new_shape=(640, 640), auto=False, scaleFill=False, scaleup=True, stride=32):
        self.new_shape = new_shape
        self.auto = auto
        self.scaleFill = scaleFill
        self.scaleup = scaleup
        self.stride = stride

    def __call__(self, labels=None, image=None):
        if labels is None:
            labels = {}
        img = labels.get("img") if image is None else image
        shape = img.shape[:2]  # current shape [height, width]
        new_shape = labels.pop("rect_shape", self.new_shape)
        if isinstance(new_shape, int):
            new_shape = (new_shape, new_shape)

        # Scale ratio (new / old)
        r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
        if not self.scaleup:  # only scale down, do not scale up (for better val mAP)
            r = min(r, 1.0)

        # Compute padding
        ratio = r, r  # width, height ratios
        new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
        dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
        if self.auto:  # minimum rectangle
            dw, dh = np.mod(dw, self.stride), np.mod(dh, self.stride)  # wh padding
        elif self.scaleFill:  # stretch
            dw, dh = 0.0, 0.0
            new_unpad = (new_shape[1], new_shape[0])
            ratio = new_shape[1] / shape[1], new_shape[0] / shape[0]  # width, height ratios

        dw /= 2  # divide padding into 2 sides
        dh /= 2
        if labels.get("ratio_pad"):
            labels["ratio_pad"] = (labels["ratio_pad"], (dw, dh))  # for evaluation

        if shape[::-1] != new_unpad:  # resize
            img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
        top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
        left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
        img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT,
                                 value=(114, 114, 114))  # add border

        if len(labels):
            labels = self._update_labels(labels, ratio, dw, dh)
            labels["img"] = img
            labels["resized_shape"] = new_shape
            return labels
        else:
            return img

    def _update_labels(self, labels, ratio, padw, padh):
        """Update labels"""
        labels["instances"].convert_bbox(format="xyxy")
        labels["instances"].denormalize(*labels["img"].shape[:2][::-1])
        labels["instances"].scale(*ratio)
        labels["instances"].add_padding(padw, padh)
        return labels


class CopyPaste:

    def __init__(self, p=0.5) -> None:
        self.p = p

    def __call__(self, labels):
        # Implement Copy-Paste augmentation https://arxiv.org/abs/2012.07177, labels as nx5 np.array(cls, xyxy)
        im = labels["img"]
        cls = labels["cls"]
        instances = labels.pop("instances")
        instances.convert_bbox(format="xyxy")
        if self.p and len(instances.segments):
            n = len(instances)
            _, w, _ = im.shape  # height, width, channels
            im_new = np.zeros(im.shape, np.uint8)

            # calculate ioa first then select indexes randomly
            ins_flip = deepcopy(instances)
            ins_flip.fliplr(w)

            ioa = bbox_ioa(ins_flip.bboxes, instances.bboxes)  # intersection over area, (N, M)
            indexes = np.nonzero((ioa < 0.30).all(1))[0]  # (N, )
            n = len(indexes)
            for j in random.sample(list(indexes), k=round(self.p * n)):
                cls = np.concatenate((cls, cls[[j]]), axis=0)
                instances = Instances.concatenate((instances, ins_flip[[j]]), axis=0)
                cv2.drawContours(im_new, instances.segments[[j]].astype(np.int32), -1, (1, 1, 1), cv2.FILLED)

            result = cv2.flip(im, 1)  # augment segments (flip left-right)
            i = cv2.flip(im_new, 1).astype(bool)
            im[i] = result[i]  # cv2.imwrite('debug.jpg', im)  # debug

        labels["img"] = im
        labels["cls"] = cls
        labels["instances"] = instances
        return labels


class Albumentations:
    # YOLOv5 Albumentations class (optional, only used if package is installed)
    def __init__(self, p=1.0):
        self.p = p
        self.transform = None
        prefix = colorstr("albumentations: ")
        try:
            import albumentations as A

            check_version(A.__version__, "1.0.3", hard=True)  # version requirement

            T = [
                A.Blur(p=0.01),
                A.MedianBlur(p=0.01),
                A.ToGray(p=0.01),
                A.CLAHE(p=0.01),
                A.RandomBrightnessContrast(p=0.0),
                A.RandomGamma(p=0.0),
                A.ImageCompression(quality_lower=75, p=0.0),]  # transforms
            self.transform = A.Compose(T, bbox_params=A.BboxParams(format="yolo", label_fields=["class_labels"]))

            LOGGER.info(prefix + ", ".join(f"{x}".replace("always_apply=False, ", "") for x in T if x.p))
        except ImportError:  # package not installed, skip
            pass
        except Exception as e:
            LOGGER.info(f"{prefix}{e}")

    def __call__(self, labels):
        im = labels["img"]
        cls = labels["cls"]
        if len(cls):
            labels["instances"].convert_bbox("xywh")
            labels["instances"].normalize(*im.shape[:2][::-1])
            bboxes = labels["instances"].bboxes
            # TODO: add supports of segments and keypoints
            if self.transform and random.random() < self.p:
                new = self.transform(image=im, bboxes=bboxes, class_labels=cls)  # transformed
                labels["img"] = new["image"]
                labels["cls"] = np.array(new["class_labels"])
            labels["instances"].update(bboxes=bboxes)
        return labels


# TODO: technically this is not an augmentation, maybe we should put this to another files
class Format:

    def __init__(self,
                 bbox_format="xywh",
                 normalize=True,
                 return_mask=False,
                 return_keypoint=False,
                 mask_ratio=4,
                 mask_overlap=True,
                 batch_idx=True):
        self.bbox_format = bbox_format
        self.normalize = normalize
        self.return_mask = return_mask  # set False when training detection only
        self.return_keypoint = return_keypoint
        self.mask_ratio = mask_ratio
        self.mask_overlap = mask_overlap
        self.batch_idx = batch_idx  # keep the batch indexes

    def __call__(self, labels):
        img = labels["img"]
        h, w = img.shape[:2]
        cls = labels.pop("cls")
        instances = labels.pop("instances")
        instances.convert_bbox(format=self.bbox_format)
        instances.denormalize(w, h)
        nl = len(instances)

        if self.return_mask:
            if nl:
                masks, instances, cls = self._format_segments(instances, cls, w, h)
                masks = torch.from_numpy(masks)
            else:
                masks = torch.zeros(1 if self.mask_overlap else nl, img.shape[0] // self.mask_ratio,
                                    img.shape[1] // self.mask_ratio)
            labels["masks"] = masks
        if self.normalize:
            instances.normalize(w, h)
        labels["img"] = self._format_img(img)
        labels["cls"] = torch.from_numpy(cls) if nl else torch.zeros(nl)
        labels["bboxes"] = torch.from_numpy(instances.bboxes) if nl else torch.zeros((nl, 4))
        if self.return_keypoint:
            labels["keypoints"] = torch.from_numpy(instances.keypoints) if nl else torch.zeros((nl, 17, 2))
        # then we can use collate_fn
        if self.batch_idx:
            labels["batch_idx"] = torch.zeros(nl)
        return labels

    def _format_img(self, img):
        if len(img.shape) < 3:
            img = np.expand_dims(img, -1)
        img = np.ascontiguousarray(img.transpose(2, 0, 1)[::-1])
        img = torch.from_numpy(img)
        return img

    def _format_segments(self, instances, cls, w, h):
        """convert polygon points to bitmap"""
        segments = instances.segments
        if self.mask_overlap:
            masks, sorted_idx = polygons2masks_overlap((h, w), segments, downsample_ratio=self.mask_ratio)
            masks = masks[None]  # (640, 640) -> (1, 640, 640)
            instances = instances[sorted_idx]
            cls = cls[sorted_idx]
        else:
            masks = polygons2masks((h, w), segments, color=1, downsample_ratio=self.mask_ratio)

        return masks, instances, cls


def mosaic_transforms(dataset, imgsz, hyp):
    pre_transform = Compose([
        Mosaic(dataset, imgsz=imgsz, p=hyp.mosaic, border=[-imgsz // 2, -imgsz // 2]),
        CopyPaste(p=hyp.copy_paste),
        RandomPerspective(
            degrees=hyp.degrees,
            translate=hyp.translate,
            scale=hyp.scale,
            shear=hyp.shear,
            perspective=hyp.perspective,
            border=[-imgsz // 2, -imgsz // 2],
        ),])
    return Compose([
        pre_transform,
        MixUp(dataset, pre_transform=pre_transform, p=hyp.mixup),
        Albumentations(p=1.0),
        RandomHSV(hgain=hyp.hsv_h, sgain=hyp.hsv_s, vgain=hyp.hsv_v),
        RandomFlip(direction="vertical", p=hyp.flipud),
        RandomFlip(direction="horizontal", p=hyp.fliplr),])  # transforms


def affine_transforms(imgsz, hyp):
    return Compose([
        LetterBox(new_shape=(imgsz, imgsz)),
        RandomPerspective(
            degrees=hyp.degrees,
            translate=hyp.translate,
            scale=hyp.scale,
            shear=hyp.shear,
            perspective=hyp.perspective,
            border=[0, 0],
        ),
        Albumentations(p=1.0),
        RandomHSV(hgain=hyp.hsv_h, sgain=hyp.hsv_s, vgain=hyp.hsv_v),
        RandomFlip(direction="vertical", p=hyp.flipud),
        RandomFlip(direction="horizontal", p=hyp.fliplr),])  # transforms


# Classification augmentations -----------------------------------------------------------------------------------------
def classify_transforms(size=224):
    # Transforms to apply if albumentations not installed
    assert isinstance(size, int), f"ERROR: classify_transforms size {size} must be integer, not (list, tuple)"
    # T.Compose([T.ToTensor(), T.Resize(size), T.CenterCrop(size), T.Normalize(IMAGENET_MEAN, IMAGENET_STD)])
    return T.Compose([CenterCrop(size), ToTensor(), T.Normalize(IMAGENET_MEAN, IMAGENET_STD)])


def classify_albumentations(
        augment=True,
        size=224,
        scale=(0.08, 1.0),
        hflip=0.5,
        vflip=0.0,
        jitter=0.4,
        mean=IMAGENET_MEAN,
        std=IMAGENET_STD,
        auto_aug=False,
):
    # YOLOv5 classification Albumentations (optional, only used if package is installed)
    prefix = colorstr("albumentations: ")
    try:
        import albumentations as A
        from albumentations.pytorch import ToTensorV2

        check_version(A.__version__, "1.0.3", hard=True)  # version requirement
        if augment:  # Resize and crop
            T = [A.RandomResizedCrop(height=size, width=size, scale=scale)]
            if auto_aug:
                # TODO: implement AugMix, AutoAug & RandAug in albumentation
                LOGGER.info(f"{prefix}auto augmentations are currently not supported")
            else:
                if hflip > 0:
                    T += [A.HorizontalFlip(p=hflip)]
                if vflip > 0:
                    T += [A.VerticalFlip(p=vflip)]
                if jitter > 0:
                    color_jitter = (float(jitter),) * 3  # repeat value for brightness, contrast, saturation, 0 hue
                    T += [A.ColorJitter(*color_jitter, 0)]
        else:  # Use fixed crop for eval set (reproducibility)
            T = [A.SmallestMaxSize(max_size=size), A.CenterCrop(height=size, width=size)]
        T += [A.Normalize(mean=mean, std=std), ToTensorV2()]  # Normalize and convert to Tensor
        LOGGER.info(prefix + ", ".join(f"{x}".replace("always_apply=False, ", "") for x in T if x.p))
        return A.Compose(T)

    except ImportError:  # package not installed, skip
        pass
    except Exception as e:
        LOGGER.info(f"{prefix}{e}")


class ClassifyLetterBox:
    # YOLOv5 LetterBox class for image preprocessing, i.e. T.Compose([LetterBox(size), ToTensor()])
    def __init__(self, size=(640, 640), auto=False, stride=32):
        super().__init__()
        self.h, self.w = (size, size) if isinstance(size, int) else size
        self.auto = auto  # pass max size integer, automatically solve for short side using stride
        self.stride = stride  # used with auto

    def __call__(self, im):  # im = np.array HWC
        imh, imw = im.shape[:2]
        r = min(self.h / imh, self.w / imw)  # ratio of new/old
        h, w = round(imh * r), round(imw * r)  # resized image
        hs, ws = (math.ceil(x / self.stride) * self.stride for x in (h, w)) if self.auto else self.h, self.w
        top, left = round((hs - h) / 2 - 0.1), round((ws - w) / 2 - 0.1)
        im_out = np.full((self.h, self.w, 3), 114, dtype=im.dtype)
        im_out[top:top + h, left:left + w] = cv2.resize(im, (w, h), interpolation=cv2.INTER_LINEAR)
        return im_out


class CenterCrop:
    # YOLOv5 CenterCrop class for image preprocessing, i.e. T.Compose([CenterCrop(size), ToTensor()])
    def __init__(self, size=640):
        super().__init__()
        self.h, self.w = (size, size) if isinstance(size, int) else size

    def __call__(self, im):  # im = np.array HWC
        imh, imw = im.shape[:2]
        m = min(imh, imw)  # min dimension
        top, left = (imh - m) // 2, (imw - m) // 2
        return cv2.resize(im[top:top + m, left:left + m], (self.w, self.h), interpolation=cv2.INTER_LINEAR)


class ToTensor:
    # YOLOv5 ToTensor class for image preprocessing, i.e. T.Compose([LetterBox(size), ToTensor()])
    def __init__(self, half=False):
        super().__init__()
        self.half = half

    def __call__(self, im):  # im = np.array HWC in BGR order
        im = np.ascontiguousarray(im.transpose((2, 0, 1))[::-1])  # HWC to CHW -> BGR to RGB -> contiguous
        im = torch.from_numpy(im)  # to torch
        im = im.half() if self.half else im.float()  # uint8 to fp16/32
        im /= 255.0  # 0-255 to 0.0-1.0
        return im


================================================
FILE: yolo/data/base.py
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

import glob
import math
import os
from multiprocessing.pool import ThreadPool
from pathlib import Path
from typing import Optional

import cv2
import numpy as np
from torch.utils.data import Dataset
from tqdm import tqdm

from ..utils import NUM_THREADS, TQDM_BAR_FORMAT
from .utils import HELP_URL, IMG_FORMATS, LOCAL_RANK


class BaseDataset(Dataset):
    """Base Dataset.
    Args:
        img_path (str): image path.
        pipeline (dict): a dict of image transforms.
        label_path (str): label path, this can also be an ann_file or other custom label path.
    """

    def __init__(
        self,
        img_path,
        imgsz=640,
        label_path=None,
        cache=False,
        augment=True,
        hyp=None,
        prefix="",
        rect=False,
        batch_size=None,
        stride=32,
        pad=0.5,
        single_cls=False,
    ):
        super().__init__()
        self.img_path = img_path
        self.imgsz = imgsz
        self.label_path = label_path
        self.augment = augment
        self.single_cls = single_cls
        self.prefix = prefix

        self.im_files = self.get_img_files(self.img_path)
        self.labels = self.get_labels()
        if self.single_cls:
            self.update_labels(include_class=[])

        self.ni = len(self.labels)

        # rect stuff
        self.rect = rect
        self.batch_size = batch_size
        self.stride = stride
        self.pad = pad
        if self.rect:
            assert self.batch_size is not None
            self.set_rectangle()

        # cache stuff
        self.ims = [None] * self.ni
        self.npy_files = [Path(f).with_suffix(".npy") for f in self.im_files]
        if cache:
            self.cache_images(cache)

        # transforms
        self.transforms = self.build_transforms(hyp=hyp)

    def get_img_files(self, img_path):
        """Read image files."""
        try:
            f = []  # image files
            for p in img_path if isinstance(img_path, list) else [img_path]:
                p = Path(p)  # os-agnostic
                if p.is_dir():  # dir
                    f += glob.glob(str(p / "**" / "*.*"), recursive=True)
                    # f = list(p.rglob('*.*'))  # pathlib
                elif p.is_file():  # file
                    with open(p) as t:
                        t = t.read().strip().splitlines()
                        parent = str(p.parent) + os.sep
                        f += [x.replace("./", parent) if x.startswith("./") else x for x in t]  # local to global path
                        # f += [p.parent / x.lstrip(os.sep) for x in t]  # local to global path (pathlib)
                else:
                    raise FileNotFoundError(f"{self.prefix}{p} does not exist")
            im_files = sorted(x.replace("/", os.sep) for x in f if x.split(".")[-1].lower() in IMG_FORMATS)
            # self.img_files = sorted([x for x in f if x.suffix[1:].lower() in IMG_FORMATS])  # pathlib
            assert im_files, f"{self.prefix}No images found"
        except Exception as e:
            raise FileNotFoundError(f"{self.prefix}Error loading data from {img_path}: {e}\n{HELP_URL}") from e
        return im_files

    def update_labels(self, include_class: Optional[list]):
        """include_class, filter labels to include only these classes (optional)"""
        include_class_array = np.array(include_class).reshape(1, -1)
        for i in range(len(self.labels)):
            if include_class:
                cls = self.labels[i]["cls"]
                bboxes = self.labels[i]["bboxes"]
                segments = self.labels[i]["segments"]
                j = (cls == include_class_array).any(1)
                self.labels[i]["cls"] = cls[j]
                self.labels[i]["bboxes"] = bboxes[j]
                if segments:
                    self.labels[i]["segments"] = segments[j]
            if self.single_cls:
                self.labels[i]["cls"] = 0

    def load_image(self, i):
        # Loads 1 image from dataset index 'i', returns (im, resized hw)
        im, f, fn = self.ims[i], self.im_files[i], self.npy_files[i]
        if im is None:  # not cached in RAM
            if fn.exists():  # load npy
                im = np.load(fn)
            else:  # read image
                im = cv2.imread(f)  # BGR
                assert im is not None, f"Image Not Found {f}"
            h0, w0 = im.shape[:2]  # orig hw
            r = self.imgsz / max(h0, w0)  # ratio
            if r != 1:  # if sizes are not equal
                interp = cv2.INTER_LINEAR if (self.augment or r > 1) else cv2.INTER_AREA
                im = cv2.resize(im, (math.ceil(w0 * r), math.ceil(h0 * r)), interpolation=interp)
            return im, (h0, w0), im.shape[:2]  # im, hw_original, hw_resized
        return self.ims[i], self.im_hw0[i], self.im_hw[i]  # im, hw_original, hw_resized

    def cache_images(self, cache):
        # cache images to memory or disk
        gb = 0  # Gigabytes of cached images
        self.im_hw0, self.im_hw = [None] * self.ni, [None] * self.ni
        fcn = self.cache_images_to_disk if cache == "disk" else self.load_image
        results = ThreadPool(NUM_THREADS).imap(fcn, range(self.ni))
        pbar = tqdm(enumerate(results), total=self.ni, bar_format=TQDM_BAR_FORMAT, disable=LOCAL_RANK > 0)
        for i, x in pbar:
            if cache == "disk":
                gb += self.npy_files[i].stat().st_size
            else:  # 'ram'
                self.ims[i], self.im_hw0[i], self.im_hw[i] = x  # im, hw_orig, hw_resized = load_image(self, i)
                gb += self.ims[i].nbytes
            pbar.desc = f"{self.prefix}Caching images ({gb / 1E9:.1f}GB {cache})"
        pbar.close()

    def cache_images_to_disk(self, i):
        # Saves an image as an *.npy file for faster loading
        f = self.npy_files[i]
        if not f.exists():
            np.save(f.as_posix(), cv2.imread(self.im_files[i]))

    def set_rectangle(self):
        bi = np.floor(np.arange(self.ni) / self.batch_size).astype(int)  # batch index
        nb = bi[-1] + 1  # number of batches

        s = np.array([x.pop("shape") for x in self.labels])  # hw
        ar = s[:, 0] / s[:, 1]  # aspect ratio
        irect = ar.argsort()
        self.im_files = [self.im_files[i] for i in irect]
        self.labels = [self.labels[i] for i in irect]
        ar = ar[irect]

        # Set training image shapes
        shapes = [[1, 1]] * nb
        for i in range(nb):
            ari = ar[bi == i]
            mini, maxi = ari.min(), ari.max()
            if maxi < 1:
                shapes[i] = [maxi, 1]
            elif mini > 1:
                shapes[i] = [1, 1 / mini]

        self.batch_shapes = np.ceil(np.array(shapes) * self.imgsz / self.stride + self.pad).astype(int) * self.stride
        self.batch = bi  # batch index of image

    def __getitem__(self, index):
        return self.transforms(self.get_label_info(index))

    def get_label_info(self, index):
        label = self.labels[index].copy()
        label["img"], label["ori_shape"], label["resized_shape"] = self.load_image(index)
        label["ratio_pad"] = (
            label["resized_shape"][0] / label["ori_shape"][0],
            label["resized_shape"][1] / label["ori_shape"][1],
        )  # for evaluation
        if self.rect:
            label["rect_shape"] = self.batch_shapes[self.batch[index]]
        label = self.update_labels_info(label)
        return label

    def __len__(self):
        return len(self.im_files)

    def update_labels_info(self, label):
        """custom your label format here"""
        return label

    def build_transforms(self, hyp=None):
        """Users can custom augmentations here
        like:
            if self.augment:
                # training transforms
                return Compose([])
            else:
                # val transforms
                return Compose([])
        """
        raise NotImplementedError

    def get_labels(self):
        """Users can custom their own format here.
        Make sure your output is a list with each element like below:
            dict(
                im_file=im_file,
                shape=shape,  # format: (height, width)
                cls=cls,
                bboxes=bboxes, # xywh
                segments=segments,  # xy
                keypoints=keypoints, # xy
                normalized=True, # or False
                bbox_format="xyxy",  # or xywh, ltwh
            )
        """
        raise NotImplementedError


================================================
FILE: yolo/data/build.py
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

import os
import random

import numpy as np
import torch
from torch.utils.data import DataLoader, dataloader, distributed

from ..utils import LOGGER, colorstr
from ..utils.torch_utils import torch_distributed_zero_first
from .dataset import ClassificationDataset, YOLODataset
from .utils import PIN_MEMORY, RANK


class InfiniteDataLoader(dataloader.DataLoader):
    """Dataloader that reuses workers

    Uses same syntax as vanilla DataLoader
    """

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        object.__setattr__(self, "batch_sampler", _RepeatSampler(self.batch_sampler))
        self.iterator = super().__iter__()

    def __len__(self):
        return len(self.batch_sampler.sampler)

    def __iter__(self):
        for _ in range(len(self)):
            yield next(self.iterator)


class _RepeatSampler:
    """Sampler that repeats forever

    Args:
        sampler (Sampler)
    """

    def __init__(self, sampler):
        self.sampler = sampler

    def __iter__(self):
        while True:
            yield from iter(self.sampler)


def seed_worker(worker_id):
    # Set dataloader worker seed https://pytorch.org/docs/stable/notes/randomness.html#dataloader
    worker_seed = torch.initial_seed() % 2 ** 32
    np.random.seed(worker_seed)
    random.seed(worker_seed)


def build_dataloader(cfg, batch_size, img_path, stride=32, label_path=None, rank=-1, mode="train"):
    assert mode in ["train", "val"]
    shuffle = mode == "train"
    if cfg.rect and shuffle:
        LOGGER.warning("WARNING ⚠️ --rect is incompatible with DataLoader shuffle, setting shuffle=False")
        shuffle = False
    with torch_distributed_zero_first(rank):  # init dataset *.cache only once if DDP
        dataset = YOLODataset(
            img_path=img_path,
            label_path=label_path,
            imgsz=cfg.imgsz,
            batch_size=batch_size,
            augment=mode == "train",  # augmentation
            hyp=cfg,  # TODO: probably add a get_hyps_from_cfg function
            rect=cfg.rect if mode == "train" else True,  # rectangular batches
            cache=cfg.get("cache", None),
            single_cls=cfg.get("single_cls", False),
            stride=int(stride),
            pad=0.0 if mode == "train" else 0.5,
            prefix=colorstr(f"{mode}: "),
            use_segments=cfg.task == "segment",
            use_keypoints=cfg.task == "keypoint")

    batch_size = min(batch_size, len(dataset))
    nd = torch.cuda.device_count()  # number of CUDA devices
    workers = cfg.workers if mode == "train" else cfg.workers * 2
    nw = min([os.cpu_count() // max(nd, 1), batch_size if batch_size > 1 else 0, workers])  # number of workers
    sampler = None if rank == -1 else distributed.DistributedSampler(dataset, shuffle=shuffle)
    loader = DataLoader if cfg.image_weights or cfg.close_mosaic else InfiniteDataLoader  # allow attribute updates
    generator = torch.Generator()
    generator.manual_seed(6148914691236517205 + RANK)
    return loader(dataset=dataset,
                  batch_size=batch_size,
                  shuffle=shuffle and sampler is None,
                  num_workers=nw,
                  sampler=sampler,
                  pin_memory=PIN_MEMORY,
                  collate_fn=getattr(dataset, "collate_fn", None),
                  worker_init_fn=seed_worker,
                  generator=generator), dataset


# build classification
# TODO: using cfg like `build_dataloader`
def build_classification_dataloader(path,
                                    imgsz=224,
                                    batch_size=16,
                                    augment=True,
                                    cache=False,
                                    rank=-1,
                                    workers=8,
                                    shuffle=True):
    # Returns Dataloader object to be used with YOLOv5 Classifier
    with torch_distributed_zero_first(rank):  # init dataset *.cache only once if DDP
        dataset = ClassificationDataset(root=path, imgsz=imgsz, augment=augment, cache=cache)
    batch_size = min(batch_size, len(dataset))
    nd = torch.cuda.device_count()
    nw = min([os.cpu_count() // max(nd, 1), batch_size if batch_size > 1 else 0, workers])
    sampler = None if rank == -1 else distributed.DistributedSampler(dataset, shuffle=shuffle)
    generator = torch.Generator()
    generator.manual_seed(6148914691236517205 + RANK)
    return InfiniteDataLoader(dataset,
                              batch_size=batch_size,
                              shuffle=shuffle and sampler is None,
                              num_workers=nw,
                              sampler=sampler,
                              pin_memory=PIN_MEMORY,
                              worker_init_fn=seed_worker,
                              generator=generator)  # or DataLoader(persistent_workers=True)


================================================
FILE: yolo/data/dataloaders/__init__.py
================================================


================================================
FILE: yolo/data/dataloaders/stream_loaders.py
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license

import glob
import math
import os
import time
from pathlib import Path
from threading import Thread
from urllib.parse import urlparse

import cv2
import numpy as np
import torch

from ultralytics.yolo.data.augment import LetterBox
from ultralytics.yolo.data.utils import IMG_FORMATS, VID_FORMATS
from ultralytics.yolo.utils import LOGGER, is_colab, is_kaggle, ops
from ultralytics.yolo.utils.checks import check_requirements


class LoadStreams:
    # YOLOv5 streamloader, i.e. `python detect.py --source 'rtsp://example.com/media.mp4'  # RTSP, RTMP, HTTP streams`
    def __init__(self, sources='file.streams', imgsz=640, stride=32, auto=True, transforms=None, vid_stride=1):
        torch.backends.cudnn.benchmark = True  # faster for fixed-size inference
        self.mode = 'stream'
        self.imgsz = imgsz
        self.stride = stride
        self.vid_stride = vid_stride  # video frame-rate stride
        sources = Path(sources).read_text().rsplit() if os.path.isfile(sources) else [sources]
        n = len(sources)
        self.sources = [ops.clean_str(x) for x in sources]  # clean source names for later
        self.imgs, self.fps, self.frames, self.threads = [None] * n, [0] * n, [0] * n, [None] * n
        for i, s in enumerate(sources):  # index, source
            # Start thread to read frames from video stream
            st = f'{i + 1}/{n}: {s}... '
            if urlparse(s).hostname in ('www.youtube.com', 'youtube.com', 'youtu.be'):  # if source is YouTube video
                # YouTube format i.e. 'https://www.youtube.com/watch?v=Zgi9g1ksQHc' or 'https://youtu.be/Zgi9g1ksQHc'
                check_requirements(('pafy', 'youtube_dl==2020.12.2'))
                import pafy
                s = pafy.new(s).getbest(preftype="mp4").url  # YouTube URL
            s = eval(s) if s.isnumeric() else s  # i.e. s = '0' local webcam
            if s == 0:
                assert not is_colab(), '--source 0 webcam unsupported on Colab. Rerun command in a local environment.'
                assert not is_kaggle(), '--source 0 webcam unsupported on Kaggle. Rerun command in a local environment.'
            cap = cv2.VideoCapture(s)
            assert cap.isOpened(), f'{st}Failed to open {s}'
            w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
            h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
            fps = cap.get(cv2.CAP_PROP_FPS)  # warning: may return 0 or nan
            self.frames[i] = max(int(cap.get(cv2.CAP_PROP_FRAME_COUNT)), 0) or float('inf')  # infinite stream fallback
            self.fps[i] = max((fps if math.isfinite(fps) else 0) % 100, 0) or 30  # 30 FPS fallback

            _, self.imgs[i] = cap.read()  # guarantee first frame
            self.threads[i] = Thread(target=self.update, args=([i, cap, s]), daemon=True)
            LOGGER.info(f"{st} Success ({self.frames[i]} frames {w}x{h} at {self.fps[i]:.2f} FPS)")
            self.threads[i].start()
        LOGGER.info('')  # newline

        # check for common shapes
        s = np.stack([LetterBox(imgsz, auto, stride=stride)(image=x).shape for x in self.imgs])
        self.rect = np.unique(s, axis=0).shape[0] == 1  # rect inference if all shapes equal
        self.auto = auto and self.rect
        self.transforms = transforms  # optional
        if not self.rect:
            LOGGER.warning('WARNING ⚠️ Stream shapes differ. For optimal performance supply similarly-shaped streams.')

    def update(self, i, cap, stream):
        # Read stream `i` frames in daemon thread
        n, f = 0, self.frames[i]  # frame number, frame array
        while cap.isOpened() and n < f:
            n += 1
            cap.grab()  # .read() = .grab() followed by .retrieve()
            if n % self.vid_stride == 0:
                success, im = cap.retrieve()
                if success:
                    self.imgs[i] = im
                else:
                    LOGGER.warning('WARNING ⚠️ Video stream unresponsive, please check your IP camera connection.')
                    self.imgs[i] = np.zeros_like(self.imgs[i])
                    cap.open(stream)  # re-open stream if signal was lost
            time.sleep(0.0)  # wait time

    def __iter__(self):
        self.count = -1
        return self

    def __next__(self):
        self.count += 1
        if not all(x.is_alive() for x in self.threads) or cv2.waitKey(1) == ord('q'):  # q to quit
            cv2.destroyAllWindows()
            raise StopIteration

        im0 = self.imgs.copy()
        if self.transforms:
            im = np.stack([self.transforms(x) for x in im0])  # transforms
        else:
            im = np.stack([LetterBox(self.imgsz, self.auto, stride=self.stride)(image=x) for x in im0])
            im = im[..., ::-1].transpose((0, 3, 1, 2))  # BGR to RGB, BHWC to BCHW
            im = np.ascontiguousarray(im)  # contiguous

        return self.sources, im, im0, None, ''

    def __len__(self):
        return len(self.sources)  # 1E12 frames = 32 streams at 30 FPS for 30 years


class LoadScreenshots:
    # YOLOv5 screenshot dataloader, i.e. `python detect.py --source "screen 0 100 100 512 256"`
    def __init__(self, source, imgsz=640, stride=32, auto=True, transforms=None):
        # source = [screen_number left top width height] (pixels)
        check_requirements('mss')
        import mss

        source, *params = source.split()
        self.screen, left, top, width, height = 0, None, None, None, None  # default to full screen 0
        if len(params) == 1:
            self.screen = int(params[0])
        elif len(params) == 4:
            left, top, width, height = (int(x) for x in params)
        elif len(params) == 5:
            self.screen, left, top, width, height = (int(x) for x in params)
        self.imgsz = imgsz
        self.stride = stride
        self.transforms = transforms
        self.auto = auto
        self.mode = 'stream'
        self.frame = 0
        self.sct = mss.mss()

        # Parse monitor shape
        monitor = self.sct.monitors[self.screen]
        self.top = monitor["top"] if top is None else (monitor["top"] + top)
        self.left = monitor["left"] if left is None else (monitor["left"] + left)
        self.width = width or monitor["width"]
        self.height = height or monitor["height"]
        self.monitor = {"left": self.left, "top": self.top, "width": self.width, "height": self.height}

    def __iter__(self):
        return self

    def __next__(self):
        # mss screen capture: get raw pixels from the screen as np array
        im0 = np.array(self.sct.grab(self.monitor))[:, :, :3]  # [:, :, :3] BGRA to BGR
        s = f"screen {self.screen} (LTWH): {self.left},{self.top},{self.width},{self.height}: "

        if self.transforms:
            im = self.transforms(im0)  # transforms
        else:
            im = LetterBox(self.imgsz, self.auto, stride=self.stride)(image=im0)
            im = im.transpose((2, 0, 1))[::-1]  # HWC to CHW, BGR to RGB
            im = np.ascontiguousarray(im)  # contiguous
        self.frame += 1
        return str(self.screen), im, im0, None, s  # screen, img, original img, im0s, s


class LoadImages:
    # YOLOv5 image/video dataloader, i.e. `python detect.py --source image.jpg/vid.mp4`
    def __init__(self, path, imgsz=640, stride=32, auto=True, transforms=None, vid_stride=1):
        if isinstance(path, str) and Path(path).suffix == ".txt":  # *.txt file with img/vid/dir on each line
            path = Path(path).read_text().rsplit()
        files = []
        for p in sorted(path) if isinstance(path, (list, tuple)) else [path]:
            p = str(Path(p).resolve())
            if '*' in p:
                files.extend(sorted(glob.glob(p, recursive=True)))  # glob
            elif os.path.isdir(p):
                files.extend(sorted(glob.glob(os.path.join(p, '*.*'))))  # dir
            elif os.path.isfile(p):
                files.append(p)  # files
            else:
                raise FileNotFoundError(f'{p} does not exist')

        images = [x for x in files if x.split('.')[-1].lower() in IMG_FORMATS]
        videos = [x for x in files if x.split('.')[-1].lower() in VID_FORMATS]
        ni, nv = len(images), len(videos)

        self.imgsz = imgsz
        self.stride = stride
        self.files = images + videos
        self.nf = ni + nv  # number of files
        self.video_flag = [False] * ni + [True] * nv
        self.mode = 'image'
        self.auto = auto
        self.transforms = transforms  # optional
        self.vid_stride = vid_stride  # video frame-rate stride
        if any(videos):
            self._new_video(videos[0])  # new video
        else:
            self.cap = None
        assert self.nf > 0, f'No images or videos found in {p}. ' \
                            f'Supported formats are:\nimages: {IMG_FORMATS}\nvideos: {VID_FORMATS}'

    def __iter__(self):
        self.count = 0
        return self

    def __next__(self):
        if self.count == self.nf:
            raise StopIteration
        path = self.files[self.count]

        if self.video_flag[self.count]:
            # Read video
            self.mode = 'video'
            for _ in range(self.vid_stride):
                self.cap.grab()
            ret_val, im0 = self.cap.retrieve()
            while not ret_val:
                self.count += 1
                self.cap.release()
                if self.count == self.nf:  # last video
                    raise StopIteration
                path = self.files[self.count]
                self._new_video(path)
                ret_val, im0 = self.cap.read()

            self.frame += 1
            # im0 = self._cv2_rotate(im0)  # for use if cv2 autorotation is False
            s = f'video {self.count + 1}/{self.nf} ({self.frame}/{self.frames}) {path}: '

        else:
            # Read image
            self.count += 1
            im0 = cv2.imread(path)  # BGR
            assert im0 is not None, f'Image Not Found {path}'
            s = f'image {self.count}/{self.nf} {path}: '

        if self.transforms:
            im = self.transforms(im0)  # transforms
        else:
            im = LetterBox(self.imgsz, self.auto, stride=self.stride)(image=im0)
            im = im.transpose((2, 0, 1))[::-1]  # HWC to CHW, BGR to RGB
            im = np.ascontiguousarray(im)  # contiguous

        return path, im, im0, self.cap, s

    def _new_video(self, path):
        # Create a new video capture object
        self.frame = 0
        self.cap = cv2.VideoCapture(path)
        self.frames = int(self.cap.get(cv2.CAP_PROP_FRAME_COUNT) / self.vid_stride)
        self.orientation = int(self.cap.get(cv2.CAP_PROP_ORIENTATION_META))  # rotation degrees
        # self.cap.set(cv2.CAP_PROP_ORIENTATION_AUTO, 0)  # disable https://github.com/ultralytics/yolov5/issues/8493

    def _cv2_rotate(self, im):
        # Rotate a cv2 video manually
        if self.orientation == 0:
            return cv2.rotate(im, cv2.ROTATE_90_CLOCKWISE)
        elif self.orientation == 180:
            return cv2.rotate(im, cv2.ROTATE_90_COUNTERCLOCKWISE)
        elif self.orientation == 90:
            return cv2.rotate(im, cv2.ROTATE_180)
        return im

    def __len__(self):
        return self.nf  # number of files


================================================
FILE: yolo/data/dataloaders/v5augmentations.py
================================================
# Ultralytics YOLO 🚀, GPL-3.0 license
"""
Image augmentation functions
"""

import math
import random

import cv2
import numpy as np
import torch
import torchvision.transforms as T
import torchvision.transforms.functional as TF

from ultralytics.yolo.utils import LOGGER, colorstr
from ultralytics.yolo.utils.checks import check_version
from ultralytics.yolo.utils.metrics import bbox_ioa
from ultralytics.yolo.utils.ops import resample_segments, segment2box, xywhn2xyxy

IMAGENET_MEAN = 0.485, 0.456, 0.406  # RGB mean
IMAGENET_STD = 0.229, 0.224, 0.225  # RGB standard deviation


class Albumentations:
    # YOLOv5 Albumentations class (optional, only used if package is installed)
    def __init__(self, size=640):
        self.transform = None
        prefix = colorstr('albumentations: ')
        try:
            import albumentations as A
            check_version(A.__version__, '1.0.3', hard=True)  # version requirement

            T = [
                A.RandomResizedCrop(height=size, width=size, scale=(0.8, 1.0), ratio=(0.9, 1.11), p=0.0),
                A.Blur(p=0.01),
                A.MedianBlur(p=0.01),
                A.ToGray(p=0.01),
                A.CLAHE(p=0.01),
                A.RandomBrightnessContrast(p=0.0),
                A.RandomGamma(p=0.0),
                A.ImageCompression(quality_lower=75, p=0.0)]  # transforms
            self.transform = A.Compose(T, bbox_params=A.BboxParams(format='yolo', label_fields=['class_labels']))

            LOGGER.info(prefix + ', '.join(f'{x}'.replace('always_apply=False, ', '') for x in T if x.p))
        except ImportError:  # package not installed, skip
            pass
        except Exception as e:
            LOGGER.info(f'{prefix}{e}')

    def __call__(self, im, labels, p=1.0):
        if self.transform and random.random() < p:
            new = self.transform(image=im, bboxes=labels[:, 1:], class_labels=labels[:, 0])  # transformed
            im, labels = new['image'], np.array([[c, *b] for c, b in zip(new['class_labels'], new['bboxes'])])
        return im, labels


def normalize(x, mean=IMAGENET_MEAN, std=IMAGENET_STD, inplace=False):
    # Denormalize RGB images x per ImageNet stats in BCHW format, i.e. = (x - mean) / std
    return TF.normalize(x, mean, std, inplace=inplace)


def denormalize(x, mean=IMAGENET_MEAN, std=IMAGENET_STD):
    # Denormalize RGB images x per ImageNet stats in BCHW format, i.e. = x * std + mean
    for i in range(3):
        x[:, i] = x[:, i] * std[i] + mean[i]
    return x


def augment_hsv(im, hgain=0.5, sgain=0.5, vgain=0.5):
    # HSV color-space augmentation
    if hgain or sgain or vgain:
        r = np.random.uniform(-1, 1, 3) * [hgain, sgain, vgain] + 1  # random gains
        hue, sat, val = cv2.split(cv2.cvtColor(im, cv2.COLOR_BGR2HSV))
        dtype = im.dtype  # uint8

        x = np.arange(0, 256, dtype=r.dtype)
        lut_hue = ((x * r[0]) % 180).astype(dtype)
        lut_sat = np.clip(x * r[1], 0, 255).astype(dtype)
        lut_val = np.clip(x * r[2], 0, 255).astype(dtype)

        im_hsv = cv2.merge((cv2.LUT(hue, lut_hue), cv2.LUT(sat, lut_sat), cv2.LUT(val, lut_val)))
        cv2.cvtColor(im_hsv, cv2.COLOR_HSV2BGR, dst=im)  # no return needed


def hist_equalize(im, clahe=True, bgr=False):
    # Equalize histogram on BGR image 'im' with im.shape(n,m,3) and range 0-255
    yuv = cv2.cvtColor(im, cv2.COLOR_BGR2YUV if bgr else cv2.COLOR_RGB2YUV)
    if clahe:
        c = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
        yuv[:, :, 0] = c.apply(yuv[:, :, 0])
    else:
        yuv[:, :, 0] = cv2.equalizeHist(yuv[:, :, 0])  # equalize Y channel histogram
    return cv2.cvtColor(yuv, cv2.COLOR_YUV2BGR if bgr else cv2.COLOR_YUV2RGB)  # convert YUV image to RGB


def replicate(im, labels):
    # Replicate labels
    h, w = im.shape[:2]
    boxes = labels[:, 1:].astype(int)
    x1, y1, x2, y2 = boxes.T
    s = ((x2 - x1) + (y2 - y1)) / 2  # side length (pixels)
    for i in s.argsort()[:round(s.size * 0.5)]:  # smallest indices
        x1b, y1b, x2b, y2b = boxes[i]
        bh, bw = y2b - y1b, x2b - x1b
        yc, xc = int(random.uniform(0, h - bh)), int(random.uniform(0, w - bw))  # offset x, y
        x1a, y1a, x2a, y2a = [xc, yc, xc + bw, yc + bh]
        im[y1a:y2a, x1a:x2a] = im[y1b:y2b, x1b:x2b]  # im4[ymin:ymax, xmin:xmax]
        labels = np.append(labels, [[labels[i, 0], x1a, y1a, x2a, y2a]], axis=0)

    return im, labels


def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
    # Resize and pad image while meeting stride-multiple constraints
    shape = im.shape[:2]  # current shape [height, width]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    # Scale ratio (new / old)
    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
    if not scaleup:  # only scale down, do not scale up (for better val mAP)
        r = min(r, 1.0)

    # Compute padding
    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
    if auto:  # minimum rectangle
        dw, dh = np.mod(dw, stride), np.mod(dh, stride)  # wh padding
    elif scaleFill:  # stretch
        dw, dh = 0.0, 0.0
        new_unpad = (new_shape[1], new_shape[0])
        ratio = new_shape[1] / shape[1], new_shape[0] / shape[0]  # width, height ratios

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
    return im, ratio, (dw, dh)


def random_perspective(im,
                       targets=(),
                       segments=(),
                       degrees=10,
                       translate=.1,
                       scale=.1,
                       shear=10,
                       perspective=0.0,
                       border=(0, 0)):
    # torchvision.transforms.RandomAffine(degrees=(-10, 10), translate=(0.1, 0.1), scale=(0.9, 1.1), shear=(-10, 10))
    # targets = [cls, xyxy]

    height = im.shape[0] + border[0] * 2  # shape(h,w,c)
    width = im.shape[1] + border[1] * 2

    # Center
    C = np.eye(3)
    C[0, 2] = -im.shape[1] / 2  # x translation (pixels)
    C[1, 2] = -im.shape[0] / 2  # y translation (pixels)

    # Perspective
    P = np.eye(3)
    P[2, 0] = random.uniform(-perspective, perspective)  # x perspective (about y)
    P[2, 1] = random.uniform(-perspective, perspective)  # y perspective (about x)

    # Rotation and Scale
    R = np.eye(3)
    a = random.uniform(-degrees, degrees)
    # a += random.choice([-180, -90, 0, 90])  # add 90deg rotations to small rotations
    s = random.uniform(1 - scale, 1 + scale)
    # s = 2 ** random.uniform(-scale, scale)
    R[:2] = cv2.getRotationMatrix2D(angle=a, center=(0, 0), scale=s)

    # Shear
    S = np.eye(3)
    S[0, 1] = math.tan(random.uniform(-shear, shear) * math.pi / 180)  # x shear (deg)
    S[1, 0] = math.tan(random.uniform(-shear, shear) * math.pi / 180)  # y shear (deg)

    # Translation
    T = np.eye(3)
    T[0, 2] = random.uniform(0.5 - translate, 0.5 + translate) * width  # x translation (pixels)
    T[1, 2] = random.uniform(0.5 - translate, 0.5 + translate) * height  # y translation (pixels)

    # Combined rotation matrix
    M = T @ S @ R @ P @ C  # order of operations (right to left) is IMPORTANT
    if (border[0] != 0) or (border[1] != 0) or (M != np.eye(3)).any():  # image changed
        if perspective:
            im = cv2.warpPerspective(im, M, dsize=(width, height), borderValue=(114, 114, 114))
        else:  # affine
            im = cv2.warpAffine(im, M[:2], dsize=(width, height), borderValue=(114, 114, 114))

    # Visualize
    # import matplotlib.pyplot as plt
    # ax = plt.subplots(1, 2, figsize=(12, 6))[1].ravel()
    # ax[0].imshow(im[:, :, ::-1])  # base
    # ax[1].imshow(im2[:, :, ::-1])  # warped

    # Transform label coordinates
    n = len(targets)
    if n:
        use_segments = any(x.any() for x in segments)
        new = np.zeros((n, 4))
        if use_segments:  # warp segments
            segments = resample_segments(segments)  # upsample
            for i, segment in enumerate(segments):
                xy = np.ones((len(segment), 3))
                xy[:, :2] = segment
                xy = xy @ M.T  # transform
                xy = xy[:, :2] / xy[:, 2:3] if perspective else x

Download .txt

gitextract_66agq4x4/

├── LICENSE
├── README.md
├── __init__.py
├── models/
│   └── v8/
│       ├── yolov8l.yaml
│       ├── yolov8m.yaml
│       ├── yolov8n.yaml
│       ├── yolov8s.yaml
│       ├── yolov8x.yaml
│       └── yolov8x6.yaml
├── nn/
│   ├── __init__.py
│   ├── autobackend.py
│   ├── modules.py
│   └── tasks.py
├── requirements.txt
└── yolo/
    ├── cli.py
    ├── configs/
    │   ├── __init__.py
    │   ├── default.yaml
    │   └── hydra_patch.py
    ├── data/
    │   ├── __init__.py
    │   ├── augment.py
    │   ├── base.py
    │   ├── build.py
    │   ├── dataloaders/
    │   │   ├── __init__.py
    │   │   ├── stream_loaders.py
    │   │   ├── v5augmentations.py
    │   │   └── v5loader.py
    │   ├── dataset.py
    │   ├── dataset_wrappers.py
    │   ├── datasets/
    │   │   ├── Argoverse.yaml
    │   │   ├── GlobalWheat2020.yaml
    │   │   ├── ImageNet.yaml
    │   │   ├── Objects365.yaml
    │   │   ├── SKU-110K.yaml
    │   │   ├── VOC.yaml
    │   │   ├── VisDrone.yaml
    │   │   ├── coco.yaml
    │   │   ├── coco128-seg.yaml
    │   │   ├── coco128.yaml
    │   │   └── xView.yaml
    │   ├── scripts/
    │   │   ├── download_weights.sh
    │   │   ├── get_coco.sh
    │   │   ├── get_coco128.sh
    │   │   └── get_imagenet.sh
    │   └── utils.py
    ├── engine/
    │   ├── __init__.py
    │   ├── exporter.py
    │   ├── model.py
    │   ├── predictor.py
    │   ├── sort.py
    │   ├── trainer.py
    │   └── validator.py
    ├── utils/
    │   ├── __init__.py
    │   ├── autobatch.py
    │   ├── callbacks/
    │   │   ├── __init__.py
    │   │   ├── base.py
    │   │   ├── clearml.py
    │   │   ├── comet.py
    │   │   ├── hub.py
    │   │   └── tensorboard.py
    │   ├── checks.py
    │   ├── dist.py
    │   ├── downloads.py
    │   ├── files.py
    │   ├── instance.py
    │   ├── loss.py
    │   ├── metrics.py
    │   ├── ops.py
    │   ├── plotting.py
    │   ├── tal.py
    │   └── torch_utils.py
    └── v8/
        ├── __init__.py
        └── detect/
            ├── __init__.py
            ├── detect_and_trk.py
            ├── predict.py
            ├── sort.py
            ├── train.py
            └── val.py

Download .txt

SYMBOL INDEX (819 symbols across 44 files)

FILE: nn/autobackend.py
  class AutoBackend (line 21) | class AutoBackend(nn.Module):
    method __init__ (line 23) | def __init__(self, weights='yolov8n.pt', device=torch.device('cpu'), d...
    method forward (line 234) | def forward(self, im, augment=False, visualize=False):
    method from_numpy (line 325) | def from_numpy(self, x):
    method warmup (line 334) | def warmup(self, imgsz=(1, 3, 640, 640)):
    method _model_type (line 348) | def _model_type(p='path/to/model.pt'):
    method _load_metadata (line 368) | def _load_metadata(f=Path('path/to/meta.yaml')):

FILE: nn/modules.py
  function autopad (line 32) | def autopad(k, p=None, d=1):  # kernel, padding, dilation
  class Conv (line 41) | class Conv(nn.Module):
    method __init__ (line 45) | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, d=1, act=True):
    method forward (line 51) | def forward(self, x):
    method forward_fuse (line 54) | def forward_fuse(self, x):
  class DWConv (line 58) | class DWConv(Conv):
    method __init__ (line 60) | def __init__(self, c1, c2, k=1, s=1, d=1, act=True):  # ch_in, ch_out,...
  class DWConvTranspose2d (line 64) | class DWConvTranspose2d(nn.ConvTranspose2d):
    method __init__ (line 66) | def __init__(self, c1, c2, k=1, s=1, p1=0, p2=0):  # ch_in, ch_out, ke...
  class ConvTranspose (line 70) | class ConvTranspose(nn.Module):
    method __init__ (line 74) | def __init__(self, c1, c2, k=2, s=2, p=0, bn=True, act=True):
    method forward (line 80) | def forward(self, x):
  class DFL (line 84) | class DFL(nn.Module):
    method __init__ (line 86) | def __init__(self, c1=16):
    method forward (line 93) | def forward(self, x):
  class TransformerLayer (line 99) | class TransformerLayer(nn.Module):
    method __init__ (line 101) | def __init__(self, c, num_heads):
    method forward (line 110) | def forward(self, x):
  class TransformerBlock (line 116) | class TransformerBlock(nn.Module):
    method __init__ (line 118) | def __init__(self, c1, c2, num_heads, num_layers):
    method forward (line 127) | def forward(self, x):
  class Bottleneck (line 135) | class Bottleneck(nn.Module):
    method __init__ (line 137) | def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5):  # ch...
    method forward (line 144) | def forward(self, x):
  class BottleneckCSP (line 148) | class BottleneckCSP(nn.Module):
    method __init__ (line 150) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):  # ch_in, ...
    method forward (line 161) | def forward(self, x):
  class C3 (line 167) | class C3(nn.Module):
    method __init__ (line 169) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):  # ch_in, ...
    method forward (line 177) | def forward(self, x):
  class C2 (line 181) | class C2(nn.Module):
    method __init__ (line 183) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):  # ch_in, ...
    method forward (line 191) | def forward(self, x):
  class C2f (line 196) | class C2f(nn.Module):
    method __init__ (line 198) | def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5):  # ch_in,...
    method forward (line 205) | def forward(self, x):
  class ChannelAttention (line 211) | class ChannelAttention(nn.Module):
    method __init__ (line 213) | def __init__(self, channels: int) -> None:
    method forward (line 219) | def forward(self, x: torch.Tensor) -> torch.Tensor:
  class SpatialAttention (line 223) | class SpatialAttention(nn.Module):
    method __init__ (line 225) | def __init__(self, kernel_size=7):
    method forward (line 232) | def forward(self, x):
  class CBAM (line 236) | class CBAM(nn.Module):
    method __init__ (line 238) | def __init__(self, c1, ratio=16, kernel_size=7):  # ch_in, ch_out, num...
    method forward (line 243) | def forward(self, x):
  class C1 (line 247) | class C1(nn.Module):
    method __init__ (line 249) | def __init__(self, c1, c2, n=1):  # ch_in, ch_out, number, shortcut, g...
    method forward (line 254) | def forward(self, x):
  class C3x (line 259) | class C3x(C3):
    method __init__ (line 261) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
  class C3TR (line 267) | class C3TR(C3):
    method __init__ (line 269) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
  class C3Ghost (line 275) | class C3Ghost(C3):
    method __init__ (line 277) | def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
  class SPP (line 283) | class SPP(nn.Module):
    method __init__ (line 285) | def __init__(self, c1, c2, k=(5, 9, 13)):
    method forward (line 292) | def forward(self, x):
  class SPPF (line 299) | class SPPF(nn.Module):
    method __init__ (line 301) | def __init__(self, c1, c2, k=5):  # equivalent to SPP(k=(5, 9, 13))
    method forward (line 308) | def forward(self, x):
  class Focus (line 317) | class Focus(nn.Module):
    method __init__ (line 319) | def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in,...
    method forward (line 324) | def forward(self, x):  # x(b,c,w,h) -> y(b,4c,w/2,h/2)
  class GhostConv (line 329) | class GhostConv(nn.Module):
    method __init__ (line 331) | def __init__(self, c1, c2, k=1, s=1, g=1, act=True):  # ch_in, ch_out,...
    method forward (line 337) | def forward(self, x):
  class GhostBottleneck (line 342) | class GhostBottleneck(nn.Module):
    method __init__ (line 344) | def __init__(self, c1, c2, k=3, s=1):  # ch_in, ch_out, kernel, stride
    method forward (line 354) | def forward(self, x):
  class Concat (line 358) | class Concat(nn.Module):
    method __init__ (line 360) | def __init__(self, dimension=1):
    method forward (line 364) | def forward(self, x):
  class AutoShape (line 368) | class AutoShape(nn.Module):
    method __init__ (line 378) | def __init__(self, model, verbose=True):
    method _apply (line 391) | def _apply(self, fn):
    method forward (line 403) | def forward(self, ims, size=640, augment=False, profile=False):
  class Detections (line 467) | class Detections:
    method __init__ (line 469) | def __init__(self, ims, pred, files, times=(0, 0, 0), names=None, shap...
    method _run (line 486) | def _run(self, pprint=False, show=False, save=False, crop=False, rende...
    method show (line 531) | def show(self, labels=True):
    method save (line 534) | def save(self, labels=True, save_dir='runs/detect/exp', exist_ok=False):
    method crop (line 538) | def crop(self, save=True, save_dir='runs/detect/exp', exist_ok=False):
    method render (line 542) | def render(self, labels=True):
    method pandas (line 546) | def pandas(self):
    method tolist (line 556) | def tolist(self):
    method print (line 565) | def print(self):
    method __len__ (line 568) | def __len__(self):  # override len(results)
    method __str__ (line 571) | def __str__(self):  # override print(results)
    method __repr__ (line 574) | def __repr__(self):
  class Proto (line 578) | class Proto(nn.Module):
    method __init__ (line 580) | def __init__(self, c1, c_=256, c2=32):  # ch_in, number of protos, num...
    method forward (line 587) | def forward(self, x):
  class Ensemble (line 591) | class Ensemble(nn.ModuleList):
    method __init__ (line 593) | def __init__(self):
    method forward (line 596) | def forward(self, x, augment=False, profile=False, visualize=False):
  class Detect (line 605) | class Detect(nn.Module):
    method __init__ (line 613) | def __init__(self, nc=80, ch=()):  # detection layer
    method forward (line 627) | def forward(self, x):
    method bias_init (line 642) | def bias_init(self):
  class Segment (line 652) | class Segment(Detect):
    method __init__ (line 654) | def __init__(self, nc=80, nm=32, npr=256, ch=()):
    method forward (line 664) | def forward(self, x):
  class Classify (line 675) | class Classify(nn.Module):
    method __init__ (line 677) | def __init__(self, c1, c2, k=1, s=1, p=None, g=1):  # ch_in, ch_out, k...
    method forward (line 685) | def forward(self, x):

FILE: nn/tasks.py
  class BaseModel (line 19) | class BaseModel(nn.Module):
    method forward (line 24) | def forward(self, x, profile=False, visualize=False):
    method _forward_once (line 38) | def _forward_once(self, x, profile=False, visualize=False):
    method _profile_one_layer (line 63) | def _profile_one_layer(self, m, x, dt):
    method fuse (line 85) | def fuse(self):
    method info (line 101) | def info(self, verbose=False, imgsz=640):
    method _apply (line 111) | def _apply(self, fn):
    method load (line 130) | def load(self, weights):
  class DetectionModel (line 141) | class DetectionModel(BaseModel):
    method __init__ (line 143) | def __init__(self, cfg='yolov8n.yaml', ch=3, nc=None, verbose=True):  ...
    method forward (line 172) | def forward(self, x, augment=False, profile=False, visualize=False):
    method _forward_augment (line 177) | def _forward_augment(self, x):
    method _descale_pred (line 192) | def _descale_pred(p, flips, scale, img_size, dim=1):
    method _clip_augmented (line 202) | def _clip_augmented(self, y):
    method load (line 213) | def load(self, weights, verbose=True):
  class SegmentationModel (line 221) | class SegmentationModel(DetectionModel):
    method __init__ (line 223) | def __init__(self, cfg='yolov8n-seg.yaml', ch=3, nc=None, verbose=True):
  class ClassificationModel (line 227) | class ClassificationModel(BaseModel):
    method __init__ (line 229) | def __init__(self,
    method _from_detection_model (line 239) | def _from_detection_model(self, model, nc=1000, cutoff=10):
    method _from_yaml (line 255) | def _from_yaml(self, cfg, ch, nc, verbose):
    method load (line 266) | def load(self, weights):
    method reshape_outputs (line 273) | def reshape_outputs(model, nc):
  function attempt_load_weights (line 297) | def attempt_load_weights(weights, device=None, inplace=True, fuse=False):
  function attempt_load_one_weight (line 337) | def attempt_load_one_weight(weight, device=None, inplace=True, fuse=False):
  function parse_model (line 365) | def parse_model(d, ch, verbose=True):  # model_dict, input_channels(3)

FILE: yolo/cli.py
  function cli (line 15) | def cli(cfg):

FILE: yolo/configs/__init__.py
  function get_config (line 11) | def get_config(config: Union[str, DictConfig], overrides: Union[str, Dic...

FILE: yolo/configs/hydra_patch.py
  function override_config (line 15) | def override_config(overrides, cfg):
  function check_config_mismatch (line 68) | def check_config_mismatch(overrides, cfg):

FILE: yolo/data/augment.py
  class BaseTransform (line 21) | class BaseTransform:
    method __init__ (line 23) | def __init__(self) -> None:
    method apply_image (line 26) | def apply_image(self, labels):
    method apply_instances (line 29) | def apply_instances(self, labels):
    method apply_semantic (line 32) | def apply_semantic(self, labels):
    method __call__ (line 35) | def __call__(self, labels):
  class Compose (line 41) | class Compose:
    method __init__ (line 43) | def __init__(self, transforms):
    method __call__ (line 46) | def __call__(self, data):
    method append (line 51) | def append(self, transform):
    method tolist (line 54) | def tolist(self):
    method __repr__ (line 57) | def __repr__(self):
  class BaseMixTransform (line 66) | class BaseMixTransform:
    method __init__ (line 69) | def __init__(self, dataset, pre_transform=None, p=0.0) -> None:
    method __call__ (line 74) | def __call__(self, labels):
    method _mix_transform (line 96) | def _mix_transform(self, labels):
    method get_indexes (line 99) | def get_indexes(self):
  class Mosaic (line 103) | class Mosaic(BaseMixTransform):
    method __init__ (line 111) | def __init__(self, dataset, imgsz=640, p=1.0, border=(0, 0)):
    method get_indexes (line 118) | def get_indexes(self):
    method _mix_transform (line 121) | def _mix_transform(self, labels):
    method _update_labels (line 158) | def _update_labels(self, labels, padw, padh):
    method _cat_labels (line 166) | def _cat_labels(self, mosaic_labels):
  class MixUp (line 184) | class MixUp(BaseMixTransform):
    method __init__ (line 186) | def __init__(self, dataset, pre_transform=None, p=0.0) -> None:
    method get_indexes (line 189) | def get_indexes(self):
    method _mix_transform (line 192) | def _mix_transform(self, labels):
  class RandomPerspective (line 202) | class RandomPerspective:
    method __init__ (line 204) | def __init__(self, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, p...
    method affine_transform (line 213) | def affine_transform(self, img):
    method apply_bboxes (line 253) | def apply_bboxes(self, bboxes, M):
    method apply_segments (line 276) | def apply_segments(self, segments, M):
    method apply_keypoints (line 299) | def apply_keypoints(self, keypoints, M):
    method __call__ (line 325) | def __call__(self, labels):
    method box_candidates (line 370) | def box_candidates(self, box1, box2, wh_thr=2, ar_thr=100, area_thr=0....
  class RandomHSV (line 378) | class RandomHSV:
    method __init__ (line 380) | def __init__(self, hgain=0.5, sgain=0.5, vgain=0.5) -> None:
    method __call__ (line 385) | def __call__(self, labels):
  class RandomFlip (line 402) | class RandomFlip:
    method __init__ (line 404) | def __init__(self, p=0.5, direction="horizontal") -> None:
    method __call__ (line 411) | def __call__(self, labels):
  class LetterBox (line 431) | class LetterBox:
    method __init__ (line 434) | def __init__(self, new_shape=(640, 640), auto=False, scaleFill=False, ...
    method __call__ (line 441) | def __call__(self, labels=None, image=None):
    method _update_labels (line 486) | def _update_labels(self, labels, ratio, padw, padh):
  class CopyPaste (line 495) | class CopyPaste:
    method __init__ (line 497) | def __init__(self, p=0.5) -> None:
    method __call__ (line 500) | def __call__(self, labels):
  class Albumentations (line 533) | class Albumentations:
    method __init__ (line 535) | def __init__(self, p=1.0):
    method __call__ (line 560) | def __call__(self, labels):
  class Format (line 577) | class Format:
    method __init__ (line 579) | def __init__(self,
    method __call__ (line 595) | def __call__(self, labels):
    method _format_img (line 624) | def _format_img(self, img):
    method _format_segments (line 631) | def _format_segments(self, instances, cls, w, h):
  function mosaic_transforms (line 645) | def mosaic_transforms(dataset, imgsz, hyp):
  function affine_transforms (line 666) | def affine_transforms(imgsz, hyp):
  function classify_transforms (line 684) | def classify_transforms(size=224):
  function classify_albumentations (line 691) | def classify_albumentations(
  class ClassifyLetterBox (line 734) | class ClassifyLetterBox:
    method __init__ (line 736) | def __init__(self, size=(640, 640), auto=False, stride=32):
    method __call__ (line 742) | def __call__(self, im):  # im = np.array HWC
  class CenterCrop (line 753) | class CenterCrop:
    method __init__ (line 755) | def __init__(self, size=640):
    method __call__ (line 759) | def __call__(self, im):  # im = np.array HWC
  class ToTensor (line 766) | class ToTensor:
    method __init__ (line 768) | def __init__(self, half=False):
    method __call__ (line 772) | def __call__(self, im):  # im = np.array HWC in BGR order

FILE: yolo/data/base.py
  class BaseDataset (line 19) | class BaseDataset(Dataset):
    method __init__ (line 27) | def __init__(
    method get_img_files (line 75) | def get_img_files(self, img_path):
    method update_labels (line 99) | def update_labels(self, include_class: Optional[list]):
    method load_image (line 115) | def load_image(self, i):
    method cache_images (line 132) | def cache_images(self, cache):
    method cache_images_to_disk (line 148) | def cache_images_to_disk(self, i):
    method set_rectangle (line 154) | def set_rectangle(self):
    method __getitem__ (line 178) | def __getitem__(self, index):
    method get_label_info (line 181) | def get_label_info(self, index):
    method __len__ (line 193) | def __len__(self):
    method update_labels_info (line 196) | def update_labels_info(self, label):
    method build_transforms (line 200) | def build_transforms(self, hyp=None):
    method get_labels (line 212) | def get_labels(self):

FILE: yolo/data/build.py
  class InfiniteDataLoader (line 16) | class InfiniteDataLoader(dataloader.DataLoader):
    method __init__ (line 22) | def __init__(self, *args, **kwargs):
    method __len__ (line 27) | def __len__(self):
    method __iter__ (line 30) | def __iter__(self):
  class _RepeatSampler (line 35) | class _RepeatSampler:
    method __init__ (line 42) | def __init__(self, sampler):
    method __iter__ (line 45) | def __iter__(self):
  function seed_worker (line 50) | def seed_worker(worker_id):
  function build_dataloader (line 57) | def build_dataloader(cfg, batch_size, img_path, stride=32, label_path=No...
  function build_classification_dataloader (line 101) | def build_classification_dataloader(path,

FILE: yolo/data/dataloaders/stream_loaders.py
  class LoadStreams (line 21) | class LoadStreams:
    method __init__ (line 23) | def __init__(self, sources='file.streams', imgsz=640, stride=32, auto=...
    method update (line 67) | def update(self, i, cap, stream):
    method __iter__ (line 83) | def __iter__(self):
    method __next__ (line 87) | def __next__(self):
    method __len__ (line 103) | def __len__(self):
  class LoadScreenshots (line 107) | class LoadScreenshots:
    method __init__ (line 109) | def __init__(self, source, imgsz=640, stride=32, auto=True, transforms...
    method __iter__ (line 138) | def __iter__(self):
    method __next__ (line 141) | def __next__(self):
  class LoadImages (line 156) | class LoadImages:
    method __init__ (line 158) | def __init__(self, path, imgsz=640, stride=32, auto=True, transforms=N...
    method __iter__ (line 193) | def __iter__(self):
    method __next__ (line 197) | def __next__(self):
    method _new_video (line 237) | def _new_video(self, path):
    method _cv2_rotate (line 245) | def _cv2_rotate(self, im):
    method __len__ (line 255) | def __len__(self):

FILE: yolo/data/dataloaders/v5augmentations.py
  class Albumentations (line 24) | class Albumentations:
    method __init__ (line 26) | def __init__(self, size=640):
    method __call__ (line 50) | def __call__(self, im, labels, p=1.0):
  function normalize (line 57) | def normalize(x, mean=IMAGENET_MEAN, std=IMAGENET_STD, inplace=False):
  function denormalize (line 62) | def denormalize(x, mean=IMAGENET_MEAN, std=IMAGENET_STD):
  function augment_hsv (line 69) | def augment_hsv(im, hgain=0.5, sgain=0.5, vgain=0.5):
  function hist_equalize (line 85) | def hist_equalize(im, clahe=True, bgr=False):
  function replicate (line 96) | def replicate(im, labels):
  function letterbox (line 113) | def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True...
  function random_perspective (line 146) | def random_perspective(im,
  function copy_paste (line 242) | def copy_paste(im, labels, segments, p=0.5):
  function cutout (line 267) | def cutout(im, labels, p=0.5):
  function mixup (line 294) | def mixup(im, labels, im2, labels2):
  function box_candidates (line 302) | def box_candidates(box1, box2, wh_thr=2, ar_thr=100, area_thr=0.1, eps=1...
  function classify_albumentations (line 310) | def classify_albumentations(
  function classify_transforms (line 352) | def classify_transforms(size=224):
  class LetterBox (line 359) | class LetterBox:
    method __init__ (line 361) | def __init__(self, size=(640, 640), auto=False, stride=32):
    method __call__ (line 367) | def __call__(self, im):  # im = np.array HWC
  class CenterCrop (line 378) | class CenterCrop:
    method __init__ (line 380) | def __init__(self, size=640):
    method __call__ (line 384) | def __call__(self, im):  # im = np.array HWC
  class ToTensor (line 391) | class ToTensor:
    method __init__ (line 393) | def __init__(self, half=False):
    method __call__ (line 397) | def __call__(self, im):  # im = np.array HWC in BGR order

FILE: yolo/data/dataloaders/v5loader.py
  function get_hash (line 54) | def get_hash(paths):
  function exif_size (line 62) | def exif_size(img):
  function exif_transpose (line 72) | def exif_transpose(image):
  function seed_worker (line 98) | def seed_worker(worker_id):
  function create_dataloader (line 105) | def create_dataloader(path,
  class InfiniteDataLoader (line 160) | class InfiniteDataLoader(dataloader.DataLoader):
    method __init__ (line 166) | def __init__(self, *args, **kwargs):
    method __len__ (line 171) | def __len__(self):
    method __iter__ (line 174) | def __iter__(self):
  class _RepeatSampler (line 179) | class _RepeatSampler:
    method __init__ (line 186) | def __init__(self, sampler):
    method __iter__ (line 189) | def __iter__(self):
  class LoadScreenshots (line 194) | class LoadScreenshots:
    method __init__ (line 196) | def __init__(self, source, img_size=640, stride=32, auto=True, transfo...
    method __iter__ (line 225) | def __iter__(self):
    method __next__ (line 228) | def __next__(self):
  class LoadImages (line 243) | class LoadImages:
    method __init__ (line 245) | def __init__(self, path, img_size=640, stride=32, auto=True, transform...
    method __iter__ (line 280) | def __iter__(self):
    method __next__ (line 284) | def __next__(self):
    method _new_video (line 324) | def _new_video(self, path):
    method _cv2_rotate (line 332) | def _cv2_rotate(self, im):
    method __len__ (line 342) | def __len__(self):
  class LoadStreams (line 346) | class LoadStreams:
    method __init__ (line 348) | def __init__(self, sources='file.streams', img_size=640, stride=32, au...
    method update (line 392) | def update(self, i, cap, stream):
    method __iter__ (line 408) | def __iter__(self):
    method __next__ (line 412) | def __next__(self):
    method __len__ (line 428) | def __len__(self):
  function img2label_paths (line 432) | def img2label_paths(img_paths):
  class LoadImagesAndLabels (line 438) | class LoadImagesAndLabels(Dataset):
    method __init__ (line 443) | def __init__(self,
    method check_cache_ram (line 593) | def check_cache_ram(self, safety_margin=0.1, prefix=''):
    method cache_labels (line 610) | def cache_labels(self, path=Path('./labels.cache'), prefix=''):
    method __len__ (line 648) | def __len__(self):
    method __getitem__ (line 657) | def __getitem__(self, index):
    method load_image (line 731) | def load_image(self, i):
    method cache_images_to_disk (line 748) | def cache_images_to_disk(self, i):
    method load_mosaic (line 754) | def load_mosaic(self, index):
    method load_mosaic9 (line 812) | def load_mosaic9(self, index):
    method collate_fn (line 890) | def collate_fn(batch):
    method collate_fn_old (line 906) | def collate_fn_old(batch):
  function flatten_recursive (line 915) | def flatten_recursive(path=DATASETS_DIR / 'coco128'):
  function extract_boxes (line 925) | def extract_boxes(path=DATASETS_DIR / 'coco128'):  # from utils.dataload...
  function autosplit (line 959) | def autosplit(path=DATASETS_DIR / 'coco128/images', weights=(0.9, 0.1, 0...
  function verify_image_label (line 985) | def verify_image_label(args):
  class HUBDatasetStats (line 1037) | class HUBDatasetStats():
    method __init__ (line 1052) | def __init__(self, path='coco128.yaml', autodownload=False):
    method _find_yaml (line 1071) | def _find_yaml(dir):
    method _unzip (line 1081) | def _unzip(self, path):
    method _hub_ops (line 1091) | def _hub_ops(self, f, max_dim=1920):
    method get_json (line 1109) | def get_json(self, save=False, verbose=False):
    method process_images (line 1144) | def process_images(self):
  class ClassificationDataset (line 1158) | class ClassificationDataset(torchvision.datasets.ImageFolder):
    method __init__ (line 1167) | def __init__(self, root, augment, imgsz, cache=False):
    method __getitem__ (line 1175) | def __getitem__(self, i):
  function create_classification_dataloader (line 1192) | def create_classification_dataloader(path,

FILE: yolo/data/dataset.py
  class YOLODataset (line 16) | class YOLODataset(BaseDataset):
    method __init__ (line 25) | def __init__(
    method cache_labels (line 48) | def cache_labels(self, path=Path("./labels.cache")):
    method get_labels (line 100) | def get_labels(self):
    method build_transforms (line 127) | def build_transforms(self, hyp=None):
    method close_mosaic (line 141) | def close_mosaic(self, hyp):
    method update_labels_info (line 150) | def update_labels_info(self, label):
    method collate_fn (line 163) | def collate_fn(batch):
  class ClassificationDataset (line 184) | class ClassificationDataset(torchvision.datasets.ImageFolder):
    method __init__ (line 193) | def __init__(self, root, augment, imgsz, cache=False):
    method __getitem__ (line 201) | def __getitem__(self, i):
    method __len__ (line 217) | def __len__(self) -> int:
  class SemanticDataset (line 222) | class SemanticDataset(BaseDataset):
    method __init__ (line 224) | def __init__(self):

FILE: yolo/data/dataset_wrappers.py
  class MixAndRectDataset (line 9) | class MixAndRectDataset:
    method __init__ (line 17) | def __init__(self, dataset):
    method __len__ (line 21) | def __len__(self):
    method __getitem__ (line 24) | def __getitem__(self, index):

FILE: yolo/data/utils.py
  function img2label_paths (line 39) | def img2label_paths(img_paths):
  function get_hash (line 45) | def get_hash(paths):
  function exif_size (line 53) | def exif_size(img):
  function verify_image_label (line 63) | def verify_image_label(args):
  function polygon2mask (line 133) | def polygon2mask(imgsz, polygons, color=1, downsample_ratio=1):
  function polygons2masks (line 154) | def polygons2masks(imgsz, polygons, color, downsample_ratio=1):
  function polygons2masks_overlap (line 169) | def polygons2masks_overlap(imgsz, segments, downsample_ratio=1):
  function check_dataset_yaml (line 194) | def check_dataset_yaml(data, autodownload=True):
  function check_dataset (line 259) | def check_dataset(dataset: str):

FILE: yolo/engine/exporter.py
  function export_formats (line 83) | def export_formats():
  function try_export (line 101) | def try_export(inner_func):
  class Exporter (line 119) | class Exporter:
    method __init__ (line 130) | def __init__(self, config=DEFAULT_CONFIG, overrides=None):
    method __call__ (line 145) | def __call__(self, model=None):
    method _export_torchscript (line 258) | def _export_torchscript(self, prefix=colorstr('TorchScript:')):
    method _export_onnx (line 275) | def _export_onnx(self, prefix=colorstr('ONNX:')):
    method _export_openvino (line 328) | def _export_openvino(self, prefix=colorstr('OpenVINO:')):
    method _export_paddle (line 343) | def _export_paddle(self, prefix=colorstr('PaddlePaddle:')):
    method _export_coreml (line 357) | def _export_coreml(self, prefix=colorstr('CoreML:')):
    method _export_engine (line 397) | def _export_engine(self, workspace=4, verbose=False, prefix=colorstr('...
    method _export_saved_model (line 454) | def _export_saved_model(self,
    method _export_saved_model_OLD (line 488) | def _export_saved_model_OLD(self,
    method _export_pb (line 535) | def _export_pb(self, keras_model, file, prefix=colorstr('TensorFlow Gr...
    method _export_tflite (line 551) | def _export_tflite(self, keras_model, int8, data, nms, agnostic_nms, p...
    method _export_edgetpu (line 591) | def _export_edgetpu(self, prefix=colorstr('Edge TPU:')):
    method _export_tfjs (line 617) | def _export_tfjs(self, prefix=colorstr('TensorFlow.js:')):
    method _add_tflite_metadata (line 643) | def _add_tflite_metadata(self, file, num_outputs):
    method _pipeline_coreml (line 675) | def _pipeline_coreml(self, model, prefix=colorstr('CoreML Pipeline:')):
    method run_callbacks (line 796) | def run_callbacks(self, event: str):
  function export (line 802) | def export(cfg):

FILE: yolo/engine/model.py
  class YOLO (line 26) | class YOLO:
    method __init__ (line 33) | def __init__(self, model='yolov8n.yaml', type="v8") -> None:
    method __call__ (line 57) | def __call__(self, source, **kwargs):
    method _new (line 60) | def _new(self, cfg: str, verbose=True):
    method _load (line 76) | def _load(self, weights: str):
    method reset (line 91) | def reset(self):
    method info (line 101) | def info(self, verbose=False):
    method fuse (line 110) | def fuse(self):
    method predict (line 114) | def predict(self, source, **kwargs):
    method val (line 134) | def val(self, data=None, **kwargs):
    method export (line 153) | def export(self, **kwargs):
    method train (line 169) | def train(self, **kwargs):
    method to (line 195) | def to(self, device):
    method _guess_ops_from_task (line 204) | def _guess_ops_from_task(self, task):
    method _reset_ckpt_args (line 214) | def _reset_ckpt_args(args):

FILE: yolo/engine/predictor.py
  class BasePredictor (line 44) | class BasePredictor:
    method __init__ (line 64) | def __init__(self, config=DEFAULT_CONFIG, overrides=None):
    method preprocess (line 98) | def preprocess(self, img):
    method get_annotator (line 101) | def get_annotator(self, img):
    method get_tracker (line 104) | def get_tracker(self,img):
    method write_results (line 106) | def write_results(self, pred, batch, print_string):
    method postprocess (line 109) | def postprocess(self, preds, img, orig_img):
    method setup (line 112) | def setup(self, source=None, model=None):
    method __call__ (line 170) | def __call__(self, source=None, model=None):
    method show (line 224) | def show(self, p):
    method save_preds (line 233) | def save_preds(self, vid_cap, idx, save_path):
    method run_callbacks (line 253) | def run_callbacks(self, event: str):

FILE: yolo/engine/sort.py
  function linear_assignment (line 18) | def linear_assignment(cost_matrix):
  function iou_batch (line 30) | def iou_batch(bb_test, bb_gt):
  function convert_bbox_to_z (line 48) | def convert_bbox_to_z(bbox):
  function convert_x_to_bbox (line 61) | def convert_x_to_bbox(x, score=None):
  class KalmanBoxTracker (line 70) | class KalmanBoxTracker(object):
    method __init__ (line 73) | def __init__(self, bbox):
    method update (line 108) | def update(self, bbox):
    method predict (line 123) | def predict(self):
    method get_state (line 143) | def get_state(self):
  function associate_detections_to_trackers (line 160) | def associate_detections_to_trackers(detections, trackers, iou_threshold...
  class Sort (line 209) | class Sort(object):
    method __init__ (line 210) | def __init__(self, max_age=1, min_hits=3, iou_threshold=0.3):
    method getTrackers (line 219) | def getTrackers(self,):
    method update (line 222) | def update(self, dets= np.empty((0,6))):
  function parse_args (line 272) | def parse_args():

FILE: yolo/engine/trainer.py
  class BaseTrainer (line 39) | class BaseTrainer:
    method __init__ (line 76) | def __init__(self, config=DEFAULT_CONFIG, overrides=None):
    method add_callback (line 150) | def add_callback(self, event: str, callback):
    method set_callback (line 156) | def set_callback(self, event: str, callback):
    method run_callbacks (line 162) | def run_callbacks(self, event: str):
    method train (line 166) | def train(self):
    method _setup_ddp (line 179) | def _setup_ddp(self, rank, world_size):
    method _setup_train (line 187) | def _setup_train(self, rank, world_size):
    method _do_train (line 235) | def _do_train(self, rank=-1, world_size=1):
    method save_model (line 358) | def save_model(self):
    method get_dataset (line 376) | def get_dataset(self, data):
    method setup_model (line 382) | def setup_model(self):
    method optimizer_step (line 399) | def optimizer_step(self):
    method preprocess_batch (line 408) | def preprocess_batch(self, batch):
    method validate (line 414) | def validate(self):
    method log (line 424) | def log(self, text, rank=-1):
    method get_model (line 436) | def get_model(self, cfg=None, weights=None, verbose=True):
    method get_validator (line 439) | def get_validator(self):
    method get_dataloader (line 442) | def get_dataloader(self, dataset_path, batch_size=16, rank=0):
    method criterion (line 448) | def criterion(self, preds, batch):
    method label_loss_items (line 454) | def label_loss_items(self, loss_items=None, prefix="train"):
    method set_model_attributes (line 461) | def set_model_attributes(self):
    method build_targets (line 467) | def build_targets(self, preds, targets):
    method progress_string (line 470) | def progress_string(self):
    method plot_training_samples (line 474) | def plot_training_samples(self, batch, ni):
    method save_metrics (line 477) | def save_metrics(self, metrics):
    method plot_metrics (line 484) | def plot_metrics(self):
    method final_eval (line 487) | def final_eval(self):
    method check_resume (line 498) | def check_resume(self):
    method resume_training (line 509) | def resume_training(self, ckpt):
    method build_optimizer (line 534) | def build_optimizer(model, name='Adam', lr=0.001, momentum=0.9, decay=...

FILE: yolo/engine/validator.py
  class BaseValidator (line 20) | class BaseValidator:
    method __init__ (line 41) | def __init__(self, dataloader=None, save_dir=None, pbar=None, logger=N...
    method __call__ (line 76) | def __call__(self, trainer=None, model=None):
    method run_callbacks (line 178) | def run_callbacks(self, event: str):
    method get_dataloader (line 182) | def get_dataloader(self, dataset_path, batch_size):
    method preprocess (line 185) | def preprocess(self, batch):
    method postprocess (line 188) | def postprocess(self, preds):
    method init_metrics (line 191) | def init_metrics(self, model):
    method update_metrics (line 194) | def update_metrics(self, preds, batch):
    method get_stats (line 197) | def get_stats(self):
    method check_stats (line 200) | def check_stats(self, stats):
    method print_results (line 203) | def print_results(self):
    method get_desc (line 206) | def get_desc(self):
    method metric_keys (line 210) | def metric_keys(self):
    method plot_val_samples (line 214) | def plot_val_samples(self, batch, ni):
    method plot_predictions (line 217) | def plot_predictions(self, batch, preds, ni):
    method pred_to_json (line 220) | def pred_to_json(self, preds, batch):
    method eval_json (line 223) | def eval_json(self, stats):

FILE: yolo/utils/__init__.py
  function is_colab (line 77) | def is_colab():
  function is_kaggle (line 88) | def is_kaggle():
  function is_jupyter_notebook (line 98) | def is_jupyter_notebook():
  function is_docker (line 115) | def is_docker() -> bool:
  function is_git_directory (line 126) | def is_git_directory() -> bool:
  function is_pip_package (line 142) | def is_pip_package(filepath: str = __name__) -> bool:
  function is_dir_writeable (line 161) | def is_dir_writeable(dir_path: str) -> bool:
  function get_git_root_dir (line 179) | def get_git_root_dir():
  function get_default_args (line 191) | def get_default_args(func):
  function get_user_config_dir (line 197) | def get_user_config_dir(sub_dir='Ultralytics'):
  function emojis (line 233) | def emojis(string=''):
  function colorstr (line 238) | def colorstr(*input):
  function set_logging (line 264) | def set_logging(name=LOGGING_NAME, verbose=True):
  class TryExcept (line 286) | class TryExcept(contextlib.ContextDecorator):
    method __init__ (line 288) | def __init__(self, msg=''):
    method __enter__ (line 291) | def __enter__(self):
    method __exit__ (line 294) | def __exit__(self, exc_type, value, traceback):
  function threaded (line 300) | def threaded(func):
  function yaml_save (line 310) | def yaml_save(file='data.yaml', data=None):
  function yaml_load (line 331) | def yaml_load(file='data.yaml', append_filename=False):
  function get_settings (line 347) | def get_settings(file=USER_CONFIG_DIR / 'settings.yaml'):
  function set_settings (line 400) | def set_settings(kwargs, file=USER_CONFIG_DIR / 'settings.yaml'):

FILE: yolo/utils/autobatch.py
  function check_train_batch_size (line 15) | def check_train_batch_size(model, imgsz=640, amp=True):
  function autobatch (line 21) | def autobatch(model, imgsz=640, fraction=0.7, batch_size=16):

FILE: yolo/utils/callbacks/base.py
  function on_pretrain_routine_start (line 8) | def on_pretrain_routine_start(trainer):
  function on_pretrain_routine_end (line 12) | def on_pretrain_routine_end(trainer):
  function on_train_start (line 16) | def on_train_start(trainer):
  function on_train_epoch_start (line 20) | def on_train_epoch_start(trainer):
  function on_train_batch_start (line 24) | def on_train_batch_start(trainer):
  function optimizer_step (line 28) | def optimizer_step(trainer):
  function on_before_zero_grad (line 32) | def on_before_zero_grad(trainer):
  function on_train_batch_end (line 36) | def on_train_batch_end(trainer):
  function on_train_epoch_end (line 40) | def on_train_epoch_end(trainer):
  function on_fit_epoch_end (line 44) | def on_fit_epoch_end(trainer):
  function on_model_save (line 48) | def on_model_save(trainer):
  function on_train_end (line 52) | def on_train_end(trainer):
  function on_params_update (line 56) | def on_params_update(trainer):
  function teardown (line 60) | def teardown(trainer):
  function on_val_start (line 65) | def on_val_start(validator):
  function on_val_batch_start (line 69) | def on_val_batch_start(validator):
  function on_val_batch_end (line 73) | def on_val_batch_end(validator):
  function on_val_end (line 77) | def on_val_end(validator):
  function on_predict_start (line 82) | def on_predict_start(predictor):
  function on_predict_batch_start (line 86) | def on_predict_batch_start(predictor):
  function on_predict_batch_end (line 90) | def on_predict_batch_end(predictor):
  function on_predict_end (line 94) | def on_predict_end(predictor):
  function on_export_start (line 99) | def on_export_start(exporter):
  function on_export_end (line 103) | def on_export_end(exporter):
  function add_integration_callbacks (line 141) | def add_integration_callbacks(instance):

FILE: yolo/utils/callbacks/clearml.py
  function _log_images (line 14) | def _log_images(imgs_dict, group="", step=0):
  function on_pretrain_routine_start (line 21) | def on_pretrain_routine_start(trainer):
  function on_train_epoch_end (line 32) | def on_train_epoch_end(trainer):
  function on_fit_epoch_end (line 37) | def on_fit_epoch_end(trainer):
  function on_train_end (line 46) | def on_train_end(trainer):

FILE: yolo/utils/callbacks/comet.py
  function on_pretrain_routine_start (line 12) | def on_pretrain_routine_start(trainer):
  function on_train_epoch_end (line 17) | def on_train_epoch_end(trainer):
  function on_fit_epoch_end (line 25) | def on_fit_epoch_end(trainer):
  function on_train_end (line 36) | def on_train_end(trainer):

FILE: yolo/utils/callbacks/hub.py
  function on_pretrain_routine_end (line 12) | def on_pretrain_routine_end(trainer):
  function on_fit_epoch_end (line 20) | def on_fit_epoch_end(trainer):
  function on_model_save (line 30) | def on_model_save(trainer):
  function on_train_end (line 41) | def on_train_end(trainer):
  function on_train_start (line 52) | def on_train_start(trainer):
  function on_val_start (line 56) | def on_val_start(validator):
  function on_predict_start (line 60) | def on_predict_start(predictor):
  function on_export_start (line 64) | def on_export_start(exporter):

FILE: yolo/utils/callbacks/tensorboard.py
  function _log_scalars (line 8) | def _log_scalars(scalars, step=0):
  function on_pretrain_routine_start (line 13) | def on_pretrain_routine_start(trainer):
  function on_batch_end (line 18) | def on_batch_end(trainer):
  function on_fit_epoch_end (line 22) | def on_fit_epoch_end(trainer):

FILE: yolo/utils/checks.py
  function is_ascii (line 21) | def is_ascii(s) -> bool:
  function check_imgsz (line 38) | def check_imgsz(imgsz, stride=32, min_dim=1, floor=0):
  function check_version (line 72) | def check_version(current: str = "0.0.0",
  function check_font (line 103) | def check_font(font: str = FONT, progress: bool = False) -> None:
  function check_online (line 127) | def check_online() -> bool:
  function check_python (line 143) | def check_python(minimum: str = '3.7.0') -> bool:
  function check_requirements (line 157) | def check_requirements(requirements=ROOT.parent / 'requirements.txt', ex...
  function check_suffix (line 191) | def check_suffix(file='yolov8n.pt', suffix=('.pt',), msg=''):
  function check_file (line 202) | def check_file(file, suffix=''):
  function check_yaml (line 227) | def check_yaml(file, suffix=('.yaml', '.yml')):
  function check_imshow (line 232) | def check_imshow(warn=False):
  function git_describe (line 248) | def git_describe(path=ROOT):  # path must be a directory
  function print_args (line 257) | def print_args(args: Optional[dict] = None, show_file=True, show_func=Fa...

FILE: yolo/utils/dist.py
  function find_free_network_port (line 12) | def find_free_network_port() -> int:
  function generate_ddp_file (line 26) | def generate_ddp_file(trainer):
  function generate_ddp_command (line 47) | def generate_ddp_command(world_size, trainer):
  function ddp_cleanup (line 58) | def ddp_cleanup(command, trainer):

FILE: yolo/utils/downloads.py
  function safe_download (line 18) | def safe_download(file, url, url2=None, min_bytes=1E0, error_msg=''):
  function is_url (line 39) | def is_url(url, check=True):
  function attempt_download (line 50) | def attempt_download(file, repo='ultralytics/assets', release='v0.0.0'):
  function download (line 100) | def download(url, dir=Path.cwd(), unzip=True, delete=True, curl=False, t...

FILE: yolo/utils/files.py
  class WorkingDirectory (line 12) | class WorkingDirectory(contextlib.ContextDecorator):
    method __init__ (line 14) | def __init__(self, new_dir):
    method __enter__ (line 18) | def __enter__(self):
    method __exit__ (line 21) | def __exit__(self, exc_type, exc_val, exc_tb):
  function increment_path (line 25) | def increment_path(path, exist_ok=False, sep='', mkdir=False):
  function unzip_file (line 60) | def unzip_file(file, path=None, exclude=('.DS_Store', '__MACOSX')):
  function file_age (line 70) | def file_age(path=__file__):
  function file_date (line 76) | def file_date(path=__file__):
  function file_size (line 82) | def file_size(path):
  function url2file (line 94) | def url2file(url):
  function get_latest_run (line 100) | def get_latest_run(search_dir='.'):

FILE: yolo/utils/instance.py
  function _ntuple (line 14) | def _ntuple(n):
  class Bboxes (line 32) | class Bboxes:
    method __init__ (line 35) | def __init__(self, bboxes, format="xyxy") -> None:
    method convert (line 66) | def convert(self, format):
    method areas (line 79) | def areas(self):
    method mul (line 99) | def mul(self, scale):
    method add (line 113) | def add(self, offset):
    method __len__ (line 127) | def __len__(self):
    method concatenate (line 131) | def concatenate(cls, boxes_list: List["Bboxes"], axis=0) -> "Bboxes":
    method __getitem__ (line 150) | def __getitem__(self, index) -> "Bboxes":
  class Instances (line 165) | class Instances:
    method __init__ (line 167) | def __init__(self, bboxes, segments=None, keypoints=None, bbox_format=...
    method convert_bbox (line 189) | def convert_bbox(self, format):
    method bbox_areas (line 192) | def bbox_areas(self):
    method scale (line 195) | def scale(self, scale_w, scale_h, bbox_only=False):
    method denormalize (line 206) | def denormalize(self, w, h):
    method normalize (line 217) | def normalize(self, w, h):
    method add_padding (line 228) | def add_padding(self, padw, padh):
    method __getitem__ (line 238) | def __getitem__(self, index) -> "Instances":
    method flipud (line 258) | def flipud(self, h):
    method fliplr (line 270) | def fliplr(self, w):
    method clip (line 282) | def clip(self, w, h):
    method update (line 295) | def update(self, bboxes, segments=None, keypoints=None):
    method __len__ (line 303) | def __len__(self):
    method concatenate (line 307) | def concatenate(cls, instances_list: List["Instances"], axis=0) -> "In...
    method bboxes (line 336) | def bboxes(self):

FILE: yolo/utils/loss.py
  class VarifocalLoss (line 11) | class VarifocalLoss(nn.Module):
    method __init__ (line 13) | def __init__(self):
    method forward (line 16) | def forward(self, pred_score, gt_score, label, alpha=0.75, gamma=2.0):
  class BboxLoss (line 24) | class BboxLoss(nn.Module):
    method __init__ (line 26) | def __init__(self, reg_max, use_dfl=False):
    method forward (line 31) | def forward(self, pred_dist, pred_bboxes, anchor_points, target_bboxes...
    method _df_loss (line 48) | def _df_loss(pred_dist, target):

FILE: yolo/utils/metrics.py
  function box_area (line 18) | def box_area(box):
  function bbox_ioa (line 23) | def bbox_ioa(box1, box2, eps=1e-7):
  function box_iou (line 45) | def box_iou(box1, box2, eps=1e-7):
  function bbox_iou (line 66) | def bbox_iou(box1, box2, xywh=True, GIoU=False, DIoU=False, CIoU=False, ...
  function mask_iou (line 107) | def mask_iou(mask1, mask2, eps=1e-7):
  function masks_iou (line 119) | def masks_iou(mask1, mask2, eps=1e-7):
  function smooth_BCE (line 131) | def smooth_BCE(eps=0.1):  # https://github.com/ultralytics/yolov3/issues...
  class FocalLoss (line 137) | class FocalLoss(nn.Module):
    method __init__ (line 139) | def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
    method forward (line 147) | def forward(self, pred, true):
  class ConfusionMatrix (line 167) | class ConfusionMatrix:
    method __init__ (line 169) | def __init__(self, nc, conf=0.25, iou_thres=0.45):
    method process_batch (line 175) | def process_batch(self, detections, labels):
    method matrix (line 221) | def matrix(self):
    method tp_fp (line 224) | def tp_fp(self):
    method plot (line 231) | def plot(self, normalize=True, save_dir='', names=()):
    method print (line 261) | def print(self):
  function smooth (line 266) | def smooth(y, f=0.05):
  function plot_pr_curve (line 274) | def plot_pr_curve(px, py, ap, save_dir=Path('pr_curve.png'), names=()):
  function plot_mc_curve (line 296) | def plot_mc_curve(px, py, save_dir=Path('mc_curve.png'), names=(), xlabe...
  function compute_ap (line 318) | def compute_ap(recall, precision):
  function ap_per_class (line 346) | def ap_per_class(tp, conf, pred_cls, target_cls, plot=False, save_dir=Pa...
  class Metric (line 413) | class Metric:
    method __init__ (line 415) | def __init__(self) -> None:
    method ap50 (line 423) | def ap50(self):
    method ap (line 431) | def ap(self):
    method mp (line 439) | def mp(self):
    method mr (line 447) | def mr(self):
    method map50 (line 455) | def map50(self):
    method map (line 463) | def map(self):
    method mean_results (line 470) | def mean_results(self):
    method class_result (line 474) | def class_result(self, i):
    method get_maps (line 478) | def get_maps(self, nc):
    method fitness (line 484) | def fitness(self):
    method update (line 489) | def update(self, results):
  class DetMetrics (line 497) | class DetMetrics:
    method __init__ (line 499) | def __init__(self, save_dir=Path("."), plot=False, names=()) -> None:
    method process (line 505) | def process(self, tp, conf, pred_cls, target_cls):
    method keys (line 511) | def keys(self):
    method mean_results (line 514) | def mean_results(self):
    method class_result (line 517) | def class_result(self, i):
    method get_maps (line 520) | def get_maps(self, nc):
    method fitness (line 524) | def fitness(self):
    method ap_class_index (line 528) | def ap_class_index(self):
    method results_dict (line 532) | def results_dict(self):
  class SegmentMetrics (line 536) | class SegmentMetrics:
    method __init__ (line 538) | def __init__(self, save_dir=Path("."), plot=False, names=()) -> None:
    method process (line 545) | def process(self, tp_m, tp_b, conf, pred_cls, target_cls):
    method keys (line 566) | def keys(self):
    method mean_results (line 571) | def mean_results(self):
    method class_result (line 574) | def class_result(self, i):
    method get_maps (line 577) | def get_maps(self, nc):
    method fitness (line 581) | def fitness(self):
    method ap_class_index (line 585) | def ap_class_index(self):
    method results_dict (line 590) | def results_dict(self):
  class ClassifyMetrics (line 594) | class ClassifyMetrics:
    method __init__ (line 596) | def __init__(self) -> None:
    method process (line 600) | def process(self, targets, pred):
    method fitness (line 608) | def fitness(self):
    method results_dict (line 612) | def results_dict(self):
    method keys (line 616) | def keys(self):

FILE: yolo/utils/ops.py
  class Profile (line 19) | class Profile(contextlib.ContextDecorator):
    method __init__ (line 21) | def __init__(self, t=0.0):
    method __enter__ (line 25) | def __enter__(self):
    method __exit__ (line 29) | def __exit__(self, type, value, traceback):
    method time (line 33) | def time(self):
  function coco80_to_coco91_class (line 39) | def coco80_to_coco91_class():  # converts 80-index (val2014) to 91-index...
  function segment2box (line 51) | def segment2box(segment, width=640, height=640):
  function scale_boxes (line 70) | def scale_boxes(img1_shape, boxes, img0_shape, ratio_pad=None):
  function make_divisible (line 97) | def make_divisible(x, divisor):
  function non_max_suppression (line 104) | def non_max_suppression(
  function clip_boxes (line 232) | def clip_boxes(boxes, shape):
  function clip_coords (line 251) | def clip_coords(boxes, shape):
  function scale_image (line 263) | def scale_image(im1_shape, masks, im0_shape, ratio_pad=None):
  function xyxy2xywh (line 298) | def xyxy2xywh(x):
  function xywh2xyxy (line 317) | def xywh2xyxy(x):
  function xywhn2xyxy (line 335) | def xywhn2xyxy(x, w=640, h=640, padw=0, padh=0):
  function xyxy2xywhn (line 357) | def xyxy2xywhn(x, w=640, h=640, clip=False, eps=0.0):
  function xyn2xy (line 382) | def xyn2xy(x, w=640, h=640, padw=0, padh=0):
  function xywh2ltwh (line 402) | def xywh2ltwh(x):
  function xyxy2ltwh (line 418) | def xyxy2ltwh(x):
  function ltwh2xywh (line 434) | def ltwh2xywh(x):
  function ltwh2xyxy (line 447) | def ltwh2xyxy(x):
  function segments2boxes (line 464) | def segments2boxes(segments):
  function resample_segments (line 482) | def resample_segments(segments, n=1000):
  function crop_mask (line 502) | def crop_mask(masks, boxes):
  function process_mask_upsample (line 521) | def process_mask_upsample(protos, masks_in, bboxes, shape):
  function process_mask (line 542) | def process_mask(protos, masks_in, bboxes, shape, upsample=False):
  function process_mask_native (line 573) | def process_mask_native(protos, masks_in, bboxes, shape):
  function scale_segments (line 599) | def scale_segments(img1_shape, segments, img0_shape, ratio_pad=None, nor...
  function masks2segments (line 630) | def masks2segments(masks, strategy='largest'):
  function clip_segments (line 655) | def clip_segments(segments, shape):
  function clean_str (line 672) | def clean_str(s):

FILE: yolo/utils/plotting.py
  class Colors (line 22) | class Colors:
    method __init__ (line 24) | def __init__(self):
    method __call__ (line 31) | def __call__(self, i, bgr=False):
    method hex2rgb (line 36) | def hex2rgb(h):  # rgb order (PIL)
  class Annotator (line 43) | class Annotator:
    method __init__ (line 45) | def __init__(self, im, line_width=None, font_size=None, font='Arial.tt...
    method box_label (line 58) | def box_label(self, box, label='', color=(128, 128, 128), txt_color=(2...
    method masks (line 89) | def masks(self, masks, colors, im_gpu, alpha=0.5, retina_masks=False):
    method rectangle (line 120) | def rectangle(self, xy, fill=None, outline=None, width=1):
    method text (line 124) | def text(self, xy, text, txt_color=(255, 255, 255), anchor='top'):
    method fromarray (line 131) | def fromarray(self, im):
    method result (line 136) | def result(self):
  function check_pil_font (line 141) | def check_pil_font(font=FONT, size=10):
  function save_one_box (line 157) | def save_one_box(xyxy, im, file=Path('im.jpg'), gain=1.02, pad=10, squar...
  function plot_images (line 176) | def plot_images(images,
  function plot_results (line 280) | def plot_results(file='path/to/results.csv', dir='', segment=False):
  function output_to_target (line 311) | def output_to_target(output, max_det=300):

FILE: yolo/utils/tal.py
  function select_candidates_in_gts (line 13) | def select_candidates_in_gts(xy_centers, gt_bboxes, eps=1e-9):
  function select_highest_overlaps (line 30) | def select_highest_overlaps(mask_pos, overlaps, n_max_boxes):
  class TaskAlignedAssigner (line 56) | class TaskAlignedAssigner(nn.Module):
    method __init__ (line 58) | def __init__(self, topk=13, num_classes=80, alpha=1.0, beta=6.0, eps=1...
    method forward (line 68) | def forward(self, pd_scores, pd_bboxes, anc_points, gt_labels, gt_bbox...
    method get_pos_mask (line 111) | def get_pos_mask(self, pd_scores, pd_bboxes, gt_labels, gt_bboxes, anc...
    method get_box_metrics (line 124) | def get_box_metrics(self, pd_scores, pd_bboxes, gt_labels, gt_bboxes):
    method select_topk_candidates (line 135) | def select_topk_candidates(self, metrics, largest=True, topk_mask=None):
    method get_targets (line 155) | def get_targets(self, gt_labels, gt_bboxes, target_gt_idx, fg_mask):
  function make_anchors (line 181) | def make_anchors(feats, strides, grid_cell_offset=0.5):
  function dist2bbox (line 196) | def dist2bbox(distance, anchor_points, xywh=True, dim=-1):
  function bbox2dist (line 208) | def bbox2dist(anchor_points, bbox, reg_max):

FILE: yolo/utils/torch_utils.py
  function torch_distributed_zero_first (line 32) | def torch_distributed_zero_first(local_rank: int):
  function smart_inference_mode (line 42) | def smart_inference_mode(torch_1_9=check_version(torch.__version__, '1.9...
  function DDP_model (line 50) | def DDP_model(model):
  function select_device (line 61) | def select_device(device='', batch_size=0, newline=False):
  function time_sync (line 97) | def time_sync():
  function fuse_conv_and_bn (line 104) | def fuse_conv_and_bn(conv, bn):
  function model_info (line 128) | def model_info(model, verbose=False, imgsz=640):
  function get_num_params (line 145) | def get_num_params(model):
  function get_num_gradients (line 149) | def get_num_gradients(model):
  function get_flops (line 153) | def get_flops(model, imgsz=640):
  function initialize_weights (line 167) | def initialize_weights(model):
  function scale_img (line 179) | def scale_img(img, ratio=1.0, same_shape=False, gs=32):  # img(16,3,256,...
  function make_divisible (line 191) | def make_divisible(x, divisor):
  function copy_attr (line 198) | def copy_attr(a, b, include=(), exclude=()):
  function intersect_dicts (line 207) | def intersect_dicts(da, db, exclude=()):
  function is_parallel (line 212) | def is_parallel(model):
  function de_parallel (line 217) | def de_parallel(model):
  function one_cycle (line 222) | def one_cycle(y1=0.0, y2=1.0, steps=100):
  function init_seeds (line 227) | def init_seeds(seed=0, deterministic=False):
  class ModelEMA (line 242) | class ModelEMA:
    method __init__ (line 248) | def __init__(self, model, decay=0.9999, tau=2000, updates=0):
    method update (line 256) | def update(self, model):
    method update_attr (line 268) | def update_attr(self, model, include=(), exclude=('process_group', 're...
  function strip_optimizer (line 273) | def strip_optimizer(f='best.pt', s=''):
  function guess_task_from_head (line 306) | def guess_task_from_head(head):
  function profile (line 321) | def profile(input, ops, n=10, device=None):

FILE: yolo/v8/detect/detect_and_trk.py
  function init_tracker (line 13) | def init_tracker():
  function draw_boxes (line 23) | def draw_boxes(img, bbox, identities=None, categories=None, names=None, ...
  function random_color_list (line 42) | def random_color_list():
  class DetectionPredictor (line 55) | class DetectionPredictor(BasePredictor):
    method get_annotator (line 57) | def get_annotator(self, img):
    method preprocess (line 60) | def preprocess(self, img):
    method postprocess (line 66) | def postprocess(self, preds, img, orig_img):
    method write_results (line 79) | def write_results(self, idx, preds, batch):
  function predict (line 141) | def predict(cfg):

FILE: yolo/v8/detect/predict.py
  class DetectionPredictor (line 12) | class DetectionPredictor(BasePredictor):
    method get_annotator (line 14) | def get_annotator(self, img):
    method preprocess (line 17) | def preprocess(self, img):
    method postprocess (line 23) | def postprocess(self, preds, img, orig_img):
    method write_results (line 36) | def write_results(self, idx, preds, batch):
  function predict (line 87) | def predict(cfg):

FILE: yolo/v8/detect/sort.py
  function linear_assignment (line 18) | def linear_assignment(cost_matrix):
  function iou_batch (line 30) | def iou_batch(bb_test, bb_gt):
  function convert_bbox_to_z (line 48) | def convert_bbox_to_z(bbox):
  function convert_x_to_bbox (line 61) | def convert_x_to_bbox(x, score=None):
  class KalmanBoxTracker (line 70) | class KalmanBoxTracker(object):
    method __init__ (line 73) | def __init__(self, bbox):
    method update (line 108) | def update(self, bbox):
    method predict (line 123) | def predict(self):
    method get_state (line 143) | def get_state(self):
  function associate_detections_to_trackers (line 160) | def associate_detections_to_trackers(detections, trackers, iou_threshold...
  class Sort (line 209) | class Sort(object):
    method __init__ (line 210) | def __init__(self, max_age=1, min_hits=3, iou_threshold=0.3):
    method getTrackers (line 219) | def getTrackers(self,):
    method update (line 222) | def update(self, dets= np.empty((0,6))):
  function parse_args (line 272) | def parse_args():

FILE: yolo/v8/detect/train.py
  class DetectionTrainer (line 23) | class DetectionTrainer(BaseTrainer):
    method get_dataloader (line 25) | def get_dataloader(self, dataset_path, batch_size, mode="train", rank=0):
    method preprocess_batch (line 46) | def preprocess_batch(self, batch):
    method set_model_attributes (line 50) | def set_model_attributes(self):
    method get_model (line 60) | def get_model(self, cfg=None, weights=None, verbose=True):
    method get_validator (line 67) | def get_validator(self):
    method criterion (line 74) | def criterion(self, preds, batch):
    method label_loss_items (line 79) | def label_loss_items(self, loss_items=None, prefix="train"):
    method progress_string (line 91) | def progress_string(self):
    method plot_training_samples (line 95) | def plot_training_samples(self, batch, ni):
    method plot_metrics (line 103) | def plot_metrics(self):
  class Loss (line 108) | class Loss:
    method __init__ (line 110) | def __init__(self, model):  # model must be de-paralleled
    method preprocess (line 129) | def preprocess(self, targets, batch_size, scale_tensor):
    method bbox_decode (line 144) | def bbox_decode(self, anchor_points, pred_dist):
    method __call__ (line 152) | def __call__(self, preds, batch):
  function train (line 199) | def train(cfg):

FILE: yolo/v8/detect/val.py
  class DetectionValidator (line 20) | class DetectionValidator(BaseValidator):
    method __init__ (line 22) | def __init__(self, dataloader=None, save_dir=None, pbar=None, logger=N...
    method preprocess (line 31) | def preprocess(self, batch):
    method init_metrics (line 44) | def init_metrics(self, model):
    method get_desc (line 57) | def get_desc(self):
    method postprocess (line 60) | def postprocess(self, preds):
    method update_metrics (line 70) | def update_metrics(self, preds, batch):
    method get_stats (line 113) | def get_stats(self):
    method print_results (line 120) | def print_results(self):
    method _process_batch (line 135) | def _process_batch(self, detections, labels):
    method get_dataloader (line 160) | def get_dataloader(self, dataset_path, batch_size):
    method plot_val_samples (line 178) | def plot_val_samples(self, batch, ni):
    method plot_predictions (line 187) | def plot_predictions(self, batch, preds, ni):
    method pred_to_json (line 194) | def pred_to_json(self, predn, filename):
    method eval_json (line 206) | def eval_json(self, stats):
  function val (line 233) | def val(cfg):

Download .json

Condensed preview — 77 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (694K chars).

[
  {
    "path": "LICENSE",
    "chars": 34523,
    "preview": "                    GNU AFFERO GENERAL PUBLIC LICENSE\n                       Version 3, 19 November 2007\n\n Copyright (C)"
  },
  {
    "path": "README.md",
    "chars": 5180,
    "preview": "# yolov8-object-tracking \n\nThis is compatible only with `ultralytics==8.0.0`. However, I highly recommend using the late"
  },
  {
    "path": "__init__.py",
    "chars": 94,
    "preview": "from hub import checks\r\nfrom engine.model import YOLO\r\nfrom utils import ops\r\nfrom . import v8"
  },
  {
    "path": "models/v8/yolov8l.yaml",
    "chars": 1236,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\n# Parameters\r\nnc: 80  # number of classes\r\ndepth_multiple: 1.00  # scales modul"
  },
  {
    "path": "models/v8/yolov8m.yaml",
    "chars": 1236,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\n# Parameters\r\nnc: 80  # number of classes\r\ndepth_multiple: 0.67  # scales modul"
  },
  {
    "path": "models/v8/yolov8n.yaml",
    "chars": 1240,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\n# Parameters\r\nnc: 80  # number of classes\r\ndepth_multiple: 0.33  # scales modul"
  },
  {
    "path": "models/v8/yolov8s.yaml",
    "chars": 1240,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\n# Parameters\r\nnc: 80  # number of classes\r\ndepth_multiple: 0.33  # scales modul"
  },
  {
    "path": "models/v8/yolov8x.yaml",
    "chars": 1236,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\n# Parameters\r\nnc: 80  # number of classes\r\ndepth_multiple: 1.00  # scales modul"
  },
  {
    "path": "models/v8/yolov8x6.yaml",
    "chars": 1616,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\n# Parameters\r\nnc: 80  # number of classes\r\ndepth_multiple: 1.00  # scales modul"
  },
  {
    "path": "nn/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "nn/autobackend.py",
    "chars": 20307,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport json\r\nimport platform\r\nfrom collections import OrderedDict, namedtuple\r\n"
  },
  {
    "path": "nn/modules.py",
    "chars": 30679,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\"\"\"\r\nCommon modules\r\n\"\"\"\r\n\r\nimport math\r\nimport warnings\r\nfrom copy import copy\r\n"
  },
  {
    "path": "nn/tasks.py",
    "chars": 18592,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport contextlib\r\nfrom copy import deepcopy\r\n\r\nimport thop\r\nimport torch\r\nimpo"
  },
  {
    "path": "requirements.txt",
    "chars": 1246,
    "preview": "# Ultralytics requirements\r\n# Usage: pip install -r requirements.txt\r\n\r\n# Base ----------------------------------------\r"
  },
  {
    "path": "yolo/cli.py",
    "chars": 1765,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport shutil\r\nfrom pathlib import Path\r\n\r\nimport hydra\r\n\r\nimport hub, yolo\r\nfr"
  },
  {
    "path": "yolo/configs/__init__.py",
    "chars": 1236,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nfrom pathlib import Path\r\nfrom typing import Dict, Union\r\n\r\nfrom omegaconf impo"
  },
  {
    "path": "yolo/configs/default.yaml",
    "chars": 5535,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# Default training settings and hyperparameters for medium-augmentation COCO trai"
  },
  {
    "path": "yolo/configs/hydra_patch.py",
    "chars": 3818,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport sys\r\nfrom difflib import get_close_matches\r\nfrom textwrap import dedent\r"
  },
  {
    "path": "yolo/data/__init__.py",
    "chars": 265,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nfrom .base import BaseDataset\r\nfrom .build import build_classification_dataload"
  },
  {
    "path": "yolo/data/augment.py",
    "chars": 31485,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport math\r\nimport random\r\nfrom copy import deepcopy\r\n\r\nimport cv2\r\nimport num"
  },
  {
    "path": "yolo/data/base.py",
    "chars": 8766,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport glob\r\nimport math\r\nimport os\r\nfrom multiprocessing.pool import ThreadPoo"
  },
  {
    "path": "yolo/data/build.py",
    "chars": 5085,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport os\r\nimport random\r\n\r\nimport numpy as np\r\nimport torch\r\nfrom torch.utils."
  },
  {
    "path": "yolo/data/dataloaders/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "yolo/data/dataloaders/stream_loaders.py",
    "chars": 11526,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport glob\r\nimport math\r\nimport os\r\nimport time\r\nfrom pathlib import Path\r\nfro"
  },
  {
    "path": "yolo/data/dataloaders/v5augmentations.py",
    "chars": 17642,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\"\"\"\r\nImage augmentation functions\r\n\"\"\"\r\n\r\nimport math\r\nimport random\r\n\r\nimport cv"
  },
  {
    "path": "yolo/data/dataloaders/v5loader.py",
    "chars": 56515,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\"\"\"\r\nDataloaders and dataset utils\r\n\"\"\"\r\n\r\nimport contextlib\r\nimport glob\r\nimport"
  },
  {
    "path": "yolo/data/dataset.py",
    "chars": 9757,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nfrom itertools import repeat\r\nfrom multiprocessing.pool import Pool\r\nfrom pathl"
  },
  {
    "path": "yolo/data/dataset_wrappers.py",
    "chars": 1366,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport collections\r\nfrom copy import deepcopy\r\n\r\nfrom .augment import LetterBox"
  },
  {
    "path": "yolo/data/datasets/Argoverse.yaml",
    "chars": 2779,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# Argoverse-HD dataset (ring-front-center camera) http://www.cs.cmu.edu/~mengtial"
  },
  {
    "path": "yolo/data/datasets/GlobalWheat2020.yaml",
    "chars": 1911,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# Global Wheat 2020 dataset http://www.global-wheat.com/ by University of Saskatc"
  },
  {
    "path": "yolo/data/datasets/ImageNet.yaml",
    "chars": 19865,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# ImageNet-1k dataset https://www.image-net.org/index.php by Stanford University\r"
  },
  {
    "path": "yolo/data/datasets/Objects365.yaml",
    "chars": 9615,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# Objects365 dataset https://www.objects365.org/ by Megvii\r\n# Example usage: pyth"
  },
  {
    "path": "yolo/data/datasets/SKU-110K.yaml",
    "chars": 2366,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# SKU-110K retail items dataset https://github.com/eg4000/SKU110K_CVPR19 by Trax "
  },
  {
    "path": "yolo/data/datasets/VOC.yaml",
    "chars": 3565,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC by University of Oxfo"
  },
  {
    "path": "yolo/data/datasets/VisDrone.yaml",
    "chars": 3013,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# VisDrone2019-DET dataset https://github.com/VisDrone/VisDrone-Dataset by Tianji"
  },
  {
    "path": "yolo/data/datasets/coco.yaml",
    "chars": 2574,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# COCO 2017 dataset http://cocodataset.org by Microsoft\r\n# Example usage: python "
  },
  {
    "path": "yolo/data/datasets/coco128-seg.yaml",
    "chars": 1939,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# COCO128-seg dataset https://www.kaggle.com/ultralytics/coco128 (first 128 image"
  },
  {
    "path": "yolo/data/datasets/coco128.yaml",
    "chars": 1923,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# COCO128 dataset https://www.kaggle.com/ultralytics/coco128 (first 128 images fr"
  },
  {
    "path": "yolo/data/datasets/xView.yaml",
    "chars": 5295,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# DIUx xView 2018 Challenge https://challenge.xviewdataset.org by U.S. National G"
  },
  {
    "path": "yolo/data/scripts/download_weights.sh",
    "chars": 628,
    "preview": "#!/bin/bash\r\n# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# Download latest models from https://github.com/ultralytics/yolov5/"
  },
  {
    "path": "yolo/data/scripts/get_coco.sh",
    "chars": 1757,
    "preview": "#!/bin/bash\r\n# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# Download COCO 2017 dataset http://cocodataset.org\r\n# Example usage"
  },
  {
    "path": "yolo/data/scripts/get_coco128.sh",
    "chars": 607,
    "preview": "#!/bin/bash\r\n# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# Download COCO128 dataset https://www.kaggle.com/ultralytics/coco12"
  },
  {
    "path": "yolo/data/scripts/get_imagenet.sh",
    "chars": 1694,
    "preview": "#!/bin/bash\r\n# Ultralytics YOLO 🚀, GPL-3.0 license\r\n# Download ILSVRC2012 ImageNet dataset https://image-net.org\r\n# Exam"
  },
  {
    "path": "yolo/data/utils.py",
    "chars": 13689,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport contextlib\r\nimport hashlib\r\nimport os\r\nimport subprocess\r\nimport time\r\nf"
  },
  {
    "path": "yolo/engine/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "yolo/engine/exporter.py",
    "chars": 40923,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\"\"\"\r\nExport a YOLOv5 PyTorch model to other formats. TensorFlow exports authored "
  },
  {
    "path": "yolo/engine/model.py",
    "chars": 8477,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nfrom pathlib import Path\r\n\r\nfrom ultralytics import yolo  # noqa\r\nfrom nn.tasks"
  },
  {
    "path": "yolo/engine/predictor.py",
    "chars": 11866,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\"\"\"\r\nRun prediction on images, videos, directories, globs, YouTube, webcam, strea"
  },
  {
    "path": "yolo/engine/sort.py",
    "chars": 13248,
    "preview": "from __future__ import print_function\n\nimport os\nimport numpy as np\nimport matplotlib\nmatplotlib.use('Agg')\nimport matpl"
  },
  {
    "path": "yolo/engine/trainer.py",
    "chars": 25403,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\"\"\"\r\nSimple training loop; Boilerplate that could apply to any arbitrary neural n"
  },
  {
    "path": "yolo/engine/validator.py",
    "chars": 9136,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport json\r\nfrom collections import defaultdict\r\nfrom pathlib import Path\r\n\r\ni"
  },
  {
    "path": "yolo/utils/__init__.py",
    "chars": 14084,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport contextlib\r\nimport inspect\r\nimport logging.config\r\nimport os\r\nimport pla"
  },
  {
    "path": "yolo/utils/autobatch.py",
    "chars": 3045,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\"\"\"\r\nAuto-batch utils\r\n\"\"\"\r\n\r\nfrom copy import deepcopy\r\n\r\nimport numpy as np\r\nim"
  },
  {
    "path": "yolo/utils/callbacks/__init__.py",
    "chars": 64,
    "preview": "from .base import add_integration_callbacks, default_callbacks\r\n"
  },
  {
    "path": "yolo/utils/callbacks/base.py",
    "chars": 3328,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\"\"\"\r\nBase callbacks\r\n\"\"\"\r\n\r\n\r\n# Trainer callbacks -------------------------------"
  },
  {
    "path": "yolo/utils/callbacks/clearml.py",
    "chars": 1913,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nfrom yolo.utils.torch_utils import get_flops, get_num_params\r\n\r\ntry:\r\n    impor"
  },
  {
    "path": "yolo/utils/callbacks/comet.py",
    "chars": 1627,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nfrom yolo.utils.torch_utils import get_flops, get_num_params\r\n\r\ntry:\r\n    impor"
  },
  {
    "path": "yolo/utils/callbacks/hub.py",
    "chars": 2615,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport json\r\nfrom time import time\r\n\r\nimport torch\r\n\r\nfrom hub.utils import PRE"
  },
  {
    "path": "yolo/utils/callbacks/tensorboard.py",
    "chars": 749,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nfrom torch.utils.tensorboard import SummaryWriter\r\n\r\nwriter = None  # TensorBoa"
  },
  {
    "path": "yolo/utils/checks.py",
    "chars": 10329,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport glob\r\nimport inspect\r\nimport math\r\nimport platform\r\nimport urllib\r\nfrom "
  },
  {
    "path": "yolo/utils/dist.py",
    "chars": 2328,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport os\r\nimport shutil\r\nimport socket\r\nimport sys\r\nimport tempfile\r\n\r\nfrom . "
  },
  {
    "path": "yolo/utils/downloads.py",
    "chars": 6602,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport logging\r\nimport os\r\nimport subprocess\r\nimport urllib\r\nfrom itertools imp"
  },
  {
    "path": "yolo/utils/files.py",
    "chars": 3918,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport contextlib\r\nimport glob\r\nimport os\r\nimport urllib\r\nfrom datetime import "
  },
  {
    "path": "yolo/utils/instance.py",
    "chars": 11694,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nfrom collections import abc\r\nfrom itertools import repeat\r\nfrom numbers import "
  },
  {
    "path": "yolo/utils/loss.py",
    "chars": 2193,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport torch\r\nimport torch.nn as nn\r\nimport torch.nn.functional as F\r\n\r\nfrom .m"
  },
  {
    "path": "yolo/utils/metrics.py",
    "chars": 23271,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\"\"\"\r\nModel validation metrics\r\n\"\"\"\r\nimport math\r\nimport warnings\r\nfrom pathlib im"
  },
  {
    "path": "yolo/utils/ops.py",
    "chars": 25598,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport contextlib\r\nimport math\r\nimport re\r\nimport time\r\n\r\nimport cv2\r\nimport nu"
  },
  {
    "path": "yolo/utils/plotting.py",
    "chars": 14824,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport contextlib\r\nimport math\r\nfrom pathlib import Path\r\nfrom urllib.error imp"
  },
  {
    "path": "yolo/utils/tal.py",
    "chars": 9788,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport torch\r\nimport torch.nn as nn\r\nimport torch.nn.functional as F\r\n\r\nfrom .c"
  },
  {
    "path": "yolo/utils/torch_utils.py",
    "chars": 16102,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport math\r\nimport os\r\nimport platform\r\nimport random\r\nimport time\r\nfrom conte"
  },
  {
    "path": "yolo/v8/__init__.py",
    "chars": 280,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nfrom pathlib import Path\r\n\r\nfrom yolo.v8 import classify, detect, segment\r\n\r\nRO"
  },
  {
    "path": "yolo/v8/detect/__init__.py",
    "chars": 177,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nfrom .predict import DetectionPredictor, predict\r\nfrom .train import DetectionT"
  },
  {
    "path": "yolo/v8/detect/detect_and_trk.py",
    "chars": 5586,
    "preview": "import hydra\r\nimport torch\r\nimport cv2\r\nfrom random import randint\r\nfrom sort import *\r\nfrom ultralytics.yolo.engine.pre"
  },
  {
    "path": "yolo/v8/detect/predict.py",
    "chars": 3950,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\n\nimport hydra\nimport torch\n\nfrom ultralytics.yolo.engine.predictor import BasePred"
  },
  {
    "path": "yolo/v8/detect/sort.py",
    "chars": 13248,
    "preview": "from __future__ import print_function\n\nimport os\nimport numpy as np\nimport matplotlib\nmatplotlib.use('Agg')\nimport matpl"
  },
  {
    "path": "yolo/v8/detect/train.py",
    "chars": 9892,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nfrom copy import copy\r\n\r\nimport hydra\r\nimport torch\r\nimport torch.nn as nn\r\n\r\nf"
  },
  {
    "path": "yolo/v8/detect/val.py",
    "chars": 11956,
    "preview": "# Ultralytics YOLO 🚀, GPL-3.0 license\r\n\r\nimport os\r\nfrom pathlib import Path\r\n\r\nimport hydra\r\nimport numpy as np\r\nimport"
  }
]

About this extraction

This page contains the full source code of the RizwanMunawar/yolov8-object-tracking GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 77 files (639.3 KB), approximately 177.2k tokens, and a symbol index with 819 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo