main 5a7e5f6f6963 cached
72 files
557.1 KB
138.1k tokens
739 symbols
1 requests
Download .txt
Showing preview only (585K chars total). Download the full file or copy to clipboard to get everything.
Repository: harubaru/discord-stable-diffusion
Branch: main
Commit: 5a7e5f6f6963
Files: 72
Total size: 557.1 KB

Directory structure:
gitextract_gdf_fm52/

├── .gitignore
├── LICENSE
├── README.md
├── __main__.py
├── models/
│   ├── .keep
│   └── v1-inference.yaml
├── requirements.txt
├── run.bat
├── run.sh
├── setup.bat
├── setup.sh
├── src/
│   ├── bot/
│   │   ├── shanghai.py
│   │   └── stablecog.py
│   ├── core/
│   │   └── logging.py
│   ├── scripts/
│   │   └── win10patch.py
│   └── stablediffusion/
│       ├── dream.py
│       ├── inpaint.py
│       ├── ldm/
│       │   ├── __init__.py
│       │   ├── data/
│       │   │   ├── __init__.py
│       │   │   ├── base.py
│       │   │   ├── imagenet.py
│       │   │   ├── lsun.py
│       │   │   ├── personalized.py
│       │   │   └── personalized_style.py
│       │   ├── dream/
│       │   │   ├── conditioning.py
│       │   │   ├── devices.py
│       │   │   ├── generator/
│       │   │   │   ├── __init__.py
│       │   │   │   ├── base.py
│       │   │   │   ├── img2img.py
│       │   │   │   ├── inpaint.py
│       │   │   │   └── txt2img.py
│       │   │   ├── image_util.py
│       │   │   ├── pngwriter.py
│       │   │   ├── readline.py
│       │   │   └── server.py
│       │   ├── generate.py
│       │   ├── gfpgan/
│       │   │   └── gfpgan_tools.py
│       │   ├── lr_scheduler.py
│       │   ├── models/
│       │   │   ├── autoencoder.py
│       │   │   └── diffusion/
│       │   │       ├── __init__.py
│       │   │       ├── classifier.py
│       │   │       ├── ddim.py
│       │   │       ├── ddpm.py
│       │   │       ├── ksampler.py
│       │   │       └── plms.py
│       │   ├── modules/
│       │   │   ├── attention.py
│       │   │   ├── diffusionmodules/
│       │   │   │   ├── __init__.py
│       │   │   │   ├── model.py
│       │   │   │   ├── openaimodel.py
│       │   │   │   └── util.py
│       │   │   ├── distributions/
│       │   │   │   ├── __init__.py
│       │   │   │   └── distributions.py
│       │   │   ├── ema.py
│       │   │   ├── embedding_manager.py
│       │   │   ├── encoders/
│       │   │   │   ├── __init__.py
│       │   │   │   └── modules.py
│       │   │   ├── image_degradation/
│       │   │   │   ├── __init__.py
│       │   │   │   ├── bsrgan.py
│       │   │   │   ├── bsrgan_light.py
│       │   │   │   └── utils_image.py
│       │   │   ├── losses/
│       │   │   │   ├── __init__.py
│       │   │   │   ├── contperceptual.py
│       │   │   │   └── vqperceptual.py
│       │   │   └── x_transformer.py
│       │   ├── simplet2i.py
│       │   └── util.py
│       ├── text2image_compvis.py
│       ├── text2image_diffusers.py
│       └── translation.py
├── storage/
│   ├── init/
│   │   └── .keep
│   └── outputs/
│       └── .keep
└── win10fix.bat

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
storage/outputs/*.png
storage/init/*.png

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
log.txt

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
#   However, in case of collaboration, if having platform-specific dependencies or dependencies
#   having no cross-platform support, pipenv may install dependencies that don't work, or not
#   install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/
waifu-diffusion/


================================================
FILE: LICENSE
================================================
                    GNU GENERAL PUBLIC LICENSE
                       Version 2, June 1991

 Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 Everyone is permitted to copy and distribute verbatim copies
 of this license document, but changing it is not allowed.

                            Preamble

  The licenses for most software are designed to take away your
freedom to share and change it.  By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users.  This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it.  (Some other Free Software Foundation software is covered by
the GNU Lesser General Public License instead.)  You can apply it to
your programs, too.

  When we speak of free software, we are referring to freedom, not
price.  Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.

  To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.

  For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have.  You must make sure that they, too, receive or can get the
source code.  And you must show them these terms so they know their
rights.

  We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.

  Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software.  If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.

  Finally, any free program is threatened constantly by software
patents.  We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary.  To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.

  The precise terms and conditions for copying, distribution and
modification follow.

                    GNU GENERAL PUBLIC LICENSE
   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

  0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License.  The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language.  (Hereinafter, translation is included without limitation in
the term "modification".)  Each licensee is addressed as "you".

Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope.  The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.

  1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.

You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.

  2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:

    a) You must cause the modified files to carry prominent notices
    stating that you changed the files and the date of any change.

    b) You must cause any work that you distribute or publish, that in
    whole or in part contains or is derived from the Program or any
    part thereof, to be licensed as a whole at no charge to all third
    parties under the terms of this License.

    c) If the modified program normally reads commands interactively
    when run, you must cause it, when started running for such
    interactive use in the most ordinary way, to print or display an
    announcement including an appropriate copyright notice and a
    notice that there is no warranty (or else, saying that you provide
    a warranty) and that users may redistribute the program under
    these conditions, and telling the user how to view a copy of this
    License.  (Exception: if the Program itself is interactive but
    does not normally print such an announcement, your work based on
    the Program is not required to print an announcement.)

These requirements apply to the modified work as a whole.  If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works.  But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.

Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.

In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.

  3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:

    a) Accompany it with the complete corresponding machine-readable
    source code, which must be distributed under the terms of Sections
    1 and 2 above on a medium customarily used for software interchange; or,

    b) Accompany it with a written offer, valid for at least three
    years, to give any third party, for a charge no more than your
    cost of physically performing source distribution, a complete
    machine-readable copy of the corresponding source code, to be
    distributed under the terms of Sections 1 and 2 above on a medium
    customarily used for software interchange; or,

    c) Accompany it with the information you received as to the offer
    to distribute corresponding source code.  (This alternative is
    allowed only for noncommercial distribution and only if you
    received the program in object code or executable form with such
    an offer, in accord with Subsection b above.)

The source code for a work means the preferred form of the work for
making modifications to it.  For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable.  However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.

If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.

  4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License.  Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.

  5. You are not required to accept this License, since you have not
signed it.  However, nothing else grants you permission to modify or
distribute the Program or its derivative works.  These actions are
prohibited by law if you do not accept this License.  Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.

  6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions.  You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.

  7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License.  If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all.  For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.

If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.

It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices.  Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.

This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.

  8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded.  In such case, this License incorporates
the limitation as if written in the body of this License.

  9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time.  Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.

Each version is given a distinguishing version number.  If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation.  If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.

  10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission.  For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this.  Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.

                            NO WARRANTY

  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.

  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.

                     END OF TERMS AND CONDITIONS

            How to Apply These Terms to Your New Programs

  If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.

  To do so, attach the following notices to the program.  It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.

    <one line to give the program's name and a brief idea of what it does.>
    Copyright (C) <year>  <name of author>

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License along
    with this program; if not, write to the Free Software Foundation, Inc.,
    51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

Also add information on how to contact you by electronic and paper mail.

If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:

    Gnomovision version 69, Copyright (C) year name of author
    Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
    This is free software, and you are welcome to redistribute it
    under certain conditions; type `show c' for details.

The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License.  Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.

You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary.  Here is a sample; alter the names:

  Yoyodyne, Inc., hereby disclaims all copyright interest in the program
  `Gnomovision' (which makes passes at compilers) written by James Hacker.

  <signature of Ty Coon>, 1 April 1989
  Ty Coon, President of Vice

This General Public License does not permit incorporating your program into
proprietary programs.  If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library.  If this is what you want to do, use the GNU Lesser General
Public License instead of this License.


================================================
FILE: README.md
================================================
# Shanghai - AI Powered Art in a Discord Bot!

<img src=https://cdn.discordapp.com/attachments/971549874514444358/1012400070559277086/1502073419.png?3867929 width=50% height=50%>

### Any questions or need help? Come hop on by to our Discord server!

[![Discord Server](https://discordapp.com/api/guilds/930499730843250783/widget.png?style=banner2)](https://discord.gg/Sx6Spmsgx7)


## Setup
Make sure you have the [CUDA Toolkit](https://developer.nvidia.com/cuda-downloads) installed

Clone the repository and enter it
````
git clone https://github.com/harubaru/discord-stable-diffusion.git
cd discord-stable-diffusion
````

#### WINDOWS SETUP
Run `setup.bat`. If you run into any errors, try running the file as administrator

If you are on a Windows 10 system, run `win10patch.bat`

Modify the `run.bat` file, where
* `--model_path` is the path to the model (make sure to replace any backslashes with double backslashes),
* `--token=` is the token to the Discord bot
* `--hf_token=` is your huggingface token (can be found [here](https://huggingface.co/settings/tokens))

Run the `run.bat` file
#### LINUX SETUP
Run `./setup.sh`. If you run into any errors, try using `sudo ./setup.sh`

Modify the `run.sh` file, where
* `--model_path` is the path to the model,
* `--token=` is the token to the Discord bot
* `--hf_token=` is your huggingface token (can be found [here](https://huggingface.co/settings/tokens))

Run `./run.sh`

### Quickstart
#### Text to Image

To generate an image from text, use the ``/dream`` command and include your prompt as the query. There's tons of parameters to play with so go wild!

![image](https://user-images.githubusercontent.com/26317155/186722689-3cbca12a-531c-47f7-b87f-99918e9ed232.png)

![image](https://user-images.githubusercontent.com/26317155/186721768-3684f629-90c3-4ef2-82b8-1310200df437.png)


#### Image to Image

To generate an image from another image, use the ``/dream`` command and include the `init_image` and `strength` parameters. The image needs to be attached to the message.

![image](https://user-images.githubusercontent.com/26317155/186722463-ec3a6d24-36c1-48f8-b09a-57651706848c.png)

![image](https://user-images.githubusercontent.com/26317155/186722528-7e652a21-fd02-4071-9fc1-87a31dfb6e63.png)


#### (Experimental) Inpainting

To fill in a mask in an image, supply a prompt, the `init_image`, `mask_image` and `strength` parameters. The mask needs to consist of black pixels in a transparent image.

![image](https://user-images.githubusercontent.com/26317155/186722970-71a662dc-16a8-4bb4-8696-3bafb3e08e65.png)



================================================
FILE: __main__.py
================================================
import os
import sys
import argparse
import asyncio
from src.core.logging import get_logger
from src.bot.shanghai import Shanghai

logger = get_logger(__name__)

def parse_args():
    parser = argparse.ArgumentParser(
        description='Shanghai - A Discord bot for AI powered utilities.',
        usage='shanghai [arguments]'
    )

    parser.add_argument('--prefix', type=str, help='The prefix to use for commands.', default='s!')
    parser.add_argument('--token', type=str, help='The token to use for authentication.')
    parser.add_argument('--hf_token', type=str, help='The token to use for HuggingFace authentication.', default=None)
    parser.add_argument('--model_path', type=str, help='Path to the model.', default=None)

    return parser.parse_args()

async def shutdown(bot):
    await bot.close()

def main():
    shanghai = None
    args = parse_args()
    
    try:
        shanghai = Shanghai(args)
        logger.info('Executing bot.')
        shanghai.run(args.token)
    except KeyboardInterrupt:
        logger.info('Keyboard interrupt received. Exiting.')
        asyncio.run(shutdown(shanghai))
    except SystemExit:
        logger.info('System exit received. Exiting.')
        asyncio.run(shutdown(shanghai))
    except Exception as e:
        logger.error(e)
        asyncio.run(shutdown(shanghai))
    finally:
        sys.exit(0)

if __name__ == '__main__':
    main()

================================================
FILE: models/.keep
================================================
壊れたカーテンの隙間から
壁を埋めるのは
暴言?妄言?知りません。

================================================
FILE: models/v1-inference.yaml
================================================
model:
  base_learning_rate: 1.0e-04
  target: src.stablediffusion.ldm.models.diffusion.ddpm.LatentDiffusion
  params:
    linear_start: 0.00085
    linear_end: 0.0120
    num_timesteps_cond: 1
    log_every_t: 200
    timesteps: 1000
    first_stage_key: "jpg"
    cond_stage_key: "txt"
    image_size: 64
    channels: 4
    cond_stage_trainable: false   # Note: different from the one we trained before
    conditioning_key: crossattn
    monitor: val/loss_simple_ema
    scale_factor: 0.18215
    use_ema: False

    scheduler_config: # 10000 warmup steps
      target: src.stablediffusion.ldm.lr_scheduler.LambdaLinearScheduler
      params:
        warm_up_steps: [ 10000 ]
        cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
        f_start: [ 1.e-6 ]
        f_max: [ 1. ]
        f_min: [ 1. ]

    personalization_config:
      target: src.stablediffusion.ldm.modules.embedding_manager.EmbeddingManager
      params:
        placeholder_strings: ["*"]
        initializer_words: ["sculpture"]
        per_image_tokens: false
        num_vectors_per_token: 1
        progressive_words: False

    unet_config:
      target: src.stablediffusion.ldm.modules.diffusionmodules.openaimodel.UNetModel
      params:
        image_size: 32 # unused
        in_channels: 4
        out_channels: 4
        model_channels: 320
        attention_resolutions: [ 4, 2, 1 ]
        num_res_blocks: 2
        channel_mult: [ 1, 2, 4, 4 ]
        num_heads: 8
        use_spatial_transformer: True
        transformer_depth: 1
        context_dim: 768
        use_checkpoint: True
        legacy: False

    first_stage_config:
      target: src.stablediffusion.ldm.models.autoencoder.AutoencoderKL
      params:
        embed_dim: 4
        monitor: val/rec_loss
        ddconfig:
          double_z: true
          z_channels: 4
          resolution: 256
          in_channels: 3
          out_ch: 3
          ch: 128
          ch_mult:
          - 1
          - 2
          - 4
          - 4
          num_res_blocks: 2
          attn_resolutions: []
          dropout: 0.0
        lossconfig:
          target: torch.nn.Identity

    cond_stage_config:
      target: src.stablediffusion.ldm.modules.encoders.modules.FrozenCLIPEmbedder


================================================
FILE: requirements.txt
================================================
--extra-index-url https://download.pytorch.org/whl/cu117
torch
diffusers
numpy
Pillow
pydantic
git+https://github.com/Pycord-Development/pycord
omegaconf==2.1.1
pytorch-lightning==1.4.2
taming-transformers-rom1504==0.0.6
test-tube>=0.7.5
torch-fidelity==0.3.0
torchmetrics==0.6.0
transformers==4.19.2
git+https://github.com/openai/CLIP.git@main#egg=clip
git+https://github.com/lstein/k-diffusion.git@master#egg=k-diffusion


================================================
FILE: run.bat
================================================
venv\Scripts\python.exe . --model_path "" --token=""

================================================
FILE: run.sh
================================================
venv/bin/python . --model_path "" --token="" --hf_token=""


================================================
FILE: setup.bat
================================================
python -m venv venv
venv\Scripts\pip.exe install -r requirements.txt

================================================
FILE: setup.sh
================================================
python -m venv venv
venv/bin/pip install -r requirements.txt


================================================
FILE: src/bot/shanghai.py
================================================
import asyncio
import os
from abc import ABC

import discord
from discord.ext import commands
from src.core.logging import get_logger


class Shanghai(commands.Bot, ABC):
    def __init__(self, args):
        intents = discord.Intents.default()
        intents.members = True
        super().__init__(command_prefix=args.prefix, intents=intents)
        self.args = args
        self.logger = get_logger(__name__)
        self.load_extension('src.bot.stablecog')

    async def on_ready(self):
        self.logger.info(f'Logged in as {self.user.name} ({self.user.id})')
        await self.change_presence(
            activity=discord.Activity(type=discord.ActivityType.watching, name='you over the seven seas.'))

    async def on_message(self, message):
        if message.author == self.user:
            try:
                # Check if the message from Shanghai was actually a generation
                if message.embeds[0].fields[0].name == 'command':
                    await message.add_reaction('❌')
            except:
                pass

    async def on_raw_reaction_add(self, ctx):
        if ctx.emoji.name == '❌':
            message = await self.get_channel(ctx.channel_id).fetch_message(ctx.message_id)
            if message.embeds:
                # look at the message footer to see if the generation was by the user who reacted
                if message.embeds[0].footer.text == f'{ctx.member.name}#{ctx.member.discriminator}':
                    await message.delete()


================================================
FILE: src/bot/stablecog.py
================================================
import traceback
from asyncio import AbstractEventLoop
from threading import Thread

import requests
import asyncio
import discord
from discord.ext import commands
from typing import Optional
from io import BytesIO
from PIL import Image
from discord import option
import random
import time

from src.stablediffusion.text2image_compvis import Text2Image

embed_color = discord.Colour.from_rgb(215, 195, 134)


class QueueObject:
    def __init__(self, ctx, prompt, height, width, guidance_scale, steps, seed, strength,
                 init_image, mask_image, sampler_name, command_str):
        self.ctx = ctx
        self.prompt = prompt
        self.height = height
        self.width = width
        self.guidance_scale = guidance_scale
        self.steps = steps
        self.seed = seed
        self.strength = strength
        self.init_image = init_image
        self.mask_image = mask_image
        self.sampler_name = sampler_name
        self.command_str = command_str


class StableCog(commands.Cog, name='Stable Diffusion', description='Create images from natural language.'):
    def __init__(self, bot):
        self.dream_thread = Thread()
        self.text2image_model = Text2Image(model_path=bot.args.model_path)
        self.event_loop = asyncio.get_event_loop()
        self.queue = []
        self.bot = bot


    @commands.slash_command(name='dream', description='Create an image.')
    @option(
        'prompt',
        str,
        description='A prompt to condition the model with.',
        required=True,
    )
    @option(
        'height',
        int,
        description='Height of the generated image.',
        required=False,
        choices=[x for x in range(192, 832, 64)]
    )
    @option(
        'width',
        int,
        description='Width of the generated image.',
        required=False,
        choices=[x for x in range(192, 832, 64)]
    )
    @option(
        'guidance_scale',
        float,
        description='Classifier-Free Guidance scale',
        required=False,
    )
    @option(
        'steps',
        int,
        description='The amount of steps to sample the model',
        required=False,
        choices=[x for x in range(5, 55, 5)]
    )
    @option(
        'sampler',
        str,
        description='The sampler to use for generation',
        required=False,
        choices=['ddim', 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms', 'plms'],
        default='ddim'
    )
    @option(
        'seed',
        int,
        description='The seed to use for reproduceability',
        required=False,
    )
    @option(
        'strength',
        float,
        description='The strength (0.0 to 1.0) used to apply the prompt to the init_image/mask_image'
    )
    @option(
        'init_image',
        discord.Attachment,
        description='The image to initialize the latents with for denoising',
        required=False,
    )
    @option(
        'mask_image',
        discord.Attachment,
        description='The mask image to use for inpainting',
        required=False,
    )
    async def dream_handler(self, ctx: discord.ApplicationContext, *, prompt: str, height: Optional[int] = 512,
                            width: Optional[int] = 512, guidance_scale: Optional[float] = 7.0,
                            steps: Optional[int] = 30,
                            sampler: Optional[str] = 'k_euler_a',
                            seed: Optional[int] = -1, strength: Optional[float] = None,
                            init_image: Optional[discord.Attachment] = None,
                            mask_image: Optional[discord.Attachment] = None):
        print(f'Request -- {ctx.author.name}#{ctx.author.discriminator} -- Prompt: {prompt}')

        if seed == -1:
            seed = random.randint(0, 0xFFFFFFFF)

        command_str = '/dream'
        command_str = command_str + f' prompt:{prompt} height:{str(height)} width:{width} guidance_scale:{guidance_scale} steps:{steps} sampler:{sampler} seed:{seed}'
        if init_image or mask_image:
            command_str = command_str + f' strength:{strength}'

        if self.dream_thread.is_alive():
            user_already_in_queue = False
            for queue_object in self.queue:
                if queue_object.ctx.author.id == ctx.author.id:
                    user_already_in_queue = True
                    break
            if user_already_in_queue:
                await ctx.send_response(
                    content=f'Please wait for your current image to finish generating before generating a new image',
                    ephemeral=True)
            else:
                self.queue.append(QueueObject(ctx, prompt, height, width, guidance_scale, steps, seed,
                                              strength,
                                              init_image, mask_image, sampler, command_str))
                await ctx.send_response(
                    content=f'Dreaming for <@{ctx.author.id}> - Queue Position: ``{len(self.queue)}`` - ``{command_str}``')
        else:
            await self.process_dream(QueueObject(ctx, prompt, height, width, guidance_scale, steps, seed,
                                                 strength,
                                                 init_image, mask_image, sampler, command_str))
            await ctx.send_response(
                content=f'Dreaming for <@{ctx.author.id}> - Queue Position: ``{len(self.queue)}`` - ``{command_str}``')

    async def process_dream(self, queue_object: QueueObject):
        self.dream_thread = Thread(target=self.dream,
                                   args=(self.event_loop, queue_object))
        self.dream_thread.start()

    def dream(self, event_loop: AbstractEventLoop, queue_object: QueueObject):
        try:
            start_time = time.time()
            if (queue_object.init_image is None) and (queue_object.mask_image is None):
                samples, seed = self.text2image_model.dream(queue_object.prompt, queue_object.steps, False, False, 0.0,
                                                            1, 1, queue_object.guidance_scale, queue_object.seed,
                                                            queue_object.height, queue_object.width, False,
                                                            queue_object.sampler_name)
            elif queue_object.init_image is not None:
                image = Image.open(requests.get(queue_object.init_image.url, stream=True).raw).convert('RGB')
                samples, seed = self.text2image_model.translation(queue_object.prompt, image, queue_object.steps, 0.0,
                                                                  0,
                                                                  0, queue_object.guidance_scale,
                                                                  queue_object.strength, queue_object.seed,
                                                                  queue_object.height, queue_object.width,
                                                                  queue_object.sampler_name)
            else:
                image = Image.open(requests.get(queue_object.init_image.url, stream=True).raw).convert('RGB')
                mask = Image.open(requests.get(queue_object.mask_image.url, stream=True).raw).convert('RGB')
                samples, seed = self.text2image_model.inpaint(queue_object.prompt, image, mask, queue_object.steps, 0.0,
                                                              1, 1, queue_object.guidance_scale,
                                                              denoising_strength=queue_object.strength,
                                                              seed=queue_object.seed, height=queue_object.height,
                                                              width=queue_object.width,
                                                              sampler_name=queue_object.sampler_name)
            end_time = time.time()

            with BytesIO() as buffer:
                samples[0].save(buffer, 'PNG')
                buffer.seek(0)
                embed = discord.Embed()
                embed.colour = embed_color
                embed.add_field(name='command', value=f'``{queue_object.command_str}``', inline=False)
                embed.add_field(name='compute used', value='``{0:.3f}`` seconds'.format(end_time - start_time),
                                inline=False)
                embed.add_field(name='delete', value='React with ❌ to delete your own generation')
                # fix errors if user doesn't have pfp
                if queue_object.ctx.author.avatar is None:
                    embed.set_footer(
                        text=f'{queue_object.ctx.author.name}#{queue_object.ctx.author.discriminator}')
                else:
                    embed.set_footer(
                        text=f'{queue_object.ctx.author.name}#{queue_object.ctx.author.discriminator}',
                        icon_url=queue_object.ctx.author.avatar.url)

                event_loop.create_task(
                    queue_object.ctx.channel.send(content=f'<@{queue_object.ctx.author.id}>', embed=embed,
                                                  file=discord.File(fp=buffer, filename=f'{seed}.png')))
        except Exception as e:
            embed = discord.Embed(title='txt2img failed', description=f'{e}\n{traceback.print_exc()}',
                                  color=embed_color)
            event_loop.create_task(queue_object.ctx.channel.send(embed=embed))
        if self.queue:
            event_loop.create_task(self.process_dream(self.queue.pop(0)))


def setup(bot):
    bot.add_cog(StableCog(bot))


================================================
FILE: src/core/logging.py
================================================
import logging

logging.basicConfig(level=logging.INFO,
                    format='[%(asctime)s] %(levelname)s: %(message)s',
                    datefmt='%Y-%m-%d %H:%M:%S')

def get_logger(name):
    return logging.getLogger(name)

================================================
FILE: src/scripts/win10patch.py
================================================
try:
    file_path = 'venv\\lib\\site-packages\\torch\\distributed\\elastic\\timer\\file_based_local_timer.py'
    with open(file_path, 'r+') as file:
        old = file.read()
        if 'SIGKILL' not in old:
            print(file_path + ' already patched!')
            exit(0)
        file.seek(0)
        file.write(old.replace('SIGKILL', 'SIGINT'))
    print('Patched ' + file_path)
except Exception as e:
    print('Patch failed! Please report this either on github or to salt#7234\nReason: ' + str(e))


================================================
FILE: src/stablediffusion/dream.py
================================================
import inspect
import warnings
from typing import List, Optional, Union

import torch

from tqdm.auto import tqdm
from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer

from diffusers import AutoencoderKL, UNet2DConditionModel, DiffusionPipeline, DDIMScheduler, LMSDiscreteScheduler, PNDMScheduler


from PIL import Image


class StableDiffusionPipeline(DiffusionPipeline):
    def __init__(
        self,
        vae: AutoencoderKL,
        text_encoder: CLIPTextModel,
        tokenizer: CLIPTokenizer,
        unet: UNet2DConditionModel,
        scheduler: Union[DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler]
    ):
        super().__init__()
        scheduler = scheduler.set_format("pt")
        self.register_modules(
            vae=vae,
            text_encoder=text_encoder,
            tokenizer=tokenizer,
            unet=unet,
            scheduler=scheduler,
        )

    @torch.no_grad()
    def __call__(
        self,
        prompt: Union[str, List[str]],
        height: Optional[int] = 512,
        width: Optional[int] = 512,
        num_inference_steps: Optional[int] = 50,
        guidance_scale: Optional[float] = 7.5,
        eta: Optional[float] = 0.0,
        generator: Optional[torch.Generator] = None,
        output_type: Optional[str] = "pil",
        progress: Optional[bool] = False,
        **kwargs,
    ):
        if "torch_device" in kwargs:
            device = kwargs.pop("torch_device")
            warnings.warn(
                "`torch_device` is deprecated as an input argument to `__call__` and will be removed in v0.3.0."
                " Consider using `pipe.to(torch_device)` instead."
            )

            # Set device as before (to be removed in 0.3.0)
            if device is None:
                device = "cuda" if torch.cuda.is_available() else "cpu"
            self.to(device)

        if isinstance(prompt, str):
            batch_size = 1
        elif isinstance(prompt, list):
            batch_size = len(prompt)
        else:
            raise ValueError(f"`prompt` has to be of type `str` or `list` but is {type(prompt)}")

        if height % 8 != 0 or width % 8 != 0:
            raise ValueError(f"`height` and `width` have to be divisible by 8 but are {height} and {width}.")

        # get prompt text embeddings
        text_input = self.tokenizer(
            prompt,
            padding="max_length",
            max_length=self.tokenizer.model_max_length,
            truncation=True,
            return_tensors="pt",
        )
        text_embeddings = self.text_encoder(text_input.input_ids.to(self.device))[0]

        # here `guidance_scale` is defined analog to the guidance weight `w` of equation (2)
        # of the Imagen paper: https://arxiv.org/pdf/2205.11487.pdf . `guidance_scale = 1`
        # corresponds to doing no classifier free guidance.
        do_classifier_free_guidance = guidance_scale > 1.0
        # get unconditional embeddings for classifier free guidance
        if do_classifier_free_guidance:
            max_length = text_input.input_ids.shape[-1]
            uncond_input = self.tokenizer(
                [""] * batch_size, padding="max_length", max_length=max_length, return_tensors="pt"
            )
            uncond_embeddings = self.text_encoder(uncond_input.input_ids.to(self.device))[0]

            # For classifier free guidance, we need to do two forward passes.
            # Here we concatenate the unconditional and text embeddings into a single batch
            # to avoid doing two forward passes
            text_embeddings = torch.cat([uncond_embeddings, text_embeddings])

        # get the intial random noise
        latents = torch.randn(
            (batch_size, self.unet.in_channels, height // 8, width // 8),
            generator=generator,
            device=self.device,
        )

        # set timesteps
        accepts_offset = "offset" in set(inspect.signature(self.scheduler.set_timesteps).parameters.keys())
        extra_set_kwargs = {}
        if accepts_offset:
            extra_set_kwargs["offset"] = 1

        self.scheduler.set_timesteps(num_inference_steps, **extra_set_kwargs)

        # if we use LMSDiscreteScheduler, let's make sure latents are mulitplied by sigmas
        if isinstance(self.scheduler, LMSDiscreteScheduler):
            latents = latents * self.scheduler.sigmas[0]

        # prepare extra kwargs for the scheduler step, since not all schedulers have the same signature
        # eta (η) is only used with the DDIMScheduler, it will be ignored for other schedulers.
        # eta corresponds to η in DDIM paper: https://arxiv.org/abs/2010.02502
        # and should be between [0, 1]
        accepts_eta = "eta" in set(inspect.signature(self.scheduler.step).parameters.keys())
        extra_step_kwargs = {}
        if accepts_eta:
            extra_step_kwargs["eta"] = eta
        
        images = []

        for i, t in tqdm(enumerate(self.scheduler.timesteps)):
            # expand the latents if we are doing classifier free guidance
            latent_model_input = torch.cat([latents] * 2) if do_classifier_free_guidance else latents
            if isinstance(self.scheduler, LMSDiscreteScheduler):
                sigma = self.scheduler.sigmas[i]
                latent_model_input = latent_model_input / ((sigma**2 + 1) ** 0.5)

            # predict the noise residual
            noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embeddings)["sample"]

            # perform guidance
            if do_classifier_free_guidance:
                noise_pred_uncond, noise_pred_text = noise_pred.chunk(2)
                noise_pred = noise_pred_uncond + guidance_scale * (noise_pred_text - noise_pred_uncond)

            # compute the previous noisy sample x_t -> x_t-1
            if isinstance(self.scheduler, LMSDiscreteScheduler):
                latents = self.scheduler.step(noise_pred, i, latents, **extra_step_kwargs)["prev_sample"]
            else:
                latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs)["prev_sample"]
            
            if progress:
                latent_image = self.vae.decode(1 / 0.18215 * latents)
                latent_image = (latent_image / 2 + 0.5).clamp(0, 1)
                latent_image = latent_image.cpu().permute(0, 2, 3, 1).numpy()

                if latent_image.ndim == 3:
                    latent_image = latent_image[None, ...]
                latent_image = (latent_image * 255).round().astype('uint8')
                latent_image = [Image.fromarray(image) for image in latent_image]
                images.append(latent_image[0])


        if progress:
            images[0].save(f'output.gif', save_all=True, append_images=images[1:], optimize=False, loop=0, duration=125)

        # scale and decode the image latents with vae
        latents = 1 / 0.18215 * latents
        image = self.vae.decode(latents)

        image = (image / 2 + 0.5).clamp(0, 1)
        image = image.cpu().permute(0, 2, 3, 1).numpy()

        if output_type == "pil":
            image = self.numpy_to_pil(image)

        return {"sample": image}


================================================
FILE: src/stablediffusion/inpaint.py
================================================
import inspect
from typing import List, Optional, Union

import numpy as np
import torch

import PIL
from diffusers import AutoencoderKL, DDIMScheduler, DiffusionPipeline, PNDMScheduler, UNet2DConditionModel
from tqdm.auto import tqdm
from transformers import CLIPTextModel, CLIPTokenizer


def preprocess(image):
    w, h = image.size
    w, h = map(lambda x: x - x % 32, (w, h))  # resize to integer multiple of 32
    image = image.resize((w, h), resample=PIL.Image.LANCZOS)
    image = np.array(image).astype(np.float32) / 255.0
    image = image[None].transpose(0, 3, 1, 2)
    image = torch.from_numpy(image)
    return 2.0 * image - 1.0

def preprocess_mask(mask):
    mask=mask.convert("L")
    w, h = mask.size
    mask = mask.resize((int(w / 8), int(h / 8)), resample=PIL.Image.LANCZOS)
    mask = np.array(mask).astype(np.float32) / 255.0
    mask = np.tile(mask,(4,1,1))
    mask = mask[None].transpose(0, 1, 2, 3)#what does this step do?
    mask = torch.from_numpy(mask).bool()
    return (mask).long()

class StableDiffusionInpaintingPipeline(DiffusionPipeline):
    def __init__(
        self,
        vae: AutoencoderKL,
        text_encoder: CLIPTextModel,
        tokenizer: CLIPTokenizer,
        unet: UNet2DConditionModel,
        scheduler: Union[DDIMScheduler, PNDMScheduler],
    ):
        super().__init__()
        scheduler = scheduler.set_format("pt")
        self.register_modules(
            vae=vae,
            text_encoder=text_encoder,
            tokenizer=tokenizer,
            unet=unet,
            scheduler=scheduler,
        )

    @torch.no_grad()
    def __call__(
        self,
        prompt: Union[str, List[str]],
        init_image: torch.FloatTensor,
        mask_image: torch.FloatTensor,
        strength: float = 0.8,
        num_inference_steps: Optional[int] = 50,
        guidance_scale: Optional[float] = 7.5,
        eta: Optional[float] = 0.0,
        generator: Optional[torch.Generator] = None,
        output_type: Optional[str] = "pil",
    ):

        if isinstance(prompt, str):
            batch_size = 1
        elif isinstance(prompt, list):
            batch_size = len(prompt)
        else:
            raise ValueError(f"`prompt` has to be of type `str` or `list` but is {type(prompt)}")

        if strength < 0 or strength > 1:
            raise ValueError(f"The value of strength should in [0.0, 1.0] but is {strength}")

        # set timesteps
        accepts_offset = "offset" in set(inspect.signature(self.scheduler.set_timesteps).parameters.keys())
        extra_set_kwargs = {}
        offset = 0
        if accepts_offset:
            offset = 1
            extra_set_kwargs["offset"] = 1

        self.scheduler.set_timesteps(num_inference_steps, **extra_set_kwargs)

        # encode the init image into latents and scale the latents
        init_latents = self.vae.encode(init_image.to(self.device)).sample()
        init_latents = 0.18215 * init_latents
        init_latents_orig = init_latents

        # prepare init_latents noise to latents
        init_latents = torch.cat([init_latents] * batch_size)

        # preprocess mask
        mask = preprocess_mask(mask_image).to(self.device)
        mask = torch.cat([mask] * batch_size)

        # get the original timestep using init_timestep
        init_timestep = int(num_inference_steps * strength) + offset
        init_timestep = min(init_timestep, num_inference_steps)
        timesteps = self.scheduler.timesteps[-init_timestep]
        timesteps = torch.tensor([timesteps] * batch_size, dtype=torch.long, device=self.device)

        # add noise to latents using the timesteps
        noise = torch.randn(init_latents.shape, generator=generator, device=self.device)
        init_latents = self.scheduler.add_noise(init_latents, noise, timesteps)

        # get prompt text embeddings
        text_input = self.tokenizer(
            prompt,
            padding="max_length",
            max_length=self.tokenizer.model_max_length,
            truncation=True,
            return_tensors="pt",
        )
        text_embeddings = self.text_encoder(text_input.input_ids.to(self.device))[0]

        # here `guidance_scale` is defined analog to the guidance weight `w` of equation (2)
        # of the Imagen paper: https://arxiv.org/pdf/2205.11487.pdf . `guidance_scale = 1`
        # corresponds to doing no classifier free guidance.
        do_classifier_free_guidance = guidance_scale > 1.0
        # get unconditional embeddings for classifier free guidance
        if do_classifier_free_guidance:
            max_length = text_input.input_ids.shape[-1]
            uncond_input = self.tokenizer(
                [""] * batch_size, padding="max_length", max_length=max_length, return_tensors="pt"
            )
            uncond_embeddings = self.text_encoder(uncond_input.input_ids.to(self.device))[0]

            # For classifier free guidance, we need to do two forward passes.
            # Here we concatenate the unconditional and text embeddings into a single batch
            # to avoid doing two forward passes
            text_embeddings = torch.cat([uncond_embeddings, text_embeddings])

        # prepare extra kwargs for the scheduler step, since not all schedulers have the same signature
        # eta (η) is only used with the DDIMScheduler, it will be ignored for other schedulers.
        # eta corresponds to η in DDIM paper: https://arxiv.org/abs/2010.02502
        # and should be between [0, 1]
        accepts_eta = "eta" in set(inspect.signature(self.scheduler.step).parameters.keys())
        extra_step_kwargs = {}
        if accepts_eta:
            extra_step_kwargs["eta"] = eta

        latents = init_latents
        t_start = max(num_inference_steps - init_timestep + offset, 0)
        for i, t in tqdm(enumerate(self.scheduler.timesteps[t_start:])):
            # expand the latents if we are doing classifier free guidance
            latent_model_input = torch.cat([latents] * 2) if do_classifier_free_guidance else latents

            # predict the noise residual
            noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embeddings)["sample"]

            # perform guidance
            if do_classifier_free_guidance:
                noise_pred_uncond, noise_pred_text = noise_pred.chunk(2)
                noise_pred = noise_pred_uncond + guidance_scale * (noise_pred_text - noise_pred_uncond)

            # compute the previous noisy sample x_t -> x_t-1
            latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs)["prev_sample"]

            #masking
            init_latents_proper = self.scheduler.add_noise(init_latents_orig, noise, t)
            latents = ( init_latents_proper * mask ) + ( latents * (1-mask) )

        # scale and decode the image latents with vae
        latents = 1 / 0.18215 * latents
        image = self.vae.decode(latents)

        image = (image / 2 + 0.5).clamp(0, 1)
        image = image.cpu().permute(0, 2, 3, 1).numpy()

        if output_type == "pil":
            image = self.numpy_to_pil(image)

        return {"sample": image, "nsfw_content_detected": False}

================================================
FILE: src/stablediffusion/ldm/__init__.py
================================================
from .generate import Generate

================================================
FILE: src/stablediffusion/ldm/data/__init__.py
================================================


================================================
FILE: src/stablediffusion/ldm/data/base.py
================================================
from abc import abstractmethod
from torch.utils.data import (
    Dataset,
    ConcatDataset,
    ChainDataset,
    IterableDataset,
)


class Txt2ImgIterableBaseDataset(IterableDataset):
    """
    Define an interface to make the IterableDatasets for text2img data chainable
    """

    def __init__(self, num_records=0, valid_ids=None, size=256):
        super().__init__()
        self.num_records = num_records
        self.valid_ids = valid_ids
        self.sample_ids = valid_ids
        self.size = size

        print(
            f'{self.__class__.__name__} dataset contains {self.__len__()} examples.'
        )

    def __len__(self):
        return self.num_records

    @abstractmethod
    def __iter__(self):
        pass


================================================
FILE: src/stablediffusion/ldm/data/imagenet.py
================================================
import os, yaml, pickle, shutil, tarfile, glob
import cv2
import albumentations
import PIL
import numpy as np
import torchvision.transforms.functional as TF
from omegaconf import OmegaConf
from functools import partial
from PIL import Image
from tqdm import tqdm
from torch.utils.data import Dataset, Subset

import taming.data.utils as tdu
from taming.data.imagenet import (
    str_to_indices,
    give_synsets_from_indices,
    download,
    retrieve,
)
from taming.data.imagenet import ImagePaths

from ldm.modules.image_degradation import (
    degradation_fn_bsr,
    degradation_fn_bsr_light,
)


def synset2idx(path_to_yaml='data/index_synset.yaml'):
    with open(path_to_yaml) as f:
        di2s = yaml.load(f)
    return dict((v, k) for k, v in di2s.items())


class ImageNetBase(Dataset):
    def __init__(self, config=None):
        self.config = config or OmegaConf.create()
        if not type(self.config) == dict:
            self.config = OmegaConf.to_container(self.config)
        self.keep_orig_class_label = self.config.get(
            'keep_orig_class_label', False
        )
        self.process_images = True  # if False we skip loading & processing images and self.data contains filepaths
        self._prepare()
        self._prepare_synset_to_human()
        self._prepare_idx_to_synset()
        self._prepare_human_to_integer_label()
        self._load()

    def __len__(self):
        return len(self.data)

    def __getitem__(self, i):
        return self.data[i]

    def _prepare(self):
        raise NotImplementedError()

    def _filter_relpaths(self, relpaths):
        ignore = set(
            [
                'n06596364_9591.JPEG',
            ]
        )
        relpaths = [
            rpath for rpath in relpaths if not rpath.split('/')[-1] in ignore
        ]
        if 'sub_indices' in self.config:
            indices = str_to_indices(self.config['sub_indices'])
            synsets = give_synsets_from_indices(
                indices, path_to_yaml=self.idx2syn
            )  # returns a list of strings
            self.synset2idx = synset2idx(path_to_yaml=self.idx2syn)
            files = []
            for rpath in relpaths:
                syn = rpath.split('/')[0]
                if syn in synsets:
                    files.append(rpath)
            return files
        else:
            return relpaths

    def _prepare_synset_to_human(self):
        SIZE = 2655750
        URL = 'https://heibox.uni-heidelberg.de/f/9f28e956cd304264bb82/?dl=1'
        self.human_dict = os.path.join(self.root, 'synset_human.txt')
        if (
            not os.path.exists(self.human_dict)
            or not os.path.getsize(self.human_dict) == SIZE
        ):
            download(URL, self.human_dict)

    def _prepare_idx_to_synset(self):
        URL = 'https://heibox.uni-heidelberg.de/f/d835d5b6ceda4d3aa910/?dl=1'
        self.idx2syn = os.path.join(self.root, 'index_synset.yaml')
        if not os.path.exists(self.idx2syn):
            download(URL, self.idx2syn)

    def _prepare_human_to_integer_label(self):
        URL = 'https://heibox.uni-heidelberg.de/f/2362b797d5be43b883f6/?dl=1'
        self.human2integer = os.path.join(
            self.root, 'imagenet1000_clsidx_to_labels.txt'
        )
        if not os.path.exists(self.human2integer):
            download(URL, self.human2integer)
        with open(self.human2integer, 'r') as f:
            lines = f.read().splitlines()
            assert len(lines) == 1000
            self.human2integer_dict = dict()
            for line in lines:
                value, key = line.split(':')
                self.human2integer_dict[key] = int(value)

    def _load(self):
        with open(self.txt_filelist, 'r') as f:
            self.relpaths = f.read().splitlines()
            l1 = len(self.relpaths)
            self.relpaths = self._filter_relpaths(self.relpaths)
            print(
                'Removed {} files from filelist during filtering.'.format(
                    l1 - len(self.relpaths)
                )
            )

        self.synsets = [p.split('/')[0] for p in self.relpaths]
        self.abspaths = [os.path.join(self.datadir, p) for p in self.relpaths]

        unique_synsets = np.unique(self.synsets)
        class_dict = dict(
            (synset, i) for i, synset in enumerate(unique_synsets)
        )
        if not self.keep_orig_class_label:
            self.class_labels = [class_dict[s] for s in self.synsets]
        else:
            self.class_labels = [self.synset2idx[s] for s in self.synsets]

        with open(self.human_dict, 'r') as f:
            human_dict = f.read().splitlines()
            human_dict = dict(line.split(maxsplit=1) for line in human_dict)

        self.human_labels = [human_dict[s] for s in self.synsets]

        labels = {
            'relpath': np.array(self.relpaths),
            'synsets': np.array(self.synsets),
            'class_label': np.array(self.class_labels),
            'human_label': np.array(self.human_labels),
        }

        if self.process_images:
            self.size = retrieve(self.config, 'size', default=256)
            self.data = ImagePaths(
                self.abspaths,
                labels=labels,
                size=self.size,
                random_crop=self.random_crop,
            )
        else:
            self.data = self.abspaths


class ImageNetTrain(ImageNetBase):
    NAME = 'ILSVRC2012_train'
    URL = 'http://www.image-net.org/challenges/LSVRC/2012/'
    AT_HASH = 'a306397ccf9c2ead27155983c254227c0fd938e2'
    FILES = [
        'ILSVRC2012_img_train.tar',
    ]
    SIZES = [
        147897477120,
    ]

    def __init__(self, process_images=True, data_root=None, **kwargs):
        self.process_images = process_images
        self.data_root = data_root
        super().__init__(**kwargs)

    def _prepare(self):
        if self.data_root:
            self.root = os.path.join(self.data_root, self.NAME)
        else:
            cachedir = os.environ.get(
                'XDG_CACHE_HOME', os.path.expanduser('~/.cache')
            )
            self.root = os.path.join(cachedir, 'autoencoders/data', self.NAME)

        self.datadir = os.path.join(self.root, 'data')
        self.txt_filelist = os.path.join(self.root, 'filelist.txt')
        self.expected_length = 1281167
        self.random_crop = retrieve(
            self.config, 'ImageNetTrain/random_crop', default=True
        )
        if not tdu.is_prepared(self.root):
            # prep
            print('Preparing dataset {} in {}'.format(self.NAME, self.root))

            datadir = self.datadir
            if not os.path.exists(datadir):
                path = os.path.join(self.root, self.FILES[0])
                if (
                    not os.path.exists(path)
                    or not os.path.getsize(path) == self.SIZES[0]
                ):
                    import academictorrents as at

                    atpath = at.get(self.AT_HASH, datastore=self.root)
                    assert atpath == path

                print('Extracting {} to {}'.format(path, datadir))
                os.makedirs(datadir, exist_ok=True)
                with tarfile.open(path, 'r:') as tar:
                    tar.extractall(path=datadir)

                print('Extracting sub-tars.')
                subpaths = sorted(glob.glob(os.path.join(datadir, '*.tar')))
                for subpath in tqdm(subpaths):
                    subdir = subpath[: -len('.tar')]
                    os.makedirs(subdir, exist_ok=True)
                    with tarfile.open(subpath, 'r:') as tar:
                        tar.extractall(path=subdir)

            filelist = glob.glob(os.path.join(datadir, '**', '*.JPEG'))
            filelist = [os.path.relpath(p, start=datadir) for p in filelist]
            filelist = sorted(filelist)
            filelist = '\n'.join(filelist) + '\n'
            with open(self.txt_filelist, 'w') as f:
                f.write(filelist)

            tdu.mark_prepared(self.root)


class ImageNetValidation(ImageNetBase):
    NAME = 'ILSVRC2012_validation'
    URL = 'http://www.image-net.org/challenges/LSVRC/2012/'
    AT_HASH = '5d6d0df7ed81efd49ca99ea4737e0ae5e3a5f2e5'
    VS_URL = 'https://heibox.uni-heidelberg.de/f/3e0f6e9c624e45f2bd73/?dl=1'
    FILES = [
        'ILSVRC2012_img_val.tar',
        'validation_synset.txt',
    ]
    SIZES = [
        6744924160,
        1950000,
    ]

    def __init__(self, process_images=True, data_root=None, **kwargs):
        self.data_root = data_root
        self.process_images = process_images
        super().__init__(**kwargs)

    def _prepare(self):
        if self.data_root:
            self.root = os.path.join(self.data_root, self.NAME)
        else:
            cachedir = os.environ.get(
                'XDG_CACHE_HOME', os.path.expanduser('~/.cache')
            )
            self.root = os.path.join(cachedir, 'autoencoders/data', self.NAME)
        self.datadir = os.path.join(self.root, 'data')
        self.txt_filelist = os.path.join(self.root, 'filelist.txt')
        self.expected_length = 50000
        self.random_crop = retrieve(
            self.config, 'ImageNetValidation/random_crop', default=False
        )
        if not tdu.is_prepared(self.root):
            # prep
            print('Preparing dataset {} in {}'.format(self.NAME, self.root))

            datadir = self.datadir
            if not os.path.exists(datadir):
                path = os.path.join(self.root, self.FILES[0])
                if (
                    not os.path.exists(path)
                    or not os.path.getsize(path) == self.SIZES[0]
                ):
                    import academictorrents as at

                    atpath = at.get(self.AT_HASH, datastore=self.root)
                    assert atpath == path

                print('Extracting {} to {}'.format(path, datadir))
                os.makedirs(datadir, exist_ok=True)
                with tarfile.open(path, 'r:') as tar:
                    tar.extractall(path=datadir)

                vspath = os.path.join(self.root, self.FILES[1])
                if (
                    not os.path.exists(vspath)
                    or not os.path.getsize(vspath) == self.SIZES[1]
                ):
                    download(self.VS_URL, vspath)

                with open(vspath, 'r') as f:
                    synset_dict = f.read().splitlines()
                    synset_dict = dict(line.split() for line in synset_dict)

                print('Reorganizing into synset folders')
                synsets = np.unique(list(synset_dict.values()))
                for s in synsets:
                    os.makedirs(os.path.join(datadir, s), exist_ok=True)
                for k, v in synset_dict.items():
                    src = os.path.join(datadir, k)
                    dst = os.path.join(datadir, v)
                    shutil.move(src, dst)

            filelist = glob.glob(os.path.join(datadir, '**', '*.JPEG'))
            filelist = [os.path.relpath(p, start=datadir) for p in filelist]
            filelist = sorted(filelist)
            filelist = '\n'.join(filelist) + '\n'
            with open(self.txt_filelist, 'w') as f:
                f.write(filelist)

            tdu.mark_prepared(self.root)


class ImageNetSR(Dataset):
    def __init__(
        self,
        size=None,
        degradation=None,
        downscale_f=4,
        min_crop_f=0.5,
        max_crop_f=1.0,
        random_crop=True,
    ):
        """
        Imagenet Superresolution Dataloader
        Performs following ops in order:
        1.  crops a crop of size s from image either as random or center crop
        2.  resizes crop to size with cv2.area_interpolation
        3.  degrades resized crop with degradation_fn

        :param size: resizing to size after cropping
        :param degradation: degradation_fn, e.g. cv_bicubic or bsrgan_light
        :param downscale_f: Low Resolution Downsample factor
        :param min_crop_f: determines crop size s,
          where s = c * min_img_side_len with c sampled from interval (min_crop_f, max_crop_f)
        :param max_crop_f: ""
        :param data_root:
        :param random_crop:
        """
        self.base = self.get_base()
        assert size
        assert (size / downscale_f).is_integer()
        self.size = size
        self.LR_size = int(size / downscale_f)
        self.min_crop_f = min_crop_f
        self.max_crop_f = max_crop_f
        assert max_crop_f <= 1.0
        self.center_crop = not random_crop

        self.image_rescaler = albumentations.SmallestMaxSize(
            max_size=size, interpolation=cv2.INTER_AREA
        )

        self.pil_interpolation = (
            False  # gets reset later if incase interp_op is from pillow
        )

        if degradation == 'bsrgan':
            self.degradation_process = partial(
                degradation_fn_bsr, sf=downscale_f
            )

        elif degradation == 'bsrgan_light':
            self.degradation_process = partial(
                degradation_fn_bsr_light, sf=downscale_f
            )

        else:
            interpolation_fn = {
                'cv_nearest': cv2.INTER_NEAREST,
                'cv_bilinear': cv2.INTER_LINEAR,
                'cv_bicubic': cv2.INTER_CUBIC,
                'cv_area': cv2.INTER_AREA,
                'cv_lanczos': cv2.INTER_LANCZOS4,
                'pil_nearest': PIL.Image.NEAREST,
                'pil_bilinear': PIL.Image.BILINEAR,
                'pil_bicubic': PIL.Image.BICUBIC,
                'pil_box': PIL.Image.BOX,
                'pil_hamming': PIL.Image.HAMMING,
                'pil_lanczos': PIL.Image.LANCZOS,
            }[degradation]

            self.pil_interpolation = degradation.startswith('pil_')

            if self.pil_interpolation:
                self.degradation_process = partial(
                    TF.resize,
                    size=self.LR_size,
                    interpolation=interpolation_fn,
                )

            else:
                self.degradation_process = albumentations.SmallestMaxSize(
                    max_size=self.LR_size, interpolation=interpolation_fn
                )

    def __len__(self):
        return len(self.base)

    def __getitem__(self, i):
        example = self.base[i]
        image = Image.open(example['file_path_'])

        if not image.mode == 'RGB':
            image = image.convert('RGB')

        image = np.array(image).astype(np.uint8)

        min_side_len = min(image.shape[:2])
        crop_side_len = min_side_len * np.random.uniform(
            self.min_crop_f, self.max_crop_f, size=None
        )
        crop_side_len = int(crop_side_len)

        if self.center_crop:
            self.cropper = albumentations.CenterCrop(
                height=crop_side_len, width=crop_side_len
            )

        else:
            self.cropper = albumentations.RandomCrop(
                height=crop_side_len, width=crop_side_len
            )

        image = self.cropper(image=image)['image']
        image = self.image_rescaler(image=image)['image']

        if self.pil_interpolation:
            image_pil = PIL.Image.fromarray(image)
            LR_image = self.degradation_process(image_pil)
            LR_image = np.array(LR_image).astype(np.uint8)

        else:
            LR_image = self.degradation_process(image=image)['image']

        example['image'] = (image / 127.5 - 1.0).astype(np.float32)
        example['LR_image'] = (LR_image / 127.5 - 1.0).astype(np.float32)

        return example


class ImageNetSRTrain(ImageNetSR):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)

    def get_base(self):
        with open('data/imagenet_train_hr_indices.p', 'rb') as f:
            indices = pickle.load(f)
        dset = ImageNetTrain(
            process_images=False,
        )
        return Subset(dset, indices)


class ImageNetSRValidation(ImageNetSR):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)

    def get_base(self):
        with open('data/imagenet_val_hr_indices.p', 'rb') as f:
            indices = pickle.load(f)
        dset = ImageNetValidation(
            process_images=False,
        )
        return Subset(dset, indices)


================================================
FILE: src/stablediffusion/ldm/data/lsun.py
================================================
import os
import numpy as np
import PIL
from PIL import Image
from torch.utils.data import Dataset
from torchvision import transforms


class LSUNBase(Dataset):
    def __init__(
        self,
        txt_file,
        data_root,
        size=None,
        interpolation='bicubic',
        flip_p=0.5,
    ):
        self.data_paths = txt_file
        self.data_root = data_root
        with open(self.data_paths, 'r') as f:
            self.image_paths = f.read().splitlines()
        self._length = len(self.image_paths)
        self.labels = {
            'relative_file_path_': [l for l in self.image_paths],
            'file_path_': [
                os.path.join(self.data_root, l) for l in self.image_paths
            ],
        }

        self.size = size
        self.interpolation = {
            'linear': PIL.Image.LINEAR,
            'bilinear': PIL.Image.BILINEAR,
            'bicubic': PIL.Image.BICUBIC,
            'lanczos': PIL.Image.LANCZOS,
        }[interpolation]
        self.flip = transforms.RandomHorizontalFlip(p=flip_p)

    def __len__(self):
        return self._length

    def __getitem__(self, i):
        example = dict((k, self.labels[k][i]) for k in self.labels)
        image = Image.open(example['file_path_'])
        if not image.mode == 'RGB':
            image = image.convert('RGB')

        # default to score-sde preprocessing
        img = np.array(image).astype(np.uint8)
        crop = min(img.shape[0], img.shape[1])
        h, w, = (
            img.shape[0],
            img.shape[1],
        )
        img = img[
            (h - crop) // 2 : (h + crop) // 2,
            (w - crop) // 2 : (w + crop) // 2,
        ]

        image = Image.fromarray(img)
        if self.size is not None:
            image = image.resize(
                (self.size, self.size), resample=self.interpolation
            )

        image = self.flip(image)
        image = np.array(image).astype(np.uint8)
        example['image'] = (image / 127.5 - 1.0).astype(np.float32)
        return example


class LSUNChurchesTrain(LSUNBase):
    def __init__(self, **kwargs):
        super().__init__(
            txt_file='data/lsun/church_outdoor_train.txt',
            data_root='data/lsun/churches',
            **kwargs
        )


class LSUNChurchesValidation(LSUNBase):
    def __init__(self, flip_p=0.0, **kwargs):
        super().__init__(
            txt_file='data/lsun/church_outdoor_val.txt',
            data_root='data/lsun/churches',
            flip_p=flip_p,
            **kwargs
        )


class LSUNBedroomsTrain(LSUNBase):
    def __init__(self, **kwargs):
        super().__init__(
            txt_file='data/lsun/bedrooms_train.txt',
            data_root='data/lsun/bedrooms',
            **kwargs
        )


class LSUNBedroomsValidation(LSUNBase):
    def __init__(self, flip_p=0.0, **kwargs):
        super().__init__(
            txt_file='data/lsun/bedrooms_val.txt',
            data_root='data/lsun/bedrooms',
            flip_p=flip_p,
            **kwargs
        )


class LSUNCatsTrain(LSUNBase):
    def __init__(self, **kwargs):
        super().__init__(
            txt_file='data/lsun/cat_train.txt',
            data_root='data/lsun/cats',
            **kwargs
        )


class LSUNCatsValidation(LSUNBase):
    def __init__(self, flip_p=0.0, **kwargs):
        super().__init__(
            txt_file='data/lsun/cat_val.txt',
            data_root='data/lsun/cats',
            flip_p=flip_p,
            **kwargs
        )


================================================
FILE: src/stablediffusion/ldm/data/personalized.py
================================================
import os
import numpy as np
import PIL
from PIL import Image
from torch.utils.data import Dataset
from torchvision import transforms

import random

imagenet_templates_smallest = [
    'a photo of a {}',
]

imagenet_templates_small = [
    'a photo of a {}',
    'a rendering of a {}',
    'a cropped photo of the {}',
    'the photo of a {}',
    'a photo of a clean {}',
    'a photo of a dirty {}',
    'a dark photo of the {}',
    'a photo of my {}',
    'a photo of the cool {}',
    'a close-up photo of a {}',
    'a bright photo of the {}',
    'a cropped photo of a {}',
    'a photo of the {}',
    'a good photo of the {}',
    'a photo of one {}',
    'a close-up photo of the {}',
    'a rendition of the {}',
    'a photo of the clean {}',
    'a rendition of a {}',
    'a photo of a nice {}',
    'a good photo of a {}',
    'a photo of the nice {}',
    'a photo of the small {}',
    'a photo of the weird {}',
    'a photo of the large {}',
    'a photo of a cool {}',
    'a photo of a small {}',
]

imagenet_dual_templates_small = [
    'a photo of a {} with {}',
    'a rendering of a {} with {}',
    'a cropped photo of the {} with {}',
    'the photo of a {} with {}',
    'a photo of a clean {} with {}',
    'a photo of a dirty {} with {}',
    'a dark photo of the {} with {}',
    'a photo of my {} with {}',
    'a photo of the cool {} with {}',
    'a close-up photo of a {} with {}',
    'a bright photo of the {} with {}',
    'a cropped photo of a {} with {}',
    'a photo of the {} with {}',
    'a good photo of the {} with {}',
    'a photo of one {} with {}',
    'a close-up photo of the {} with {}',
    'a rendition of the {} with {}',
    'a photo of the clean {} with {}',
    'a rendition of a {} with {}',
    'a photo of a nice {} with {}',
    'a good photo of a {} with {}',
    'a photo of the nice {} with {}',
    'a photo of the small {} with {}',
    'a photo of the weird {} with {}',
    'a photo of the large {} with {}',
    'a photo of a cool {} with {}',
    'a photo of a small {} with {}',
]

per_img_token_list = [
    'א',
    'ב',
    'ג',
    'ד',
    'ה',
    'ו',
    'ז',
    'ח',
    'ט',
    'י',
    'כ',
    'ל',
    'מ',
    'נ',
    'ס',
    'ע',
    'פ',
    'צ',
    'ק',
    'ר',
    'ש',
    'ת',
]


class PersonalizedBase(Dataset):
    def __init__(
        self,
        data_root,
        size=None,
        repeats=100,
        interpolation='bicubic',
        flip_p=0.5,
        set='train',
        placeholder_token='*',
        per_image_tokens=False,
        center_crop=False,
        mixing_prob=0.25,
        coarse_class_text=None,
    ):

        self.data_root = data_root

        self.image_paths = [
            os.path.join(self.data_root, file_path)
            for file_path in os.listdir(self.data_root)
        ]

        # self._length = len(self.image_paths)
        self.num_images = len(self.image_paths)
        self._length = self.num_images

        self.placeholder_token = placeholder_token

        self.per_image_tokens = per_image_tokens
        self.center_crop = center_crop
        self.mixing_prob = mixing_prob

        self.coarse_class_text = coarse_class_text

        if per_image_tokens:
            assert self.num_images < len(
                per_img_token_list
            ), f"Can't use per-image tokens when the training set contains more than {len(per_img_token_list)} tokens. To enable larger sets, add more tokens to 'per_img_token_list'."

        if set == 'train':
            self._length = self.num_images * repeats

        self.size = size
        self.interpolation = {
            'linear': PIL.Image.LINEAR,
            'bilinear': PIL.Image.BILINEAR,
            'bicubic': PIL.Image.BICUBIC,
            'lanczos': PIL.Image.LANCZOS,
        }[interpolation]
        self.flip = transforms.RandomHorizontalFlip(p=flip_p)

    def __len__(self):
        return self._length

    def __getitem__(self, i):
        example = {}
        image = Image.open(self.image_paths[i % self.num_images])

        if not image.mode == 'RGB':
            image = image.convert('RGB')

        placeholder_string = self.placeholder_token
        if self.coarse_class_text:
            placeholder_string = (
                f'{self.coarse_class_text} {placeholder_string}'
            )

        if self.per_image_tokens and np.random.uniform() < self.mixing_prob:
            text = random.choice(imagenet_dual_templates_small).format(
                placeholder_string, per_img_token_list[i % self.num_images]
            )
        else:
            text = random.choice(imagenet_templates_small).format(
                placeholder_string
            )

        example['caption'] = text

        # default to score-sde preprocessing
        img = np.array(image).astype(np.uint8)

        if self.center_crop:
            crop = min(img.shape[0], img.shape[1])
            h, w, = (
                img.shape[0],
                img.shape[1],
            )
            img = img[
                (h - crop) // 2 : (h + crop) // 2,
                (w - crop) // 2 : (w + crop) // 2,
            ]

        image = Image.fromarray(img)
        if self.size is not None:
            image = image.resize(
                (self.size, self.size), resample=self.interpolation
            )

        image = self.flip(image)
        image = np.array(image).astype(np.uint8)
        example['image'] = (image / 127.5 - 1.0).astype(np.float32)
        return example


================================================
FILE: src/stablediffusion/ldm/data/personalized_style.py
================================================
import os
import numpy as np
import PIL
from PIL import Image
from torch.utils.data import Dataset
from torchvision import transforms

import random

imagenet_templates_small = [
    'a painting in the style of {}',
    'a rendering in the style of {}',
    'a cropped painting in the style of {}',
    'the painting in the style of {}',
    'a clean painting in the style of {}',
    'a dirty painting in the style of {}',
    'a dark painting in the style of {}',
    'a picture in the style of {}',
    'a cool painting in the style of {}',
    'a close-up painting in the style of {}',
    'a bright painting in the style of {}',
    'a cropped painting in the style of {}',
    'a good painting in the style of {}',
    'a close-up painting in the style of {}',
    'a rendition in the style of {}',
    'a nice painting in the style of {}',
    'a small painting in the style of {}',
    'a weird painting in the style of {}',
    'a large painting in the style of {}',
]

imagenet_dual_templates_small = [
    'a painting in the style of {} with {}',
    'a rendering in the style of {} with {}',
    'a cropped painting in the style of {} with {}',
    'the painting in the style of {} with {}',
    'a clean painting in the style of {} with {}',
    'a dirty painting in the style of {} with {}',
    'a dark painting in the style of {} with {}',
    'a cool painting in the style of {} with {}',
    'a close-up painting in the style of {} with {}',
    'a bright painting in the style of {} with {}',
    'a cropped painting in the style of {} with {}',
    'a good painting in the style of {} with {}',
    'a painting of one {} in the style of {}',
    'a nice painting in the style of {} with {}',
    'a small painting in the style of {} with {}',
    'a weird painting in the style of {} with {}',
    'a large painting in the style of {} with {}',
]

per_img_token_list = [
    'א',
    'ב',
    'ג',
    'ד',
    'ה',
    'ו',
    'ז',
    'ח',
    'ט',
    'י',
    'כ',
    'ל',
    'מ',
    'נ',
    'ס',
    'ע',
    'פ',
    'צ',
    'ק',
    'ר',
    'ש',
    'ת',
]


class PersonalizedBase(Dataset):
    def __init__(
        self,
        data_root,
        size=None,
        repeats=100,
        interpolation='bicubic',
        flip_p=0.5,
        set='train',
        placeholder_token='*',
        per_image_tokens=False,
        center_crop=False,
    ):

        self.data_root = data_root

        self.image_paths = [
            os.path.join(self.data_root, file_path)
            for file_path in os.listdir(self.data_root)
        ]

        # self._length = len(self.image_paths)
        self.num_images = len(self.image_paths)
        self._length = self.num_images

        self.placeholder_token = placeholder_token

        self.per_image_tokens = per_image_tokens
        self.center_crop = center_crop

        if per_image_tokens:
            assert self.num_images < len(
                per_img_token_list
            ), f"Can't use per-image tokens when the training set contains more than {len(per_img_token_list)} tokens. To enable larger sets, add more tokens to 'per_img_token_list'."

        if set == 'train':
            self._length = self.num_images * repeats

        self.size = size
        self.interpolation = {
            'linear': PIL.Image.LINEAR,
            'bilinear': PIL.Image.BILINEAR,
            'bicubic': PIL.Image.BICUBIC,
            'lanczos': PIL.Image.LANCZOS,
        }[interpolation]
        self.flip = transforms.RandomHorizontalFlip(p=flip_p)

    def __len__(self):
        return self._length

    def __getitem__(self, i):
        example = {}
        image = Image.open(self.image_paths[i % self.num_images])

        if not image.mode == 'RGB':
            image = image.convert('RGB')

        if self.per_image_tokens and np.random.uniform() < 0.25:
            text = random.choice(imagenet_dual_templates_small).format(
                self.placeholder_token, per_img_token_list[i % self.num_images]
            )
        else:
            text = random.choice(imagenet_templates_small).format(
                self.placeholder_token
            )

        example['caption'] = text

        # default to score-sde preprocessing
        img = np.array(image).astype(np.uint8)

        if self.center_crop:
            crop = min(img.shape[0], img.shape[1])
            h, w, = (
                img.shape[0],
                img.shape[1],
            )
            img = img[
                (h - crop) // 2 : (h + crop) // 2,
                (w - crop) // 2 : (w + crop) // 2,
            ]

        image = Image.fromarray(img)
        if self.size is not None:
            image = image.resize(
                (self.size, self.size), resample=self.interpolation
            )

        image = self.flip(image)
        image = np.array(image).astype(np.uint8)
        example['image'] = (image / 127.5 - 1.0).astype(np.float32)
        return example


================================================
FILE: src/stablediffusion/ldm/dream/conditioning.py
================================================
'''
This module handles the generation of the conditioning tensors, including management of
weighted subprompts.

Useful function exports:

get_uc_and_c()                  get the conditioned and unconditioned latent
split_weighted_subpromopts()    split subprompts, normalize and weight them
log_tokenization()              print out colour-coded tokens and warn if truncated

'''
import re
import torch

def get_uc_and_c(prompt, model, log_tokens=False, skip_normalize=False):
    uc = model.get_learned_conditioning([''])

    # get weighted sub-prompts
    weighted_subprompts = split_weighted_subprompts(
        prompt, skip_normalize
    )

    if len(weighted_subprompts) > 1:
        # i dont know if this is correct.. but it works
        c = torch.zeros_like(uc)
        # normalize each "sub prompt" and add it
        for subprompt, weight in weighted_subprompts:
            log_tokenization(subprompt, model, log_tokens)
            c = torch.add(
                c,
                model.get_learned_conditioning([subprompt]),
                alpha=weight,
            )
    else:   # just standard 1 prompt
        log_tokenization(prompt, model, log_tokens)
        c = model.get_learned_conditioning([prompt])
    return (uc, c)

def split_weighted_subprompts(text, skip_normalize=False)->list:
    """
    grabs all text up to the first occurrence of ':'
    uses the grabbed text as a sub-prompt, and takes the value following ':' as weight
    if ':' has no value defined, defaults to 1.0
    repeats until no text remaining
    """
    prompt_parser = re.compile("""
            (?P<prompt>     # capture group for 'prompt'
            (?:\\\:|[^:])+  # match one or more non ':' characters or escaped colons '\:'
            )               # end 'prompt'
            (?:             # non-capture group
            :+              # match one or more ':' characters
            (?P<weight>     # capture group for 'weight'
            -?\d+(?:\.\d+)? # match positive or negative integer or decimal number
            )?              # end weight capture group, make optional
            \s*             # strip spaces after weight
            |               # OR
            $               # else, if no ':' then match end of line
            )               # end non-capture group
            """, re.VERBOSE)
    parsed_prompts = [(match.group("prompt").replace("\\:", ":"), float(
        match.group("weight") or 1)) for match in re.finditer(prompt_parser, text)]
    if skip_normalize:
        return parsed_prompts
    weight_sum = sum(map(lambda x: x[1], parsed_prompts))
    if weight_sum == 0:
        print(
            "Warning: Subprompt weights add up to zero. Discarding and using even weights instead.")
        equal_weight = 1 / len(parsed_prompts)
        return [(x[0], equal_weight) for x in parsed_prompts]
    return [(x[0], x[1] / weight_sum) for x in parsed_prompts]
        
# shows how the prompt is tokenized
# usually tokens have '</w>' to indicate end-of-word,
# but for readability it has been replaced with ' '
def log_tokenization(text, model, log=False):
    if not log:
        return
    tokens    = model.cond_stage_model.tokenizer._tokenize(text)
    tokenized = ""
    discarded = ""
    usedTokens = 0
    totalTokens = len(tokens)
    for i in range(0, totalTokens):
        token = tokens[i].replace('</w>', ' ')
        # alternate color
        s = (usedTokens % 6) + 1
        if i < model.cond_stage_model.max_length:
            tokenized = tokenized + f"\x1b[0;3{s};40m{token}"
            usedTokens += 1
        else:  # over max token length
            discarded = discarded + f"\x1b[0;3{s};40m{token}"
        print(f"\n>> Tokens ({usedTokens}):\n{tokenized}\x1b[0m")
        if discarded != "":
            print(
                f">> Tokens Discarded ({totalTokens-usedTokens}):\n{discarded}\x1b[0m"
            )


================================================
FILE: src/stablediffusion/ldm/dream/devices.py
================================================
import torch
from torch import autocast
from contextlib import contextmanager, nullcontext

def choose_torch_device() -> str:
    '''Convenience routine for guessing which GPU device to run model on'''
    if torch.cuda.is_available():
        return 'cuda'
    if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
        return 'mps'
    return 'cpu'

def choose_autocast_device(device):
    '''Returns an autocast compatible device from a torch device'''
    device_type = device.type # this returns 'mps' on M1
    # autocast only supports cuda or cpu
    if device_type in ('cuda','cpu'):
        return device_type,autocast
    else:
        return 'cpu',nullcontext


================================================
FILE: src/stablediffusion/ldm/dream/generator/__init__.py
================================================
'''
Initialization file for the ldm.dream.generator package
'''
from .base import Generator


================================================
FILE: src/stablediffusion/ldm/dream/generator/base.py
================================================
'''
Base class for ldm.dream.generator.*
including img2img, txt2img, and inpaint
'''
import torch
import numpy as  np
import random
from tqdm import tqdm, trange
from PIL               import Image
from einops import rearrange, repeat
from pytorch_lightning import seed_everything
from src.stablediffusion.ldm.dream.devices import choose_autocast_device

downsampling = 8

class Generator():
    def __init__(self,model):
        self.model               = model
        self.seed                = None
        self.latent_channels     = model.channels
        self.downsampling_factor = downsampling   # BUG: should come from model or config
        self.variation_amount    = 0
        self.with_variations     = []

    # this is going to be overridden in img2img.py, txt2img.py and inpaint.py
    def get_make_image(self,prompt,**kwargs):
        """
        Returns a function returning an image derived from the prompt and the initial image
        Return value depends on the seed at the time you call it
        """
        raise NotImplementedError("image_iterator() must be implemented in a descendent class")

    def set_variation(self, seed, variation_amount, with_variations):
        self.seed             = seed
        self.variation_amount = variation_amount
        self.with_variations  = with_variations

    def generate(self,prompt,init_image,width,height,iterations=1,seed=None,
                 image_callback=None, step_callback=None,
                 **kwargs):
        device_type,scope   = choose_autocast_device(self.model.device)
        make_image          = self.get_make_image(
            prompt,
            init_image    = init_image,
            width         = width,
            height        = height,
            step_callback = step_callback,
            **kwargs
        )

        results             = []
        seed                = seed if seed else self.new_seed()
        seed, initial_noise = self.generate_initial_noise(seed, width, height)
        with scope(device_type), self.model.ema_scope():
            for n in trange(iterations, desc='Generating'):
                x_T = None
                if self.variation_amount > 0:
                    seed_everything(seed)
                    target_noise = self.get_noise(width,height)
                    x_T = self.slerp(self.variation_amount, initial_noise, target_noise)
                elif initial_noise is not None:
                    # i.e. we specified particular variations
                    x_T = initial_noise
                else:
                    seed_everything(seed)
                    if self.model.device.type == 'mps':
                        x_T = self.get_noise(width,height)

                # make_image will do the equivalent of get_noise itself
                image = make_image(x_T)
                results.append([image, seed])
                if image_callback is not None:
                    image_callback(image, seed)
                seed = self.new_seed()
        return results
    
    def sample_to_image(self,samples):
        """
        Returns a function returning an image derived from the prompt and the initial image
        Return value depends on the seed at the time you call it
        """
        x_samples = self.model.decode_first_stage(samples)
        x_samples = torch.clamp((x_samples + 1.0) / 2.0, min=0.0, max=1.0)
        if len(x_samples) != 1:
            raise Exception(
                f'>> expected to get a single image, but got {len(x_samples)}')
        x_sample = 255.0 * rearrange(
            x_samples[0].cpu().numpy(), 'c h w -> h w c'
        )
        return Image.fromarray(x_sample.astype(np.uint8))

    def generate_initial_noise(self, seed, width, height):
        initial_noise = None
        if self.variation_amount > 0 or len(self.with_variations) > 0:
            # use fixed initial noise plus random noise per iteration
            seed_everything(seed)
            initial_noise = self.get_noise(width,height)
            for v_seed, v_weight in self.with_variations:
                seed = v_seed
                seed_everything(seed)
                next_noise = self.get_noise(width,height)
                initial_noise = self.slerp(v_weight, initial_noise, next_noise)
            if self.variation_amount > 0:
                random.seed() # reset RNG to an actually random state, so we can get a random seed for variations
                seed = random.randrange(0,np.iinfo(np.uint32).max)
            return (seed, initial_noise)
        else:
            return (seed, None)

    # returns a tensor filled with random numbers from a normal distribution
    def get_noise(self,width,height):
        """
        Returns a tensor filled with random numbers, either form a normal distribution
        (txt2img) or from the latent image (img2img, inpaint)
        """
        raise NotImplementedError("get_noise() must be implemented in a descendent class")
    
    def new_seed(self):
        self.seed = random.randrange(0, np.iinfo(np.uint32).max)
        return self.seed

    def slerp(self, t, v0, v1, DOT_THRESHOLD=0.9995):
        '''
        Spherical linear interpolation
        Args:
            t (float/np.ndarray): Float value between 0.0 and 1.0
            v0 (np.ndarray): Starting vector
            v1 (np.ndarray): Final vector
            DOT_THRESHOLD (float): Threshold for considering the two vectors as
                                colineal. Not recommended to alter this.
        Returns:
            v2 (np.ndarray): Interpolation vector between v0 and v1
        '''
        inputs_are_torch = False
        if not isinstance(v0, np.ndarray):
            inputs_are_torch = True
            v0 = v0.detach().cpu().numpy()
        if not isinstance(v1, np.ndarray):
            inputs_are_torch = True
            v1 = v1.detach().cpu().numpy()

        dot = np.sum(v0 * v1 / (np.linalg.norm(v0) * np.linalg.norm(v1)))
        if np.abs(dot) > DOT_THRESHOLD:
            v2 = (1 - t) * v0 + t * v1
        else:
            theta_0 = np.arccos(dot)
            sin_theta_0 = np.sin(theta_0)
            theta_t = theta_0 * t
            sin_theta_t = np.sin(theta_t)
            s0 = np.sin(theta_0 - theta_t) / sin_theta_0
            s1 = sin_theta_t / sin_theta_0
            v2 = s0 * v0 + s1 * v1

        if inputs_are_torch:
            v2 = torch.from_numpy(v2).to(self.model.device)

        return v2



================================================
FILE: src/stablediffusion/ldm/dream/generator/img2img.py
================================================
'''
ldm.dream.generator.txt2img descends from src.stablediffusion.ldm.dream.generator
'''

import torch
import numpy as  np
from src.stablediffusion.ldm.dream.devices             import choose_autocast_device
from src.stablediffusion.ldm.dream.generator.base      import Generator
from src.stablediffusion.ldm.models.diffusion.ddim     import DDIMSampler

class Img2Img(Generator):
    def __init__(self,model):
        super().__init__(model)
        self.init_latent         = None    # by get_noise()
    
    @torch.no_grad()
    def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,
                       conditioning,init_image,strength,step_callback=None,**kwargs):
        """
        Returns a function returning an image derived from the prompt and the initial image
        Return value depends on the seed at the time you call it.
        """

        # PLMS sampler not supported yet, so ignore previous sampler
        if not isinstance(sampler,DDIMSampler):
            print(
                f">> sampler '{sampler.__class__.__name__}' is not yet supported. Using DDIM sampler"
            )
            sampler = DDIMSampler(self.model, device=self.model.device)

        sampler.make_schedule(
            ddim_num_steps=steps, ddim_eta=ddim_eta, verbose=False
        )

        device_type,scope   = choose_autocast_device(self.model.device)
        with scope(device_type):
            self.init_latent = self.model.get_first_stage_encoding(
                self.model.encode_first_stage(init_image)
            ) # move to latent space

        t_enc = int(strength * steps)
        uc, c   = conditioning

        @torch.no_grad()
        def make_image(x_T):
            # encode (scaled latent)
            z_enc = sampler.stochastic_encode(
                self.init_latent,
                torch.tensor([t_enc]).to(self.model.device),
                noise=x_T
            )
            # decode it
            samples = sampler.decode(
                z_enc,
                c,
                t_enc,
                img_callback = step_callback,
                unconditional_guidance_scale=cfg_scale,
                unconditional_conditioning=uc,
            )
            return self.sample_to_image(samples)

        return make_image

    def get_noise(self,width,height):
        device      = self.model.device
        init_latent = self.init_latent
        assert init_latent is not None,'call to get_noise() when init_latent not set'
        if device.type == 'mps':
            return torch.randn_like(init_latent, device='cpu').to(device)
        else:
            return torch.randn_like(init_latent, device=device)


================================================
FILE: src/stablediffusion/ldm/dream/generator/inpaint.py
================================================
'''
ldm.dream.generator.inpaint descends from src.stablediffusion.ldm.dream.generator
'''

import torch
import numpy as  np
from einops import rearrange, repeat
from src.stablediffusion.ldm.dream.devices             import choose_autocast_device
from src.stablediffusion.ldm.dream.generator.img2img   import Img2Img
from src.stablediffusion.ldm.models.diffusion.ddim     import DDIMSampler

class Inpaint(Img2Img):
    def __init__(self,model):
        self.init_latent = None
        super().__init__(model)
    
    @torch.no_grad()
    def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,
                       conditioning,init_image,mask_image,strength,
                       step_callback=None,**kwargs):
        """
        Returns a function returning an image derived from the prompt and
        the initial image + mask.  Return value depends on the seed at
        the time you call it.  kwargs are 'init_latent' and 'strength'
        """

        mask_image = mask_image[0][0].unsqueeze(0).repeat(4,1,1).unsqueeze(0)
        mask_image = repeat(mask_image, '1 ... -> b ...', b=1)

        # PLMS sampler not supported yet, so ignore previous sampler
        if not isinstance(sampler,DDIMSampler):
            print(
                f">> sampler '{sampler.__class__.__name__}' is not yet supported. Using DDIM sampler"
            )
            sampler = DDIMSampler(self.model, device=self.model.device)

            sampler.make_schedule(
                ddim_num_steps=steps, ddim_eta=ddim_eta, verbose=False
            )

        device_type,scope   = choose_autocast_device(self.model.device)
        with scope(device_type):
            self.init_latent = self.model.get_first_stage_encoding(
                self.model.encode_first_stage(init_image)
            ) # move to latent space

        t_enc   = int(strength * steps)
        uc, c   = conditioning

        print(f">> target t_enc is {t_enc} steps")

        @torch.no_grad()
        def make_image(x_T):
            # encode (scaled latent)
            z_enc = sampler.stochastic_encode(
                self.init_latent,
                torch.tensor([t_enc]).to(self.model.device),
                noise=x_T
            )
                                       
            # decode it
            samples = sampler.decode(
                z_enc,
                c,
                t_enc,
                img_callback                 = step_callback,
                unconditional_guidance_scale = cfg_scale,
                unconditional_conditioning = uc,
                mask                       = mask_image,
                init_latent                = self.init_latent
            )
            return self.sample_to_image(samples)

        return make_image





================================================
FILE: src/stablediffusion/ldm/dream/generator/txt2img.py
================================================
'''
ldm.dream.generator.txt2img inherits from src.stablediffusion.ldm.dream.generator
'''

import torch
import numpy as  np
from src.stablediffusion.ldm.dream.generator.base import Generator

class Txt2Img(Generator):
    def __init__(self,model):
        super().__init__(model)
    
    @torch.no_grad()
    def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,
                       conditioning,width,height,step_callback=None,**kwargs):
        """
        Returns a function returning an image derived from the prompt and the initial image
        Return value depends on the seed at the time you call it
        kwargs are 'width' and 'height'
        """
        uc, c   = conditioning

        @torch.no_grad()
        def make_image(x_T):
            shape = [
                self.latent_channels,
                height // self.downsampling_factor,
                width  // self.downsampling_factor,
            ]
            samples, _ = sampler.sample(
                batch_size                   = 1,
                S                            = steps,
                x_T                          = x_T,
                conditioning                 = c,
                shape                        = shape,
                verbose                      = False,
                unconditional_guidance_scale = cfg_scale,
                unconditional_conditioning   = uc,
                eta                          = ddim_eta,
                img_callback                 = step_callback
            )
            return self.sample_to_image(samples)

        return make_image


    # returns a tensor filled with random numbers from a normal distribution
    def get_noise(self,width,height):
        device         = self.model.device
        if device.type == 'mps':
            return torch.randn([1,
                                self.latent_channels,
                                height // self.downsampling_factor,
                                width  // self.downsampling_factor],
                               device='cpu').to(device)
        else:
            return torch.randn([1,
                                self.latent_channels,
                                height // self.downsampling_factor,
                                width  // self.downsampling_factor],
                               device=device)


================================================
FILE: src/stablediffusion/ldm/dream/image_util.py
================================================
from math import sqrt, floor, ceil
from PIL import Image

class InitImageResizer():
    """Simple class to create resized copies of an Image while preserving the aspect ratio."""
    def __init__(self,Image):
        self.image = Image

    def resize(self,width=None,height=None) -> Image:
        """
        Return a copy of the image resized to fit within
        a box width x height. The aspect ratio is 
        maintained. If neither width nor height are provided, 
        then returns a copy of the original image. If one or the other is
        provided, then the other will be calculated from the
        aspect ratio.

        Everything is floored to the nearest multiple of 64 so
        that it can be passed to img2img()
        """
        im    = self.image
        
        ar = im.width/float(im.height)

        # Infer missing values from aspect ratio
        if not(width or height): # both missing
            width  = im.width
            height = im.height
        elif not height:           # height missing
            height = int(width/ar)
        elif not width:            # width missing
            width  = int(height*ar)

        # rw and rh are the resizing width and height for the image
        # they maintain the aspect ratio, but may not completelyl fill up
        # the requested destination size
        (rw,rh) = (width,int(width/ar)) if im.width>=im.height else (int(height*ar),height)

        #round everything to multiples of 64
        width,height,rw,rh = map(
            lambda x: x-x%64, (width,height,rw,rh)
        )

        # no resize necessary, but return a copy
        if im.width == width and im.height == height:
            return im.copy()
        
        # otherwise resize the original image so that it fits inside the bounding box
        resized_image = self.image.resize((rw,rh),resample=Image.Resampling.LANCZOS)
        return resized_image

def make_grid(image_list, rows=None, cols=None):
    image_cnt = len(image_list)
    if None in (rows, cols):
        rows = floor(sqrt(image_cnt))  # try to make it square
        cols = ceil(image_cnt / rows)
    width = image_list[0].width
    height = image_list[0].height

    grid_img = Image.new('RGB', (width * cols, height * rows))
    i = 0
    for r in range(0, rows):
        for c in range(0, cols):
            if i >= len(image_list):
                break
            grid_img.paste(image_list[i], (c * width, r * height))
            i = i + 1

    return grid_img



================================================
FILE: src/stablediffusion/ldm/dream/pngwriter.py
================================================
"""
Two helper classes for dealing with PNG images and their path names.
PngWriter -- Converts Images generated by T2I into PNGs, finds
             appropriate names for them, and writes prompt metadata
             into the PNG.
PromptFormatter -- Utility for converting a Namespace of prompt parameters
             back into a formatted prompt string with command-line switches.
"""
import os
import re
from PIL import PngImagePlugin

# -------------------image generation utils-----


class PngWriter:
    def __init__(self, outdir):
        self.outdir = outdir
        os.makedirs(outdir, exist_ok=True)

    # gives the next unique prefix in outdir
    def unique_prefix(self):
        # sort reverse alphabetically until we find max+1
        dirlist = sorted(os.listdir(self.outdir), reverse=True)
        # find the first filename that matches our pattern or return 000000.0.png
        existing_name = next(
            (f for f in dirlist if re.match('^(\d+)\..*\.png', f)),
            '0000000.0.png',
        )
        basecount = int(existing_name.split('.', 1)[0]) + 1
        return f'{basecount:06}'

    # saves image named _image_ to outdir/name, writing metadata from prompt
    # returns full path of output
    def save_image_and_prompt_to_png(self, image, prompt, name):
        path = os.path.join(self.outdir, name)
        info = PngImagePlugin.PngInfo()
        info.add_text('Dream', prompt)
        image.save(path, 'PNG', pnginfo=info)
        return path


class PromptFormatter:
    def __init__(self, t2i, opt):
        self.t2i = t2i
        self.opt = opt

    # note: the t2i object should provide all these values.
    # there should be no need to or against opt values
    def normalize_prompt(self):
        """Normalize the prompt and switches"""
        t2i = self.t2i
        opt = self.opt

        switches = list()
        switches.append(f'"{opt.prompt}"')
        switches.append(f'-s{opt.steps        or t2i.steps}')
        switches.append(f'-W{opt.width        or t2i.width}')
        switches.append(f'-H{opt.height       or t2i.height}')
        switches.append(f'-C{opt.cfg_scale    or t2i.cfg_scale}')
        switches.append(f'-A{opt.sampler_name or t2i.sampler_name}')
# to do: put model name into the t2i object
#        switches.append(f'--model{t2i.model_name}')
        if opt.seamless or t2i.seamless:
            switches.append(f'--seamless')
        if opt.init_img:
            switches.append(f'-I{opt.init_img}')
        if opt.fit:
            switches.append(f'--fit')
        if opt.strength and opt.init_img is not None:
            switches.append(f'-f{opt.strength or t2i.strength}')
        if opt.gfpgan_strength:
            switches.append(f'-G{opt.gfpgan_strength}')
        if opt.upscale:
            switches.append(f'-U {" ".join([str(u) for u in opt.upscale])}')
        if opt.variation_amount > 0:
            switches.append(f'-v{opt.variation_amount}')
        if opt.with_variations:
            formatted_variations = ','.join(f'{seed}:{weight}' for seed, weight in opt.with_variations)
            switches.append(f'-V{formatted_variations}')
        return ' '.join(switches)


================================================
FILE: src/stablediffusion/ldm/dream/readline.py
================================================
"""
Readline helper functions for dream.py (linux and mac only).
"""
import os
import re
import atexit

# ---------------readline utilities---------------------
try:
    import readline

    readline_available = True
except:
    readline_available = False


class Completer:
    def __init__(self, options):
        self.options = sorted(options)
        return

    def complete(self, text, state):
        buffer = readline.get_line_buffer()

        if text.startswith(('-I', '--init_img','-M','--init_mask')):
            return self._path_completions(text, state, ('.png','.jpg','.jpeg'))

        if buffer.strip().endswith('cd') or text.startswith(('.', '/')):
            return self._path_completions(text, state, ())

        response = None
        if state == 0:
            # This is the first time for this text, so build a match list.
            if text:
                self.matches = [
                    s for s in self.options if s and s.startswith(text)
                ]
            else:
                self.matches = self.options[:]

        # Return the state'th item from the match list,
        # if we have that many.
        try:
            response = self.matches[state]
        except IndexError:
            response = None
        return response

    def _path_completions(self, text, state, extensions):
        # get the path so far
        # TODO: replace this mess with a regular expression match
        if text.startswith('-I'):
            path = text.replace('-I', '', 1).lstrip()
        elif text.startswith('--init_img='):
            path = text.replace('--init_img=', '', 1).lstrip()
        elif text.startswith('--init_mask='):
            path = text.replace('--init_mask=', '', 1).lstrip()
        elif text.startswith('-M'):
            path = text.replace('-M', '', 1).lstrip()
        else:
            path = text

        matches = list()

        path = os.path.expanduser(path)
        if len(path) == 0:
            matches.append(text + './')
        else:
            dir = os.path.dirname(path)
            dir_list = os.listdir(dir)
            for n in dir_list:
                if n.startswith('.') and len(n) > 1:
                    continue
                full_path = os.path.join(dir, n)
                if full_path.startswith(path):
                    if os.path.isdir(full_path):
                        matches.append(
                            os.path.join(os.path.dirname(text), n) + '/'
                        )
                    elif n.endswith(extensions):
                        matches.append(os.path.join(os.path.dirname(text), n))

        try:
            response = matches[state]
        except IndexError:
            response = None
        return response


if readline_available:
    readline.set_completer(
        Completer(
            [
                '--steps','-s',
                '--seed','-S',
                '--iterations','-n',
                '--width','-W','--height','-H',
                '--cfg_scale','-C',
                '--grid','-g',
                '--individual','-i',
                '--init_img','-I',
                '--init_mask','-M',
                '--strength','-f',
                '--variants','-v',
                '--outdir','-o',
                '--sampler','-A','-m',
                '--embedding_path',
                '--device',
                '--grid','-g',
                '--gfpgan_strength','-G',
                '--upscale','-U',
                '-save_orig','--save_original',
                '--skip_normalize','-x',
                '--log_tokenization','t',
            ]
        ).complete
    )
    readline.set_completer_delims(' ')
    readline.parse_and_bind('tab: complete')

    histfile = os.path.join(os.path.expanduser('~'), '.dream_history')
    try:
        readline.read_history_file(histfile)
        readline.set_history_length(1000)
    except FileNotFoundError:
        pass
    atexit.register(readline.write_history_file, histfile)


================================================
FILE: src/stablediffusion/ldm/dream/server.py
================================================
import argparse
import json
import base64
import mimetypes
import os
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
from src.stablediffusion.ldm.dream.pngwriter import PngWriter, PromptFormatter
from threading import Event

def build_opt(post_data, seed, gfpgan_model_exists):
    opt = argparse.Namespace()
    setattr(opt, 'prompt', post_data['prompt'])
    setattr(opt, 'init_img', post_data['initimg'])
    setattr(opt, 'strength', float(post_data['strength']))
    setattr(opt, 'iterations', int(post_data['iterations']))
    setattr(opt, 'steps', int(post_data['steps']))
    setattr(opt, 'width', int(post_data['width']))
    setattr(opt, 'height', int(post_data['height']))
    setattr(opt, 'seamless', 'seamless' in post_data)
    setattr(opt, 'fit', 'fit' in post_data)
    setattr(opt, 'mask', 'mask' in post_data)
    setattr(opt, 'invert_mask', 'invert_mask' in post_data)
    setattr(opt, 'cfg_scale', float(post_data['cfg_scale']))
    setattr(opt, 'sampler_name', post_data['sampler_name'])
    setattr(opt, 'gfpgan_strength', float(post_data['gfpgan_strength']) if gfpgan_model_exists else 0)
    setattr(opt, 'upscale', [int(post_data['upscale_level']), float(post_data['upscale_strength'])] if post_data['upscale_level'] != '' else None)
    setattr(opt, 'progress_images', 'progress_images' in post_data)
    setattr(opt, 'seed', None if int(post_data['seed']) == -1 else int(post_data['seed']))
    setattr(opt, 'variation_amount', float(post_data['variation_amount']) if int(post_data['seed']) != -1 else 0)
    setattr(opt, 'with_variations', [])

    broken = False
    if int(post_data['seed']) != -1 and post_data['with_variations'] != '':
        for part in post_data['with_variations'].split(','):
            seed_and_weight = part.split(':')
            if len(seed_and_weight) != 2:
                print(f'could not parse with_variation part "{part}"')
                broken = True
                break
            try:
                seed = int(seed_and_weight[0])
                weight = float(seed_and_weight[1])
            except ValueError:
                print(f'could not parse with_variation part "{part}"')
                broken = True
                break
            opt.with_variations.append([seed, weight])
    
    if broken:
        raise CanceledException

    if len(opt.with_variations) == 0:
        opt.with_variations = None

    return opt

class CanceledException(Exception):
    pass

class DreamServer(BaseHTTPRequestHandler):
    model = None
    outdir = None
    canceled = Event()

    def do_GET(self):
        if self.path == "/":
            self.send_response(200)
            self.send_header("Content-type", "text/html")
            self.end_headers()
            with open("./static/dream_web/index.html", "rb") as content:
                self.wfile.write(content.read())
        elif self.path == "/config.js":
            # unfortunately this import can't be at the top level, since that would cause a circular import
            from src.stablediffusion.ldm.gfpgan.gfpgan_tools import gfpgan_model_exists
            self.send_response(200)
            self.send_header("Content-type", "application/javascript")
            self.end_headers()
            config = {
                'gfpgan_model_exists': gfpgan_model_exists
            }
            self.wfile.write(bytes("let config = " + json.dumps(config) + ";\n", "utf-8"))
        elif self.path == "/run_log.json":
            self.send_response(200)
            self.send_header("Content-type", "application/json")
            self.end_headers()
            output = []
            
            log_file = os.path.join(self.outdir, "dream_web_log.txt")
            if os.path.exists(log_file):
                with open(log_file, "r") as log:
                    for line in log:
                        url, config = line.split(": {", maxsplit=1)
                        config = json.loads("{" + config)
                        config["url"] = url.lstrip(".")
                        if os.path.exists(url):
                            output.append(config)

            self.wfile.write(bytes(json.dumps({"run_log": output}), "utf-8"))
        elif self.path == "/cancel":
            self.canceled.set()
            self.send_response(200)
            self.send_header("Content-type", "application/json")
            self.end_headers()
            self.wfile.write(bytes('{}', 'utf8'))
        else:
            path = "." + self.path
            cwd = os.path.realpath(os.getcwd())
            is_in_cwd = os.path.commonprefix((os.path.realpath(path), cwd)) == cwd
            if not (is_in_cwd and os.path.exists(path)):
                self.send_response(404)
                return
            mime_type = mimetypes.guess_type(path)[0]
            if mime_type is not None:
                self.send_response(200)
                self.send_header("Content-type", mime_type)
                self.end_headers()
                with open("." + self.path, "rb") as content:
                    self.wfile.write(content.read())
            else:
                self.send_response(404)

    def do_POST(self):
        self.send_response(200)
        self.send_header("Content-type", "application/json")
        self.end_headers()

        # unfortunately this import can't be at the top level, since that would cause a circular import
        from src.stablediffusion.ldm.gfpgan.gfpgan_tools import gfpgan_model_exists

        content_length = int(self.headers['Content-Length'])
        post_data = json.loads(self.rfile.read(content_length))
        opt = build_opt(post_data, self.model.seed, gfpgan_model_exists)

        self.canceled.clear()
        print(f">> Request to generate with prompt: {opt.prompt}")
        # In order to handle upscaled images, the PngWriter needs to maintain state
        # across images generated by each call to prompt2img(), so we define it in
        # the outer scope of image_done()
        config = post_data.copy() # Shallow copy
        config['initimg'] = config.pop('initimg_name', '')

        images_generated = 0    # helps keep track of when upscaling is started
        images_upscaled = 0     # helps keep track of when upscaling is completed
        pngwriter = PngWriter(self.outdir)

        prefix = pngwriter.unique_prefix()
        # if upscaling is requested, then this will be called twice, once when
        # the images are first generated, and then again when after upscaling
        # is complete. The upscaling replaces the original file, so the second
        # entry should not be inserted into the image list.
        def image_done(image, seed, upscaled=False):
            name = f'{prefix}.{seed}.png'
            iter_opt = argparse.Namespace(**vars(opt)) # copy
            if opt.variation_amount > 0:
                this_variation = [[seed, opt.variation_amount]]
                if opt.with_variations is None:
                    iter_opt.with_variations = this_variation
                else:
                    iter_opt.with_variations = opt.with_variations + this_variation
                iter_opt.variation_amount = 0
            elif opt.with_variations is None:
                iter_opt.seed = seed
            normalized_prompt = PromptFormatter(self.model, iter_opt).normalize_prompt()
            path = pngwriter.save_image_and_prompt_to_png(image, f'{normalized_prompt} -S{iter_opt.seed}', name)

            if int(config['seed']) == -1:
                config['seed'] = seed
            # Append post_data to log, but only once!
            if not upscaled:
                with open(os.path.join(self.outdir, "dream_web_log.txt"), "a") as log:
                    log.write(f"{path}: {json.dumps(config)}\n")

                self.wfile.write(bytes(json.dumps(
                    {'event': 'result', 'url': path, 'seed': seed, 'config': config}
                ) + '\n',"utf-8"))

            # control state of the "postprocessing..." message
            upscaling_requested = opt.upscale or opt.gfpgan_strength > 0
            nonlocal images_generated # NB: Is this bad python style? It is typical usage in a perl closure.
            nonlocal images_upscaled  # NB: Is this bad python style? It is typical usage in a perl closure.
            if upscaled:
                images_upscaled += 1
            else:
                images_generated += 1
            if upscaling_requested:
                action = None
                if images_generated >= opt.iterations:
                    if images_upscaled < opt.iterations:
                        action = 'upscaling-started'
                    else:
                        action = 'upscaling-done'
                if action:
                    x = images_upscaled + 1
                    self.wfile.write(bytes(json.dumps(
                        {'event': action, 'processed_file_cnt': f'{x}/{opt.iterations}'}
                    ) + '\n',"utf-8"))

        step_writer = PngWriter(os.path.join(self.outdir, "intermediates"))
        step_index = 1
        def image_progress(sample, step):
            if self.canceled.is_set():
                self.wfile.write(bytes(json.dumps({'event':'canceled'}) + '\n', 'utf-8'))
                raise CanceledException
            path = None
            # since rendering images is moderately expensive, only render every 5th image
            # and don't bother with the last one, since it'll render anyway
            nonlocal step_index
            if opt.progress_images and step % 5 == 0 and step < opt.steps - 1:
                image = self.model.sample_to_image(sample)
                name = f'{prefix}.{opt.seed}.{step_index}.png'
                metadata = f'{opt.prompt} -S{opt.seed} [intermediate]'
                path = step_writer.save_image_and_prompt_to_png(image, metadata, name)
                step_index += 1
            self.wfile.write(bytes(json.dumps(
                {'event': 'step', 'step': step + 1, 'url': path}
            ) + '\n',"utf-8"))

        try:
            if opt.init_img is None:
                # Run txt2img
                self.model.prompt2image(**vars(opt), step_callback=image_progress, image_callback=image_done)
            else:
                # Decode initimg as base64 to temp file
                with open("./img2img-tmp.png", "wb") as f:
                    initimg = opt.init_img.split(",")[1] # Ignore mime type
                    f.write(base64.b64decode(initimg))
                opt1 = argparse.Namespace(**vars(opt))
                opt1.init_img = "./img2img-tmp.png"

                try:
                    # Run img2img
                    self.model.prompt2image(**vars(opt1), step_callback=image_progress, image_callback=image_done)
                finally:
                    # Remove the temp file
                    os.remove("./img2img-tmp.png")
        except CanceledException:
            print(f"Canceled.")
            return


class ThreadingDreamServer(ThreadingHTTPServer):
    def __init__(self, server_address):
        super(ThreadingDreamServer, self).__init__(server_address, DreamServer)


================================================
FILE: src/stablediffusion/ldm/generate.py
================================================
# Copyright (c) 2022 Lincoln D. Stein (https://github.com/lstein)

# Derived from source code carrying the following copyrights
# Copyright (c) 2022 Machine Vision and Learning Group, LMU Munich
# Copyright (c) 2022 Robin Rombach and Patrick Esser and contributors

import torch
import numpy as np
import random
import os
import time
import re
import sys
import traceback
import transformers

from omegaconf import OmegaConf
from PIL import Image, ImageOps
from torch import nn
from pytorch_lightning import seed_everything

from src.stablediffusion.ldm.util                      import instantiate_from_config
from src.stablediffusion.ldm.models.diffusion.ddim     import DDIMSampler
from src.stablediffusion.ldm.models.diffusion.plms     import PLMSSampler
from src.stablediffusion.ldm.models.diffusion.ksampler import KSampler
from src.stablediffusion.ldm.dream.pngwriter           import PngWriter
from src.stablediffusion.ldm.dream.image_util          import InitImageResizer
from src.stablediffusion.ldm.dream.devices             import choose_torch_device
from src.stablediffusion.ldm.dream.conditioning        import get_uc_and_c

"""Simplified text to image API for stable diffusion/latent diffusion

Example Usage:

from src.stablediffusion.ldm.generate import Generate

# Create an object with default values
gr = Generate()

# do the slow model initialization
gr.load_model()

# Do the fast inference & image generation. Any options passed here
# override the default values assigned during class initialization
# Will call load_model() if the model was not previously loaded and so
# may be slow at first.
# The method returns a list of images. Each row of the list is a sub-list of [filename,seed]
results = gr.prompt2png(prompt     = "an astronaut riding a horse",
                         outdir     = "./outputs/samples",
                         iterations = 3)

for row in results:
    print(f'filename={row[0]}')
    print(f'seed    ={row[1]}')

# Same thing, but using an initial image.
results = gr.prompt2png(prompt   = "an astronaut riding a horse",
                         outdir   = "./outputs/,
                         iterations = 3,
                         init_img = "./sketches/horse+rider.png")

for row in results:
    print(f'filename={row[0]}')
    print(f'seed    ={row[1]}')

# Same thing, but we return a series of Image objects, which lets you manipulate them,
# combine them, and save them under arbitrary names

results = gr.prompt2image(prompt   = "an astronaut riding a horse"
                           outdir   = "./outputs/")
for row in results:
    im   = row[0]
    seed = row[1]
    im.save(f'./outputs/samples/an_astronaut_riding_a_horse-{seed}.png')
    im.thumbnail(100,100).save('./outputs/samples/astronaut_thumb.jpg')

Note that the old txt2img() and img2img() calls are deprecated but will
still work.

The full list of arguments to Generate() are:
gr = Generate(
          weights     = path to model weights ('models/ldm/stable-diffusion-v1/model.ckpt')
          config     = path to model configuraiton ('configs/stable-diffusion/v1-inference.yaml')
          iterations  = <integer>     // how many times to run the sampling (1)
          steps       = <integer>     // 50
          seed        = <integer>     // current system time
          sampler_name= ['ddim', 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms', 'plms']  // k_lms
          grid        = <boolean>     // false
          width       = <integer>     // image width, multiple of 64 (512)
          height      = <integer>     // image height, multiple of 64 (512)
          cfg_scale   = <float>       // condition-free guidance scale (7.5)
          )

"""


class Generate:
    """Generate class
    Stores default values for multiple configuration items
    """

    def __init__(
            self,
            iterations            = 1,
            steps                 = 50,
            cfg_scale             = 7.5,
            weights               = 'models/ldm/stable-diffusion-v1/model.ckpt',
            config                = 'configs/stable-diffusion/v1-inference.yaml',
            grid                  = False,
            width                 = 512,
            height                = 512,
            sampler_name          = 'k_lms',
            ddim_eta              = 0.0,  # deterministic
            precision             = 'autocast',
            full_precision        = False,
            strength              = 0.75,  # default in scripts/img2img.py
            seamless              = False,
            embedding_path        = None,
            device_type           = 'cuda',
            ignore_ctrl_c         = False,
    ):
        self.iterations               = iterations
        self.width                    = width
        self.height                   = height
        self.steps                    = steps
        self.cfg_scale                = cfg_scale
        self.weights                  = weights
        self.config                   = config
        self.sampler_name             = sampler_name
        self.grid                     = grid
        self.ddim_eta                 = ddim_eta
        self.precision                = precision
        self.full_precision           = True if choose_torch_device() == 'mps' else full_precision
        self.strength                 = strength
        self.seamless                 = seamless
        self.embedding_path           = embedding_path
        self.device_type              = device_type
        self.ignore_ctrl_c            = ignore_ctrl_c    # note, this logic probably doesn't belong here...
        self.model                    = None     # empty for now
        self.sampler                  = None
        self.device                   = None
        self.generators               = {}
        self.base_generator           = None
        self.seed                     = None

        if device_type == 'cuda' and not torch.cuda.is_available():
            device_type = choose_torch_device()
            print(">> cuda not available, using device", device_type)
        self.device = torch.device(device_type)

        # for VRAM usage statistics
        device_type          = choose_torch_device()
        self.session_peakmem = torch.cuda.max_memory_allocated() if device_type == 'cuda' else None
        transformers.logging.set_verbosity_error()

    def prompt2png(self, prompt, outdir, **kwargs):
        """
        Takes a prompt and an output directory, writes out the requested number
        of PNG files, and returns an array of [[filename,seed],[filename,seed]...]
        Optional named arguments are the same as those passed to Generate and prompt2image()
        """
        results = self.prompt2image(prompt, **kwargs)
        pngwriter = PngWriter(outdir)
        prefix = pngwriter.unique_prefix()
        outputs = []
        for image, seed in results:
            name = f'{prefix}.{seed}.png'
            path = pngwriter.save_image_and_prompt_to_png(
                image, f'{prompt} -S{seed}', name)
            outputs.append([path, seed])
        return outputs

    def txt2img(self, prompt, **kwargs):
        outdir = kwargs.pop('outdir', 'outputs/img-samples')
        return self.prompt2png(prompt, outdir, **kwargs)

    def img2img(self, prompt, **kwargs):
        outdir = kwargs.pop('outdir', 'outputs/img-samples')
        assert (
            'init_img' in kwargs
        ), 'call to img2img() must include the init_img argument'
        return self.prompt2png(prompt, outdir, **kwargs)

    def prompt2image(
            self,
            # these are common
            prompt,
            iterations     =    None,
            steps          =    None,
            seed           =    None,
            cfg_scale      =    None,
            ddim_eta       =    None,
            skip_normalize =    False,
            image_callback =    None,
            step_callback  =    None,
            width          =    None,
            height         =    None,
            sampler_name   =    None,
            seamless       =    False,
            log_tokenization=  False,
            with_variations =   None,
            variation_amount =  0.0,
            # these are specific to img2img and inpaint
            init_img       =    None,
            init_mask      =    None,
            fit            =    False,
            strength       =    None,
            # these are specific to GFPGAN/ESRGAN
            gfpgan_strength=    0,
            save_original  =    False,
            upscale        =    None,
            **args,
    ):   # eat up additional cruft
        """
        ldm.generate.prompt2image() is the common entry point for txt2img() and img2img()
        It takes the following arguments:
           prompt                          // prompt string (no default)
           iterations                      // iterations (1); image count=iterations
           steps                           // refinement steps per iteration
           seed                            // seed for random number generator
           width                           // width of image, in multiples of 64 (512)
           height                          // height of image, in multiples of 64 (512)
           cfg_scale                       // how strongly the prompt influences the image (7.5) (must be >1)
           seamless                        // whether the generated image should tile
           init_img                        // path to an initial image
           strength                        // strength for noising/unnoising init_img. 0.0 preserves image exactly, 1.0 replaces it completely
           gfpgan_strength                 // strength for GFPGAN. 0.0 preserves image exactly, 1.0 replaces it completely
           ddim_eta                        // image randomness (eta=0.0 means the same seed always produces the same image)
           step_callback                   // a function or method that will be called each step
           image_callback                  // a function or method that will be called each time an image is generated
           with_variations                 // a weighted list [(seed_1, weight_1), (seed_2, weight_2), ...] of variations which should be applied before doing any generation
           variation_amount                // optional 0-1 value to slerp from -S noise to random noise (allows variations on an image)

        To use the step callback, define a function that receives two arguments:
        - Image GPU data
        - The step number

        To use the image callback, define a function of method that receives two arguments, an Image object
        and the seed. You can then do whatever you like with the image, including converting it to
        different formats and manipulating it. For example:

            def process_image(image,seed):
                image.save(f{'images/seed.png'})

        The callback used by the prompt2png() can be found in ldm/dream_util.py. It contains code
        to create the requested output directory, select a unique informative name for each image, and
        write the prompt into the PNG metadata.
        """
        # TODO: convert this into a getattr() loop
        steps                 = steps      or self.steps
        width                 = width      or self.width
        height                = height     or self.height
        seamless              = seamless   or self.seamless
        cfg_scale             = cfg_scale  or self.cfg_scale
        ddim_eta              = ddim_eta   or self.ddim_eta
        iterations            = iterations or self.iterations
        strength              = strength   or self.strength
        self.seed             = seed
        self.log_tokenization = log_tokenization
        with_variations = [] if with_variations is None else with_variations

        model = (
            self.load_model()
        )  # will instantiate the model or return it from cache

        for m in model.modules():
            if isinstance(m, (nn.Conv2d, nn.ConvTranspose2d)):
                m.padding_mode = 'circular' if seamless else m._orig_padding_mode
        
        assert cfg_scale > 1.0, 'CFG_Scale (-C) must be >1.0'
        assert (
            0.0 < strength < 1.0
        ), 'img2img and inpaint strength can only work with 0.0 < strength < 1.0'
        assert (
                0.0 <= variation_amount <= 1.0
        ), '-v --variation_amount must be in [0.0, 1.0]'

        # check this logic - doesn't look right
        if len(with_variations) > 0 or variation_amount > 1.0:
            assert seed is not None,\
                'seed must be specified when using with_variations'
            if variation_amount == 0.0:
                assert iterations == 1,\
                    'when using --with_variations, multiple iterations are only possible when using --variation_amount'
            assert all(0 <= weight <= 1 for _, weight in with_variations),\
                f'variation weights must be in [0.0, 1.0]: got {[weight for _, weight in with_variations]}'

        width, height, _ = self._resolution_check(width, height, log=True)

        if sampler_name and (sampler_name != self.sampler_name):
            self.sampler_name = sampler_name
            self._set_sampler()

        tic = time.time()
        if torch.cuda.is_available():
            torch.cuda.reset_peak_memory_stats()

        results          = list()
        init_image       = None
        mask_image       = None

        try:
            uc, c = get_uc_and_c(
                prompt, model=self.model,
                skip_normalize=skip_normalize,
                log_tokens=self.log_tokenization
            )

            (init_image,mask_image) = self._make_images(init_img,init_mask, width, height, fit)
            
            if (init_image is not None) and (mask_image is not None):
                generator = self._make_inpaint()
            elif init_image is not None:
                generator = self._make_img2img()
            else:
                generator = self._make_txt2img()

            generator.set_variation(self.seed, variation_amount, with_variations)
            results = generator.generate(
                prompt,
                iterations     = iterations,
                seed           = self.seed,
                sampler        = self.sampler,
                steps          = steps,
                cfg_scale      = cfg_scale,
                conditioning   = (uc,c),
                ddim_eta       = ddim_eta,
                image_callback = image_callback,  # called after the final image is generated
                step_callback  = step_callback,   # called after each intermediate image is generated
                width          = width,
                height         = height,
                init_image     = init_image,      # notice that init_image is different from init_img
                mask_image     = mask_image,
                strength       = strength,
            )

            if upscale is not None or gfpgan_strength > 0:
                self.upscale_and_reconstruct(results,
                                             upscale        = upscale,
                                             strength       = gfpgan_strength,
                                             save_original  = save_original,
                                             image_callback = image_callback)

        except KeyboardInterrupt:
            print('*interrupted*')
            if not self.ignore_ctrl_c:
                raise KeyboardInterrupt
            print(
                '>> Partial results will be returned; if --grid was requested, nothing will be returned.'
            )
        except RuntimeError as e:
            print(traceback.format_exc(), file=sys.stderr)
            print('>> Could not generate image.')

        toc = time.time()
        print('>> Usage stats:')
        print(
            f'>>   {len(results)} image(s) generated in', '%4.2fs' % (toc - tic)
        )
        if torch.cuda.is_available() and self.device.type == 'cuda':
            print(
                f'>>   Max VRAM used for this generation:',
                '%4.2fG.' % (torch.cuda.max_memory_allocated() / 1e9),
                'Current VRAM utilization:'
                '%4.2fG' % (torch.cuda.memory_allocated() / 1e9),
            )

            self.session_peakmem = max(
                self.session_peakmem, torch.cuda.max_memory_allocated()
            )
            print(
                f'>>   Max VRAM used since script start: ',
                '%4.2fG' % (self.session_peakmem / 1e9),
            )
        return results

    def _make_images(self, img_path, mask_path, width, height, fit=False):
        init_image      = None
        init_mask       = None
        if not img_path:
            return None,None

        image        = self._load_img(img_path, width, height, fit=fit) # this returns an Image
        init_image   = self._create_init_image(image)                   # this returns a torch tensor

        if self._has_transparency(image) and not mask_path:      # if image has a transparent area and no mask was provided, then try to generate mask
            print('>> Initial image has transparent areas. Will inpaint in these regions.')
            if self._check_for_erasure(image):
                print(
                    '>> WARNING: Colors underneath the transparent region seem to have been erased.\n',
                    '>>          Inpainting will be suboptimal. Please preserve the colors when making\n',
                    '>>          a transparency mask, or provide mask explicitly using --init_mask (-M).'
                )
            init_mask = self._create_init_mask(image)                   # this returns a torch tensor

        if mask_path:
            mask_image  = self._load_img(mask_path, width, height, fit=fit) # this returns an Image
            init_mask   = self._create_init_mask(mask_image)

        return init_image,init_mask

    def _make_img2img(self):
        if not self.generators.get('img2img'):
            from src.stablediffusion.ldm.dream.generator.img2img import Img2Img
            self.generators['img2img'] = Img2Img(self.model)
        return self.generators['img2img']

    def _make_txt2img(self):
        if not self.generators.get('txt2img'):
            from src.stablediffusion.ldm.dream.generator.txt2img import Txt2Img
            self.generators['txt2img'] = Txt2Img(self.model)
        return self.generators['txt2img']

    def _make_inpaint(self):
        if not self.generators.get('inpaint'):
            from src.stablediffusion.ldm.dream.generator.inpaint import Inpaint
            self.generators['inpaint'] = Inpaint(self.model)
        return self.generators['inpaint']

    def load_model(self):
        """Load and initialize the model from configuration variables passed at object creation time"""
        if self.model is None:
            seed_everything(random.randrange(0, np.iinfo(np.uint32).max))
            try:
                config = OmegaConf.load(self.config)
                model = self._load_model_from_config(config, self.weights)
                if self.embedding_path is not None:
                    model.embedding_manager.load(
                        self.embedding_path, self.full_precision
                    )
                self.model = model.to(self.device)
                # model.to doesn't change the cond_stage_model.device used to move the tokenizer output, so set it here
                self.model.cond_stage_model.device = self.device
            except AttributeError as e:
                print(f'>> Error loading model. {str(e)}', file=sys.stderr)
                print(traceback.format_exc(), file=sys.stderr)
                raise SystemExit from e

            self._set_sampler()

            for m in self.model.modules():
                if isinstance(m, (nn.Conv2d, nn.ConvTranspose2d)):
                    m._orig_padding_mode = m.padding_mode

        return self.model

    def upscale_and_reconstruct(self,
                                image_list,
                                upscale       = None,
                                strength      =  0.0,
                                save_original = False,
                                image_callback = None):
        try:
            if upscale is not None:
                from src.stablediffusion.ldm.gfpgan.gfpgan_tools import real_esrgan_upscale
            if strength > 0:
                from src.stablediffusion.ldm.gfpgan.gfpgan_tools import run_gfpgan
        except (ModuleNotFoundError, ImportError):
            print(traceback.format_exc(), file=sys.stderr)
            print('>> You may need to install the ESRGAN and/or GFPGAN modules')
            return
            
        for r in image_list:
            image, seed = r
            try:
                if upscale is not None:
                    if len(upscale) < 2:
                        upscale.append(0.75)
                    image = real_esrgan_upscale(
                        image,
                        upscale[1],
                        int(upscale[0]),
                        seed,
                    )
                if strength > 0:
                    image = run_gfpgan(
                        image, strength, seed, 1
                    )
            except Exception as e:
                print(
                    f'>> Error running RealESRGAN or GFPGAN. Your image was not upscaled.\n{e}'
                )

            if image_callback is not None:
                image_callback(image, seed, upscaled=True)
            else:
                r[0] = image

    # to help WebGUI - front end to generator util function
    def sample_to_image(self,samples):
        return self._sample_to_image(samples)

    def _sample_to_image(self,samples):
        if not self.base_generator:
            from src.stablediffusion.ldm.dream.generator import Generator
            self.base_generator = Generator(self.model)
        return self.base_generator.sample_to_image(samples)

    def _set_sampler(self):
        msg = f'>> Setting Sampler to {self.sampler_name}'
        if self.sampler_name == 'plms':
            self.sampler = PLMSSampler(self.model, device=self.device)
        elif self.sampler_name == 'ddim':
            self.sampler = DDIMSampler(self.model, device=self.device)
        elif self.sampler_name == 'k_dpm_2_a':
            self.sampler = KSampler(
                self.model, 'dpm_2_ancestral', device=self.device
            )
        elif self.sampler_name == 'k_dpm_2':
            self.sampler = KSampler(self.model, 'dpm_2', device=self.device)
        elif self.sampler_name == 'k_euler_a':
            self.sampler = KSampler(
                self.model, 'euler_ancestral', device=self.device
            )
        elif self.sampler_name == 'k_euler':
            self.sampler = KSampler(self.model, 'euler', device=self.device)
        elif self.sampler_name == 'k_heun':
            self.sampler = KSampler(self.model, 'heun', device=self.device)
        elif self.sampler_name == 'k_lms':
            self.sampler = KSampler(self.model, 'lms', device=self.device)
        else:
            msg = f'>> Unsupported Sampler: {self.sampler_name}, Defaulting to plms'
            self.sampler = PLMSSampler(self.model, device=self.device)

        print(msg)

    def _load_model_from_config(self, config, ckpt):
        print(f'>> Loading model from {ckpt}')

        # for usage statistics
        device_type = choose_torch_device()
        if device_type == 'cuda':
            torch.cuda.reset_peak_memory_stats() 
        tic = time.time()

        # this does the work
        pl_sd = torch.load(ckpt, map_location='cpu')
        sd = pl_sd['state_dict']
        model = instantiate_from_config(config.model)
        m, u = model.load_state_dict(sd, strict=False)
        
        if self.full_precision:
            print(
                '>> Using slower but more accurate full-precision math (--full_precision)'
            )
        else:
            print(
                '>> Using half precision math. Call with --full_precision to use more accurate but VRAM-intensive full precision.'
            )
            model.half()
        model.to(self.device)
        model.eval()

        # usage statistics
        toc = time.time()
        print(
            f'>> Model loaded in', '%4.2fs' % (toc - tic)
        )
        if device_type == 'cuda':
            print(
                '>> Max VRAM used to load the model:',
                '%4.2fG' % (torch.cuda.max_memory_allocated() / 1e9),
                '\n>> Current VRAM usage:'
                '%4.2fG' % (torch.cuda.memory_allocated() / 1e9),
            )

        return model

    def _load_img(self, path, width, height, fit=False):
        assert os.path.exists(path), f'>> {path}: File not found'

        #        with Image.open(path) as img:
        #            image = img.convert('RGBA')
        image = Image.open(path)
        print(
            f'>> loaded input image of size {image.width}x{image.height} from {path}'
        )
        if fit:
            image = self._fit_image(image,(width,height))
        else:
            image = self._squeeze_image(image)
        return image

    def _create_init_image(self,image):
        image = image.convert('RGB')
        # print(
        #     f'>> DEBUG: writing the image to img.png'
        # )
        # image.save('img.png')
        image = np.array(image).astype(np.float32) / 255.0
        image = image[None].transpose(0, 3, 1, 2)
        image = torch.from_numpy(image)
        image = 2.0 * image - 1.0 
        return image.to(self.device)

    def _create_init_mask(self, image):
        # convert into a black/white mask
        image = self._image_to_mask(image)
        image = image.convert('RGB')
        # BUG: We need to use the model's downsample factor rather than hardcoding "8"
        from src.stablediffusion.ldm.dream.generator.base import downsampling
        image = image.resize((image.width//downsampling, image.height//downsampling), resample=Image.Resampling.LANCZOS)
        # print(
        #     f'>> DEBUG: writing the mask to mask.png'
        #     )
        # image.save('mask.png')
        image = np.array(image)
        image = image.astype(np.float32) / 255.0
        image = image[None].transpose(0, 3, 1, 2)
        image = torch.from_numpy(image)
        return image.to(self.device)

    # The mask is expected to have the region to be inpainted
    # with alpha transparency. It converts it into a black/white
    # image with the transparent part black.
    def _image_to_mask(self, mask_image, invert=False) -> Image:
        # Obtain the mask from the transparency channel
        mask = Image.new(mode="L", size=mask_image.size, color=255)
        mask.putdata(mask_image.getdata(band=3))
        if invert:
            mask = ImageOps.invert(mask)
        return mask

    def _has_transparency(self,image):
        if image.info.get("transparency", None) is not None:
            return True
        if image.mode == "P":
            transparent = image.info.get("transparency", -1)
            for _, index in image.getcolors():
                if index == transparent:
                    return True
        elif image.mode == "RGBA":
            extrema = image.getextrema()
            if extrema[3][0] < 255:
                return True
        return False

    
    def _check_for_erasure(self,image):
        width, height = image.size
        pixdata       = image.load()
        colored       = 0
        for y in range(height):
            for x in range(width):
                if pixdata[x, y][3] == 0:
                    r, g, b, _ = pixdata[x, y]
                    if (r, g, b) != (0, 0, 0) and \
                       (r, g, b) != (255, 255, 255):
                        colored += 1
        return colored == 0

    def _squeeze_image(self,image):
        x,y,resize_needed = self._resolution_check(image.width,image.height)
        if resize_needed:
            return InitImageResizer(image).resize(x,y)
        return image


    def _fit_image(self,image,max_dimensions):
        w,h = max_dimensions
        print(
            f'>> image will be resized to fit inside a box {w}x{h} in size.'
        )
        if image.width > image.height:
            h   = None   # by setting h to none, we tell InitImageResizer to fit into the width and calculate height
        elif image.height > image.width:
            w   = None   # ditto for w
        else:
            pass
        image = InitImageResizer(image).resize(w,h)   # note that InitImageResizer does the multiple of 64 truncation internally
        print(
            f'>> after adjusting image dimensions to be multiples of 64, init image is {image.width}x{image.height}'
            )
        return image

    def _resolution_check(self, width, height, log=False):
        resize_needed = False
        w, h = map(
            lambda x: x - x % 64, (width, height)
        )  # resize to integer multiple of 64
        if h != height or w != width:
            if log:
                print(
                    f'>> Provided width and height must be multiples of 64. Auto-resizing to {w}x{h}'
                )
            height = h
            width  = w
            resize_needed = True

        if (width * height) > (self.width * self.height):
            print(">> This input is larger than your defaults. If you run out of memory, please use a smaller image.")

        return width, height, resize_needed




================================================
FILE: src/stablediffusion/ldm/gfpgan/gfpgan_tools.py
================================================
import torch
import warnings
import os
import sys
import numpy as np

from PIL import Image
from scripts.dream import create_argv_parser

arg_parser = create_argv_parser()
opt        = arg_parser.parse_args()
model_path          = os.path.join(opt.gfpgan_dir, opt.gfpgan_model_path)
gfpgan_model_exists = os.path.isfile(model_path)

def run_gfpgan(image, strength, seed, upsampler_scale=4):
    print(f'>> GFPGAN - Restoring Faces for image seed:{seed}')
    gfpgan = None
    with warnings.catch_warnings():
        warnings.filterwarnings('ignore', category=DeprecationWarning)
        warnings.filterwarnings('ignore', category=UserWarning)
        
        try:
            if not gfpgan_model_exists:
                raise Exception('GFPGAN model not found at path ' + model_path)

            sys.path.append(os.path.abspath(opt.gfpgan_dir))
            from gfpgan import GFPGANer

            bg_upsampler = _load_gfpgan_bg_upsampler(
                opt.gfpgan_bg_upsampler, upsampler_scale, opt.gfpgan_bg_tile
            )

            gfpgan = GFPGANer(
                model_path=model_path,
                upscale=upsampler_scale,
                arch='clean',
                channel_multiplier=2,
                bg_upsampler=bg_upsampler,
            )
        except Exception:
            import traceback

            print('>> Error loading GFPGAN:', file=sys.stderr)
            print(traceback.format_exc(), file=sys.stderr)

    if gfpgan is None:
        print(
            f'>> WARNING: GFPGAN not initialized.'
        )
        print(
            f'>> Download https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth to {model_path}, \nor change GFPGAN directory with --gfpgan_dir.'
        )
        return image

    image = image.convert('RGB')

    cropped_faces, restored_faces, restored_img = gfpgan.enhance(
        np.array(image, dtype=np.uint8),
        has_aligned=False,
        only_center_face=False,
        paste_back=True,
    )
    res = Image.fromarray(restored_img)

    if strength < 1.0:
        # Resize the image to the new image if the sizes have changed
        if restored_img.size != image.size:
            image = image.resize(res.size)
        res = Image.blend(image, res, strength)

    if torch.cuda.is_available():
        torch.cuda.empty_cache()
    gfpgan = None

    return res


def _load_gfpgan_bg_upsampler(bg_upsampler, upsampler_scale, bg_tile=400):
    if bg_upsampler == 'realesrgan':
        if not torch.cuda.is_available(): # CPU or MPS on M1
            use_half_precision = False
        else:
            use_half_precision = True

        model_path = {
            2: 'https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth',
            4: 'https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth',
        }

        if upsampler_scale not in model_path:
            return None

        from basicsr.archs.rrdbnet_arch import RRDBNet
        from realesrgan import RealESRGANer

        if upsampler_scale == 4:
            model = RRDBNet(
                num_in_ch=3,
                num_out_ch=3,
                num_feat=64,
                num_block=23,
                num_grow_ch=32,
                scale=4,
            )
        if upsampler_scale == 2:
            model = RRDBNet(
                num_in_ch=3,
                num_out_ch=3,
                num_feat=64,
                num_block=23,
                num_grow_ch=32,
                scale=2,
            )

        bg_upsampler = RealESRGANer(
            scale=upsampler_scale,
            model_path=model_path[upsampler_scale],
            model=model,
            tile=bg_tile,
            tile_pad=10,
            pre_pad=0,
            half=use_half_precision,
        )
    else:
        bg_upsampler = None

    return bg_upsampler


def real_esrgan_upscale(image, strength, upsampler_scale, seed):
    print(
        f'>> Real-ESRGAN Upscaling seed:{seed} : scale:{upsampler_scale}x'
    )

    with warnings.catch_warnings():
        warnings.filterwarnings('ignore', category=DeprecationWarning)
        warnings.filterwarnings('ignore', category=UserWarning)

        try:
            upsampler = _load_gfpgan_bg_upsampler(
                opt.gfpgan_bg_upsampler, upsampler_scale, opt.gfpgan_bg_tile
            )
        except Exception:
            import traceback

            print('>> Error loading Real-ESRGAN:', file=sys.stderr)
            print(traceback.format_exc(), file=sys.stderr)

    output, img_mode = upsampler.enhance(
        np.array(image, dtype=np.uint8),
        outscale=upsampler_scale,
        alpha_upsampler=opt.gfpgan_bg_upsampler,
    )

    res = Image.fromarray(output)

    if strength < 1.0:
        # Resize the image to the new image if the sizes have changed
        if output.size != image.size:
            image = image.resize(res.size)
        res = Image.blend(image, res, strength)

    if torch.cuda.is_available():
        torch.cuda.empty_cache()
    upsampler = None

    return res


================================================
FILE: src/stablediffusion/ldm/lr_scheduler.py
================================================
import numpy as np


class LambdaWarmUpCosineScheduler:
    """
    note: use with a base_lr of 1.0
    """

    def __init__(
        self,
        warm_up_steps,
        lr_min,
        lr_max,
        lr_start,
        max_decay_steps,
        verbosity_interval=0,
    ):
        self.lr_warm_up_steps = warm_up_steps
        self.lr_start = lr_start
        self.lr_min = lr_min
        self.lr_max = lr_max
        self.lr_max_decay_steps = max_decay_steps
        self.last_lr = 0.0
        self.verbosity_interval = verbosity_interval

    def schedule(self, n, **kwargs):
        if self.verbosity_interval > 0:
            if n % self.verbosity_interval == 0:
                print(
                    f'current step: {n}, recent lr-multiplier: {self.last_lr}'
                )
        if n < self.lr_warm_up_steps:
            lr = (
                self.lr_max - self.lr_start
            ) / self.lr_warm_up_steps * n + self.lr_start
            self.last_lr = lr
            return lr
        else:
            t = (n - self.lr_warm_up_steps) / (
                self.lr_max_decay_steps - self.lr_warm_up_steps
            )
            t = min(t, 1.0)
            lr = self.lr_min + 0.5 * (self.lr_max - self.lr_min) * (
                1 + np.cos(t * np.pi)
            )
            self.last_lr = lr
            return lr

    def __call__(self, n, **kwargs):
        return self.schedule(n, **kwargs)


class LambdaWarmUpCosineScheduler2:
    """
    supports repeated iterations, configurable via lists
    note: use with a base_lr of 1.0.
    """

    def __init__(
        self,
        warm_up_steps,
        f_min,
        f_max,
        f_start,
        cycle_lengths,
        verbosity_interval=0,
    ):
        assert (
            len(warm_up_steps)
            == len(f_min)
            == len(f_max)
            == len(f_start)
            == len(cycle_lengths)
        )
        self.lr_warm_up_steps = warm_up_steps
        self.f_start = f_start
        self.f_min = f_min
        self.f_max = f_max
        self.cycle_lengths = cycle_lengths
        self.cum_cycles = np.cumsum([0] + list(self.cycle_lengths))
        self.last_f = 0.0
        self.verbosity_interval = verbosity_interval

    def find_in_interval(self, n):
        interval = 0
        for cl in self.cum_cycles[1:]:
            if n <= cl:
                return interval
            interval += 1

    def schedule(self, n, **kwargs):
        cycle = self.find_in_interval(n)
        n = n - self.cum_cycles[cycle]
        if self.verbosity_interval > 0:
            if n % self.verbosity_interval == 0:
                print(
                    f'current step: {n}, recent lr-multiplier: {self.last_f}, '
                    f'current cycle {cycle}'
                )
        if n < self.lr_warm_up_steps[cycle]:
            f = (
                self.f_max[cycle] - self.f_start[cycle]
            ) / self.lr_warm_up_steps[cycle] * n + self.f_start[cycle]
            self.last_f = f
            return f
        else:
            t = (n - self.lr_warm_up_steps[cycle]) / (
                self.cycle_lengths[cycle] - self.lr_warm_up_steps[cycle]
            )
            t = min(t, 1.0)
            f = self.f_min[cycle] + 0.5 * (
                self.f_max[cycle] - self.f_min[cycle]
            ) * (1 + np.cos(t * np.pi))
            self.last_f = f
            return f

    def __call__(self, n, **kwargs):
        return self.schedule(n, **kwargs)


class LambdaLinearScheduler(LambdaWarmUpCosineScheduler2):
    def schedule(self, n, **kwargs):
        cycle = self.find_in_interval(n)
        n = n - self.cum_cycles[cycle]
        if self.verbosity_interval > 0:
            if n % self.verbosity_interval == 0:
                print(
                    f'current step: {n}, recent lr-multiplier: {self.last_f}, '
                    f'current cycle {cycle}'
                )

        if n < self.lr_warm_up_steps[cycle]:
            f = (
                self.f_max[cycle] - self.f_start[cycle]
            ) / self.lr_warm_up_steps[cycle] * n + self.f_start[cycle]
            self.last_f = f
            return f
        else:
            f = self.f_min[cycle] + (self.f_max[cycle] - self.f_min[cycle]) * (
                self.cycle_lengths[cycle] - n
            ) / (self.cycle_lengths[cycle])
            self.last_f = f
            return f


================================================
FILE: src/stablediffusion/ldm/models/autoencoder.py
================================================
import torch
import pytorch_lightning as pl
import torch.nn.functional as F
from contextlib import contextmanager

from taming.modules.vqvae.quantize import VectorQuantizer2 as VectorQuantizer

from src.stablediffusion.ldm.modules.diffusionmodules.model import Encoder, Decoder
from src.stablediffusion.ldm.modules.distributions.distributions import (
    DiagonalGaussianDistribution,
)

from src.stablediffusion.ldm.util import instantiate_from_config


class VQModel(pl.LightningModule):
    def __init__(
        self,
        ddconfig,
        lossconfig,
        n_embed,
        embed_dim,
        ckpt_path=None,
        ignore_keys=[],
        image_key='image',
        colorize_nlabels=None,
        monitor=None,
        batch_resize_range=None,
        scheduler_config=None,
        lr_g_factor=1.0,
        remap=None,
        sane_index_shape=False,  # tell vector quantizer to return indices as bhw
        use_ema=False,
    ):
        super().__init__()
        self.embed_dim = embed_dim
        self.n_embed = n_embed
        self.image_key = image_key
        self.encoder = Encoder(**ddconfig)
        self.decoder = Decoder(**ddconfig)
        self.loss = instantiate_from_config(lossconfig)
        self.quantize = VectorQuantizer(
            n_embed,
            embed_dim,
            beta=0.25,
            remap=remap,
            sane_index_shape=sane_index_shape,
        )
        self.quant_conv = torch.nn.Conv2d(ddconfig['z_channels'], embed_dim, 1)
        self.post_quant_conv = torch.nn.Conv2d(
            embed_dim, ddconfig['z_channels'], 1
        )
        if colorize_nlabels is not None:
            assert type(colorize_nlabels) == int
            self.register_buffer(
                'colorize', torch.randn(3, colorize_nlabels, 1, 1)
            )
        if monitor is not None:
            self.monitor = monitor
        self.batch_resize_range = batch_resize_range
        if self.batch_resize_range is not None:
            print(
                f'{self.__class__.__name__}: Using per-batch resizing in range {batch_resize_range}.'
            )

        self.use_ema = use_ema
        if self.use_ema:
            self.model_ema = LitEma(self)
            print(f'Keeping EMAs of {len(list(self.model_ema.buffers()))}.')

        if ckpt_path is not None:
            self.init_from_ckpt(ckpt_path, ignore_keys=ignore_keys)
        self.scheduler_config = scheduler_config
        self.lr_g_factor = lr_g_factor

    @contextmanager
    def ema_scope(self, context=None):
        if self.use_ema:
            self.model_ema.store(self.parameters())
            self.model_ema.copy_to(self)
            if context is not None:
                print(f'{context}: Switched to EMA weights')
        try:
            yield None
        finally:
            if self.use_ema:
                self.model_ema.restore(self.parameters())
                if context is not None:
                    print(f'{context}: Restored training weights')

    def init_from_ckpt(self, path, ignore_keys=list()):
        sd = torch.load(path, map_location='cpu')['state_dict']
        keys = list(sd.keys())
        for k in keys:
            for ik in ignore_keys:
                if k.startswith(ik):
                    print('Deleting key {} from state_dict.'.format(k))
                    del sd[k]
        missing, unexpected = self.load_state_dict(sd, strict=False)
        print(
            f'Restored from {path} with {len(missing)} missing and {len(unexpected)} unexpected keys'
        )
        if len(missing) > 0:
            print(f'Missing Keys: {missing}')
            print(f'Unexpected Keys: {unexpected}')

    def on_train_batch_end(self, *args, **kwargs):
        if self.use_ema:
            self.model_ema(self)

    def encode(self, x):
        h = self.encoder(x)
        h = self.quant_conv(h)
        quant, emb_loss, info = self.quantize(h)
        return quant, emb_loss, info

    def encode_to_prequant(self, x):
        h = self.encoder(x)
        h = self.quant_conv(h)
        return h

    def decode(self, quant):
        quant = self.post_quant_conv(quant)
        dec = self.decoder(quant)
        return dec

    def decode_code(self, code_b):
        quant_b = self.quantize.embed_code(code_b)
        dec = self.decode(quant_b)
        return dec

    def forward(self, input, return_pred_indices=False):
        quant, diff, (_, _, ind) = self.encode(input)
        dec = self.decode(quant)
        if return_pred_indices:
            return dec, diff, ind
        return dec, diff

    def get_input(self, batch, k):
        x = batch[k]
        if len(x.shape) == 3:
            x = x[..., None]
        x = (
            x.permute(0, 3, 1, 2)
            .to(memory_format=torch.contiguous_format)
            .float()
        )
        if self.batch_resize_range is not None:
            lower_size = self.batch_resize_range[0]
            upper_size = self.batch_resize_range[1]
            if self.global_step <= 4:
                # do the first few batches with max size to avoid later oom
                new_resize = upper_size
            else:
                new_resize = np.random.choice(
                    np.arange(lower_size, upper_size + 16, 16)
                )
            if new_resize != x.shape[2]:
                x = F.interpolate(x, size=new_resize, mode='bicubic')
            x = x.detach()
        return x

    def training_step(self, batch, batch_idx, optimizer_idx):
        # https://github.com/pytorch/pytorch/issues/37142
        # try not to fool the heuristics
        x = self.get_input(batch, self.image_key)
        xrec, qloss, ind = self(x, return_pred_indices=True)

        if optimizer_idx == 0:
            # autoencode
            aeloss, log_dict_ae = self.loss(
                qloss,
                x,
                xrec,
                optimizer_idx,
                self.global_step,
                last_layer=self.get_last_layer(),
                split='train',
                predicted_indices=ind,
            )

            self.log_dict(
                log_dict_ae,
                prog_bar=False,
                logger=True,
                on_step=True,
                on_epoch=True,
            )
            return aeloss

        if optimizer_idx == 1:
            # discriminator
            discloss, log_dict_disc = self.loss(
                qloss,
                x,
                xrec,
                optimizer_idx,
                self.global_step,
                last_layer=self.get_last_layer(),
                split='train',
            )
            self.log_dict(
                log_dict_disc,
                prog_bar=False,
                logger=True,
                on_step=True,
                on_epoch=True,
            )
            return discloss

    def validation_step(self, batch, batch_idx):
        log_dict = self._validation_step(batch, batch_idx)
        with self.ema_scope():
            log_dict_ema = self._validation_step(
                batch, batch_idx, suffix='_ema'
            )
        return log_dict

    def _validation_step(self, batch, batch_idx, suffix=''):
        x = self.get_input(batch, self.image_key)
        xrec, qloss, ind = self(x, return_pred_indices=True)
        aeloss, log_dict_ae = self.loss(
            qloss,
            x,
            xrec,
            0,
            self.global_step,
            last_layer=self.get_last_layer(),
            split='val' + suffix,
            predicted_indices=ind,
        )

        discloss, log_dict_disc = self.loss(
            qloss,
            x,
            xrec,
            1,
            self.global_step,
            last_layer=self.get_last_layer(),
            split='val' + suffix,
            predicted_indices=ind,
        )
        rec_loss = log_dict_ae[f'val{suffix}/rec_loss']
        self.log(
            f'val{suffix}/rec_loss',
            rec_loss,
            prog_bar=True,
            logger=True,
            on_step=False,
            on_epoch=True,
            sync_dist=True,
        )
        self.log(
            f'val{suffix}/aeloss',
            aeloss,
            prog_bar=True,
            logger=True,
            on_step=False,
            on_epoch=True,
            sync_dist=True,
        )
        if version.parse(pl.__version__) >= version.parse('1.4.0'):
            del log_dict_ae[f'val{suffix}/rec_loss']
        self.log_dict(log_dict_ae)
        self.log_dict(log_dict_disc)
        return self.log_dict

    def configure_optimizers(self):
        lr_d = self.learning_rate
        lr_g = self.lr_g_factor * self.learning_rate
        print('lr_d', lr_d)
        print('lr_g', lr_g)
        opt_ae = torch.optim.Adam(
            list(self.encoder.parameters())
            + list(self.decoder.parameters())
            + list(self.quantize.parameters())
            + list(self.quant_conv.parameters())
            + list(self.post_quant_conv.parameters()),
            lr=lr_g,
            betas=(0.5, 0.9),
        )
        opt_disc = torch.optim.Adam(
            self.loss.discriminator.parameters(), lr=lr_d, betas=(0.5, 0.9)
        )

        if self.scheduler_config is not None:
            scheduler = instantiate_from_config(self.scheduler_config)

            print('Setting up LambdaLR scheduler...')
            scheduler = [
                {
                    'scheduler': LambdaLR(
                        opt_ae, lr_lambda=scheduler.schedule
                    ),
                    'interval': 'step',
                    'frequency': 1,
                },
                {
                    'scheduler': LambdaLR(
                        opt_disc, lr_lambda=scheduler.schedule
                    ),
                    'interval': 'step',
                    'frequency': 1,
                },
            ]
            return [opt_ae, opt_disc], scheduler
        return [opt_ae, opt_disc], []

    def get_last_layer(self):
        return self.decoder.conv_out.weight

    def log_images(self, batch, only_inputs=False, plot_ema=False, **kwargs):
        log = dict()
        x = self.get_input(batch, self.image_key)
        x = x.to(self.device)
        if only_inputs:
            log['inputs'] = x
            return log
        xrec, _ = self(x)
        if x.shape[1] > 3:
            # colorize with random projection
            assert xrec.shape[1] > 3
            x = self.to_rgb(x)
            xrec = self.to_rgb(xrec)
        log['inputs'] = x
        log['reconstructions'] = xrec
        if plot_ema:
            with self.ema_scope():
                xrec_ema, _ = self(x)
                if x.shape[1] > 3:
                    xrec_ema = self.to_rgb(xrec_ema)
                log['reconstructions_ema'] = xrec_ema
        return log

    def to_rgb(self, x):
        assert self.image_key == 'segmentation'
        if not hasattr(self, 'colorize'):
            self.register_buffer(
                'colorize', torch.randn(3, x.shape[1], 1, 1).to(x)
            )
        x = F.conv2d(x, weight=self.colorize)
        x = 2.0 * (x - x.min()) / (x.max() - x.min()) - 1.0
        return x


class VQModelInterface(VQModel):
    def __init__(self, embed_dim, *args, **kwargs):
        super().__init__(embed_dim=embed_dim, *args, **kwargs)
        self.embed_dim = embed_dim

    def encode(self, x):
        h = self.encoder(x)
        h = self.quant_conv(h)
        return h

    def decode(self, h, force_not_quantize=False):
        # also go through quantization layer
        if not force_not_quantize:
            quant, emb_loss, info = self.quantize(h)
        else:
            quant = h
        quant = self.post_quant_conv(quant)
        dec = self.decoder(quant)
        return dec


class AutoencoderKL(pl.LightningModule):
    def __init__(
        self,
        ddconfig,
        lossconfig,
        embed_dim,
        ckpt_path=None,
        ignore_keys=[],
        image_key='image',
        colorize_nlabels=None,
        monitor=None,
    ):
        super().__init__()
        self.image_key = image_key
        self.encoder = Encoder(**ddconfig)
        self.decoder = Decoder(**ddconfig)
        self.loss = instantiate_from_config(lossconfig)
        assert ddconfig['double_z']
        self.quant_conv = torch.nn.Conv2d(
            2 * ddconfig['z_channels'], 2 * embed_dim, 1
        )
        self.post_quant_conv = torch.nn.Conv2d(
            embed_dim, ddconfig['z_channels'], 1
        )
        self.embed_dim = embed_dim
        if colorize_nlabels is not None:
            assert type(colorize_nlabels) == int
            self.register_buffer(
                'colorize', torch.randn(3, colorize_nlabels, 1, 1)
            )
        if monitor is not None:
            self.monitor = monitor
        if ckpt_path is not None:
            self.init_from_ckpt(ckpt_path, ignore_keys=ignore_keys)

    def init_from_ckpt(self, path, ignore_keys=list()):
        sd = torch.load(path, map_location='cpu')['state_dict']
        keys = list(sd.keys())
        for k in keys:
            for ik in ignore_keys:
                if k.startswith(ik):
                    print('Deleting key {} from state_dict.'.format(k))
                    del sd[k]
        self.load_state_dict(sd, strict=False)
        print(f'Restored from {path}')

    def encode(self, x):
        h = self.encoder(x)
        moments = self.quant_conv(h)
        posterior = DiagonalGaussianDistribution(moments)
        return posterior

    def decode(self, z):
        z = self.post_quant_conv(z)
        dec = self.decoder(z)
        return dec

    def forward(self, input, sample_posterior=True):
        posterior = self.encode(input)
        if sample_posterior:
            z = posterior.sample()
        else:
            z = posterior.mode()
        dec = self.decode(z)
        return dec, posterior

    def get_input(self, batch, k):
        x = batch[k]
        if len(x.shape) == 3:
            x = x[..., None]
        x = (
            x.permute(0, 3, 1, 2)
            .to(memory_format=torch.contiguous_format)
            .float()
        )
        return x

    def training_step(self, batch, batch_idx, optimizer_idx):
        inputs = self.get_input(batch, self.image_key)
        reconstructions, posterior = self(inputs)

        if optimizer_idx == 0:
            # train encoder+decoder+logvar
            aeloss, log_dict_ae = self.loss(
                inputs,
                reconstructions,
                posterior,
                optimizer_idx,
                self.global_step,
                last_layer=self.get_last_layer(),
                split='train',
            )
            self.log(
                'aeloss',
                aeloss,
                prog_bar=True,
                logger=True,
                on_step=True,
                on_epoch=True,
            )
            self.log_dict(
                log_dict_ae,
                prog_bar=False,
                logger=True,
                on_step=True,
                on_epoch=False,
            )
            return aeloss

        if optimizer_idx == 1:
            # train the discriminator
            discloss, log_dict_disc = self.loss(
                inputs,
                reconstructions,
                posterior,
                optimizer_idx,
                self.global_step,
                last_layer=self.get_last_layer(),
                split='train',
            )

            self.log(
                'discloss',
                discloss,
                prog_bar=True,
                logger=True,
                on_step=True,
                on_epoch=True,
            )
            self.log_dict(
                log_dict_disc,
                prog_bar=False,
                logger=True,
                on_step=True,
                on_epoch=False,
            )
            return discloss

    def validation_step(self, batch, batch_idx):
        inputs = self.get_input(batch, self.image_key)
        reconstructions, posterior = self(inputs)
        aeloss, log_dict_ae = self.loss(
            inputs,
            reconstructions,
            posterior,
            0,
            self.global_step,
            last_layer=self.get_last_layer(),
            split='val',
        )

        discloss, log_dict_disc = self.loss(
            inputs,
            reconstructions,
            posterior,
            1,
            self.global_step,
            last_layer=self.get_last_layer(),
            split='val',
        )

        self.log('val/rec_loss', log_dict_ae['val/rec_loss'])
        self.log_dict(log_dict_ae)
        self.log_dict(log_dict_disc)
        return self.log_dict

    def configure_optimizers(self):
        lr = self.learning_rate
        opt_ae = torch.optim.Adam(
            list(self.encoder.parameters())
            + list(self.decoder.parameters())
            + list(self.quant_conv.parameters())
            + list(self.post_quant_conv.parameters()),
            lr=lr,
            betas=(0.5, 0.9),
        )
        opt_disc = torch.optim.Adam(
            self.loss.discriminator.parameters(), lr=lr, betas=(0.5, 0.9)
        )
        return [opt_ae, opt_disc], []

    def get_last_layer(self):
        return self.decoder.conv_out.weight

    @torch.no_grad()
    def log_images(self, batch, only_inputs=False, **kwargs):
        log = dict()
        x = self.get_input(batch, self.image_key)
        x = x.to(self.device)
        if not only_inputs:
            xrec, posterior = self(x)
            if x.shape[1] > 3:
                # colorize with random projection
                assert xrec.shape[1] > 3
                x = self.to_rgb(x)
                xrec = self.to_rgb(xrec)
            log['samples'] = self.decode(torch.randn_like(posterior.sample()))
            log['reconstructions'] = xrec
        log['inputs'] = x
        return log

    def to_rgb(self, x):
        assert self.image_key == 'segmentation'
        if not hasattr(self, 'colorize'):
            self.register_buffer(
                'colorize', torch.randn(3, x.shape[1], 1, 1).to(x)
            )
        x = F.conv2d(x, weight=self.colorize)
        x = 2.0 * (x - x.min()) / (x.max() - x.min()) - 1.0
        return x


class IdentityFirstStage(torch.nn.Module):
    def __init__(self, *args, vq_interface=False, **kwargs):
        self.vq_interface = vq_interface  # TODO: Should be true by default but check to not break older stuff
        super().__init__()

    def encode(self, x, *args, **kwargs):
        return x

    def decode(self, x, *args, **kwargs):
        return x

    def quantize(self, x, *args, **kwargs):
        if self.vq_interface:
            return x, None, [None, None, None]
        return x

    def forward(self, x, *args, **kwargs):
        return x


================================================
FILE: src/stablediffusion/ldm/models/diffusion/__init__.py
================================================


================================================
FILE: src/stablediffusion/ldm/models/diffusion/classifier.py
================================================
import os
import torch
import pytorch_lightning as pl
from omegaconf import OmegaConf
from torch.nn import functional as F
from torch.optim import AdamW
from torch.optim.lr_scheduler import LambdaLR
from copy import deepcopy
from einops import rearrange
from glob import glob
from natsort import natsorted

from src.stablediffusion.ldm.modules.diffusionmodules.openaimodel import (
    EncoderUNetModel,
    UNetModel,
)
from src.stablediffusion.ldm.util import log_txt_as_img, default, ismap, instantiate_from_config

__models__ = {'class_label': EncoderUNetModel, 'segmentation': UNetModel}


def disabled_train(self, mode=True):
    """Overwrite model.train with this function to make sure train/eval mode
    does not change anymore."""
    return self


class NoisyLatentImageClassifier(pl.LightningModule):
    def __init__(
        self,
        diffusion_path,
        num_classes,
        ckpt_path=None,
        pool='attention',
        label_key=None,
        diffusion_ckpt_path=None,
        scheduler_config=None,
        weight_decay=1.0e-2,
        log_steps=10,
        monitor='val/loss',
        *args,
        **kwargs,
    ):
        super().__init__(*args, **kwargs)
        self.num_classes = num_classes
        # get latest config of diffusion model
        diffusion_config = natsorted(
            glob(os.path.join(diffusion_path, 'configs', '*-project.yaml'))
        )[-1]
        self.diffusion_config = OmegaConf.load(diffusion_config).model
        self.diffusion_config.params.ckpt_path = diffusion_ckpt_path
        self.load_diffusion()

        self.monitor = monitor
        self.numd = (
            self.diffusion_model.first_stage_model.encoder.num_resolutions - 1
        )
        self.log_time_interval = (
            self.diffusion_model.num_timesteps // log_steps
        )
        self.log_steps = log_steps

        self.label_key = (
            label_key
            if not hasattr(self.diffusion_model, 'cond_stage_key')
            else self.diffusion_model.cond_stage_key
        )

        assert (
            self.label_key is not None
        ), 'label_key neither in diffusion model nor in model.params'

        if self.label_key not in __models__:
            raise NotImplementedError()

        self.load_classifier(ckpt_path, pool)

        self.scheduler_config = scheduler_config
        self.use_scheduler = self.scheduler_config is not None
        self.weight_decay = weight_decay

    def init_from_ckpt(self, path, ignore_keys=list(), only_model=False):
        sd = torch.load(path, map_location='cpu')
        if 'state_dict' in list(sd.keys()):
            sd = sd['state_dict']
        keys = list(sd.keys())
        for k in keys:
            for ik in ignore_keys:
                if k.startswith(ik):
                    print('Deleting key {} from state_dict.'.format(k))
                    del sd[k]
        missing, unexpected = (
            self.load_state_dict(sd, strict=False)
            if not only_model
            else self.model.load_state_dict(sd, strict=False)
        )
        print(
            f'Restored from {path} with {len(missing)} missing and {len(unexpected)} unexpected keys'
        )
        if len(missing) > 0:
            print(f'Missing Keys: {missing}')
        if len(unexpected) > 0:
            print(f'Unexpected Keys: {unexpected}')

    def load_diffusion(self):
        model = instantiate_from_config(self.diffusion_config)
        self.diffusion_model = model.eval()
        self.diffusion_model.train = disabled_train
        for param in self.diffusion_model.parameters():
            param.requires_grad = False

    def load_classifier(self, ckpt_path, pool):
        model_config = deepcopy(
            self.diffusion_config.params.unet_config.params
        )
        model_config.in_channels = (
            self.diffusion_config.params.unet_config.params.out_channels
        )
        model_config.out_channels = self.num_classes
        if self.label_key == 'class_label':
            model_config.pool = pool

        self.model = __models__[self.label_key](**model_config)
        if ckpt_path is not None:
            print(
                '#####################################################################'
            )
            print(f'load from ckpt "{ckpt_path}"')
            print(
                '#####################################################################'
            )
            self.init_from_ckpt(ckpt_path)

    @torch.no_grad()
    def get_x_noisy(self, x, t, noise=None):
        noise = default(noise, lambda: torch.randn_like(x))
        continuous_sqrt_alpha_cumprod = None
        if self.diffusion_model.use_continuous_noise:
            continuous_sqrt_alpha_cumprod = (
                self.diffusion_model.sample_continuous_noise_level(
                    x.shape[0], t + 1
                )
            )
            # todo: make sure t+1 is correct here

        return self.diffusion_model.q_sample(
            x_start=x,
            t=t,
            noise=noise,
            continuous_sqrt_alpha_cumprod=continuous_sqrt_alpha_cumprod,
        )

    def forward(self, x_noisy, t, *args, **kwargs):
        return self.model(x_noisy, t)

    @torch.no_grad()
    def get_input(self, batch, k):
        x = batch[k]
        if len(x.shape) == 3:
            x = x[..., None]
        x = rearrange(x, 'b h w c -> b c h w')
        x = x.to(memory_format=torch.contiguous_format).float()
        return x

    @torch.no_grad()
    def get_conditioning(self, batch, k=None):
        if k is None:
            k = self.label_key
        assert k is not None, 'Needs to provide label key'

        targets = batch[k].to(self.device)

        if self.label_key == 'segmentation':
            targets = rearrange(targets, 'b h w c -> b c h w')
            for down in range(self.numd):
                h, w = targets.shape[-2:]
                targets = F.interpolate(
                    targets, size=(h // 2, w // 2), mode='nearest'
                )

            # targets = rearrange(targets,'b c h w -> b h w c')

        return targets

    def compute_top_k(self, logits, labels, k, reduction='mean'):
        _, top_ks = torch.topk(logits, k, dim=1)
        if reduction == 'mean':
            return (
                (top_ks == labels[:, None]).float().sum(dim=-1).mean().item()
            )
        elif reduction == 'none':
            return (top_ks == labels[:, None]).float().sum(dim=-1)

    def on_train_epoch_start(self):
        # save some memory
        self.diffusion_model.model.to('cpu')

    @torch.no_grad()
    def write_logs(self, loss, logits, targets):
        log_prefix = 'train' if self.training else 'val'
        log = {}
        log[f'{log_prefix}/loss'] = loss.mean()
        log[f'{log_prefix}/acc@1'] = self.compute_top_k(
            logits, targets, k=1, reduction='mean'
        )
        log[f'{log_prefix}/acc@5'] = self.compute_top_k(
            logits, targets, k=5, reduction='mean'
        )

        self.log_dict(
            log,
            prog_bar=False,
            logger=True,
            on_step=self.training,
            on_epoch=True,
        )
        self.log(
            'loss', log[f'{log_prefix}/loss'], prog_bar=True, logger=False
        )
        self.log(
            'global_step',
            self.global_step,
            logger=False,
            on_epoch=False,
            prog_bar=True,
        )
        lr = self.optimizers().param_groups[0]['lr']
        self.log(
            'lr_abs',
            lr,
            on_step=True,
            logger=True,
            on_epoch=False,
            prog_bar=True,
        )

    def shared_step(self, batch, t=None):
        x, *_ = self.diffusion_model.get_input(
            batch, k=self.diffusion_model.first_stage_key
        )
        targets = self.get_conditioning(batch)
        if targets.dim() == 4:
            targets = targets.argmax(dim=1)
        if t is None:
            t = torch.randint(
                0,
                self.diffusion_model.num_timesteps,
                (x.shape[0],),
                device=self.device,
            ).long()
        else:
            t = torch.full(
                size=(x.shape[0],), fill_value=t, device=self.device
            ).long()
        x_noisy = self.get_x_noisy(x, t)
        logits = self(x_noisy, t)

        loss = F.cross_entropy(logits, targets, reduction='none')

        self.write_logs(loss.detach(), logits.detach(), targets.detach())

        loss = loss.mean()
        return loss, logits, x_noisy, targets

    def training_step(self, batch, batch_idx):
        loss, *_ = self.shared_step(batch)
        return loss

    def reset_noise_accs(self):
        self.noisy_acc = {
            t: {'acc@1': [], 'acc@5': []}
            for t in range(
                0,
                self.diffusion_model.num_timesteps,
                self.diffusion_model.log_every_t,
            )
        }

    def on_validation_start(self):
        self.reset_noise_accs()

    @torch.no_grad()
    def validation_step(self, batch, batch_idx):
        loss, *_ = self.shared_step(batch)

        for t in self.noisy_acc:
            _, logits, _, targets = self.shared_step(batch, t)
            self.noisy_acc[t]['acc@1'].append(
                self.compute_top_k(logits, targets, k=1, reduction='mean')
            )
            self.noisy_acc[t]['acc@5'].append(
                self.compute_top_k(logits, targets, k=5, reduction='mean')
            )

        return loss

    def configure_optimizers(self):
        optimizer = AdamW(
            self.model.parameters(),
            lr=self.learning_rate,
            weight_decay=self.weight_decay,
        )

        if self.use_scheduler:
            scheduler = instantiate_from_config(self.scheduler_config)

            print('Setting up LambdaLR scheduler...')
            scheduler = [
                {
                    'scheduler': LambdaLR(
                        optimizer, lr_lambda=scheduler.schedule
                    ),
                    'interval': 'step',
                    'frequency': 1,
                }
            ]
            return [optimizer], scheduler

        return optimizer

    @torch.no_grad()
    def log_images(self, batch, N=8, *args, **kwargs):
        log = dict()
        x = self.get_input(batch, self.diffusion_model.first_stage_key)
        log['inputs'] = x

        y = self.get_conditioning(batch)

        if self.label_key == 'class_label':
            y = log_txt_as_img((x.shape[2], x.shape[3]), batch['human_label'])
            log['labels'] = y

        if ismap(y):
            log['labels'] = self.diffusion_model.to_rgb(y)

            for step in range(self.log_steps):
                current_time = step * self.log_time_interval

                _, logits, x_noisy, _ = self.shared_step(batch, t=current_time)

                log[f'inputs@t{current_time}'] = x_noisy

                pred = F.one_hot(
                    logits.argmax(dim=1), num_classes=self.num_classes
                )
                pred = rearrange(pred, 'b h w c -> b c h w')

                log[f'pred@t{current_time}'] = self.diffusion_model.to_rgb(
                    pred
                )

        for key in log:
            log[key] = log[key][:N]

        return log


================================================
FILE: src/stablediffusion/ldm/models/diffusion/ddim.py
================================================
"""SAMPLING ONLY."""

import torch
import numpy as np
from tqdm import tqdm
from functools import partial
from src.stablediffusion.ldm.dream.devices import choose_torch_device

from src.stablediffusion.ldm.modules.diffusionmodules.util import (
    make_ddim_sampling_parameters,
    make_ddim_timesteps,
    noise_like,
    extract_into_tensor,
)


class DDIMSampler(object):
    def __init__(self, model, schedule='linear', device=None, **kwargs):
        super().__init__()
        self.model = model
        self.ddpm_num_timesteps = model.num_timesteps
        self.schedule = schedule
        self.device   = device or choose_torch_device()

    def register_buffer(self, name, attr):
        if type(attr) == torch.Tensor:
            if attr.device != torch.
Download .txt
gitextract_gdf_fm52/

├── .gitignore
├── LICENSE
├── README.md
├── __main__.py
├── models/
│   ├── .keep
│   └── v1-inference.yaml
├── requirements.txt
├── run.bat
├── run.sh
├── setup.bat
├── setup.sh
├── src/
│   ├── bot/
│   │   ├── shanghai.py
│   │   └── stablecog.py
│   ├── core/
│   │   └── logging.py
│   ├── scripts/
│   │   └── win10patch.py
│   └── stablediffusion/
│       ├── dream.py
│       ├── inpaint.py
│       ├── ldm/
│       │   ├── __init__.py
│       │   ├── data/
│       │   │   ├── __init__.py
│       │   │   ├── base.py
│       │   │   ├── imagenet.py
│       │   │   ├── lsun.py
│       │   │   ├── personalized.py
│       │   │   └── personalized_style.py
│       │   ├── dream/
│       │   │   ├── conditioning.py
│       │   │   ├── devices.py
│       │   │   ├── generator/
│       │   │   │   ├── __init__.py
│       │   │   │   ├── base.py
│       │   │   │   ├── img2img.py
│       │   │   │   ├── inpaint.py
│       │   │   │   └── txt2img.py
│       │   │   ├── image_util.py
│       │   │   ├── pngwriter.py
│       │   │   ├── readline.py
│       │   │   └── server.py
│       │   ├── generate.py
│       │   ├── gfpgan/
│       │   │   └── gfpgan_tools.py
│       │   ├── lr_scheduler.py
│       │   ├── models/
│       │   │   ├── autoencoder.py
│       │   │   └── diffusion/
│       │   │       ├── __init__.py
│       │   │       ├── classifier.py
│       │   │       ├── ddim.py
│       │   │       ├── ddpm.py
│       │   │       ├── ksampler.py
│       │   │       └── plms.py
│       │   ├── modules/
│       │   │   ├── attention.py
│       │   │   ├── diffusionmodules/
│       │   │   │   ├── __init__.py
│       │   │   │   ├── model.py
│       │   │   │   ├── openaimodel.py
│       │   │   │   └── util.py
│       │   │   ├── distributions/
│       │   │   │   ├── __init__.py
│       │   │   │   └── distributions.py
│       │   │   ├── ema.py
│       │   │   ├── embedding_manager.py
│       │   │   ├── encoders/
│       │   │   │   ├── __init__.py
│       │   │   │   └── modules.py
│       │   │   ├── image_degradation/
│       │   │   │   ├── __init__.py
│       │   │   │   ├── bsrgan.py
│       │   │   │   ├── bsrgan_light.py
│       │   │   │   └── utils_image.py
│       │   │   ├── losses/
│       │   │   │   ├── __init__.py
│       │   │   │   ├── contperceptual.py
│       │   │   │   └── vqperceptual.py
│       │   │   └── x_transformer.py
│       │   ├── simplet2i.py
│       │   └── util.py
│       ├── text2image_compvis.py
│       ├── text2image_diffusers.py
│       └── translation.py
├── storage/
│   ├── init/
│   │   └── .keep
│   └── outputs/
│       └── .keep
└── win10fix.bat
Download .txt
SYMBOL INDEX (739 symbols across 49 files)

FILE: __main__.py
  function parse_args (line 10) | def parse_args():
  function shutdown (line 23) | async def shutdown(bot):
  function main (line 26) | def main():

FILE: src/bot/shanghai.py
  class Shanghai (line 10) | class Shanghai(commands.Bot, ABC):
    method __init__ (line 11) | def __init__(self, args):
    method on_ready (line 19) | async def on_ready(self):
    method on_message (line 24) | async def on_message(self, message):
    method on_raw_reaction_add (line 33) | async def on_raw_reaction_add(self, ctx):

FILE: src/bot/stablecog.py
  class QueueObject (line 21) | class QueueObject:
    method __init__ (line 22) | def __init__(self, ctx, prompt, height, width, guidance_scale, steps, ...
  class StableCog (line 38) | class StableCog(commands.Cog, name='Stable Diffusion', description='Crea...
    method __init__ (line 39) | def __init__(self, bot):
    method dream_handler (line 112) | async def dream_handler(self, ctx: discord.ApplicationContext, *, prom...
    method process_dream (line 152) | async def process_dream(self, queue_object: QueueObject):
    method dream (line 157) | def dream(self, event_loop: AbstractEventLoop, queue_object: QueueObje...
  function setup (line 213) | def setup(bot):

FILE: src/core/logging.py
  function get_logger (line 7) | def get_logger(name):

FILE: src/stablediffusion/dream.py
  class StableDiffusionPipeline (line 16) | class StableDiffusionPipeline(DiffusionPipeline):
    method __init__ (line 17) | def __init__(
    method __call__ (line 36) | def __call__(

FILE: src/stablediffusion/inpaint.py
  function preprocess (line 13) | def preprocess(image):
  function preprocess_mask (line 22) | def preprocess_mask(mask):
  class StableDiffusionInpaintingPipeline (line 32) | class StableDiffusionInpaintingPipeline(DiffusionPipeline):
    method __init__ (line 33) | def __init__(
    method __call__ (line 52) | def __call__(

FILE: src/stablediffusion/ldm/data/base.py
  class Txt2ImgIterableBaseDataset (line 10) | class Txt2ImgIterableBaseDataset(IterableDataset):
    method __init__ (line 15) | def __init__(self, num_records=0, valid_ids=None, size=256):
    method __len__ (line 26) | def __len__(self):
    method __iter__ (line 30) | def __iter__(self):

FILE: src/stablediffusion/ldm/data/imagenet.py
  function synset2idx (line 28) | def synset2idx(path_to_yaml='data/index_synset.yaml'):
  class ImageNetBase (line 34) | class ImageNetBase(Dataset):
    method __init__ (line 35) | def __init__(self, config=None):
    method __len__ (line 49) | def __len__(self):
    method __getitem__ (line 52) | def __getitem__(self, i):
    method _prepare (line 55) | def _prepare(self):
    method _filter_relpaths (line 58) | def _filter_relpaths(self, relpaths):
    method _prepare_synset_to_human (line 82) | def _prepare_synset_to_human(self):
    method _prepare_idx_to_synset (line 92) | def _prepare_idx_to_synset(self):
    method _prepare_human_to_integer_label (line 98) | def _prepare_human_to_integer_label(self):
    method _load (line 113) | def _load(self):
  class ImageNetTrain (line 161) | class ImageNetTrain(ImageNetBase):
    method __init__ (line 172) | def __init__(self, process_images=True, data_root=None, **kwargs):
    method _prepare (line 177) | def _prepare(self):
  class ImageNetValidation (line 231) | class ImageNetValidation(ImageNetBase):
    method __init__ (line 245) | def __init__(self, process_images=True, data_root=None, **kwargs):
    method _prepare (line 250) | def _prepare(self):
  class ImageNetSR (line 315) | class ImageNetSR(Dataset):
    method __init__ (line 316) | def __init__(
    method __len__ (line 398) | def __len__(self):
    method __getitem__ (line 401) | def __getitem__(self, i):
  class ImageNetSRTrain (line 443) | class ImageNetSRTrain(ImageNetSR):
    method __init__ (line 444) | def __init__(self, **kwargs):
    method get_base (line 447) | def get_base(self):
  class ImageNetSRValidation (line 456) | class ImageNetSRValidation(ImageNetSR):
    method __init__ (line 457) | def __init__(self, **kwargs):
    method get_base (line 460) | def get_base(self):

FILE: src/stablediffusion/ldm/data/lsun.py
  class LSUNBase (line 9) | class LSUNBase(Dataset):
    method __init__ (line 10) | def __init__(
    method __len__ (line 39) | def __len__(self):
    method __getitem__ (line 42) | def __getitem__(self, i):
  class LSUNChurchesTrain (line 72) | class LSUNChurchesTrain(LSUNBase):
    method __init__ (line 73) | def __init__(self, **kwargs):
  class LSUNChurchesValidation (line 81) | class LSUNChurchesValidation(LSUNBase):
    method __init__ (line 82) | def __init__(self, flip_p=0.0, **kwargs):
  class LSUNBedroomsTrain (line 91) | class LSUNBedroomsTrain(LSUNBase):
    method __init__ (line 92) | def __init__(self, **kwargs):
  class LSUNBedroomsValidation (line 100) | class LSUNBedroomsValidation(LSUNBase):
    method __init__ (line 101) | def __init__(self, flip_p=0.0, **kwargs):
  class LSUNCatsTrain (line 110) | class LSUNCatsTrain(LSUNBase):
    method __init__ (line 111) | def __init__(self, **kwargs):
  class LSUNCatsValidation (line 119) | class LSUNCatsValidation(LSUNBase):
    method __init__ (line 120) | def __init__(self, flip_p=0.0, **kwargs):

FILE: src/stablediffusion/ldm/data/personalized.py
  class PersonalizedBase (line 100) | class PersonalizedBase(Dataset):
    method __init__ (line 101) | def __init__(
    method __len__ (line 152) | def __len__(self):
    method __getitem__ (line 155) | def __getitem__(self, i):

FILE: src/stablediffusion/ldm/data/personalized_style.py
  class PersonalizedBase (line 78) | class PersonalizedBase(Dataset):
    method __init__ (line 79) | def __init__(
    method __len__ (line 125) | def __len__(self):
    method __getitem__ (line 128) | def __getitem__(self, i):

FILE: src/stablediffusion/ldm/dream/conditioning.py
  function get_uc_and_c (line 15) | def get_uc_and_c(prompt, model, log_tokens=False, skip_normalize=False):
  function split_weighted_subprompts (line 39) | def split_weighted_subprompts(text, skip_normalize=False)->list:
  function log_tokenization (line 75) | def log_tokenization(text, model, log=False):

FILE: src/stablediffusion/ldm/dream/devices.py
  function choose_torch_device (line 5) | def choose_torch_device() -> str:
  function choose_autocast_device (line 13) | def choose_autocast_device(device):

FILE: src/stablediffusion/ldm/dream/generator/base.py
  class Generator (line 16) | class Generator():
    method __init__ (line 17) | def __init__(self,model):
    method get_make_image (line 26) | def get_make_image(self,prompt,**kwargs):
    method set_variation (line 33) | def set_variation(self, seed, variation_amount, with_variations):
    method generate (line 38) | def generate(self,prompt,init_image,width,height,iterations=1,seed=None,
    method sample_to_image (line 77) | def sample_to_image(self,samples):
    method generate_initial_noise (line 92) | def generate_initial_noise(self, seed, width, height):
    method get_noise (line 111) | def get_noise(self,width,height):
    method new_seed (line 118) | def new_seed(self):
    method slerp (line 122) | def slerp(self, t, v0, v1, DOT_THRESHOLD=0.9995):

FILE: src/stablediffusion/ldm/dream/generator/img2img.py
  class Img2Img (line 11) | class Img2Img(Generator):
    method __init__ (line 12) | def __init__(self,model):
    method get_make_image (line 17) | def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,
    method get_noise (line 65) | def get_noise(self,width,height):

FILE: src/stablediffusion/ldm/dream/generator/inpaint.py
  class Inpaint (line 12) | class Inpaint(Img2Img):
    method __init__ (line 13) | def __init__(self,model):
    method get_make_image (line 18) | def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,

FILE: src/stablediffusion/ldm/dream/generator/txt2img.py
  class Txt2Img (line 9) | class Txt2Img(Generator):
    method __init__ (line 10) | def __init__(self,model):
    method get_make_image (line 14) | def get_make_image(self,prompt,sampler,steps,cfg_scale,ddim_eta,
    method get_noise (line 48) | def get_noise(self,width,height):

FILE: src/stablediffusion/ldm/dream/image_util.py
  class InitImageResizer (line 4) | class InitImageResizer():
    method __init__ (line 6) | def __init__(self,Image):
    method resize (line 9) | def resize(self,width=None,height=None) -> Image:
  function make_grid (line 52) | def make_grid(image_list, rows=None, cols=None):

FILE: src/stablediffusion/ldm/dream/pngwriter.py
  class PngWriter (line 16) | class PngWriter:
    method __init__ (line 17) | def __init__(self, outdir):
    method unique_prefix (line 22) | def unique_prefix(self):
    method save_image_and_prompt_to_png (line 35) | def save_image_and_prompt_to_png(self, image, prompt, name):
  class PromptFormatter (line 43) | class PromptFormatter:
    method __init__ (line 44) | def __init__(self, t2i, opt):
    method normalize_prompt (line 50) | def normalize_prompt(self):

FILE: src/stablediffusion/ldm/dream/readline.py
  class Completer (line 17) | class Completer:
    method __init__ (line 18) | def __init__(self, options):
    method complete (line 22) | def complete(self, text, state):
    method _path_completions (line 49) | def _path_completions(self, text, state, extensions):

FILE: src/stablediffusion/ldm/dream/server.py
  function build_opt (line 10) | def build_opt(post_data, seed, gfpgan_model_exists):
  class CanceledException (line 57) | class CanceledException(Exception):
  class DreamServer (line 60) | class DreamServer(BaseHTTPRequestHandler):
    method do_GET (line 65) | def do_GET(self):
    method do_POST (line 122) | def do_POST(self):
  class ThreadingDreamServer (line 241) | class ThreadingDreamServer(ThreadingHTTPServer):
    method __init__ (line 242) | def __init__(self, server_address):

FILE: src/stablediffusion/ldm/generate.py
  class Generate (line 97) | class Generate:
    method __init__ (line 102) | def __init__(
    method prompt2png (line 156) | def prompt2png(self, prompt, outdir, **kwargs):
    method txt2img (line 173) | def txt2img(self, prompt, **kwargs):
    method img2img (line 177) | def img2img(self, prompt, **kwargs):
    method prompt2image (line 184) | def prompt2image(
    method _make_images (line 377) | def _make_images(self, img_path, mask_path, width, height, fit=False):
    method _make_img2img (line 402) | def _make_img2img(self):
    method _make_txt2img (line 408) | def _make_txt2img(self):
    method _make_inpaint (line 414) | def _make_inpaint(self):
    method load_model (line 420) | def load_model(self):
    method upscale_and_reconstruct (line 447) | def upscale_and_reconstruct(self,
    method sample_to_image (line 490) | def sample_to_image(self,samples):
    method _sample_to_image (line 493) | def _sample_to_image(self,samples):
    method _set_sampler (line 499) | def _set_sampler(self):
    method _load_model_from_config (line 527) | def _load_model_from_config(self, config, ckpt):
    method _load_img (line 569) | def _load_img(self, path, width, height, fit=False):
    method _create_init_image (line 584) | def _create_init_image(self,image):
    method _create_init_mask (line 596) | def _create_init_mask(self, image):
    method _image_to_mask (line 616) | def _image_to_mask(self, mask_image, invert=False) -> Image:
    method _has_transparency (line 624) | def _has_transparency(self,image):
    method _check_for_erasure (line 639) | def _check_for_erasure(self,image):
    method _squeeze_image (line 652) | def _squeeze_image(self,image):
    method _fit_image (line 659) | def _fit_image(self,image,max_dimensions):
    method _resolution_check (line 676) | def _resolution_check(self, width, height, log=False):

FILE: src/stablediffusion/ldm/gfpgan/gfpgan_tools.py
  function run_gfpgan (line 15) | def run_gfpgan(image, strength, seed, upsampler_scale=4):
  function _load_gfpgan_bg_upsampler (line 78) | def _load_gfpgan_bg_upsampler(bg_upsampler, upsampler_scale, bg_tile=400):
  function real_esrgan_upscale (line 130) | def real_esrgan_upscale(image, strength, upsampler_scale, seed):

FILE: src/stablediffusion/ldm/lr_scheduler.py
  class LambdaWarmUpCosineScheduler (line 4) | class LambdaWarmUpCosineScheduler:
    method __init__ (line 9) | def __init__(
    method schedule (line 26) | def schedule(self, n, **kwargs):
    method __call__ (line 49) | def __call__(self, n, **kwargs):
  class LambdaWarmUpCosineScheduler2 (line 53) | class LambdaWarmUpCosineScheduler2:
    method __init__ (line 59) | def __init__(
    method find_in_interval (line 84) | def find_in_interval(self, n):
    method schedule (line 91) | def schedule(self, n, **kwargs):
    method __call__ (line 117) | def __call__(self, n, **kwargs):
  class LambdaLinearScheduler (line 121) | class LambdaLinearScheduler(LambdaWarmUpCosineScheduler2):
    method schedule (line 122) | def schedule(self, n, **kwargs):

FILE: src/stablediffusion/ldm/models/autoencoder.py
  class VQModel (line 16) | class VQModel(pl.LightningModule):
    method __init__ (line 17) | def __init__(
    method ema_scope (line 77) | def ema_scope(self, context=None):
    method init_from_ckpt (line 91) | def init_from_ckpt(self, path, ignore_keys=list()):
    method on_train_batch_end (line 107) | def on_train_batch_end(self, *args, **kwargs):
    method encode (line 111) | def encode(self, x):
    method encode_to_prequant (line 117) | def encode_to_prequant(self, x):
    method decode (line 122) | def decode(self, quant):
    method decode_code (line 127) | def decode_code(self, code_b):
    method forward (line 132) | def forward(self, input, return_pred_indices=False):
    method get_input (line 139) | def get_input(self, batch, k):
    method training_step (line 163) | def training_step(self, batch, batch_idx, optimizer_idx):
    method validation_step (line 211) | def validation_step(self, batch, batch_idx):
    method _validation_step (line 219) | def _validation_step(self, batch, batch_idx, suffix=''):
    method configure_optimizers (line 268) | def configure_optimizers(self):
    method get_last_layer (line 309) | def get_last_layer(self):
    method log_images (line 312) | def log_images(self, batch, only_inputs=False, plot_ema=False, **kwargs):
    method to_rgb (line 335) | def to_rgb(self, x):
  class VQModelInterface (line 346) | class VQModelInterface(VQModel):
    method __init__ (line 347) | def __init__(self, embed_dim, *args, **kwargs):
    method encode (line 351) | def encode(self, x):
    method decode (line 356) | def decode(self, h, force_not_quantize=False):
  class AutoencoderKL (line 367) | class AutoencoderKL(pl.LightningModule):
    method __init__ (line 368) | def __init__(
    method init_from_ckpt (line 402) | def init_from_ckpt(self, path, ignore_keys=list()):
    method encode (line 413) | def encode(self, x):
    method decode (line 419) | def decode(self, z):
    method forward (line 424) | def forward(self, input, sample_posterior=True):
    method get_input (line 433) | def get_input(self, batch, k):
    method training_step (line 444) | def training_step(self, batch, batch_idx, optimizer_idx):
    method validation_step (line 505) | def validation_step(self, batch, batch_idx):
    method configure_optimizers (line 533) | def configure_optimizers(self):
    method get_last_layer (line 548) | def get_last_layer(self):
    method log_images (line 552) | def log_images(self, batch, only_inputs=False, **kwargs):
    method to_rgb (line 568) | def to_rgb(self, x):
  class IdentityFirstStage (line 579) | class IdentityFirstStage(torch.nn.Module):
    method __init__ (line 580) | def __init__(self, *args, vq_interface=False, **kwargs):
    method encode (line 584) | def encode(self, x, *args, **kwargs):
    method decode (line 587) | def decode(self, x, *args, **kwargs):
    method quantize (line 590) | def quantize(self, x, *args, **kwargs):
    method forward (line 595) | def forward(self, x, *args, **kwargs):

FILE: src/stablediffusion/ldm/models/diffusion/classifier.py
  function disabled_train (line 22) | def disabled_train(self, mode=True):
  class NoisyLatentImageClassifier (line 28) | class NoisyLatentImageClassifier(pl.LightningModule):
    method __init__ (line 29) | def __init__(
    method init_from_ckpt (line 82) | def init_from_ckpt(self, path, ignore_keys=list(), only_model=False):
    method load_diffusion (line 105) | def load_diffusion(self):
    method load_classifier (line 112) | def load_classifier(self, ckpt_path, pool):
    method get_x_noisy (line 135) | def get_x_noisy(self, x, t, noise=None):
    method forward (line 153) | def forward(self, x_noisy, t, *args, **kwargs):
    method get_input (line 157) | def get_input(self, batch, k):
    method get_conditioning (line 166) | def get_conditioning(self, batch, k=None):
    method compute_top_k (line 185) | def compute_top_k(self, logits, labels, k, reduction='mean'):
    method on_train_epoch_start (line 194) | def on_train_epoch_start(self):
    method write_logs (line 199) | def write_logs(self, loss, logits, targets):
    method shared_step (line 237) | def shared_step(self, batch, t=None):
    method training_step (line 265) | def training_step(self, batch, batch_idx):
    method reset_noise_accs (line 269) | def reset_noise_accs(self):
    method on_validation_start (line 279) | def on_validation_start(self):
    method validation_step (line 283) | def validation_step(self, batch, batch_idx):
    method configure_optimizers (line 297) | def configure_optimizers(self):
    method log_images (line 322) | def log_images(self, batch, N=8, *args, **kwargs):

FILE: src/stablediffusion/ldm/models/diffusion/ddim.py
  class DDIMSampler (line 17) | class DDIMSampler(object):
    method __init__ (line 18) | def __init__(self, model, schedule='linear', device=None, **kwargs):
    method register_buffer (line 25) | def register_buffer(self, name, attr):
    method make_schedule (line 31) | def make_schedule(
    method sample (line 110) | def sample(
    method ddim_sampling (line 176) | def ddim_sampling(
    method p_sample_ddim (line 276) | def p_sample_ddim(
    method stochastic_encode (line 357) | def stochastic_encode(self, x0, t, use_original_steps=False, noise=None):
    method decode (line 376) | def decode(

FILE: src/stablediffusion/ldm/models/diffusion/ddpm.py
  function disabled_train (line 59) | def disabled_train(self, mode=True):
  function uniform_on_device (line 65) | def uniform_on_device(r1, r2, shape, device):
  class DDPM (line 69) | class DDPM(pl.LightningModule):
    method __init__ (line 71) | def __init__(
    method register_schedule (line 159) | def register_schedule(
    method ema_scope (line 268) | def ema_scope(self, context=None):
    method init_from_ckpt (line 282) | def init_from_ckpt(self, path, ignore_keys=list(), only_model=False):
    method q_mean_variance (line 305) | def q_mean_variance(self, x_start, t):
    method predict_start_from_noise (line 324) | def predict_start_from_noise(self, x_t, t, noise):
    method q_posterior (line 334) | def q_posterior(self, x_start, x_t, t):
    method p_mean_variance (line 353) | def p_mean_variance(self, x, t, clip_denoised: bool):
    method p_sample (line 370) | def p_sample(self, x, t, clip_denoised=True, repeat_noise=False):
    method p_sample_loop (line 386) | def p_sample_loop(self, shape, return_intermediates=False):
    method sample (line 409) | def sample(self, batch_size=16, return_intermediates=False):
    method q_sample (line 417) | def q_sample(self, x_start, t, noise=None):
    method get_loss (line 428) | def get_loss(self, pred, target, mean=True):
    method p_losses (line 445) | def p_losses(self, x_start, t, noise=None):
    method forward (line 476) | def forward(self, x, *args, **kwargs):
    method get_input (line 484) | def get_input(self, batch, k):
    method shared_step (line 492) | def shared_step(self, batch):
    method training_step (line 497) | def training_step(self, batch, batch_idx):
    method validation_step (line 527) | def validation_step(self, batch, batch_idx):
    method on_train_batch_end (line 549) | def on_train_batch_end(self, *args, **kwargs):
    method _get_rows_from_list (line 553) | def _get_rows_from_list(self, samples):
    method log_images (line 561) | def log_images(
    method configure_optimizers (line 602) | def configure_optimizers(self):
  class LatentDiffusion (line 611) | class LatentDiffusion(DDPM):
    method __init__ (line 614) | def __init__(
    method make_cond_schedule (line 689) | def make_cond_schedule(
    method on_train_batch_start (line 704) | def on_train_batch_start(self, batch, batch_idx, dataloader_idx):
    method register_schedule (line 727) | def register_schedule(
    method instantiate_first_stage (line 749) | def instantiate_first_stage(self, config):
    method instantiate_cond_stage (line 756) | def instantiate_cond_stage(self, config):
    method instantiate_embedding_manager (line 784) | def instantiate_embedding_manager(self, config, embedder):
    method _get_denoise_row_from_list (line 794) | def _get_denoise_row_from_list(
    method get_first_stage_encoding (line 812) | def get_first_stage_encoding(self, encoder_posterior):
    method get_learned_conditioning (line 823) | def get_learned_conditioning(self, c):
    method meshgrid (line 840) | def meshgrid(self, h, w):
    method delta_border (line 847) | def delta_border(self, h, w):
    method get_weighting (line 863) | def get_weighting(self, h, w, Ly, Lx, device):
    method get_fold_unfold (line 886) | def get_fold_unfold(
    method get_input (line 976) | def get_input(
    method decode_first_stage (line 1036) | def decode_first_stage(
    method differentiable_decode_first_stage (line 1120) | def differentiable_decode_first_stage(
    method encode_first_stage (line 1204) | def encode_first_stage(self, x):
    method shared_step (line 1251) | def shared_step(self, batch, **kwargs):
    method forward (line 1256) | def forward(self, x, c, *args, **kwargs):
    method _rescale_annotations (line 1272) | def _rescale_annotations(
    method apply_model (line 1284) | def apply_model(self, x_noisy, t, cond, return_ids=False):
    method _predict_eps_from_xstart (line 1447) | def _predict_eps_from_xstart(self, x_t, t, pred_xstart):
    method _prior_bpd (line 1454) | def _prior_bpd(self, x_start):
    method p_losses (line 1472) | def p_losses(self, x_start, cond, t, noise=None):
    method p_mean_variance (line 1521) | def p_mean_variance(
    method p_sample (line 1583) | def p_sample(
    method progressive_denoising (line 1643) | def progressive_denoising(
    method p_sample_loop (line 1739) | def p_sample_loop(
    method sample (line 1819) | def sample(
    method sample_log (line 1867) | def sample_log(self, cond, batch_size, ddim, ddim_steps, **kwargs):
    method log_images (line 1887) | def log_images(
    method configure_optimizers (line 2064) | def configure_optimizers(self):
    method to_rgb (line 2097) | def to_rgb(self, x):
    method on_save_checkpoint (line 2106) | def on_save_checkpoint(self, checkpoint):
  class DiffusionWrapper (line 2127) | class DiffusionWrapper(pl.LightningModule):
    method __init__ (line 2128) | def __init__(self, diff_model_config, conditioning_key):
    method forward (line 2140) | def forward(self, x, t, c_concat: list = None, c_crossattn: list = None):
  class Layout2ImgDiffusion (line 2162) | class Layout2ImgDiffusion(LatentDiffusion):
    method __init__ (line 2164) | def __init__(self, cond_stage_key, *args, **kwargs):
    method log_images (line 2170) | def log_images(self, batch, N=8, *args, **kwargs):

FILE: src/stablediffusion/ldm/models/diffusion/ksampler.py
  class CFGDenoiser (line 7) | class CFGDenoiser(nn.Module):
    method __init__ (line 8) | def __init__(self, model):
    method forward (line 12) | def forward(self, x, sigma, uncond, cond, cond_scale):
  class KSampler (line 20) | class KSampler(object):
    method __init__ (line 21) | def __init__(self, model, schedule='lms', device=None, **kwargs):
    method sample (line 39) | def sample(

FILE: src/stablediffusion/ldm/models/diffusion/plms.py
  class PLMSSampler (line 16) | class PLMSSampler(object):
    method __init__ (line 17) | def __init__(self, model, schedule='linear', device=None, **kwargs):
    method register_buffer (line 24) | def register_buffer(self, name, attr):
    method make_schedule (line 30) | def make_schedule(
    method sample (line 111) | def sample(
    method plms_sampling (line 176) | def plms_sampling(
    method p_sample_plms (line 287) | def p_sample_plms(

FILE: src/stablediffusion/ldm/modules/attention.py
  function exists (line 12) | def exists(val):
  function uniq (line 16) | def uniq(arr):
  function default (line 20) | def default(val, d):
  function max_neg_value (line 26) | def max_neg_value(t):
  function init_ (line 30) | def init_(tensor):
  class GEGLU (line 38) | class GEGLU(nn.Module):
    method __init__ (line 39) | def __init__(self, dim_in, dim_out):
    method forward (line 43) | def forward(self, x):
  class FeedForward (line 48) | class FeedForward(nn.Module):
    method __init__ (line 49) | def __init__(self, dim, dim_out=None, mult=4, glu=False, dropout=0.):
    method forward (line 64) | def forward(self, x):
  function zero_module (line 68) | def zero_module(module):
  function Normalize (line 77) | def Normalize(in_channels):
  class LinearAttention (line 81) | class LinearAttention(nn.Module):
    method __init__ (line 82) | def __init__(self, dim, heads=4, dim_head=32):
    method forward (line 89) | def forward(self, x):
  class SpatialSelfAttention (line 100) | class SpatialSelfAttention(nn.Module):
    method __init__ (line 101) | def __init__(self, in_channels):
    method forward (line 127) | def forward(self, x):
  class CrossAttention (line 153) | class CrossAttention(nn.Module):
    method __init__ (line 154) | def __init__(self, query_dim, context_dim=None, heads=8, dim_head=64, ...
    method einsum_op_v1 (line 184) | def einsum_op_v1(self, q, k, v, r1):
    method einsum_op_v2 (line 205) | def einsum_op_v2(self, q, k, v, r1):
    method einsum_op_v3 (line 217) | def einsum_op_v3(self, q, k, v, r1):
    method einsum_op_v4 (line 230) | def einsum_op_v4(self, q, k, v, r1):
    method forward (line 261) | def forward(self, x, context=None, mask=None):
  class BasicTransformerBlock (line 283) | class BasicTransformerBlock(nn.Module):
    method __init__ (line 284) | def __init__(self, dim, n_heads, d_head, dropout=0., context_dim=None,...
    method forward (line 295) | def forward(self, x, context=None):
    method _forward (line 298) | def _forward(self, x, context=None):
  class SpatialTransformer (line 306) | class SpatialTransformer(nn.Module):
    method __init__ (line 314) | def __init__(self, in_channels, n_heads, d_head,
    method forward (line 338) | def forward(self, x, context=None):

FILE: src/stablediffusion/ldm/modules/diffusionmodules/model.py
  function get_timestep_embedding (line 14) | def get_timestep_embedding(timesteps, embedding_dim):
  function nonlinearity (line 35) | def nonlinearity(x):
  function Normalize (line 40) | def Normalize(in_channels, num_groups=32):
  class Upsample (line 44) | class Upsample(nn.Module):
    method __init__ (line 45) | def __init__(self, in_channels, with_conv):
    method forward (line 55) | def forward(self, x):
  class Downsample (line 62) | class Downsample(nn.Module):
    method __init__ (line 63) | def __init__(self, in_channels, with_conv):
    method forward (line 74) | def forward(self, x):
  class ResnetBlock (line 84) | class ResnetBlock(nn.Module):
    method __init__ (line 85) | def __init__(self, *, in_channels, out_channels=None, conv_shortcut=Fa...
    method forward (line 123) | def forward(self, x, temb):
  class LinAttnBlock (line 157) | class LinAttnBlock(LinearAttention):
    method __init__ (line 159) | def __init__(self, in_channels):
  class AttnBlock (line 163) | class AttnBlock(nn.Module):
    method __init__ (line 164) | def __init__(self, in_channels):
    method forward (line 191) | def forward(self, x):
  function make_attn (line 264) | def make_attn(in_channels, attn_type="vanilla"):
  class Model (line 275) | class Model(nn.Module):
    method __init__ (line 276) | def __init__(self, *, ch, out_ch, ch_mult=(1,2,4,8), num_res_blocks,
    method forward (line 375) | def forward(self, x, t=None, context=None):
    method get_last_layer (line 423) | def get_last_layer(self):
  class Encoder (line 427) | class Encoder(nn.Module):
    method __init__ (line 428) | def __init__(self, *, ch, out_ch, ch_mult=(1,2,4,8), num_res_blocks,
    method forward (line 493) | def forward(self, x):
  class Decoder (line 521) | class Decoder(nn.Module):
    method __init__ (line 522) | def __init__(self, *, ch, out_ch, ch_mult=(1,2,4,8), num_res_blocks,
    method forward (line 594) | def forward(self, z):
  class SimpleDecoder (line 655) | class SimpleDecoder(nn.Module):
    method __init__ (line 656) | def __init__(self, in_channels, out_channels, *args, **kwargs):
    method forward (line 678) | def forward(self, x):
  class UpsampleDecoder (line 691) | class UpsampleDecoder(nn.Module):
    method __init__ (line 692) | def __init__(self, in_channels, out_channels, ch, num_res_blocks, reso...
    method forward (line 725) | def forward(self, x):
  class LatentRescaler (line 739) | class LatentRescaler(nn.Module):
    method __init__ (line 740) | def __init__(self, factor, in_channels, mid_channels, out_channels, de...
    method forward (line 764) | def forward(self, x):
  class MergedRescaleEncoder (line 776) | class MergedRescaleEncoder(nn.Module):
    method __init__ (line 777) | def __init__(self, in_channels, ch, resolution, out_ch, num_res_blocks,
    method forward (line 789) | def forward(self, x):
  class MergedRescaleDecoder (line 795) | class MergedRescaleDecoder(nn.Module):
    method __init__ (line 796) | def __init__(self, z_channels, out_ch, resolution, num_res_blocks, att...
    method forward (line 806) | def forward(self, x):
  class Upsampler (line 812) | class Upsampler(nn.Module):
    method __init__ (line 813) | def __init__(self, in_size, out_size, in_channels, out_channels, ch_mu...
    method forward (line 825) | def forward(self, x):
  class Resize (line 831) | class Resize(nn.Module):
    method __init__ (line 832) | def __init__(self, in_channels=None, learned=False, mode="bilinear"):
    method forward (line 847) | def forward(self, x, scale_factor=1.0):
  class FirstStagePostProcessor (line 854) | class FirstStagePostProcessor(nn.Module):
    method __init__ (line 856) | def __init__(self, ch_mult:list, in_channels,
    method instantiate_pretrained (line 891) | def instantiate_pretrained(self, config):
    method encode_with_pretrained (line 900) | def encode_with_pretrained(self,x):
    method forward (line 906) | def forward(self,x):

FILE: src/stablediffusion/ldm/modules/diffusionmodules/openaimodel.py
  function convert_module_to_f16 (line 24) | def convert_module_to_f16(x):
  function convert_module_to_f32 (line 28) | def convert_module_to_f32(x):
  class AttentionPool2d (line 33) | class AttentionPool2d(nn.Module):
    method __init__ (line 38) | def __init__(
    method forward (line 54) | def forward(self, x):
  class TimestepBlock (line 65) | class TimestepBlock(nn.Module):
    method forward (line 71) | def forward(self, x, emb):
  class TimestepEmbedSequential (line 77) | class TimestepEmbedSequential(nn.Sequential, TimestepBlock):
    method forward (line 83) | def forward(self, x, emb, context=None):
  class Upsample (line 94) | class Upsample(nn.Module):
    method __init__ (line 103) | def __init__(
    method forward (line 116) | def forward(self, x):
  class TransposedUpsample (line 129) | class TransposedUpsample(nn.Module):
    method __init__ (line 132) | def __init__(self, channels, out_channels=None, ks=5):
    method forward (line 141) | def forward(self, x):
  class Downsample (line 145) | class Downsample(nn.Module):
    method __init__ (line 154) | def __init__(
    method forward (line 176) | def forward(self, x):
  class ResBlock (line 181) | class ResBlock(TimestepBlock):
    method __init__ (line 197) | def __init__(
    method forward (line 267) | def forward(self, x, emb):
    method _forward (line 278) | def _forward(self, x, emb):
  class AttentionBlock (line 301) | class AttentionBlock(nn.Module):
    method __init__ (line 308) | def __init__(
    method forward (line 337) | def forward(self, x):
    method _forward (line 343) | def _forward(self, x):
  function count_flops_attn (line 352) | def count_flops_attn(model, _x, y):
  class QKVAttentionLegacy (line 372) | class QKVAttentionLegacy(nn.Module):
    method __init__ (line 377) | def __init__(self, n_heads):
    method forward (line 381) | def forward(self, qkv):
    method count_flops (line 402) | def count_flops(model, _x, y):
  class QKVAttention (line 406) | class QKVAttention(nn.Module):
    method __init__ (line 411) | def __init__(self, n_heads):
    method forward (line 415) | def forward(self, qkv):
    method count_flops (line 438) | def count_flops(model, _x, y):
  class UNetModel (line 442) | class UNetModel(nn.Module):
    method __init__ (line 472) | def __init__(
    method convert_to_fp16 (line 766) | def convert_to_fp16(self):
    method convert_to_fp32 (line 774) | def convert_to_fp32(self):
    method forward (line 782) | def forward(self, x, timesteps=None, context=None, y=None, **kwargs):
  class EncoderUNetModel (line 819) | class EncoderUNetModel(nn.Module):
    method __init__ (line 825) | def __init__(
    method convert_to_fp16 (line 998) | def convert_to_fp16(self):
    method convert_to_fp32 (line 1005) | def convert_to_fp32(self):
    method forward (line 1012) | def forward(self, x, timesteps):

FILE: src/stablediffusion/ldm/modules/diffusionmodules/util.py
  function make_beta_schedule (line 21) | def make_beta_schedule(
  function make_ddim_timesteps (line 62) | def make_ddim_timesteps(
  function make_ddim_sampling_parameters (line 92) | def make_ddim_sampling_parameters(
  function betas_for_alpha_bar (line 116) | def betas_for_alpha_bar(num_diffusion_timesteps, alpha_bar, max_beta=0.9...
  function extract_into_tensor (line 135) | def extract_into_tensor(a, t, x_shape):
  function checkpoint (line 141) | def checkpoint(func, inputs, params, flag):
  class CheckpointFunction (line 160) | class CheckpointFunction(torch.autograd.Function):
    method forward (line 162) | def forward(ctx, run_function, length, *args):
    method backward (line 172) | def backward(ctx, *output_grads):
  function timestep_embedding (line 194) | def timestep_embedding(timesteps, dim, max_period=10000, repeat_only=Fal...
  function zero_module (line 221) | def zero_module(module):
  function scale_module (line 230) | def scale_module(module, scale):
  function mean_flat (line 239) | def mean_flat(tensor):
  function normalization (line 246) | def normalization(channels):
  class SiLU (line 256) | class SiLU(nn.Module):
    method forward (line 257) | def forward(self, x):
  class GroupNorm32 (line 261) | class GroupNorm32(nn.GroupNorm):
    method forward (line 262) | def forward(self, x):
  function conv_nd (line 266) | def conv_nd(dims, *args, **kwargs):
  function linear (line 279) | def linear(*args, **kwargs):
  function avg_pool_nd (line 286) | def avg_pool_nd(dims, *args, **kwargs):
  class HybridConditioner (line 299) | class HybridConditioner(nn.Module):
    method __init__ (line 300) | def __init__(self, c_concat_config, c_crossattn_config):
    method forward (line 307) | def forward(self, c_concat, c_crossattn):
  function noise_like (line 313) | def noise_like(shape, device, repeat=False):

FILE: src/stablediffusion/ldm/modules/distributions/distributions.py
  class AbstractDistribution (line 5) | class AbstractDistribution:
    method sample (line 6) | def sample(self):
    method mode (line 9) | def mode(self):
  class DiracDistribution (line 13) | class DiracDistribution(AbstractDistribution):
    method __init__ (line 14) | def __init__(self, value):
    method sample (line 17) | def sample(self):
    method mode (line 20) | def mode(self):
  class DiagonalGaussianDistribution (line 24) | class DiagonalGaussianDistribution(object):
    method __init__ (line 25) | def __init__(self, parameters, deterministic=False):
    method sample (line 37) | def sample(self):
    method kl (line 43) | def kl(self, other=None):
    method nll (line 62) | def nll(self, sample, dims=[1, 2, 3]):
    method mode (line 73) | def mode(self):
  function normal_kl (line 77) | def normal_kl(mean1, logvar1, mean2, logvar2):

FILE: src/stablediffusion/ldm/modules/ema.py
  class LitEma (line 5) | class LitEma(nn.Module):
    method __init__ (line 6) | def __init__(self, model, decay=0.9999, use_num_upates=True):
    method forward (line 29) | def forward(self, model):
    method copy_to (line 56) | def copy_to(self, model):
    method store (line 67) | def store(self, parameters):
    method restore (line 76) | def restore(self, parameters):

FILE: src/stablediffusion/ldm/modules/embedding_manager.py
  function get_clip_token_for_string (line 16) | def get_clip_token_for_string(tokenizer, string):
  function get_bert_token_for_string (line 34) | def get_bert_token_for_string(tokenizer, string):
  function get_embedding_for_clip_token (line 43) | def get_embedding_for_clip_token(embedder, token):
  class EmbeddingManager (line 47) | class EmbeddingManager(nn.Module):
    method __init__ (line 48) | def __init__(
    method forward (line 134) | def forward(
    method save (line 210) | def save(self, ckpt_path):
    method load (line 219) | def load(self, ckpt_path, full=True):
    method get_embedding_norms_squared (line 243) | def get_embedding_norms_squared(self):
    method embedding_parameters (line 253) | def embedding_parameters(self):
    method embedding_to_coarse_loss (line 256) | def embedding_to_coarse_loss(self):

FILE: src/stablediffusion/ldm/modules/encoders/modules.py
  function _expand_mask (line 16) | def _expand_mask(mask, dtype, tgt_len=None):
  function _build_causal_attention_mask (line 34) | def _build_causal_attention_mask(bsz, seq_len, dtype):
  class AbstractEncoder (line 44) | class AbstractEncoder(nn.Module):
    method __init__ (line 45) | def __init__(self):
    method encode (line 48) | def encode(self, *args, **kwargs):
  class ClassEmbedder (line 52) | class ClassEmbedder(nn.Module):
    method __init__ (line 53) | def __init__(self, embed_dim, n_classes=1000, key='class'):
    method forward (line 58) | def forward(self, batch, key=None):
  class TransformerEmbedder (line 67) | class TransformerEmbedder(AbstractEncoder):
    method __init__ (line 70) | def __init__(
    method forward (line 86) | def forward(self, tokens):
    method encode (line 91) | def encode(self, x):
  class BERTTokenizer (line 95) | class BERTTokenizer(AbstractEncoder):
    method __init__ (line 98) | def __init__(
    method forward (line 123) | def forward(self, text):
    method encode (line 137) | def encode(self, text):
    method decode (line 143) | def decode(self, text):
  class BERTEmbedder (line 147) | class BERTEmbedder(AbstractEncoder):
    method __init__ (line 150) | def __init__(
    method forward (line 174) | def forward(self, text, embedding_manager=None):
    method encode (line 184) | def encode(self, text, **kwargs):
  class SpatialRescaler (line 189) | class SpatialRescaler(nn.Module):
    method __init__ (line 190) | def __init__(
    method forward (line 223) | def forward(self, x):
    method encode (line 231) | def encode(self, x):
  class FrozenCLIPEmbedder (line 235) | class FrozenCLIPEmbedder(AbstractEncoder):
    method __init__ (line 238) | def __init__(
    method freeze (line 434) | def freeze(self):
    method forward (line 439) | def forward(self, text, **kwargs):
    method encode (line 454) | def encode(self, text, **kwargs):
  class FrozenCLIPTextEmbedder (line 458) | class FrozenCLIPTextEmbedder(nn.Module):
    method __init__ (line 463) | def __init__(
    method freeze (line 478) | def freeze(self):
    method forward (line 483) | def forward(self, text):
    method encode (line 490) | def encode(self, text):
  class FrozenClipImageEmbedder (line 498) | class FrozenClipImageEmbedder(nn.Module):
    method __init__ (line 503) | def __init__(
    method preprocess (line 526) | def preprocess(self, x):
    method forward (line 540) | def forward(self, x):

FILE: src/stablediffusion/ldm/modules/image_degradation/bsrgan.py
  function modcrop_np (line 29) | def modcrop_np(img, sf):
  function analytic_kernel (line 49) | def analytic_kernel(k):
  function anisotropic_Gaussian (line 67) | def anisotropic_Gaussian(ksize=15, theta=np.pi, l1=6, l2=6):
  function gm_blur_kernel (line 93) | def gm_blur_kernel(mean, cov, size=15):
  function shift_pixel (line 106) | def shift_pixel(x, sf, upper_left=True):
  function blur (line 135) | def blur(x, k):
  function gen_kernel (line 154) | def gen_kernel(
  function fspecial_gaussian (line 205) | def fspecial_gaussian(hsize, sigma):
  function fspecial_laplacian (line 221) | def fspecial_laplacian(alpha):
  function fspecial (line 230) | def fspecial(filter_type, *args, **kwargs):
  function bicubic_degradation (line 248) | def bicubic_degradation(x, sf=3):
  function srmd_degradation (line 260) | def srmd_degradation(x, k, sf=3):
  function dpsr_degradation (line 284) | def dpsr_degradation(x, k, sf=3):
  function classical_degradation (line 306) | def classical_degradation(x, k, sf=3):
  function add_sharpening (line 321) | def add_sharpening(img, weight=0.5, radius=50, threshold=10):
  function add_blur (line 347) | def add_blur(img, sf=4):
  function add_resize (line 370) | def add_resize(img, sf=4):
  function add_Gaussian_noise (line 405) | def add_Gaussian_noise(img, noise_level1=2, noise_level2=25):
  function add_speckle_noise (line 428) | def add_speckle_noise(img, noise_level1=2, noise_level2=25):
  function add_Poisson_noise (line 452) | def add_Poisson_noise(img):
  function add_JPEG_noise (line 469) | def add_JPEG_noise(img):
  function random_crop (line 480) | def random_crop(lq, hq, sf=4, lq_patchsize=64):
  function degradation_bsrgan (line 495) | def degradation_bsrgan(img, sf=4, lq_patchsize=72, isp_model=None):
  function degradation_bsrgan_variant (line 604) | def degradation_bsrgan_variant(image, sf=4, isp_model=None):
  function degradation_bsrgan_plus (line 711) | def degradation_bsrgan_plus(

FILE: src/stablediffusion/ldm/modules/image_degradation/bsrgan_light.py
  function modcrop_np (line 29) | def modcrop_np(img, sf):
  function analytic_kernel (line 49) | def analytic_kernel(k):
  function anisotropic_Gaussian (line 67) | def anisotropic_Gaussian(ksize=15, theta=np.pi, l1=6, l2=6):
  function gm_blur_kernel (line 93) | def gm_blur_kernel(mean, cov, size=15):
  function shift_pixel (line 106) | def shift_pixel(x, sf, upper_left=True):
  function blur (line 135) | def blur(x, k):
  function gen_kernel (line 154) | def gen_kernel(
  function fspecial_gaussian (line 205) | def fspecial_gaussian(hsize, sigma):
  function fspecial_laplacian (line 221) | def fspecial_laplacian(alpha):
  function fspecial (line 230) | def fspecial(filter_type, *args, **kwargs):
  function bicubic_degradation (line 248) | def bicubic_degradation(x, sf=3):
  function srmd_degradation (line 260) | def srmd_degradation(x, k, sf=3):
  function dpsr_degradation (line 284) | def dpsr_degradation(x, k, sf=3):
  function classical_degradation (line 306) | def classical_degradation(x, k, sf=3):
  function add_sharpening (line 321) | def add_sharpening(img, weight=0.5, radius=50, threshold=10):
  function add_blur (line 347) | def add_blur(img, sf=4):
  function add_resize (line 374) | def add_resize(img, sf=4):
  function add_Gaussian_noise (line 409) | def add_Gaussian_noise(img, noise_level1=2, noise_level2=25):
  function add_speckle_noise (line 432) | def add_speckle_noise(img, noise_level1=2, noise_level2=25):
  function add_Poisson_noise (line 456) | def add_Poisson_noise(img):
  function add_JPEG_noise (line 473) | def add_JPEG_noise(img):
  function random_crop (line 484) | def random_crop(lq, hq, sf=4, lq_patchsize=64):
  function degradation_bsrgan (line 499) | def degradation_bsrgan(img, sf=4, lq_patchsize=72, isp_model=None):
  function degradation_bsrgan_variant (line 608) | def degradation_bsrgan_variant(image, sf=4, isp_model=None):

FILE: src/stablediffusion/ldm/modules/image_degradation/utils_image.py
  function is_image_file (line 42) | def is_image_file(filename):
  function get_timestamp (line 46) | def get_timestamp():
  function imshow (line 50) | def imshow(x, title=None, cbar=False, figsize=None):
  function surf (line 60) | def surf(Z, cmap='rainbow', figsize=None):
  function get_image_paths (line 80) | def get_image_paths(dataroot):
  function _get_paths_from_images (line 87) | def _get_paths_from_images(path):
  function patches_from_image (line 106) | def patches_from_image(img, p_size=512, p_overlap=64, p_max=800):
  function imssave (line 125) | def imssave(imgs, img_path):
  function split_imageset (line 141) | def split_imageset(
  function mkdir (line 179) | def mkdir(path):
  function mkdirs (line 184) | def mkdirs(paths):
  function mkdir_and_rename (line 192) | def mkdir_and_rename(path):
  function imread_uint (line 211) | def imread_uint(path, n_channels=3):
  function imsave (line 229) | def imsave(img, img_path):
  function imwrite (line 236) | def imwrite(img, img_path):
  function read_img (line 246) | def read_img(path):
  function uint2single (line 275) | def uint2single(img):
  function single2uint (line 280) | def single2uint(img):
  function uint162single (line 285) | def uint162single(img):
  function single2uint16 (line 290) | def single2uint16(img):
  function uint2tensor4 (line 301) | def uint2tensor4(img):
  function uint2tensor3 (line 314) | def uint2tensor3(img):
  function tensor2uint (line 326) | def tensor2uint(img):
  function single2tensor3 (line 339) | def single2tensor3(img):
  function single2tensor4 (line 344) | def single2tensor4(img):
  function tensor2single (line 354) | def tensor2single(img):
  function tensor2single3 (line 363) | def tensor2single3(img):
  function single2tensor5 (line 372) | def single2tensor5(img):
  function single32tensor5 (line 381) | def single32tensor5(img):
  function single42tensor4 (line 390) | def single42tensor4(img):
  function tensor2img (line 397) | def tensor2img(tensor, out_type=np.uint8, min_max=(0, 1)):
  function augment_img (line 444) | def augment_img(img, mode=0):
  function augment_img_tensor4 (line 464) | def augment_img_tensor4(img, mode=0):
  function augment_img_tensor (line 484) | def augment_img_tensor(img, mode=0):
  function augment_img_np3 (line 502) | def augment_img_np3(img, mode=0):
  function augment_imgs (line 530) | def augment_imgs(img_list, hflip=True, rot=True):
  function modcrop (line 555) | def modcrop(img_in, scale):
  function shave (line 571) | def shave(img_in, border=0):
  function rgb2ycbcr (line 590) | def rgb2ycbcr(img, only_y=True):
  function ycbcr2rgb (line 620) | def ycbcr2rgb(img):
  function bgr2ycbcr (line 646) | def bgr2ycbcr(img, only_y=True):
  function channel_convert (line 676) | def channel_convert(in_c, tar_type, img_list):
  function calculate_psnr (line 700) | def calculate_psnr(img1, img2, border=0):
  function calculate_ssim (line 721) | def calculate_ssim(img1, img2, border=0):
  function ssim (line 748) | def ssim(img1, img2):
  function cubic (line 780) | def cubic(x):
  function calculate_weights_indices (line 789) | def calculate_weights_indices(
  function imresize (line 850) | def imresize(img, scale, antialiasing=True):
  function imresize_np (line 935) | def imresize_np(img, scale, antialiasing=True):

FILE: src/stablediffusion/ldm/modules/losses/contperceptual.py
  class LPIPSWithDiscriminator (line 7) | class LPIPSWithDiscriminator(nn.Module):
    method __init__ (line 8) | def __init__(
    method calculate_adaptive_weight (line 46) | def calculate_adaptive_weight(self, nll_loss, g_loss, last_layer=None):
    method forward (line 67) | def forward(

FILE: src/stablediffusion/ldm/modules/losses/vqperceptual.py
  function hinge_d_loss_with_exemplar_weights (line 14) | def hinge_d_loss_with_exemplar_weights(logits_real, logits_fake, weights):
  function adopt_weight (line 24) | def adopt_weight(weight, global_step, threshold=0, value=0.0):
  function measure_perplexity (line 30) | def measure_perplexity(predicted_indices, n_embed):
  function l1 (line 42) | def l1(x, y):
  function l2 (line 46) | def l2(x, y):
  class VQLPIPSWithDiscriminator (line 50) | class VQLPIPSWithDiscriminator(nn.Module):
    method __init__ (line 51) | def __init__(
    method calculate_adaptive_weight (line 108) | def calculate_adaptive_weight(self, nll_loss, g_loss, last_layer=None):
    method forward (line 129) | def forward(

FILE: src/stablediffusion/ldm/modules/x_transformer.py
  class AbsolutePositionalEmbedding (line 23) | class AbsolutePositionalEmbedding(nn.Module):
    method __init__ (line 24) | def __init__(self, dim, max_seq_len):
    method init_ (line 29) | def init_(self):
    method forward (line 32) | def forward(self, x):
  class FixedPositionalEmbedding (line 37) | class FixedPositionalEmbedding(nn.Module):
    method __init__ (line 38) | def __init__(self, dim):
    method forward (line 43) | def forward(self, x, seq_dim=1, offset=0):
  function exists (line 58) | def exists(val):
  function default (line 62) | def default(val, d):
  function always (line 68) | def always(val):
  function not_equals (line 75) | def not_equals(val):
  function equals (line 82) | def equals(val):
  function max_neg_value (line 89) | def max_neg_value(tensor):
  function pick_and_pop (line 96) | def pick_and_pop(keys, d):
  function group_dict_by_key (line 101) | def group_dict_by_key(cond, d):
  function string_begins_with (line 110) | def string_begins_with(prefix, str):
  function group_by_key_prefix (line 114) | def group_by_key_prefix(prefix, d):
  function groupby_prefix_and_trim (line 118) | def groupby_prefix_and_trim(prefix, d):
  class Scale (line 132) | class Scale(nn.Module):
    method __init__ (line 133) | def __init__(self, value, fn):
    method forward (line 138) | def forward(self, x, **kwargs):
  class Rezero (line 143) | class Rezero(nn.Module):
    method __init__ (line 144) | def __init__(self, fn):
    method forward (line 149) | def forward(self, x, **kwargs):
  class ScaleNorm (line 154) | class ScaleNorm(nn.Module):
    method __init__ (line 155) | def __init__(self, dim, eps=1e-5):
    method forward (line 161) | def forward(self, x):
  class RMSNorm (line 166) | class RMSNorm(nn.Module):
    method __init__ (line 167) | def __init__(self, dim, eps=1e-8):
    method forward (line 173) | def forward(self, x):
  class Residual (line 178) | class Residual(nn.Module):
    method forward (line 179) | def forward(self, x, residual):
  class GRUGating (line 183) | class GRUGating(nn.Module):
    method __init__ (line 184) | def __init__(self, dim):
    method forward (line 188) | def forward(self, x, residual):
  class GEGLU (line 200) | class GEGLU(nn.Module):
    method __init__ (line 201) | def __init__(self, dim_in, dim_out):
    method forward (line 205) | def forward(self, x):
  class FeedForward (line 210) | class FeedForward(nn.Module):
    method __init__ (line 211) | def __init__(self, dim, dim_out=None, mult=4, glu=False, dropout=0.0):
    method forward (line 225) | def forward(self, x):
  class Attention (line 230) | class Attention(nn.Module):
    method __init__ (line 231) | def __init__(
    method forward (line 289) | def forward(
  class AttentionLayers (line 414) | class AttentionLayers(nn.Module):
    method __init__ (line 415) | def __init__(
    method forward (line 539) | def forward(
  class Encoder (line 613) | class Encoder(AttentionLayers):
    method __init__ (line 614) | def __init__(self, **kwargs):
  class TransformerWrapper (line 619) | class TransformerWrapper(nn.Module):
    method __init__ (line 620) | def __init__(
    method init_ (line 679) | def init_(self):
    method forward (line 682) | def forward(

FILE: src/stablediffusion/ldm/simplet2i.py
  class T2I (line 10) | class T2I(Generate):
    method __init__ (line 11) | def __init__(self,**kwargs):

FILE: src/stablediffusion/ldm/util.py
  function log_txt_as_img (line 17) | def log_txt_as_img(wh, xc, size=10):
  function ismap (line 43) | def ismap(x):
  function isimage (line 49) | def isimage(x):
  function exists (line 55) | def exists(x):
  function default (line 59) | def default(val, d):
  function mean_flat (line 65) | def mean_flat(tensor):
  function count_params (line 73) | def count_params(model, verbose=False):
  function instantiate_from_config (line 82) | def instantiate_from_config(config, **kwargs):
  function get_obj_from_str (line 94) | def get_obj_from_str(string, reload=False):
  function _do_parallel_data_prefetch (line 102) | def _do_parallel_data_prefetch(func, Q, data, idx, idx_to_fn=False):
  function parallel_data_prefetch (line 114) | def parallel_data_prefetch(

FILE: src/stablediffusion/text2image_compvis.py
  function resize_image (line 16) | def resize_image(resize_mode, im, width, height):
  class Text2Image (line 52) | class Text2Image:
    method __init__ (line 53) | def __init__(self, model_path='models/model-epoch06-full.ckpt', use_gp...
    method dream (line 61) | def dream(self, prompt: str, ddim_steps: int, plms: bool, fixed_code: ...
    method translation (line 68) | def translation(self, prompt: str, init_img, ddim_steps: int, ddim_eta...
    method inpaint (line 78) | def inpaint(self, prompt: str, init_img, mask_img, ddim_steps: int, dd...

FILE: src/stablediffusion/text2image_diffusers.py
  function resize_image (line 18) | def resize_image(resize_mode, im, width, height):
  class Text2Image (line 54) | class Text2Image:
    method __init__ (line 55) | def __init__(self, use_gpu=True):
    method dream (line 109) | def dream(self, prompt: str, ddim_steps: int, plms: bool, fixed_code: ...
    method translation (line 117) | def translation(self, prompt: str, init_img, ddim_steps: int, ddim_eta...
    method inpaint (line 132) | def inpaint(self, prompt: str, init_img, mask_img, ddim_steps: int, dd...
    method vae_test (line 148) | def vae_test(self, image, height: int, width: int):

FILE: src/stablediffusion/translation.py
  function preprocess (line 14) | def preprocess(image):
  class StableDiffusionImg2ImgPipeline (line 24) | class StableDiffusionImg2ImgPipeline(DiffusionPipeline):
    method __init__ (line 25) | def __init__(
    method __call__ (line 44) | def __call__(
Condensed preview — 72 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (593K chars).
[
  {
    "path": ".gitignore",
    "chars": 1866,
    "preview": "storage/outputs/*.png\nstorage/init/*.png\n\n# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\nlog."
  },
  {
    "path": "LICENSE",
    "chars": 18092,
    "preview": "                    GNU GENERAL PUBLIC LICENSE\n                       Version 2, June 1991\n\n Copyright (C) 1989, 1991 Fr"
  },
  {
    "path": "README.md",
    "chars": 2581,
    "preview": "# Shanghai - AI Powered Art in a Discord Bot!\n\n<img src=https://cdn.discordapp.com/attachments/971549874514444358/101240"
  },
  {
    "path": "__main__.py",
    "chars": 1402,
    "preview": "import os\nimport sys\nimport argparse\nimport asyncio\nfrom src.core.logging import get_logger\nfrom src.bot.shanghai import"
  },
  {
    "path": "models/.keep",
    "chars": 33,
    "preview": "壊れたカーテンの隙間から\n壁を埋めるのは\n暴言?妄言?知りません。"
  },
  {
    "path": "models/v1-inference.yaml",
    "chars": 2271,
    "preview": "model:\n  base_learning_rate: 1.0e-04\n  target: src.stablediffusion.ldm.models.diffusion.ddpm.LatentDiffusion\n  params:\n "
  },
  {
    "path": "requirements.txt",
    "chars": 423,
    "preview": "--extra-index-url https://download.pytorch.org/whl/cu117\ntorch\ndiffusers\nnumpy\nPillow\npydantic\ngit+https://github.com/Py"
  },
  {
    "path": "run.bat",
    "chars": 52,
    "preview": "venv\\Scripts\\python.exe . --model_path \"\" --token=\"\""
  },
  {
    "path": "run.sh",
    "chars": 59,
    "preview": "venv/bin/python . --model_path \"\" --token=\"\" --hf_token=\"\"\n"
  },
  {
    "path": "setup.bat",
    "chars": 68,
    "preview": "python -m venv venv\nvenv\\Scripts\\pip.exe install -r requirements.txt"
  },
  {
    "path": "setup.sh",
    "chars": 61,
    "preview": "python -m venv venv\nvenv/bin/pip install -r requirements.txt\n"
  },
  {
    "path": "src/bot/shanghai.py",
    "chars": 1496,
    "preview": "import asyncio\nimport os\nfrom abc import ABC\n\nimport discord\nfrom discord.ext import commands\nfrom src.core.logging impo"
  },
  {
    "path": "src/bot/stablecog.py",
    "chars": 9723,
    "preview": "import traceback\nfrom asyncio import AbstractEventLoop\nfrom threading import Thread\n\nimport requests\nimport asyncio\nimpo"
  },
  {
    "path": "src/core/logging.py",
    "chars": 233,
    "preview": "import logging\n\nlogging.basicConfig(level=logging.INFO,\n                    format='[%(asctime)s] %(levelname)s: %(messa"
  },
  {
    "path": "src/scripts/win10patch.py",
    "chars": 510,
    "preview": "try:\n    file_path = 'venv\\\\lib\\\\site-packages\\\\torch\\\\distributed\\\\elastic\\\\timer\\\\file_based_local_timer.py'\n    with "
  },
  {
    "path": "src/stablediffusion/dream.py",
    "chars": 7170,
    "preview": "import inspect\nimport warnings\nfrom typing import List, Optional, Union\n\nimport torch\n\nfrom tqdm.auto import tqdm\nfrom t"
  },
  {
    "path": "src/stablediffusion/inpaint.py",
    "chars": 7151,
    "preview": "import inspect\nfrom typing import List, Optional, Union\n\nimport numpy as np\nimport torch\n\nimport PIL\nfrom diffusers impo"
  },
  {
    "path": "src/stablediffusion/ldm/__init__.py",
    "chars": 30,
    "preview": "from .generate import Generate"
  },
  {
    "path": "src/stablediffusion/ldm/data/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "src/stablediffusion/ldm/data/base.py",
    "chars": 738,
    "preview": "from abc import abstractmethod\nfrom torch.utils.data import (\n    Dataset,\n    ConcatDataset,\n    ChainDataset,\n    Iter"
  },
  {
    "path": "src/stablediffusion/ldm/data/imagenet.py",
    "chars": 16282,
    "preview": "import os, yaml, pickle, shutil, tarfile, glob\nimport cv2\nimport albumentations\nimport PIL\nimport numpy as np\nimport tor"
  },
  {
    "path": "src/stablediffusion/ldm/data/lsun.py",
    "chars": 3497,
    "preview": "import os\nimport numpy as np\nimport PIL\nfrom PIL import Image\nfrom torch.utils.data import Dataset\nfrom torchvision impo"
  },
  {
    "path": "src/stablediffusion/ldm/data/personalized.py",
    "chars": 5496,
    "preview": "import os\nimport numpy as np\nimport PIL\nfrom PIL import Image\nfrom torch.utils.data import Dataset\nfrom torchvision impo"
  },
  {
    "path": "src/stablediffusion/ldm/data/personalized_style.py",
    "chars": 4952,
    "preview": "import os\nimport numpy as np\nimport PIL\nfrom PIL import Image\nfrom torch.utils.data import Dataset\nfrom torchvision impo"
  },
  {
    "path": "src/stablediffusion/ldm/dream/conditioning.py",
    "chars": 3897,
    "preview": "'''\nThis module handles the generation of the conditioning tensors, including management of\nweighted subprompts.\n\nUseful"
  },
  {
    "path": "src/stablediffusion/ldm/dream/devices.py",
    "chars": 693,
    "preview": "import torch\nfrom torch import autocast\nfrom contextlib import contextmanager, nullcontext\n\ndef choose_torch_device() ->"
  },
  {
    "path": "src/stablediffusion/ldm/dream/generator/__init__.py",
    "chars": 92,
    "preview": "'''\nInitialization file for the ldm.dream.generator package\n'''\nfrom .base import Generator\n"
  },
  {
    "path": "src/stablediffusion/ldm/dream/generator/base.py",
    "chars": 6449,
    "preview": "'''\nBase class for ldm.dream.generator.*\nincluding img2img, txt2img, and inpaint\n'''\nimport torch\nimport numpy as  np\nim"
  },
  {
    "path": "src/stablediffusion/ldm/dream/generator/img2img.py",
    "chars": 2669,
    "preview": "'''\nldm.dream.generator.txt2img descends from src.stablediffusion.ldm.dream.generator\n'''\n\nimport torch\nimport numpy as "
  },
  {
    "path": "src/stablediffusion/ldm/dream/generator/inpaint.py",
    "chars": 2768,
    "preview": "'''\nldm.dream.generator.inpaint descends from src.stablediffusion.ldm.dream.generator\n'''\n\nimport torch\nimport numpy as "
  },
  {
    "path": "src/stablediffusion/ldm/dream/generator/txt2img.py",
    "chars": 2372,
    "preview": "'''\nldm.dream.generator.txt2img inherits from src.stablediffusion.ldm.dream.generator\n'''\n\nimport torch\nimport numpy as "
  },
  {
    "path": "src/stablediffusion/ldm/dream/image_util.py",
    "chars": 2501,
    "preview": "from math import sqrt, floor, ceil\nfrom PIL import Image\n\nclass InitImageResizer():\n    \"\"\"Simple class to create resize"
  },
  {
    "path": "src/stablediffusion/ldm/dream/pngwriter.py",
    "chars": 3170,
    "preview": "\"\"\"\nTwo helper classes for dealing with PNG images and their path names.\nPngWriter -- Converts Images generated by T2I i"
  },
  {
    "path": "src/stablediffusion/ldm/dream/readline.py",
    "chars": 4001,
    "preview": "\"\"\"\nReadline helper functions for dream.py (linux and mac only).\n\"\"\"\nimport os\nimport re\nimport atexit\n\n# --------------"
  },
  {
    "path": "src/stablediffusion/ldm/dream/server.py",
    "chars": 11150,
    "preview": "import argparse\nimport json\nimport base64\nimport mimetypes\nimport os\nfrom http.server import BaseHTTPRequestHandler, Thr"
  },
  {
    "path": "src/stablediffusion/ldm/generate.py",
    "chars": 29784,
    "preview": "# Copyright (c) 2022 Lincoln D. Stein (https://github.com/lstein)\n\n# Derived from source code carrying the following cop"
  },
  {
    "path": "src/stablediffusion/ldm/gfpgan/gfpgan_tools.py",
    "chars": 5089,
    "preview": "import torch\nimport warnings\nimport os\nimport sys\nimport numpy as np\n\nfrom PIL import Image\nfrom scripts.dream import cr"
  },
  {
    "path": "src/stablediffusion/ldm/lr_scheduler.py",
    "chars": 4373,
    "preview": "import numpy as np\n\n\nclass LambdaWarmUpCosineScheduler:\n    \"\"\"\n    note: use with a base_lr of 1.0\n    \"\"\"\n\n    def __i"
  },
  {
    "path": "src/stablediffusion/ldm/models/autoencoder.py",
    "chars": 18827,
    "preview": "import torch\nimport pytorch_lightning as pl\nimport torch.nn.functional as F\nfrom contextlib import contextmanager\n\nfrom "
  },
  {
    "path": "src/stablediffusion/ldm/models/diffusion/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "src/stablediffusion/ldm/models/diffusion/classifier.py",
    "chars": 11363,
    "preview": "import os\nimport torch\nimport pytorch_lightning as pl\nfrom omegaconf import OmegaConf\nfrom torch.nn import functional as"
  },
  {
    "path": "src/stablediffusion/ldm/models/diffusion/ddim.py",
    "chars": 14480,
    "preview": "\"\"\"SAMPLING ONLY.\"\"\"\n\nimport torch\nimport numpy as np\nfrom tqdm import tqdm\nfrom functools import partial\nfrom src.stabl"
  },
  {
    "path": "src/stablediffusion/ldm/models/diffusion/ddpm.py",
    "chars": 78019,
    "preview": "\"\"\"\nwild mixture of\nhttps://github.com/lucidrains/denoising-diffusion-pytorch/blob/7706bdfc6f527f58d33f84b7b522e61e6e316"
  },
  {
    "path": "src/stablediffusion/ldm/models/diffusion/ksampler.py",
    "chars": 2964,
    "preview": "\"\"\"wrapper around part of Katherine Crowson's k-diffusion library, making it call compatible with other Samplers\"\"\"\nimpo"
  },
  {
    "path": "src/stablediffusion/ldm/models/diffusion/plms.py",
    "chars": 13620,
    "preview": "\"\"\"SAMPLING ONLY.\"\"\"\n\nimport torch\nimport numpy as np\nfrom tqdm import tqdm\nfrom functools import partial\nfrom src.stabl"
  },
  {
    "path": "src/stablediffusion/ldm/modules/attention.py",
    "chars": 12401,
    "preview": "from inspect import isfunction\nimport math\nimport torch\nimport torch.nn.functional as F\nfrom torch import nn, einsum\nfro"
  },
  {
    "path": "src/stablediffusion/ldm/modules/diffusionmodules/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "src/stablediffusion/ldm/modules/diffusionmodules/model.py",
    "chars": 35497,
    "preview": "# pytorch_diffusion + derived encoder decoder\nimport gc\nimport math\nimport torch\nimport torch.nn as nn\nimport numpy as n"
  },
  {
    "path": "src/stablediffusion/ldm/modules/diffusionmodules/openaimodel.py",
    "chars": 36253,
    "preview": "from abc import abstractmethod\nfrom functools import partial\nimport math\nfrom typing import Iterable\n\nimport numpy as np"
  },
  {
    "path": "src/stablediffusion/ldm/modules/diffusionmodules/util.py",
    "chars": 10209,
    "preview": "# adopted from\n# https://github.com/openai/improved-diffusion/blob/main/improved_diffusion/gaussian_diffusion.py\n# and\n#"
  },
  {
    "path": "src/stablediffusion/ldm/modules/distributions/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "src/stablediffusion/ldm/modules/distributions/distributions.py",
    "chars": 3119,
    "preview": "import torch\nimport numpy as np\n\n\nclass AbstractDistribution:\n    def sample(self):\n        raise NotImplementedError()\n"
  },
  {
    "path": "src/stablediffusion/ldm/modules/ema.py",
    "chars": 3180,
    "preview": "import torch\nfrom torch import nn\n\n\nclass LitEma(nn.Module):\n    def __init__(self, model, decay=0.9999, use_num_upates="
  },
  {
    "path": "src/stablediffusion/ldm/modules/embedding_manager.py",
    "chars": 9030,
    "preview": "from cmath import log\nimport torch\nfrom torch import nn\n\nimport sys\n\nfrom src.stablediffusion.ldm.data.personalized impo"
  },
  {
    "path": "src/stablediffusion/ldm/modules/encoders/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "src/stablediffusion/ldm/modules/encoders/modules.py",
    "chars": 16679,
    "preview": "import torch\nimport torch.nn as nn\nfrom functools import partial\nimport clip\nfrom einops import rearrange, repeat\nfrom t"
  },
  {
    "path": "src/stablediffusion/ldm/modules/image_degradation/__init__.py",
    "chars": 266,
    "preview": "from src.stablediffusion.ldm.modules.image_degradation.bsrgan import (\n    degradation_bsrgan_variant as degradation_fn_"
  },
  {
    "path": "src/stablediffusion/ldm/modules/image_degradation/bsrgan.py",
    "chars": 26558,
    "preview": "# -*- coding: utf-8 -*-\n\"\"\"\n# --------------------------------------------\n# Super-Resolution\n# ------------------------"
  },
  {
    "path": "src/stablediffusion/ldm/modules/image_degradation/bsrgan_light.py",
    "chars": 23359,
    "preview": "# -*- coding: utf-8 -*-\nimport numpy as np\nimport cv2\nimport torch\n\nfrom functools import partial\nimport random\nfrom sci"
  },
  {
    "path": "src/stablediffusion/ldm/modules/image_degradation/utils_image.py",
    "chars": 30056,
    "preview": "import os\nimport math\nimport random\nimport numpy as np\nimport torch\nimport cv2\nfrom torchvision.utils import make_grid\nf"
  },
  {
    "path": "src/stablediffusion/ldm/modules/losses/__init__.py",
    "chars": 89,
    "preview": "from src.stablediffusion.ldm.modules.losses.contperceptual import LPIPSWithDiscriminator\n"
  },
  {
    "path": "src/stablediffusion/ldm/modules/losses/contperceptual.py",
    "chars": 6282,
    "preview": "import torch\nimport torch.nn as nn\n\nfrom taming.modules.losses.vqperceptual import *  # TODO: taming dependency yes/no?\n"
  },
  {
    "path": "src/stablediffusion/ldm/modules/losses/vqperceptual.py",
    "chars": 8654,
    "preview": "import torch\nfrom torch import nn\nimport torch.nn.functional as F\nfrom einops import repeat\n\nfrom taming.modules.discrim"
  },
  {
    "path": "src/stablediffusion/ldm/modules/x_transformer.py",
    "chars": 21561,
    "preview": "\"\"\"shout-out to https://github.com/lucidrains/x-transformers/tree/main/x_transformers\"\"\"\nimport torch\nfrom torch import "
  },
  {
    "path": "src/stablediffusion/ldm/simplet2i.py",
    "chars": 383,
    "preview": "'''\nThis module is provided for backward compatibility with the\noriginal (hasty) API.\n\nPlease use ldm.generate instead.\n"
  },
  {
    "path": "src/stablediffusion/ldm/util.py",
    "chars": 5927,
    "preview": "import importlib\n\nimport torch\nimport numpy as np\nfrom collections import abc\nfrom einops import rearrange\nfrom functool"
  },
  {
    "path": "src/stablediffusion/text2image_compvis.py",
    "chars": 4914,
    "preview": "import os\nimport torch\nimport numpy as np\nfrom PIL import Image\nfrom pytorch_lightning import seed_everything\nfrom torch"
  },
  {
    "path": "src/stablediffusion/text2image_diffusers.py",
    "chars": 7132,
    "preview": "import os\nimport torch\nimport numpy as np\nfrom PIL import Image\nfrom pytorch_lightning import seed_everything\nfrom torch"
  },
  {
    "path": "src/stablediffusion/translation.py",
    "chars": 6444,
    "preview": "import inspect\nfrom typing import List, Optional, Union\n\nimport numpy as np\nimport torch\n\nimport PIL\nfrom diffusers impo"
  },
  {
    "path": "storage/init/.keep",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "storage/outputs/.keep",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "win10fix.bat",
    "chars": 32,
    "preview": "python src\\scripts\\win10patch.py"
  }
]

About this extraction

This page contains the full source code of the harubaru/discord-stable-diffusion GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 72 files (557.1 KB), approximately 138.1k tokens, and a symbol index with 739 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!