Repository: AvrahamRaviv/Text2All
Branch: main
Commit: cbc428166d2d
Files: 1
Total size: 6.9 KB

Directory structure:
gitextract_wtz21xxr/

└── README.md

================================================
FILE CONTENTS
================================================

================================================
FILE: README.md
================================================
# Text2All    ![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)
<!-- ![](https://img.shields.io/github/stars/AvrahamRaviv?affiliations=OWNER&color=YellowGreen&label=Stars&logo=StarsC&logoColor=StarsC&style=social) -->

> A comprehensive list of resources about text-guided generative models.
<p align="center">
<img src="https://github.com/AvrahamRaviv/Text2All/blob/main/TextToAll.png" width="512" height="512">
</p>

## Table of contents
- [Works and Papers](#works-and-papers)
  - [Text-to-Image](#text-to-image)
  - [Text-to-Video](#text-to-video)
  - [Text-to-3D](#text-to-3d)
  - [Text-to-Audio](#text-to-audio)
  - [Text-to-Motion](#text-to-motion)
  - [Text-to-Style](#text-to-style)

- [Tutorials](#tutorials)
  - [Resources](#resources)
  - [Blogs and Summaries](#blogs-and-summaries)
  - [Videos](#videos)

## Works and Papers

### Text-to-Image
- [DALL-E - Hierarchical Text-Conditional Image Generation with CLIP Latents](https://openai.com/dall-e-2/)

- [Imagen - Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding](https://imagen.research.google/)

- [RE-IMAGEN: Retrieval-augmented Text-to-image Generator](https://arxiv.org/pdf/2209.14491.pdf)

- [Stable Diffusion - High-Resolution Image Synthesis with Latent Diffusion Models](https://github.com/CompVis/stable-diffusion)

- [MidJourney](https://www.midjourney.com/home/)

- [GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models](https://github.com/openai/glide-text2im)

- [Parti - Pathways Autoregressive Text-to-Image](https://parti.research.google/)

- [MagicMix: Semantic Mixing with Diffusion Models](https://arxiv.org/abs/2210.16056)

- [AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks](https://github.com/taoxugit/AttnGAN)

- [Imagic: Text-Based Real Image Editing with Diffusion Models](https://arxiv.org/pdf/2210.09276.pdf)

- [DIFFEDIT: Diffusion-based Semantic Image Editing With Mask Guidance](https://arxiv.org/pdf/2210.11427.pdf?fbclid=IwAR1WanJ75VTG2NFYhZAevLyogvcbmeC4dj8Cifx4dd94SdyQGd92h8gU0Ec)

- [UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single Image](https://arxiv.org/abs/2210.09477)

- [LAFITE: Towards Language-Free Training for Text-to-Image Generation](https://github.com/drboog/Lafite)

- [DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation](https://dreambooth.github.io/)

- [Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors](https://arxiv.org/abs/2203.13131)

- [Text2LIVE: Text-Driven Layered Image and Video Editing](https://text2live.github.io/?utm_source=catalyzex.com)

- [An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion](https://textual-inversion.github.io/)

- [Prompt-to-Prompt: Latent Diffusion and Stable Diffusion implementation](https://prompt-to-prompt.github.io/)

- [StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks](https://github.com/hanzhanggit/StackGAN)

- [clip2latent: Text driven sampling of a pre-trained StyleGAN using denoising diffusion and CLIP](https://github.com/justinpinkney/clip2latent)

- Stable Diffusion Notebooks: [Image](https://colab.research.google.com/github/deforum/stable-diffusion/blob/main/Deforum_Stable_Diffusion.ipynb?fbclid=IwAR23pz-LB_UcXOE1vBGIf6niGL86CHlISFhr4kfqYA-qUJR_m0EVfWOpg5Y), [Animation](https://colab.research.google.com/gist/costiash/d9421a1d4c0c66f8fb1d0f107b0c0bcb/stable-diffusion-animation-demo.ipynb?fbclid=IwAR23gmrl-TRW37PRxWTYlrixdy8DA3woxEpmaE62OzsZ48-dx7raraD7UAY), [Panorama](https://colab.research.google.com/drive/1RXRrkKUnpNiPCxTJg0Imq7sIM8ltYFz2?usp=sharing), [text-based real image editing](https://github.com/justinpinkney/stable-diffusion/blob/main/notebooks/imagic.ipynb)

### Text-to-Video
- [Make-A-Video: Text-to-Video Generation without Text-Video Data](https://makeavideo.studio/)

- [Imagen Video: High Definition Video Generation With Diffusion Models](https://imagen.research.google/video/)

- [Phenaki: Variable Length Video Generation from Open Domain Textual Descriptions](https://phenaki.video/?fbclid=IwAR2PPsn9kT7WGbOaTrr-Fi7UBVBWd8-BZzX3bLFT9B_WISO9LBGq8mBIl6M)

- [GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions](https://arxiv.org/pdf/2104.14806v1.pdf)

- [CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers](https://github.com/THUDM/CogVideo)

- [NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion](https://github.com/microsoft/NUWA)

### Text-to-3D
- [DreamFusion: Text-to-3D using 2D Diffusion](https://dreamfusionpaper.github.io/)

- [Zero-Shot Text-Guided Object Generation with Dream Fields](https://ajayj.com/dreamfields)

- [CLIP-Mesh: Generating textured meshes from text using pretrained image-text models](https://www.nasir.lol/clipmesh)

### Text-to-Audio
- [AudioGen: Textually Guided Audio Generation](https://felixkreuk.github.io/text2audio_arxiv_samples/)

- [Diffsound: Discrete Diffusion Model for Text-to-sound Generation](http://dongchaoyang.top/text-to-sound-synthesis-demo/)

### Text-to-Motion
- [MotionCLIP: Exposing Human Motion Generation to CLIP Space](https://guytevet.github.io/motionclip-page/)

- [Human Motion Diffusion Model](https://guytevet.github.io/mdm-page/)

- [Language2Pose: Natural Language Grounded Pose Forecasting](https://arxiv.org/abs/1907.01108)

### Text-to-Style
- [Text-Driven Stylization of Video Objects](https://sloeschcke.github.io/Text-Driven-Stylization-of-Video-Objects/)

## Tutorials

### Resources

- [Awesome AI image synthesis - A list of awesome tools, ideas, prompt engineering tools, colabs, models, and helpers for the prompt designer playing with aiArt and image synthesis](https://github.com/altryne/awesome-ai-art-image-synthesis)

- [Tools and Resources for AI Art](https://pharmapsychotic.com/tools.html)

- [DiffusionDB - large-scale text-to-image prompt dataset (it contains 2 million images)](https://github.com/poloclub/diffusiondb)

### Blogs and Summaries

- [The Illustrated Stable Diffusion](https://jalammar.github.io/illustrated-stable-diffusion/)

- [Introduction to Diffusion Models](https://www.assemblyai.com/blog/diffusion-models-for-machine-learning-introduction/)

- [How DALL-E works](https://www.assemblyai.com/blog/how-dall-e-2-actually-works/)

- [How Imagen works](https://www.assemblyai.com/blog/how-imagen-actually-works/)

- [Build Your Own Imagen](https://www.assemblyai.com/blog/minimagen-build-your-own-imagen-text-to-image-model/)

### Videos

- [Diffusion models from scratch in PyTorch](https://www.youtube.com/watch?v=a4Yfz2FxXiY&ab_channel=DeepFindr)

- [Yannic Kilcher - Text-to-Image models are taking over! (Imagen, DALL-E 2, Midjourney, CogView 2 & more)](https://www.youtube.com/watch?v=af6WPqvzjjk&ab_channel=YannicKilcher)

- [Yannic Kilcher - Stable Diffusion Takes Over!](https://www.youtube.com/watch?v=xbxe-x6wvRw&ab_channel=YannicKilcher)