Repository: AvrahamRaviv/Text2All Branch: main Commit: cbc428166d2d Files: 1 Total size: 6.9 KB Directory structure: gitextract_wtz21xxr/ └── README.md ================================================ FILE CONTENTS ================================================ ================================================ FILE: README.md ================================================ # Text2All ![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square) > A comprehensive list of resources about text-guided generative models.

## Table of contents - [Works and Papers](#works-and-papers) - [Text-to-Image](#text-to-image) - [Text-to-Video](#text-to-video) - [Text-to-3D](#text-to-3d) - [Text-to-Audio](#text-to-audio) - [Text-to-Motion](#text-to-motion) - [Text-to-Style](#text-to-style) - [Tutorials](#tutorials) - [Resources](#resources) - [Blogs and Summaries](#blogs-and-summaries) - [Videos](#videos) ## Works and Papers ### Text-to-Image - [DALL-E - Hierarchical Text-Conditional Image Generation with CLIP Latents](https://openai.com/dall-e-2/) - [Imagen - Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding](https://imagen.research.google/) - [RE-IMAGEN: Retrieval-augmented Text-to-image Generator](https://arxiv.org/pdf/2209.14491.pdf) - [Stable Diffusion - High-Resolution Image Synthesis with Latent Diffusion Models](https://github.com/CompVis/stable-diffusion) - [MidJourney](https://www.midjourney.com/home/) - [GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models](https://github.com/openai/glide-text2im) - [Parti - Pathways Autoregressive Text-to-Image](https://parti.research.google/) - [MagicMix: Semantic Mixing with Diffusion Models](https://arxiv.org/abs/2210.16056) - [AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks](https://github.com/taoxugit/AttnGAN) - [Imagic: Text-Based Real Image Editing with Diffusion Models](https://arxiv.org/pdf/2210.09276.pdf) - [DIFFEDIT: Diffusion-based Semantic Image Editing With Mask Guidance](https://arxiv.org/pdf/2210.11427.pdf?fbclid=IwAR1WanJ75VTG2NFYhZAevLyogvcbmeC4dj8Cifx4dd94SdyQGd92h8gU0Ec) - [UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single Image](https://arxiv.org/abs/2210.09477) - [LAFITE: Towards Language-Free Training for Text-to-Image Generation](https://github.com/drboog/Lafite) - [DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation](https://dreambooth.github.io/) - [Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors](https://arxiv.org/abs/2203.13131) - [Text2LIVE: Text-Driven Layered Image and Video Editing](https://text2live.github.io/?utm_source=catalyzex.com) - [An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion](https://textual-inversion.github.io/) - [Prompt-to-Prompt: Latent Diffusion and Stable Diffusion implementation](https://prompt-to-prompt.github.io/) - [StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks](https://github.com/hanzhanggit/StackGAN) - [clip2latent: Text driven sampling of a pre-trained StyleGAN using denoising diffusion and CLIP](https://github.com/justinpinkney/clip2latent) - Stable Diffusion Notebooks: [Image](https://colab.research.google.com/github/deforum/stable-diffusion/blob/main/Deforum_Stable_Diffusion.ipynb?fbclid=IwAR23pz-LB_UcXOE1vBGIf6niGL86CHlISFhr4kfqYA-qUJR_m0EVfWOpg5Y), [Animation](https://colab.research.google.com/gist/costiash/d9421a1d4c0c66f8fb1d0f107b0c0bcb/stable-diffusion-animation-demo.ipynb?fbclid=IwAR23gmrl-TRW37PRxWTYlrixdy8DA3woxEpmaE62OzsZ48-dx7raraD7UAY), [Panorama](https://colab.research.google.com/drive/1RXRrkKUnpNiPCxTJg0Imq7sIM8ltYFz2?usp=sharing), [text-based real image editing](https://github.com/justinpinkney/stable-diffusion/blob/main/notebooks/imagic.ipynb) ### Text-to-Video - [Make-A-Video: Text-to-Video Generation without Text-Video Data](https://makeavideo.studio/) - [Imagen Video: High Definition Video Generation With Diffusion Models](https://imagen.research.google/video/) - [Phenaki: Variable Length Video Generation from Open Domain Textual Descriptions](https://phenaki.video/?fbclid=IwAR2PPsn9kT7WGbOaTrr-Fi7UBVBWd8-BZzX3bLFT9B_WISO9LBGq8mBIl6M) - [GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions](https://arxiv.org/pdf/2104.14806v1.pdf) - [CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers](https://github.com/THUDM/CogVideo) - [NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion](https://github.com/microsoft/NUWA) ### Text-to-3D - [DreamFusion: Text-to-3D using 2D Diffusion](https://dreamfusionpaper.github.io/) - [Zero-Shot Text-Guided Object Generation with Dream Fields](https://ajayj.com/dreamfields) - [CLIP-Mesh: Generating textured meshes from text using pretrained image-text models](https://www.nasir.lol/clipmesh) ### Text-to-Audio - [AudioGen: Textually Guided Audio Generation](https://felixkreuk.github.io/text2audio_arxiv_samples/) - [Diffsound: Discrete Diffusion Model for Text-to-sound Generation](http://dongchaoyang.top/text-to-sound-synthesis-demo/) ### Text-to-Motion - [MotionCLIP: Exposing Human Motion Generation to CLIP Space](https://guytevet.github.io/motionclip-page/) - [Human Motion Diffusion Model](https://guytevet.github.io/mdm-page/) - [Language2Pose: Natural Language Grounded Pose Forecasting](https://arxiv.org/abs/1907.01108) ### Text-to-Style - [Text-Driven Stylization of Video Objects](https://sloeschcke.github.io/Text-Driven-Stylization-of-Video-Objects/) ## Tutorials ### Resources - [Awesome AI image synthesis - A list of awesome tools, ideas, prompt engineering tools, colabs, models, and helpers for the prompt designer playing with aiArt and image synthesis](https://github.com/altryne/awesome-ai-art-image-synthesis) - [Tools and Resources for AI Art](https://pharmapsychotic.com/tools.html) - [DiffusionDB - large-scale text-to-image prompt dataset (it contains 2 million images)](https://github.com/poloclub/diffusiondb) ### Blogs and Summaries - [The Illustrated Stable Diffusion](https://jalammar.github.io/illustrated-stable-diffusion/) - [Introduction to Diffusion Models](https://www.assemblyai.com/blog/diffusion-models-for-machine-learning-introduction/) - [How DALL-E works](https://www.assemblyai.com/blog/how-dall-e-2-actually-works/) - [How Imagen works](https://www.assemblyai.com/blog/how-imagen-actually-works/) - [Build Your Own Imagen](https://www.assemblyai.com/blog/minimagen-build-your-own-imagen-text-to-image-model/) ### Videos - [Diffusion models from scratch in PyTorch](https://www.youtube.com/watch?v=a4Yfz2FxXiY&ab_channel=DeepFindr) - [Yannic Kilcher - Text-to-Image models are taking over! (Imagen, DALL-E 2, Midjourney, CogView 2 & more)](https://www.youtube.com/watch?v=af6WPqvzjjk&ab_channel=YannicKilcher) - [Yannic Kilcher - Stable Diffusion Takes Over!](https://www.youtube.com/watch?v=xbxe-x6wvRw&ab_channel=YannicKilcher)