Repository: AvrahamRaviv/Text2All
Branch: main
Commit: cbc428166d2d
Files: 1
Total size: 6.9 KB
Directory structure:
gitextract_wtz21xxr/
└── README.md
================================================
FILE CONTENTS
================================================
================================================
FILE: README.md
================================================
# Text2All 
> A comprehensive list of resources about text-guided generative models.
## Table of contents
- [Works and Papers](#works-and-papers)
- [Text-to-Image](#text-to-image)
- [Text-to-Video](#text-to-video)
- [Text-to-3D](#text-to-3d)
- [Text-to-Audio](#text-to-audio)
- [Text-to-Motion](#text-to-motion)
- [Text-to-Style](#text-to-style)
- [Tutorials](#tutorials)
- [Resources](#resources)
- [Blogs and Summaries](#blogs-and-summaries)
- [Videos](#videos)
## Works and Papers
### Text-to-Image
- [DALL-E - Hierarchical Text-Conditional Image Generation with CLIP Latents](https://openai.com/dall-e-2/)
- [Imagen - Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding](https://imagen.research.google/)
- [RE-IMAGEN: Retrieval-augmented Text-to-image Generator](https://arxiv.org/pdf/2209.14491.pdf)
- [Stable Diffusion - High-Resolution Image Synthesis with Latent Diffusion Models](https://github.com/CompVis/stable-diffusion)
- [MidJourney](https://www.midjourney.com/home/)
- [GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models](https://github.com/openai/glide-text2im)
- [Parti - Pathways Autoregressive Text-to-Image](https://parti.research.google/)
- [MagicMix: Semantic Mixing with Diffusion Models](https://arxiv.org/abs/2210.16056)
- [AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks](https://github.com/taoxugit/AttnGAN)
- [Imagic: Text-Based Real Image Editing with Diffusion Models](https://arxiv.org/pdf/2210.09276.pdf)
- [DIFFEDIT: Diffusion-based Semantic Image Editing With Mask Guidance](https://arxiv.org/pdf/2210.11427.pdf?fbclid=IwAR1WanJ75VTG2NFYhZAevLyogvcbmeC4dj8Cifx4dd94SdyQGd92h8gU0Ec)
- [UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single Image](https://arxiv.org/abs/2210.09477)
- [LAFITE: Towards Language-Free Training for Text-to-Image Generation](https://github.com/drboog/Lafite)
- [DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation](https://dreambooth.github.io/)
- [Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors](https://arxiv.org/abs/2203.13131)
- [Text2LIVE: Text-Driven Layered Image and Video Editing](https://text2live.github.io/?utm_source=catalyzex.com)
- [An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion](https://textual-inversion.github.io/)
- [Prompt-to-Prompt: Latent Diffusion and Stable Diffusion implementation](https://prompt-to-prompt.github.io/)
- [StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks](https://github.com/hanzhanggit/StackGAN)
- [clip2latent: Text driven sampling of a pre-trained StyleGAN using denoising diffusion and CLIP](https://github.com/justinpinkney/clip2latent)
- Stable Diffusion Notebooks: [Image](https://colab.research.google.com/github/deforum/stable-diffusion/blob/main/Deforum_Stable_Diffusion.ipynb?fbclid=IwAR23pz-LB_UcXOE1vBGIf6niGL86CHlISFhr4kfqYA-qUJR_m0EVfWOpg5Y), [Animation](https://colab.research.google.com/gist/costiash/d9421a1d4c0c66f8fb1d0f107b0c0bcb/stable-diffusion-animation-demo.ipynb?fbclid=IwAR23gmrl-TRW37PRxWTYlrixdy8DA3woxEpmaE62OzsZ48-dx7raraD7UAY), [Panorama](https://colab.research.google.com/drive/1RXRrkKUnpNiPCxTJg0Imq7sIM8ltYFz2?usp=sharing), [text-based real image editing](https://github.com/justinpinkney/stable-diffusion/blob/main/notebooks/imagic.ipynb)
### Text-to-Video
- [Make-A-Video: Text-to-Video Generation without Text-Video Data](https://makeavideo.studio/)
- [Imagen Video: High Definition Video Generation With Diffusion Models](https://imagen.research.google/video/)
- [Phenaki: Variable Length Video Generation from Open Domain Textual Descriptions](https://phenaki.video/?fbclid=IwAR2PPsn9kT7WGbOaTrr-Fi7UBVBWd8-BZzX3bLFT9B_WISO9LBGq8mBIl6M)
- [GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions](https://arxiv.org/pdf/2104.14806v1.pdf)
- [CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers](https://github.com/THUDM/CogVideo)
- [NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion](https://github.com/microsoft/NUWA)
### Text-to-3D
- [DreamFusion: Text-to-3D using 2D Diffusion](https://dreamfusionpaper.github.io/)
- [Zero-Shot Text-Guided Object Generation with Dream Fields](https://ajayj.com/dreamfields)
- [CLIP-Mesh: Generating textured meshes from text using pretrained image-text models](https://www.nasir.lol/clipmesh)
### Text-to-Audio
- [AudioGen: Textually Guided Audio Generation](https://felixkreuk.github.io/text2audio_arxiv_samples/)
- [Diffsound: Discrete Diffusion Model for Text-to-sound Generation](http://dongchaoyang.top/text-to-sound-synthesis-demo/)
### Text-to-Motion
- [MotionCLIP: Exposing Human Motion Generation to CLIP Space](https://guytevet.github.io/motionclip-page/)
- [Human Motion Diffusion Model](https://guytevet.github.io/mdm-page/)
- [Language2Pose: Natural Language Grounded Pose Forecasting](https://arxiv.org/abs/1907.01108)
### Text-to-Style
- [Text-Driven Stylization of Video Objects](https://sloeschcke.github.io/Text-Driven-Stylization-of-Video-Objects/)
## Tutorials
### Resources
- [Awesome AI image synthesis - A list of awesome tools, ideas, prompt engineering tools, colabs, models, and helpers for the prompt designer playing with aiArt and image synthesis](https://github.com/altryne/awesome-ai-art-image-synthesis)
- [Tools and Resources for AI Art](https://pharmapsychotic.com/tools.html)
- [DiffusionDB - large-scale text-to-image prompt dataset (it contains 2 million images)](https://github.com/poloclub/diffusiondb)
### Blogs and Summaries
- [The Illustrated Stable Diffusion](https://jalammar.github.io/illustrated-stable-diffusion/)
- [Introduction to Diffusion Models](https://www.assemblyai.com/blog/diffusion-models-for-machine-learning-introduction/)
- [How DALL-E works](https://www.assemblyai.com/blog/how-dall-e-2-actually-works/)
- [How Imagen works](https://www.assemblyai.com/blog/how-imagen-actually-works/)
- [Build Your Own Imagen](https://www.assemblyai.com/blog/minimagen-build-your-own-imagen-text-to-image-model/)
### Videos
- [Diffusion models from scratch in PyTorch](https://www.youtube.com/watch?v=a4Yfz2FxXiY&ab_channel=DeepFindr)
- [Yannic Kilcher - Text-to-Image models are taking over! (Imagen, DALL-E 2, Midjourney, CogView 2 & more)](https://www.youtube.com/watch?v=af6WPqvzjjk&ab_channel=YannicKilcher)
- [Yannic Kilcher - Stable Diffusion Takes Over!](https://www.youtube.com/watch?v=xbxe-x6wvRw&ab_channel=YannicKilcher)