DALL·E: How to get access? Tips & FAQs about Dalle2
We are writing this article for general audience and casual users of Dall-E. How to get access to DALL-E 2? How does it work? Explore its features and example artworks. What are its limitations and restrictions? Try alternative AI text-to-image generators while you await Dalle access.
What is DALL·E 2?
OpenAI has developed an AI system called Dalle 2. It is capable of generating high resolution images like artwork and photos from text descriptions.
When will OpenAI release DALL·E to the public? How to access DALL·E?
Dalle's public release date is unknown. It was in trial/research mode with very few people getting access from April to July 2022. From 20 July, 2022, OpenAI announced the Dalle Beta, in which 1 million users in the wait lists will get access gradually. You can join the Dalle Beta waitlist by submitting the form here: https://labs.openai.com/waitlist
Is DALL·E free? How to buy DALL·E credits? How much does DALL·E 2 cost?
Dalle 2 has a freemium model. You get 50 free credits in your first month, then 15 free credits per month. Free credits expire after 1 month. After getting access to Dalle, you can purchase 115 credits for $15 from their website. For one credit, you get 4 images per text prompt, or 3 images with the variation or edit function. Inpainting is possible with the eraser tool inside the edit feature.
How do people use DALL·E 2 for generating images and artworks?
The online community on Twitter, Reddit, Instagram, Discord, etc. discovers, learns, and shares different ways in which we can creatively use Dalle!
Coming up with innovative ways for descriptions is called prompt design and is a community driven effort. People have found that adding information in the text prompts has big effect on the output images that Dalle generates. This info can be about the medium and style of the image, type of objects & beings, historical period it is from, emotions, actions, popular artists' names, etc. A great resource for learning prompt design is the DALL-E Prompt Book: https://dallery.gallery/the-dalle-2-prompt-book/
You can also give it an input image, and ask it to produce variations of that image. Asking for variations is based purely on the input image; you cannot add any text description. Dalle will infer the context, elements, styles etc. from your input image. And it will produce on its own, different images with the same context, elements, styles, etc. By default, Dalle produces 1024 pixels by 1024 pixels square images.
Or you can give it incomplete images, with transparent sections. And Dalle will edit and fill the image, based on the image contents + your description. People call it inpainting when you are erasing parts of an image and letting Dalle redraw something else. Dalle has an eraser tool to let you do inpainting.
Outpainting / Uncropping
And we call it outpainting or uncropping when you shrink the existing image, so that there is a larger transparent canvas around it. And you let Dalle expand the image with new elements, based on its own creativity and your input descriptions. Dalle does not currently support any way to shrink the image, or expand the canvas. So, people download the Dalle image, edit it in an image editor, and reupload it to Dalle for uncropping. Beautiful and large murals, infinite-zoom videos, landscape-panaroma style artworks, etc. have been produced in this manner using outpainting or uncropping.
Where does DALL·E get its name from?
Dalle is named after the robot character Wall-E and the artist Salvador Dali.
How is DALL·E related to GPT-3?
OpenAI has developed a natural language model called GPT-3. It offers GPT-3 as a paid API, and many apps have been developed from it. They use its natural language processing capabililties. While GPT-3 is optimized to understand and process text, Dalle 2 is optimized to understand text-image relationships. Dalle2 is a modified version of GPT-3 model, with 3.5 billion parameters, that has been trained to generate images from text. Dalle 1 has 12 billion parameters, whereas GPT-3 itself has 175 billion parameters!
What are some DALL·E alternatives for text-to-image AI generation?
- Dalle Mini aka Craiyon: https://www.craiyon.com/
- Midjourney: https://www.midjourney.com/
- Wombo Dream: https://wombo.art/
- StarryAI: https://www.starryai.com/
- NightCafe: https://www.starryai.com/
- Replicate's Library of Text to Image models https://replicate.com/collections/text-to-image
- Stable Diffusion Beta: https://stability.ai/beta-signup-form
- Google Imagen: https://imagen.research.google/
- Google Parti: https://parti.research.google/
How do text-to-image AI art generators work?
- Researchers and developer try to use, mix, and evolve different approaches.
- They usually train these text-to-image models on millions of text captions-image pairs.
- GANs do well in narrow contexts, like artificial face generation. But they fail for general usecases.
- CLIP proved to be a useful ingredient to generalize the model for any text and image. CLIP stands for Contrastive Language-Image Pretraining.
- DALLE uses CLIP and Diffusion.
- Diffusion helps to improve quality of the image. It produces a new image starting from noise.
- Dalle 2's diffusion model is called GLIDE (Guided Language to Image Diffusion for Generation and Editing).
- Some AI art generators work by trying to emulate & improve upon Dalle's approach, based on what they shared in the scientific papers.
- Other AI text-to-image generators take a different approach, like using BigGAN, VQ-GAN, StyleGAN, etc. to generate images. But they all usually rely on CLIP.
Does DALL·E have a secret language?
There has been speculation on the internet that Dalle has a secret language. Examples were provided where certain gibberish words produced birds or insects. However other users debunked this theory, as they showed that the same gibberish words can generate different objects. Dalle always generates some image, whether you give it meaningful words, or emojis, or type nonsensical gibberish.
What are the restrictions and limitations of DALL-E 2?
Dalle 2 is not perfect. There are many limitations and ways in which Dalle can fail to produce the right ouputs. An AI technology of this calibre can also be risky and potentially harmful in the wrong hands. So OpenAI has put intentional restrictions on how Dalle2 can be used.
Limitations of Dalle2
- When you ask Dalle to output text with 2 or more words, it tends to produce incorrect spellings. So we can say that Dalle is bad at spelling more than 2 words at a time.
- Dalle is also bad at composition when it comes to relative positions of elements. When told to draw "a red object on top of a blue one", it can produce the opposite: a blue object on top of a red one.
- Dalle is bad at counting as well. Telling it "one blue apple in a bowl of green oranges", it produces blue apples and green oranges. But their quantity can be different from what we have told it.
- Sometimes, Dalle can produce faces with defects. While such defects or imperfections are true for other things, people tend to notice facial defects more easily.
Restrictions in Dalle2
- OpenAI has taken care to omit images in its training data that were violent, adult, or politically manipulative.
- Dalle2 refuses to accept names of famous people in its text prompts, or pictures of real people as image upload inputs.
- Similarly, it does not allow violent, adult, or political concepts in its text prompts.
- Accounts that might submit descriptions or images that go against Dalle's policy can get permanently suspended.
- Besides these active restrictions, OpenAI has been careful at restricting the user base of Dalle2. They are gradually testing and expanding access to Dalle2, as they learn about how people are using it.
This was an introductory article about OpenAI's DALL-E 2. We have seen how you can join the Dalle waitlist, what the features of Dalle are, how people are using Dalle etc. By answering some FAQs about Dalle, we have made it easier for you to navigate this exciting new rabbit hole on the internet. There are many improvements to Dalle coming up. Similarly, several alternative AI Text-to-Image generators are being released by other companies and groups. These new tools can expand the creativity of artists, designers, photographers, etc. Dalle 2 and its alternatives give new creative capabilities to people who always had great imagination but never had the time or the inclination to learn creative tools and techniques. Which of these tools are your favorites? Tell us in the comments!