Runway, a startup known for co-creating the popular text-to-image AI model Stable Diffusion, has recently unveiled a new artificial intelligence model capable of generating short videos based on text prompts.

The generative neural network is called Gen-2, and it can generate short videos based on just a few text inputs. The AI algorithm generates 3-second video clips from scratch. However, users can also upload images as a cue for the algorithm.

Gen-2 will not be open-sourced or widely available from the start for various reasons. Instead, users can sign up to join a waitlist for Gen-2 via Runway's Discord.

Google’s Imagen AI Tool Uses Text Descriptions to Generate Images
To recognize text input, Imagen uses large language models on which natural speech processing algorithms are based. The tech giant claims that its new text-to-image generator has an “unprecedented degree of photorealism.”

Right now, the videos generated by Gen-1 and Gen-2 have no sound. Runway AI is conducting research into the sound generation in hopes of creating a system that will automatically generate not only images and videos, but sound that suits them.