Phenaki
Text-to-Video Generation with Dynamic Prompts
Explore Runway Research's cutting-edge work in multimodal AI, including Gen-4 video generation, 3D Gaussian splatting, and domain generalization. Discover how Runway is shaping the future of simulation and creativity.
Runway Research is focused on developing multimodal AI systems that understand and simulate the dynamics of the real world. Their work centers on video as a core input and output, enhanced by modalities like audio and text to create more comprehensive models. These general-purpose simulators aim to power the next generation of creative and analytical tools.
The team at Runway believes that video, due to its complexity and temporal structure, provides the most powerful foundation for training AI that mimics human-like perception and understanding. By grounding models in rich video data, they aim to unlock applications in film, design, and interactive experiences.
Runway researchers introduced a method called StochasticSplats that improves on existing 3D Gaussian splatting techniques by eliminating the need for depth sorting. This stochastic rasterization approach offers more control over rendering costs and visual fidelity, improving outcomes in 3D applications.
The SCoPE method refines how generative models interpret complex prompts. By breaking prompts into coarse-to-fine layers, the system ensures more accurate visual representations and better alignment between input descriptions and generated images.
Runway’s Gen-4 model represents a significant advancement in text-to-video generation. With more control and higher fidelity than previous iterations, Gen-4 helps users create cinematic visuals from minimal inputs, pushing creative boundaries in filmmaking and animation.
Complementing Gen-4, tools like Act-One and Frames are designed for interactive content creation. These platforms enable users to manipulate AI-generated content in real time, offering flexibility and precision in crafting visual narratives.
Runway’s research into domain generalization explores how diffusion model features can separate unseen domains without relying on labeled data. This method allows for more adaptable AI systems, particularly in environments with unpredictable or diverse inputs.
By identifying latent domain structures, Runway augments existing classifiers with additional representations. This helps models perform more reliably across different domains, making them useful for real-world deployment where data variability is the norm.
Runway extends its research impact through RNA Sessions—an ongoing series exploring the intersections of AI, art, and innovation. These events invite thought leaders to discuss breakthroughs and future directions in generative media.
Collaborations with entities like Lionsgate and Tribeca Festival highlight Runway’s commitment to practical, real-world integration of AI tools. These partnerships help drive adoption of generative technologies in professional creative workflows.