In June 2023, we pulled back the curtain on GAIA-1, our ground-breaking generative model designed specifically for autonomous driving. Today, we’re thrilled to announce a major upgrade. Say hello to a GAIA-1 that is more efficient, more capable, and more versatile with over 9 billion trainable parameters. 🎉
Our engineers have been hard at work optimizing GAIA-1 to generate higher-resolution videos and improving the world model’s overall quality through larger-scale training. We have now scaled GAIA-1 to a massive 9 billion parameters, pushing the boundaries of what is possible in the realm of autonomous driving technologies.
The Technical Grit
GAIA-1 uses specialized encoders for video, text, and action inputs, churning them into a shared representation. Its world model, an autoregressive transformer, predicts the next set of image tokens in a sequence while considering past image tokens as well as contextual information provided by text and action tokens. With 6.5 billion parameters, this world model takes holistic decision-making to the next level.
Apart from offering unparalleled controllability and realism in generating driving scenarios, GAIA-1 opens new avenues for training and validating autonomous driving systems. It can predict different traffic levels, incorporate weather changes, and interact with other dynamic agents on the road, making it an invaluable tool in the evolution of autonomous driving.
The Road Ahead
We’re not stopping here. Our next steps include extending the model’s capabilities to provide a 360-degree perspective and improving its inference efficiency. We are dedicated to pushing the envelope further, making our technology even more applicable and effective in real-world scenarios.
Read the Full Technical Report
For those interested in diving into the nitty-gritty, our full technical report is available on arXiv.