Helm.ai, a leading provider of advanced AI software for high-end ADAS, Level 4 autonomous driving, and robotic automation, has announced the launch of VidGen-1, a groundbreaking generative AI model. VidGen-1 is designed to produce highly realistic video sequences of driving scenes, significantly enhancing autonomous driving development and validation. This innovative AI technology builds upon Helm.ai’s previous introduction of GenSim-1, which focuses on AI-generated labeled images. VidGen-1 is poised to revolutionize both prediction tasks and generative simulation in the autonomous driving industry.
The generative AI video model is trained on thousands of hours of diverse driving footage. It utilizes advanced deep neural network (DNN) architectures along with Deep Teaching, an efficient unsupervised training technology, to create realistic video sequences of driving scenes. These videos are produced at a resolution of 384 x 640 pixels, with variable frame rates of up to 30 frames per second, and can be several minutes long. VidGen-1 can generate videos randomly without an input prompt or can be prompted using a single image or input video.
One of the key features of VidGen-1 is its ability to generate videos of driving scenes from different geographies and for multiple types of cameras and vehicle perspectives. The model excels in producing highly realistic appearances and maintaining temporally consistent object motion. Moreover, it learns and reproduces human-like driving behaviors, generating motions of the ego-vehicle and surrounding agents that adhere to traffic rules. This capability allows the model to simulate realistic video footage of various scenarios across multiple international cities, encompassing urban and suburban environments, diverse vehicles, pedestrians, bicyclists, intersections, turns, different weather conditions such as rain and fog, and various illumination effects, including glare and night driving. It even accurately reflects on wet road surfaces, reflective building walls, and the hood of the ego-vehicle.
Video data is considered the most information-rich sensory modality in autonomous driving and is sourced from the most cost-effective sensor—the camera. However, the high dimensionality of video data makes AI video generation a challenging task. Achieving high image quality while accurately modeling the dynamics of a moving scene to ensure video realism is a well-known difficulty in video generation applications.
VidGen-1 offers automakers significant scalability advantages over traditional non-AI simulations. It enables rapid asset generation and imbues the agents in the simulation with sophisticated real-life behaviors. This approach not only reduces development time and cost but also effectively bridges the “sim-to-real” gap, providing a highly realistic and efficient solution that greatly broadens the applicability of simulation-based training and validation.
VidGen-1 is set to provide the autonomous driving industry with an invaluable tool for developing and validating autonomous systems. By offering realistic and efficient video simulations, Helm.ai is paving the way for faster, more cost-effective advancements in autonomous driving technology.