Nvidia unveiled three major components for physical AI at GTC Taipei: Cosmos 3 world model, Alpamayo 2 Super driving model, and an open humanoid robot reference platform. The announcement signals the company's push beyond data centers into robotics and autonomous systems.
Nvidia used its GTC Taipei event to consolidate its physical AI strategy, introducing tools designed for robots, autonomous vehicles, and video systems.
The flagship offering, Cosmos 3, represents a next-generation world model—a foundational AI system that can understand and predict physical environments. World models serve as a core component for training robots and autonomous systems by simulating real-world scenarios.
Alpamayo 2 Super is a substantially upgraded driving model that Nvidia positions for autonomous vehicle development. The scaled-up version builds on previous iterations to handle more complex driving scenarios and decision-making tasks.
The humanoid robot reference platform marks Nvidia's direct entry into robotics standardization. By providing an open reference design, Nvidia aims to accelerate development across the industry rather than competing as a hardware manufacturer.
The announcements reflect a broader industry shift toward physical AI—systems that operate in the real world rather than purely digital environments. Nvidia's strategy emphasizes software, models, and platforms rather than end-user robots, positioning the company as infrastructure provider.
The timing aligns with increased investment from roboticists and autonomous vehicle developers seeking unified AI frameworks. Partnerships, including ABB Robotics' collaboration with Nvidia, underscore growing adoption of these tools across manufacturing and logistics.
These releases follow Nvidia's recent expansion into consumer PC chips, demonstrating the company's broadening portfolio beyond its dominant GPU business. The physical AI push represents a longer-term bet on industrial and commercial applications where AI systems must interact with tangible environments.
Nvidia's reference platform approach suggests the company views standardization as essential for ecosystem growth. By opening the humanoid design, Nvidia enables developers to focus on AI software rather than mechanical engineering, potentially accelerating deployment timelines across industries.
ElevenLabs has released Dubbing v2, an AI dubbing model that maintains original speaker emotion, tone, and pacing across 90+ languages while keeping audio synchronized with video content.
Mistral AI announced fresh capabilities and product updates at its inaugural Now Summit in Paris, drawing significant developer interest. The French AI company outlined its vision for open and efficient language models.
Google's new Gemini Spark AI agent can access emails, documents, and calendars to plan events, but a real-world test revealed significant gaps in understanding personal relationships.
Google has patched several bugs in its Gemini app that caused users to rapidly exhaust their usage quotas. The fixes include doubling video generation limits for Ultra members and eliminating charges for failed requests.