close
close

Amazon announces Nova, a new family of multimodal AI models

Amazon announces Nova, a new family of multimodal AI models

At its re:Invent 2024 conference, Amazon Web Services (AWS), Amazon’s cloud computing division, announced a new family of generative, multimodal AI models called Nova.

There are four text-oriented models in total: Micro, Lite, Pro and Premier. Micro, Lite and Pro are available to AWS customers today, while Premiere will launch in the first quarter of 2025, Amazon CEO Andy Jassy said on stage.

In addition, there is an image generation model, Nova Canas, and a video generation model, Nova Reel. Both are publicly available today.

“We continued to work on our own boundary models,” Jassy said, “and those boundary models have made tremendous progress in the last four to five months.” And we thought if we got value from it, you probably would too Get value from it.”

Micro, Lite, Pro and Premier

The text-focused Nova models differ primarily in their capabilities and sizes.

Micro, which can only record and output text, offers the lowest latency of all – it processes text and generates responses the fastest. Lite can process image, video and text input reasonably quickly. Amazon says Pro offers the “best combination of accuracy, speed and cost” for a range of tasks. And Premier is the most powerful and designed for complex workloads.

Like Lite, Pro and Premier can analyze text, images or videos.

Jassy claims the Nova models are among the fastest in their class – and the cheapest to run. They are available in AWS Bedrock, Amazon’s AI development platform, where they can be fine-tuned and distilled for greater speed and lower costs.

“We optimized these models to work with proprietary systems and APIs, making it much easier for you to perform multiple orchestrated automated steps – agent behavior – with these models,” Jassy added. “That’s why I find this very convincing.”

Canas and Reel

Canvas and Reel are Amazon’s strongest play in generative media yet.

Canvas allows users to generate and edit images using command prompts and provides controls for the color scheme and layout of the generated image. Reel, the more ambitious of the two models, creates videos up to six seconds long from prompts. Reel allows users to adjust camera movement to create videos with pans, 360-degree rotations, and zooms.

Reel is currently limited to six-second videos, but a version that can generate two-minute videos is “coming soon,” according to Amazon.

Jassy emphasized that both Canvas and Reel have “built-in” controls for responsible use, including watermarking and content moderation. “(We’re trying to limit) the generation of harmful content,” he said.

So what’s next for Nova? Jassy said Amazon is working on a speech-to-speech model for the first quarter of 2025 and an “any-to-any” model to be available around mid-2025.

“You can input text, voice, images or video and output text, voice, images and video,” Jassy said of the any-to-any model. “This is the future of how frontier models are built and consumed.”

Leave a Reply

Your email address will not be published. Required fields are marked *