tiprankstipranks
Advertisement
Advertisement

Baseten Expands Agentic AI Model Support and Deepens Cloud Ecosystem Ties

Baseten Expands Agentic AI Model Support and Deepens Cloud Ecosystem Ties

Baseten intensified its push into AI infrastructure this week, expanding support for advanced models and deepening ties with major ecosystem partners. The company added production deployment for Poolside’s open-weight agentic coding models, Laguna XS.2 and Laguna M.1, on its inference platform.

Claim 55% Off TipRanks

Baseten emphasized optimized, low-latency serving and production-grade infrastructure for these models, targeting teams building AI-enabled software development tools. This move is aimed at capturing workloads from enterprises seeking flexible hosting for code-generation and automated software development use cases.

The company also spotlighted NVIDIA’s new Nemotron 3 Nano Omni multimodal foundation model, which unifies audio, image, text, and video inputs in a single architecture. Baseten framed the model as well suited for advanced agentic workflows, including computer-use agents, document intelligence, and large-scale video and audio reasoning.

By aligning with unified multimodal architectures, Baseten appears positioned to support more complex agent orchestration while reducing pipeline complexity for customers. If integrated into its platform, models like Nemotron 3 Nano Omni could enhance Baseten’s appeal in multimodal AI deployment for enterprise clients.

In parallel, Baseten featured prominently at the Google Cloud Next conference, participating in official sessions and co-hosting an opening event with House of Kube and Google Cloud. Company leaders presented on new AI capabilities on Google Kubernetes Engine and strategies for scaling in the “agentic era,” reinforcing Baseten’s role within the Google Cloud ecosystem.

Baseten further expanded its model catalog by adding support for the Kimi K2.6 large language model with a series of performance optimizations. These included KV-aware routing, NVFP4 weights tuned for NVIDIA Blackwell GPUs, multimodal hierarchical caching, and prefill-decode disaggregation to improve efficiency and latency.

Collectively, these developments strengthen Baseten’s positioning as an infrastructure-centric provider for high-performance generative and agentic AI workloads. While no usage or revenue metrics were disclosed, the company’s focus on ecosystem partnerships and advanced inference capabilities may support future enterprise adoption and recurring platform demand.

Disclaimer & DisclosureReport an Issue

1