tiprankstipranks
Trending News
More News >
Advertisement
Advertisement

VAST Data Unveils Inference Architecture to Power NVIDIA’s New Context Memory Platform for Agentic AI

VAST Data Unveils Inference Architecture to Power NVIDIA’s New Context Memory Platform for Agentic AI

New updates have been reported about VAST Data.

Claim 70% Off TipRanks Premium

VAST Data has introduced a new inference architecture that underpins the NVIDIA Inference Context Memory Storage Platform, positioning the company at the center of how long-lived, agentic AI workloads will manage context at scale. By running the VAST AI Operating System (AI OS) natively on NVIDIA BlueField-4 DPUs and leveraging Spectrum-X Ethernet networking, VAST is redesigning the inference data path so that key-value (KV) cache data – the core memory of large language models and AI agents – can be stored, shared, and reused across nodes with far higher efficiency. This architecture embeds data services directly into GPU servers and complementary data nodes, removing traditional client-server contention and excess data hops that slow time-to-first-token as concurrency grows. Coupled with VAST’s Disaggregated Shared-Everything (DASE) architecture, every host can tap into a globally coherent context namespace, enabling direct, low-latency flow from GPU memory to persistent NVMe storage over RDMA fabrics, which is critical as AI economics shift from pure compute performance to context continuity and utilization.

For VAST Data, the launch expands its role from high-performance storage into the core operating fabric of enterprise and AI-native NVIDIA AI factories, with direct implications for customer infrastructure efficiency, GPU utilization, and total cost of ownership. The company is targeting organizations moving from experimental AI to regulated, revenue-generating services, where context must be governed with policy, isolation, auditability, lifecycle controls, and optional protection without sacrificing KV cache speed. VAST AI OS is designed to support this by preventing costly “rebuild storms,” reducing idle GPU time, and improving infrastructure efficiency as context sizes and multi-session concurrency surge. VAST executives emphasize that inference is increasingly constrained by memory and data movement rather than raw compute, and that clusters able to move and govern context at line rate will have a structural advantage. NVIDIA leadership, in turn, highlights VAST’s role in enabling a coherent, high-throughput data plane for multi-turn, multi-user inference as agentic workloads grow. VAST will further detail its strategy, roadmap, and ecosystem at its inaugural user conference, VAST Forward, in February 2026, where customers and partners will engage in technical sessions and labs focused on scaling AI and data infrastructure using the VAST AI OS and DASE architecture.

Disclaimer & DisclosureReport an Issue

1