FriendliAI featured prominently this week as it reinforced its position in high-performance AI inference and infrastructure for open-weight models. The company highlighted new integrations, model deployments, and third-party recognition that collectively underscore its focus on speed, efficiency, and developer-centric tooling.
Meet Samuel – Your Personal Investing Prophet
- Start a conversation with TipRanks’ trusted, data-backed investment intelligence
- Ask Samuel about stocks, your portfolio, or the market and get instant, personalized insights in seconds
FriendliAI introduced a guide for integrating its platform with the OpenClaw framework, positioning itself as an integration layer for AI assistants and multi-agent systems. The setup script is designed to streamline provider configuration, credential handling, fallback behavior, and channel routing across Friendli Model APIs and Dedicated Endpoints.
The OpenClaw-focused configuration supports high-performance inference on frontier open-weight models such as Z.ai’s GLM-5.1, Kimi K2.6, NVIDIA Nemotron 3, and DeepSeek AI V4. FriendliAI also emphasizes specialized agents that route deeper reasoning tasks and low-latency responses to different models, while using multiple fallback options to enhance resilience and cost control.
In parallel, FriendliAI expanded its model portfolio by highlighting Google DeepMind’s Gemma‑4‑31B‑it on its APIs and Dedicated Endpoints. The dense, instruction-tuned multimodal model is promoted as a top performer on Artificial Analysis leaderboards for output speed, time-to-first-token, and overall response times in coding, reasoning, and document understanding workloads.
Benchmark figures cited for Gemma‑4‑31B‑it include strong scores on agentic coding, math and reasoning, and multimodal benchmarks such as LiveCodeBench, AIME 2026, GPQA Diamond, MMMU Pro, and MATH‑Vision. The model can be accessed in serverless mode or via one-click Dedicated Endpoints, reinforcing FriendliAI’s strategy of offering flexible deployment options for enterprise users.
The company also focused on AI coding agents, arguing that inference performance rather than agent design is often the main bottleneck. FriendliAI’s stack is presented as delivering leading latency, throughput, tool calling, and structured outputs for models including GLM-5.1, Kimi K2.6, Nemotron 3, DeepSeek V4, and other agent-focused workloads.
FriendliAI stresses support for context reuse, multi-step refactors, and repository-wide edits through its Model APIs and Dedicated Endpoints. Integration with popular coding agents such as Claude Code and other tools is framed as a simple environment-variable switch, suggesting a low-friction path for developers to adopt its infrastructure.
Further broadening its coding portfolio, FriendliAI enabled one-click deployment of Poolside’s Laguna XS.2, a 33B-parameter Mixture-of-Experts agentic coding model, on Dedicated Endpoints. Laguna XS.2 supports four precision variants, including BF16, FP8, INT4, and NVFP4, to match diverse GPU architectures such as NVIDIA Hopper and Blackwell.
Laguna XS.2 is positioned for long-horizon development tasks, with benchmark performance on SWE-bench datasets and features like interleaved reasoning and a 256-expert design aimed at efficient inference. These capabilities align with FriendliAI’s goal of serving high-end developer and enterprise workloads that demand both performance and cost-effective scaling.
Beyond product updates, FriendliAI gained external validation through its inclusion in CB Insights’ 10th annual AI 100 list of promising private AI firms. The recognition highlights its role in powering production-scale inference for developers, startups, and enterprises, and may enhance its visibility with customers, partners, and investors.
Taken together, FriendliAI’s new integrations, expanded model support, and third-party recognition point to a week of strategic progress in cementing its position as a specialized AI infrastructure provider. The company’s emphasis on open-weight models, low-latency inference, and flexible deployment options could support broader adoption and strengthen its future prospects in the competitive generative AI stack.

