tiprankstipranks
Advertisement
Advertisement
FriendliAI – Weekly Recap

FriendliAI featured prominently this week as it expanded support for cutting-edge open-weight models and showcased its cost-efficient AI inference infrastructure. The company highlighted new benchmarks in security-focused fuzzing, long-horizon agentic workloads, and high-performance coding agents, underscoring a strategy centered on low-latency, high-throughput model serving.

Claim 55% Off TipRanks

In a security collaboration with Team Atlanta, winners of the 2025 DARPA AIxCC competition, GLM-5 on FriendliAI’s infrastructure reportedly found 35 bugs on a 54-vulnerability benchmark at about $392 in compute cost. This compared with 8 bugs at roughly $3,264 for traditional fuzzing and 41 bugs at $2.4K–$3.1K using Gemini-2.5-Pro with Gondar.

The results position open-weight models on FriendliAI as a materially cheaper option for LLM-guided fuzzing, particularly in scenarios with structured inputs where random mutation often fails. Emphasis on cost-per-bug metrics and integration with tools like Gondar and Jazzer may bolster FriendliAI’s appeal among cybersecurity teams running intensive vulnerability discovery workloads.

FriendliAI also spotlighted support for Alibaba Cloud’s Qwen3.6 family via its Dedicated Endpoints, offering one-click deployment on reserved GPU capacity. It framed the sparse Qwen3.6-35B-A3B as a cost-efficient, high-throughput option and the dense Qwen3.6-27B as a higher-performing flagship model for coding and agentic use cases.

Benchmark data shared by the company indicate Qwen3.6-27B outperforms the A3B variant on SWE-bench Verified, Terminal-Bench 2.0, MMMU, and AIME26, with both models supporting “Thinking Preservation” for multi-step agents. FriendliAI claims 2–5x throughput improvements and 50–90% GPU cost reductions at 99.99% uptime, targeting enterprises deploying large-scale coding, multimodal, and automation workflows.

The company further expanded its Dedicated Endpoints by adding Moonshot AI’s Kimi K2.6, a 1T-parameter multimodal mixture-of-experts model with a 256K context window. Positioned for autonomous coding agents, deep research, web browsing, and rich multimodal understanding, K2.6 is marketed as competitive with leading proprietary models across software engineering and multi-agent tasks.

FriendliAI underscored that its private, autoscaling, high-throughput infrastructure abstracts away lower-level tools such as vLLM and SGLang to simplify deployment. This turnkey approach is aimed at enterprises seeking powerful open-source models without managing complex inference stacks, potentially driving higher utilization of FriendliAI’s dedicated capacity.

On the serverless side, FriendliAI promoted GLM-5.1 from Z.ai, optimized for long-horizon software engineering and agentic workloads. Third-party trackers Artificial Analysis and OpenRouter were cited as ranking FriendliAI among leaders in output speed, latency, and structured outputs for this model.

GLM-5.1 is described as outperforming Anthropic’s Claude Opus 4.6 on SWE-Bench Pro and CyberGym benchmarks while operating at lower cost, reinforcing FriendliAI’s focus on performance-sensitive inference. The company also shared technical content on running GLM-5.1 on its platform to reduce adoption friction and cultivate developer usage.

FriendliAI’s week also included sponsorship of NVIDIA Nemotron Developer Days Seoul 2026 and its associated hackathon, where it provided compute credits to teams. These efforts deepen ties with NVIDIA’s Nemotron ecosystem and the Korean AI developer community, enhancing brand visibility among early adopters and potential enterprise customers.

By centering its offering on open-weight, high-end models and emphasizing measurable performance and cost advantages, FriendliAI is strengthening its position in AI inference and security-focused workloads. Overall, the week’s updates point to a company pushing hard on technical differentiation and ecosystem engagement to support future growth in infrastructure and model-serving services.

Disclaimer & DisclosureReport an Issue

1