Mercor Expands APEX Benchmark to Measure Frontier AI Performance in High-Value Professional Roles

Mercor has shared an update. The company announced a major expansion of APEX, its benchmark designed to evaluate whether frontier AI models can perform economically valuable work in four high-value professions: investment banking associate, management consultant, big law associate, and primary care physician. The benchmark now comprises 400 real-world cases that reflect typical professional deliverables. According to the latest results, OpenAI’s GPT-5 is currently the best-performing model overall. Investment banking remains the most challenging domain, with top models achieving around 60% performance. Mercor also reports measurable progress across models: Anthropic’s Claude Opus 4.5 improved by nearly 12 points versus Opus 4.1, and Google’s Gemini 3 Pro is closing the gap with GPT-5. The company emphasizes that despite these gains, a significant gap remains between current AI capabilities and full economic usefulness in these complex roles. To support broader research and development, Mercor is open-sourcing 100 cases (25 per domain) and its evaluation harness.

Claim 55% Off TipRanks

Unlock hedge fund-level data and powerful investing tools for smarter, sharper decisions
Discover top-performing stock ideas and upgrade to a portfolio of market leaders with Smart Investor Picks

For investors, this update positions Mercor as an important infrastructure and benchmarking provider in the AI ecosystem, particularly for assessing AI’s readiness for high-value professional services work. A more rigorous and widely adopted benchmark could enhance Mercor’s strategic relevance to AI labs, enterprise users, and investors who need standardized measures of model performance and economic impact. If APEX becomes a reference standard, it could create monetization opportunities around benchmarking services, data products, and tooling, while also driving partnerships with leading model developers. The open-sourcing of part of the dataset and eval harness may accelerate adoption and ecosystem engagement, though it could moderate direct monetization of the benchmark itself. Overall, the expanded APEX benchmark strengthens Mercor’s positioning in the rapidly evolving AI evaluation space and may support future revenue growth tied to AI model assessment and enterprise deployment decisions.

Disclaimer & Disclosure Report an Issue

Mercor Expands APEX Benchmark to Measure Frontier AI Performance in High-Value Professional Roles

Claim 55% Off TipRanks

Latest News Feed

More Articles

Stock Comparison

Investment Ideas