According to a recent LinkedIn post from Together AI, the company’s speech-to-text models are now ranked first and second for transcription speed on the Artificial Analysis Speech to Text leaderboard. The post highlights that NVIDIA Parakeet TDT 0.6B V3 running on Together AI reportedly processes 303 seconds of audio per second of compute time.
Meet Samuel – Your Personal Investing Prophet
- Start a conversation with TipRanks’ trusted, data-backed investment intelligence
- Ask Samuel about stocks, your portfolio, or the market and get instant, personalized insights in seconds
The post also indicates pricing of $1.50 per 1,000 minutes of audio and cites an AA-WER of 4.6% across three real-world datasets. This combination of speed, cost, and accuracy suggests Together AI is positioning its infrastructure as a competitive option for developers building real-time voice agents.
For investors, the emphasis on “fast STT” as core infrastructure for AI-native applications points to potential volume-based usage growth if adoption scales among enterprises and startups. Strong leaderboard performance may enhance Together AI’s standing in the AI infrastructure segment, supporting its ability to attract high-value workloads and deepen ecosystem ties with partners such as NVIDIA.

