Tencent just open-sourced Hy3 preview, a model that punches above its weight on coding agents, reasoning, and search—built in ...
Morning Overview on MSN
OpenAI launches GPT-5.5, calling it its most powerful model yet
OpenAI released GPT-5.5 in May 2026, calling it the most capable AI model the company has ever built. The new model sits ...
OpenAI today detailed o3, its new flagship large language model for reasoning tasks. The model’s introduction caps off a 12-day product announcement series that started with the launch of a new ...
Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Google has claimed the top spot in a ...
MLCommons today released AILuminate, a new benchmark test for evaluating the safety of large language models. Launched in 2020, MLCommons is an industry consortium backed by several dozen tech firms.
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now A team of Abacus.AI, New York University, ...
Anthropic's new flagship model Claude Opus 4.7 beat every benchmark we threw at it, and eats tokens like a hungry teenager.
SAN FRANCISCO--(BUSINESS WIRE)--Today, MLCommons® announced new results from two industry-standard MLPerf™ benchmark suites: MLPerf Training v3.1 The MLPerf Training benchmark suite comprises full ...
Earlier this week, Meta landed in hot water for using an experimental, unreleased version of its Llama 4 Maverick model to achieve a high score on a crowdsourced benchmark, LM Arena. The incident ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results