The Register on MSN

AI models still suck at math

Just less than before, according to the ORCA test exclusive Current-day LLMs are prediction engines and, as such, they can ...
Yitzi Snow makes his New York Times Crossword debut. The New York Times has launched the Midi, a daily medium-size crossword.
Microsoft Math Solver is a free tool that uses AI to recognize both printed and handwritten math. It’s particularly strong with geometric proofs and interactive graphing, and it pulls learning ...
Microsoft Research has produced a peer-reviewed study showing that the leading AI chatbots lose significant accuracy as conversations grow longer, confirming a frustration familiar to anyone who has ...
Abstract: In the realm of natural language processing, large language models (LLMs) have demonstrated superb performance in human-level reasoning and text generation, which has inspired a large number ...
Five years ago, mathematicians Dawei Chen and Quentin Gendron were trying to untangle a difficult area of algebraic geometry involving differentials, elements of calculus used to measure distance ...
During my time as a learning support math teacher, I always had a daily word problem on my board for students to work on when they first walked into my classroom. To be completely honest, this ...
Amateur mathematicians are using artificial intelligence chatbots to solve long-standing problems, in a move that has taken professionals by surprise. While the problems in question aren’t the most ...
According to Greg Brockman (@gdb) and Terence Tao, GPT-5.2 Pro has reached a significant milestone by independently solving an Erdos problem—a first for large language models (LLMs). This achievement ...
Mathematics be a tricky subject, and many students struggle to get the hang of it, finding it difficult to solve problems and equations in class. It requires a special sort of attention that one can’t ...
When I first started working with multi-agent collaboration (MAC) systems, they felt like something out of science fiction. It’s a group of autonomous digital entities that negotiate, share context, ...