Rather, these experts say, o1 and other reasoning models might simply be using languages they find most efficient to achieve an objective (or hallucinating).
“The model doesn’t know what language is, or that languages are different,” Matthew Guzdial, an AI researcher and assistant professor at the University of Alberta, told TechCrunch. “It’s all just text to it.”
Indeed, models don’t directly process words. They use tokens instead. Tokens can be words, such as “fantastic.” Or they can be syllables, like “fan,” “tas” and “tic.” Or they can even be individual characters in words — e.g. “f,” “a,” “n,” “t,” “a,” “s,” “t,” “i,” “c.”