Research
25/11/2025

ChatGPT knows a lot, but does it have real knowledge?

Ask ChatGPT what the capital of France is, and it will answer “Paris”. But does that mean it actually possesses knowledge? And if so, what kind of knowledge is it exactly? In her PhD dissertation, TU/e researcher Céline Budding lays out a theoretical framework for examining what large language models really learn from the enormous amounts of data they’re trained on.

by Martina Silbrníková

photo Leoni Andriessen

Budding never planned to pursue a PhD. “I had done a technical master’s and spent all my time programming — I was pretty much done with it after that,” she says. Until she stumbled upon this project. Rather than building yet another model, the research focuses on understanding and explaining what existing models do. Less computation, more reflection — with clear ties to philosophy. Because how do you explain what a language model actually learns?

Large Language Models

Large Language Models, or LLMs, are computer programs trained to process and generate language. They can write texts, answer questions, or produce summaries. The best-known example is ChatGPT. What they fundamentally do is recognize patterns in language: by observing which words frequently appear together and how sentences are usually structured, they learn which words are likely to follow.

The learning process involves processing massive amounts of text — books, articles, websites. The model repeatedly tries to predict the most likely next word in a sequence, and by doing this millions of times, it becomes increasingly skilled at detecting linguistic patterns. Sometimes LLMs receive additional fine-tuning for specific tasks, such as translation or summarization.

Tacit knowledge

“These models can do so much that people sometimes assume they have knowledge — or even intelligence. But is that really the case? And if we do say they have knowledge, what exactly do we mean by that?” Budding explains. She examined the role knowledge plays in human linguistic behavior, diving into philosophical literature. That led her to the concept of tacit knowledge: knowledge we possess but find difficult to articulate. Humans, for example, rely on grammatical knowledge to speak a language, but they apply it unconsciously and rarely can explain its logic.

Language models aren’t given explicit rules or facts; everything they “know” comes from the patterns embedded in their training data. So the real question is: what, exactly, have they learned? And could language models also possess some form of tacit knowledge? If you ask ChatGPT for the capital of France, it replies “Paris.” But does it understand that Paris is located in France and that the two concepts are connected? Or is it simply reproducing a frequently occurring question–answer pattern?

Parrots or thinkers?

The question of whether AI models truly understand anything — or merely repeat patterns — has been around for a long time. Some researchers call AI language models stochastic parrots: parrots that mimic everything without true comprehension. Others attribute qualities like “smart” or “intelligent” to them. Budding argues against buying into the hype and instead urges careful examination of what the models actually do. “We need to study this thoroughly and justify our claims before we start putting labels on anything,” she says.

That’s no small task: these models contain millions of parameters, and the first challenge is identifying what to look for when talking about knowledge. “In my dissertation, I make a proposal for that,” Budding says. She presents a theoretical framework that sets out clear criteria for tacit knowledge. If a language model meets all of these criteria, we can say it has tacit knowledge.

Illustration of how a language model can process information. Source | Céline Budding's dissertation

According to her, we should focus on whether models form connections between different inputs — the way humans do. “That way, we can determine whether a model truly knows something about Paris, or whether it treats every question about Paris independently and simply predicts the correct answer.”

Interventions

To determine whether a model truly possesses knowledge, Budding argues we need to look for internal representations of concepts — such as “Paris” or a grammatical rule — inside the model itself. In her thesis, she proposes doing this through interventions. The idea is to modify a model in a controlled way and then observe how its predictions change.

She offers an example: if a model predicts that Paris is the capital of France, you could adjust the model so that it begins predicting that Paris is located in Italy. “The question then becomes: does the model apply this new information consistently? If you ask about the Louvre or the Eiffel Tower, does it treat those locations as Italian too? If the model can make these connections, you might argue it has some form of knowledge.”

A starting point

But even then, you’re not done. How do you perform an intervention that ensures a truly causal relationship — meaning the effect is genuinely due to your modification? “I discuss these conditions in my dissertation as well. You could write an entire dissertation just on that,” she says.

And there are many more complications. We know, for example, that these models often exhibit bias — can interventions help uncover that? Budding also acknowledges that tacit knowledge is a limited form of knowledge: can a model recognize that certain things are equivalent, and treat them as such?

Her work is not a final answer but a beginning — a first step toward methods that test what language models can actually learn, and when we might be justified in calling it “knowledge.” That requires looking not only at the output, she argues, but also at what happens inside the model. That internal structure and its representations offer crucial insights. “In my dissertation, I provide tools to investigate this further, so we can increasingly understand how these models work.”

PhD in the picture

What’s on the cover of your dissertation?

“I drew it myself. I wanted to avoid all the stereotypical AI imagery — robots, strings of zeros and ones. The idea is that we’re trying to find something meaningful inside a complex network, like the concept of a city such as Paris and everything associated with it — the lines and points of light represent that.”

You’re at a birthday party. How do you explain your research in one sentence?

“I investigate whether language models like ChatGPT possess any form of knowledge.”

How do you unwind outside of your research?

“I love knitting; I’ve made lots of clothes while traveling to conferences,” she says, glancing at the hand-knitted vest she’s wearing. “My grandmother taught me, and I picked it back up during my master’s because I spent all my time programming and writing. It was a much-needed change of pace.”

What tip do you wish you had received as a starting PhD candidate?

“It’s completely normal to feel a bit lost in your first year — that’s part of the process. And finding community among other PhD students is really important. The PhD journey can feel lonely at times. Reaching out, talking to people, making connections really helps, even if everyone is working on their own research.”

What’s your next chapter?

“I now work as an AI governance consultant for the Dutch national government. I help organizations deploy AI responsibly, including in the context of the EU AI Act. The question is: if you’re using AI, how do you do it well — avoid risks, follow the rules? Where my PhD was very theoretical, this work is much more focused on practical application.”

This article was translated using AI-assisted tools and reviewed by an editor

Share this article

Cursor

This is research at TU/e

ChatGPT knows a lot, but does it have real knowledge?

Large Language Models

Tacit knowledge

Parrots or thinkers?

Interventions

A starting point

PhD in the picture

Large Language Models

Tacit knowledge

Parrots or thinkers?

Interventions

A starting point

Discussion

Send Comment

Tags

Most read articles