Engineering Junkies
  • HOME
  • News
  • Technology
    • AI
    • Robotics
  • Science
  • Gadgets
  • Transport
    • Cars
No Result
View All Result
  • HOME
  • News
  • Technology
    • AI
    • Robotics
  • Science
  • Gadgets
  • Transport
    • Cars
No Result
View All Result
Engineering Junkies
No Result
View All Result
Home Technology AI

How Does AI Think? The Science Behind Large Language Models

How does AI think? Well, it predicts the next word, over and over, from patterns it absorbed reading the entire internet.

Engineering Junkies by Engineering Junkies
02/06/2026
in AI
A A
Share on FacebookShare on Twitter

Quick Answer

How does AI think? It actually doesn’t think the way humans do. Models like ChatGPT and Claude generate responses by predicting the most likely next word based on patterns learned from vast amounts of text. They use a system called the transformer, which looks at your entire message, figures out how the words relate to each other, and then builds a response one token (small piece of text) at a time.

Illustration showing human thinking compared with AI, explaining how ChatGPT uses tokens, embeddings, attention, and transformers to predict words instead of thinking like a human.

Every large language model — ChatGPT, Claude, Gemini — works by predicting the next word over and over again, using patterns absorbed from trillions of examples.

Understanding how AI thinks, or more precisely what it does instead of thinking, changes the way you use it, trust it, and know when to question it. This is that explanation, without the jargon, without the hype, and without skipping the parts that actually matter.

You type a question. Half a second later, an AI gives you a confident, fluent answer on anything from quantum physics to tax law. It does this without a brain, without lived experience, and without any genuine understanding of the world in the way you and I have it. So what on earth is going on inside it?

The answer surprises most people. And once you understand it, you will never use AI the same way again.

It All Starts With One Deceptively Simple Idea

At its core, every large language model, ChatGPT, Claude, Gemini and the rest, is doing one thing: predicting the next word.

That sounds almost embarrassingly simple for technology that can write a legal contract or explain quantum mechanics. But here is the trick: to predict the next word reliably, across billions of sentences on every topic humans have ever written about, a model must develop something that looks, from the outside, a great deal like understanding.

Think about what it takes to correctly complete this sentence:

“The surgeon scrubbed her hands before entering the ___.”

You knew the answer without thinking. An AI had to earn that knowledge by reading millions of similar sentences.

To fill that blank correctly, you need to know what surgeons do, what an operating theatre looks like, and what sterile protocol means in a medical context. No single fact gets you there. A web of connected knowledge does.

When a language model learns to predict text at scale, it builds that web, encoding billions of relationships between ideas as patterns in mathematics rather than facts in a database.

Before It Can Read, It Has to Break Language into Pieces

The first thing a language model does with any text is break it into pieces called tokens. A token is roughly a word, though sometimes it is a fragment. The word “thinking” might be one token. The word “unbelievable” might be split into “un,” “believ,” and “able.”

Diagram showing how AI breaks the word “unbelievable” into tokens, converts them into numbers, and places words like king, queen, monarch, and banana into vector space where similar words cluster together.
The word “unbelievable” becomes three separate tokens, each gets a number, and that number maps to a position in space. King and queen end up neighbors. Banana ends up alone. This is how AI reads.

This is important because AI does not read language the way humans do. It does not understand letters, sounds, or words directly. Instead, it converts everything into numbers.

Each word, or piece of a word called a token, is assigned a number. That number is then transformed into a vector, which is a long list of numbers that represents the word’s meaning and its relationship to other words.

For example, “king” and “queen” end up close together because they often appear in similar contexts. “Monarch” is nearby as well, while “banana” sits much farther away because it belongs to a completely different category.

You can think of it as a map of language. Words with similar meanings are grouped close together, while unrelated words are placed farther apart. In this way, AI turns language into mathematics, allowing it to recognize patterns and relationships that help it generate meaningful responses.

Word Embeddings: This is a way of representing meaning using math. Each word is turned into a vector which is a long list of numbers. Words with similar meanings end up with similar vectors in this high dimensional space. This is what lets the model understand that “happy” and “joyful” belong in similar contexts even if they were never seen together in training.

The Eight People Who Changed Everything

For many years, AI processed language one word at a time. It was a bit like reading a book while slowly forgetting what was written on the previous pages. By the time the AI reached the end of a long paragraph, it often struggled to remember the beginning. This made long documents difficult to understand.

That changed on June 12, 2017, when eight researchers at Google Brain published a paper called “Attention Is All You Need.”. The paper introduced a new system known as the transformer architecture, and it quickly transformed the entire AI industry.

Instead of reading words one by one, transformers can look at all the words in a sentence or passage at the same time. This helps the model understand context, relationships, and meaning much more effectively.

The original transformer architecture diagram from the 2017 Attention Is All You Need paper, showing encoder and decoder blocks with multi-head attention and positional encoding
This is the exact architecture that every major AI model runs on today. Eight researchers drew this diagram in 2017. Nothing in the field has been the same since.

Every major AI model built since that paper, including GPT, Claude and Gemini, runs on this foundation. It was not just a small upgrade. It was a breakthrough that completely changed how AI works with language.

“We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely.”

Vaswani et al., 2017 — Original paper, arXiv

Self-Attention: The Core Idea, Made Visual

Instead of reading one word at a time, a transformer looks at the entire input at once and calculates. It checks how every word in the sentence relates to every other word. This process is called self attention.

Take this sentence
The trophy did not fit in the suitcase because it was too big

You can quickly tell that “it” refers to the trophy. A transformer works this out by comparing “it” with every other word in the sentence.

The connection between “it” and “trophy” becomes strong. The connection between “it” and small words like “the” becomes weak. From these patterns, the model chooses the correct meaning of the sentence. It does not use rules like a human would. It relies on patterns learned from data.

This is not a single calculation. The model repeats this process in many different ways at the same time. Each part focuses on something different such as grammar, meaning, cause and effect, or context across the sentence. Together these signals help the model build a clear understanding of the text.

Self-attention: how the model resolves what “it” refers to

The model assigns an attention score between “it” and every other word. High scores (teal) are most relevant.

The
trophy
did
not
fit
in
the
suitcase
because
it ← resolving
was
too
big
High attention
Word being resolved
Low attention

The Scale of Training Is Genuinely Hard to Imagine

The transformer architecture is the core system behind modern AI. What gives it knowledge is training. The scale of this training is very hard to imagine.

To understand how AI learns at this scale, look at GPT 3 which was released by OpenAI in 2020. It had 175 billion parameters and was trained on 570 gigabytes of text from the internet books and Wikipedia.

Its earlier version GPT 2 had only 1.5 billion parameters. That jump from 1.5 billion to 175 billion brought a huge improvement that even surprised the researchers who built it.

GPT-3 Figures

Parameters: 175 billion, confirmed in OpenAI’s published paper and by NVIDIA’s technical blog

Training data: 570 GB including CommonCrawl, Wikipedia, and digitised books

Compute equivalent: Estimated at 355 years of single-GPU processing time at commercial pricing

Sources: NVIDIA Technical Blog · Stanford HAI

GPT 4 released in 2023 was trained on around 13 trillion tokens of text and code according to reported technical information. Parameters are like internal settings that control how the model responds.

Training means adjusting all of these settings again and again across massive amounts of data until the model becomes accurate enough to produce useful answers.

GPT-4 Figures

Training data: Approximately 13 trillion tokens from Common Crawl, code repositories, books, and academic sources

Training cost: Estimated at $63 million to $78 million in compute time

Bar exam: OpenAI originally claimed the 90th percentile. A 2024 peer-reviewed study in Artificial Intelligence and Law by MIT researcher Eric Martínez found this was overstated because the comparison pool was skewed toward repeat test-takers. A fairer comparison places GPT-4 closer to the 69th percentile on essays.

Sources: Originality.ai · Illinois Institute of Technology ·

So How Does AI Think When You Ask It Something?

By the time you open ChatGPT or Claude and type a question the training is already finished. The model is not learning anymore. Its knowledge is fixed. What it does instead is called inference. It uses everything it learned during training to create an answer for your question in real time.

Here is what happens in a split second after you ask something

1) Tokenize: Your text is broken into small pieces called tokens. Each token is turned into a number the model can work with

2) Embed: Each token becomes a vector which is a long list of numbers. This places the word in a kind of meaning space where similar ideas sit closer together

3) Attend: The model passes these vectors through many transformer layers. Each layer checks how words relate to each other and builds a clearer understanding of the full question

4) Predict: The model calculates probabilities for every possible next token. It creates a ranked list of what could come next from most likely to least likely

5) Generate: One token is chosen and added to the response. Then the whole process repeats again and again until the answer is complete

The inference pipeline: from your question to the first word of the answer
T
Tokenize
Text split into numbered word fragments
›
E
Embed
Tokens become vectors in semantic space
›
A
Attend
Attention maps relationships across all tokens
›
P
Predict
Every possible next token gets a probability score
›
G
Generate
One token chosen. Process repeats until done. 

This is why AI writes one word at a time instead of producing everything at once. It does not know the full answer in advance. It builds it step by step with each new token becoming part of the next prediction.

The Human Layer That Shapes the AI You Actually Use

The raw pre-trained model is not the AI you use every day. If it were left on its own it would reflect everything found on the internet including useful information and harmful or unwanted content.

After pre training there is a step called Reinforcement Learning from Human Feedback or RLHF. In this stage human trainers review the model’s answers and rank them. Responses that are helpful accurate and appropriate get higher scores.

Responses that are harmful misleading or low quality get lower scores. These rankings are used to train a reward model that guides the AI toward better answers.

Verified — RLHF at OpenAI

OpenAI formalized RLHF in their 2022 InstructGPT paper. The process used approximately 10,000 contractor-written prompts and 40,000 drawn from real users. Each prompt was given between 4 and 9 model responses for human raters to rank in order of quality.

Source: OpenAI — Aligning language models to follow instructions (2022)

This is why ChatGPT and Claude can feel different even though they are built on similar foundations. The underlying architecture is similar but the human feedback shaping each system is different. The AI you interact with is shaped by thousands of human decisions about what good responses should look like.

Where AI Goes Wrong and Why It Is Not a Surprise

Failure ModeWhat It Looks LikeThe Real CauseWhere It Stands
HallucinationStates a false fact with complete confidenceTraining rewards plausible-sounding text, not verified truthOngoing
SycophancyAgrees with you even when you are wrongHuman raters scored agreeable responses higher during RLHFImproving
Knowledge cutoffKnows nothing after its training dateTraining data is fixed. The model cannot update itself.Structural
Reasoning errorsFails at multi-step logic or basic mathsPattern-matching is not the same as causal reasoningImproving

On hallucination

The model does not have direct access to ground truth. It cannot check whether what it says is actually correct. It only predicts what words are most likely to come next based on patterns in data. This means it can sound confident while still being wrong because it does not truly understand truth the way humans do.

A 2025 OpenAI research paper found that the next-token prediction objective, the very goal used to train these models, inadvertently rewards confident guessing over honest uncertainty. Models are not designed to bluff. They are trained in a way that makes bluffing the statistically safer strategy. (Source: Lakera AI, citing OpenAI 2025 research)

On Sycophancy

If you tell the model the answer is 42 when it is 44 the model may still agree with you. This happens because during RLHF training human reviewers often gave higher scores to answers that matched their expectations or agreed with them.

Over time the model learned that agreeing can be treated as helpful behavior. From the model’s perspective there is no sense of right or wrong just patterns that were rewarded more often.

The Part Nobody Talks About: How AI Thinking Affects Your Own Thinking

Here is something most explainers on how AI thinks leave out entirely. These systems do not just respond to your reasoning. They shape it.

AI produces smooth and confident text whether it is correct or not. Over time people start to link that confidence with trust. A well written answer can feel more reliable even when it is not.

AI interface showing high fluency but low accuracy, illustrating how confident AI responses can mislead users
Fluency meter full. Accuracy meter empty. The most dangerous AI response is not the one that sounds wrong. It is the one that sounds completely right.

Understanding that AI works through prediction instead of real reasoning gives you something useful. It is a reason to be careful when the answer sounds very certain. The most fluent responses are not always the most accurate. Sometimes they are just strong pattern matching based on limited or uneven data.

A calculator is a helpful comparison but it is not perfect. If you enter the wrong formula you still get a confident wrong result. With AI the formula is hidden so you cannot see how the output was produced. You only see the final text in a clear and natural form. That is exactly when you need to think more critically.

The people who use AI well are not the ones who trust it blindly. They are the ones who understand when to question it.

So Does AI Actually Think?

Comparison of human thinking and artificial intelligence showing emotion based human thought and pattern based AI processing

There is no inner experience, no curiosity, and no awareness of what it means to be right, honest, or to care about the question being asked. When people ask how AI thinks, the most accurate answer is that it does not think in the philosophical sense. What it does instead is still remarkable.

It is a system trained on vast amounts of human writing across centuries. It learns patterns from that data and uses them to generate responses to new situations. From the outside this can sometimes look like thinking even though it works in a very different way.

AI does not reason from first principles like a scientist. It works by finding and combining patterns at an enormous scale. That scale is so large that the results can feel surprisingly intelligent compared to anything we had before.

The machine is not thinking like a human. It is doing something else that is still real and useful. When you understand AI as prediction instead of reasoning you become much better at using it. Concepts like tokens, embeddings, attention, inference, and RLHF give you a clearer picture of what is actually happening under the surface.

This kind of informed skepticism is not about distrusting AI. It is about using it with a clearer understanding of its limits and strengths.

Frequently Asked Questions

How does AI actually think?
⌄
It predicts the next word using patterns learned from large amounts of text. It does not reason or look up facts and it does not understand language like humans. It works by finding patterns and choosing the most likely next word based on the context.
What is a large language model (LLM)?
⌄
AI hallucination happens because of how these systems are trained. They are built to produce text that sounds likely and natural rather than checking if it is actually true. Research from OpenAI in 2025 found that the training process tends to reward confident answers more than careful ones. This means the model can sound certain even when the information is not reliable.
Why does AI make things up (hallucinate)?
⌄
The transformer is the main design behind modern AI language models. It was introduced in the 2017 paper Attention Is All You Need by researchers at Google. Its key idea called self attention lets the model look at a whole sentence at once and understand how each word relates to the others instead of reading word by word.
What is the transformer architecture?
⌄
The transformer is the main design behind modern AI language models. It was introduced in the 2017 paper Attention Is All You Need by researchers at Google. Its key idea called self attention lets the model look at a whole sentence at once and understand how each word relates to the others instead of reading word by word.
What is RLHF and why does it matter?
⌄
RLHF stands for Reinforcement Learning from Human Feedback. It is the step that turns a raw AI model into a helpful assistant. People review and rank the model’s answers, and those rankings are used to train a system that guides the AI toward better responses. OpenAI explained this process in their 2022 InstructGPT paper.
Is AI actually intelligent?
⌄
Researchers and philosophers do not fully agree on this. What is clear is that large language models can now do tasks that once needed human intelligence. They do this by finding patterns in data rather than truly understanding or reasoning. Whether this is called intelligence depends on how you define it.

Tags: AIAI ExplainedAI PredictionHow AI WorksLLMMachine LearningRLHFTechnologyTransformer Architecture
Previous Post

Scientists Make Mouse Skin Transparent Using Common Food Dye

Next Post

Do Electric Vehicles Catch Fire More Than Gas Cars?

Related Posts

This screenshot displays Google Gemini to generate an AI-created image of a pope

Google Pauses AI Image Tool After It Refused to Show White People

07/05/2026

This Screenshot Shows CNN requesting Google Gemini to Generate an AI-created image of a Pope, Along with the AI Response....

Google Gemini Chatbot Will Save Your Conversations for Three Years

Google’s Gemini Chatbot Will Save Your Conversations for Three Years

17/05/2026

Do you have a secret you want to keep? Be cautious with AI assistants as the companies running them may...

ChatGPT Can Now Make AI Videos with Sora Text-To-Video Model

ChatGPT Can Now Make AI Videos with Sora Text-To-Video Model

17/05/2026

This cute little fluffy monster kneeling beside a melting red candle was created with OpenAI Sora OpenAI, known for its...

AI Won't Take Your Jobs Quickly Study Finds

AI Will Not Take Your Job as Fast as You Think But the Clock Is Ticking

17/05/2026

  A new MIT study shows AI is replacing office jobs faster than physical trade roles like plumbing nursing and...

Why Humans Haven’t Returned to the Moon in Over 50 Years.
Science

Why Humans Haven’t Returned to the Moon in Over 50 Years

30/05/2026
Do electric vehicles catch fire more than gas cars comparison between EV and gasoline vehicle fires
Transport

Do Electric Vehicles Catch Fire More Than Gas Cars?

02/06/2026
FD&C Yellow 5 (tartrazine), the dye that gives foods, drugs, and cosmetics a lemon-yellow color, can also be used to make mice transparent, as shown in this generative image.
Science

Scientists Make Mouse Skin Transparent Using Common Food Dye

30/04/2026
China ultra-high-speed UHS 1000 kmh Maglev Train Successfully Passes Demo Test
Transport

China Tests 1,000 km/h Maglev Train Successfully

30/04/2026

Subscribe

Sweden Launches World’s First Self-Sailing Electric Passenger Ferry
Transport

Sweden Launches World’s First Self-Sailing Electric Passenger Ferry

23/05/2026
Wazuma V8M
Transport

Wazuma V8M Powered Trike

25/05/2026
Nasa Wants Volunteers to Live in Its Mars Simulation CHAPEA for a Year
Science

Nasa Wants Volunteers to Live in Its Mars Simulation for a Year

25/05/2026
This screenshot displays Google Gemini to generate an AI-created image of a pope
AI

Google Pauses AI Image Tool After It Refused to Show White People

07/05/2026
engineering-junkies-3d-logo

Engineering Junkies

A Publication Led by a Team of Expert Researchers in Technology, Science, and Current Events. Stay Informed by Joining Our Community Today.

Follow Us

Categories

  • DIY & Home Improvement
  • Engineering
  • Gadgets
  • News
  • Science
  • Technology
    • AI
    • Robotics
  • Transport
    • Cars

Company

  • About Us
  • Advertise
  • Contact Us
  • Cookie Policy
  • Editorial Guidelines
  • Legal Policies
  • Privacy Policy
  • Terms of Service

© Copyright 2023 -All Rights Reserved by Engineering Junkies.

No Result
View All Result
  • Home
  • News
  • Technology
    • AI
    • Robotics
  • Science
  • Gadgets
  • Transport
    • Cars
  • Engineering

© Copyright 2023 -All Rights Reserved by Engineering Junkies.