Copilot Explainers · 19 April 2026 · 4 min read
What AI Models Does Microsoft Copilot Use? A Technical Look at the Stack
What model does Copilot use? Microsoft 365 Copilot routes between GPT-4 family, reasoning models and others through an orchestrator grounded in Microsoft Graph.
TL;DR
- Microsoft 365 Copilot is an orchestration system, not a single model. The GPT-4 family handles most everyday tasks, with reasoning models and smaller models appearing in specific surfaces.
- The orchestrator, Semantic Index and Microsoft Graph decide which model runs and what context it sees, which is why the same prompt produces different answers in different apps.
- Track model changes for capability awareness, but design your tenant around the orchestrator, Graph and content readiness, because those are the parts you can influence.
People asking “what model does Copilot use” usually want a single answer. The honest reply is that there is not one. Microsoft 365 Copilot is an orchestration system that uses several models depending on the task, the surface and the licence.
For technical and architecture decisions, the model picture still matters. If you understand which models do what, you can predict where Copilot will be strongest and where its limits show up.
The GPT-4 family does most of the work
Most of the everyday Copilot experience in Word, Outlook, Teams and the Microsoft 365 chat sidebar is powered by the GPT-4 family of models, delivered through Azure OpenAI Service. That has included GPT-4, GPT-4 Turbo and GPT-4o at different points, and Microsoft updates the underlying mix without fanfare. The version you used last quarter may not be the version you are using now.
This is why benchmarks against the open ChatGPT product do not always hold. Microsoft tunes how the model is called, what context it gets and which guardrails apply. The model can be very similar while the experience is quite different.
Reasoning models handle harder analysis
For tasks that need step-by-step reasoning rather than fluent writing, Copilot can call OpenAI reasoning models. These appear most clearly in the newer Copilot agents like Researcher and Analyst, which were built around longer chains of thought, multi-step planning and code execution against tenant data.
Reasoning models are slower and more expensive, so they are not used for every prompt. They appear when the task warrants the cost, and they are typically gated to specific licences and agents rather than the general chat surface.
Smaller models cover lightweight work
Microsoft also runs its own small model family, Phi, alongside the OpenAI stack. These are used in scenarios where latency matters more than raw capability, including some on-device experiences and routing decisions inside the orchestrator itself.
You will not usually see these advertised in the user interface. They do work behind the scenes, particularly where a smaller model can answer faster and cheaper than a frontier one.
Copilot Studio opens the door to other model providers
Copilot Studio is the platform Microsoft offers for building custom agents that sit on top of Microsoft 365. In late 2025 it added support for Anthropic models, meaning agent builders can now choose between OpenAI and Anthropic when designing how an agent reasons.
This change does not flip a switch in the main Copilot product, where most users never see Copilot Studio. It does change what is possible for organisations building their own agents, especially where a particular model is a better fit for a specific task.
The orchestrator is doing more than people realise
The model layer is only one part of Copilot. The orchestrator decides what the request is, what context to fetch, which model to call and how to combine the result with grounded information from your tenant.
That orchestrator handles prompt rewriting, retrieval, safety checks and the final response formatting. It is the reason a Copilot answer in Outlook feels different from the same prompt in Word, even when the underlying language model is identical.
Grounding through Semantic Index and Microsoft Graph
Models on their own do not know about your tenant. Microsoft 365 Copilot grounds responses using two layers working together:
- Microsoft Graph holds the relationships between people, files, meetings, chats and emails the user is allowed to see.
- The Semantic Index sits on top of that, providing fast retrieval over the content so the orchestrator can pull the most relevant snippets into the prompt.
The model is only ever shown a small, permission-trimmed slice of your tenant per request. The quality of that slice depends entirely on how the Microsoft 365 estate is organised and how permissions are managed.
What this means in practice
Understanding the model layer is useful for two reasons.
First, it explains why Copilot can feel inconsistent across apps and over time. Different surfaces use different model and context combinations, and the brand on the tin stays the same even as the components behind it shift.
Second, it helps you avoid building a strategy around one model name. The model mix changes. The orchestrator, Graph and Semantic Index are the more durable parts of the system, and they are the parts your tenant directly affects through permissions, content quality and labelling.
If you want the leadership-level view rather than the technical one, see the plain-English guide to the AI models behind Microsoft 365 Copilot.
Related reading
More on copilot explainers
Common questions