Run AI models locally on Windows — free and private
Ollama is a free, open-source tool that lets you run large language models (LLMs) like Llama 3, Mistral and Qwen 2.5 directly on your Windows PC. No cloud subscription, no API key, no data leaving your machine. After the initial model download, everything works completely offline.
On Windows, Ollama installs as a background service that exposes a local API at localhost:11434. You interact with it via the command line or through any compatible chat UI.
Get Ollama running on Windows in 4 steps
- 1
Download OllamaSetup.exe
Use the download button above. The installer is ~5 MB and has no dependencies. Works on Windows 10 and 11 (64-bit). Verify the file is genuine before running.
- 2
Run the installer
Double-click the .exe and follow the one-page wizard. Ollama installs in under a minute and starts a background service automatically. See the full install guide for tips on GPU detection and firewall prompts.
- 3
Pull your first model
C:\> ollama pull llama3pulling manifest...pulling 8934d96d3f08... 100% ████████████ 4.7 GBsuccessThe first pull downloads the model weights (~4–8 GB depending on the model). After that it runs fully offline.
- 4
Start chatting
C:\> ollama run llama3>>> Send a message (/? for help)>>> Hello! What can you do?I can help with writing, coding, analysis, Q&A...
Everything you need for Ollama on Windows
Install on Windows
Step-by-step install guide with GPU tips and common pitfalls.
→ install guideGPU Acceleration
Enable NVIDIA CUDA or AMD DirectML for 5-10x faster inference.
→ GPU guideModels Hub
Llama 3, Mistral, Qwen 2.5, Gemma 2 — copy the pull command.
→ browse modelsOffline & Privacy
Is Ollama fully offline? Where is data stored? How to block network access.
→ privacy guideTroubleshooting
Fix CLI not recognized, port conflicts, slow downloads and GPU issues.
→ fix issuesFAQ
All common Ollama Windows questions answered in one place.
→ see FAQMinimum requirements for Ollama on Windows
| Component | Minimum | Recommended |
|---|---|---|
| OS | Windows 10 64-bit | Windows 11 64-bit |
| RAM | 8 GB | 16 GB or more |
| Disk | 10 GB free (for one model) | 50+ GB SSD |
| CPU | Any x64 with AVX2 | Modern multi-core (Ryzen 5/i5+) |
| GPU | Optional | NVIDIA (8+ GB VRAM) or AMD (DirectML) |
Without a GPU, Ollama runs models on the CPU — functional but slower. A GPU with 8+ GB VRAM dramatically improves speed. See GPU Acceleration.