ollama pull command below and paste it into Command Prompt or PowerShell. The model downloads once and then runs fully offline.Llama 3 — Meta
Llama 3 is Meta's flagship open-source model. It delivers an excellent balance of quality, speed and hardware compatibility, making it the best starting point for most Windows users.
| Variant | Size | VRAM needed | Pull command |
|---|---|---|---|
| 8B (default) | 4.7 GB | 6 GB+ | ollama pull llama3 |
| 8B Q4_K_M | 4.4 GB | 5 GB+ | ollama pull llama3:8b-instruct-q4_K_M |
| 70B | 40 GB | 48 GB+ | ollama pull llama3:70b |
Full guide: Llama 3 on Windows
Mistral — Mistral AI
Mistral 7B is one of the fastest models per token on consumer hardware. It excels at coding, summarisation and quick Q&A. Runs well even on GPUs with 6 GB VRAM or CPU-only.
| Variant | Size | VRAM needed | Pull command |
|---|---|---|---|
| 7B (default) | 4.1 GB | 5 GB+ | ollama pull mistral |
| Mistral Nemo 12B | 7.1 GB | 8 GB+ | ollama pull mistral-nemo |
Full guide: Mistral on Windows
Qwen 2.5 — Alibaba
Qwen 2.5 leads open-source benchmarks in reasoning, math and multilingual tasks. Available in sizes from 0.5B to 72B. Strong choice for non-English languages and technical work.
| Variant | Size | VRAM needed | Pull command |
|---|---|---|---|
| 7B | 4.4 GB | 6 GB+ | ollama pull qwen2.5 |
| 14B | 8.7 GB | 10 GB+ | ollama pull qwen2.5:14b |
| Coder 7B | 4.4 GB | 6 GB+ | ollama pull qwen2.5-coder |
Full guide: Qwen 2.5 on Windows
Gemma 2 — Google
Gemma 2 from Google punches above its weight for its size. The 2B variant runs on almost any hardware including CPU-only. Good for embedded use cases and machines with limited RAM.
| Variant | Size | VRAM needed | Pull command |
|---|---|---|---|
| 2B | 1.6 GB | 3 GB+ | ollama pull gemma2:2b |
| 9B | 5.4 GB | 7 GB+ | ollama pull gemma2 |
| 27B | 16 GB | 20 GB+ | ollama pull gemma2:27b |
Phi-4 — Microsoft
Phi-4 is Microsoft's small language model optimised for instruction following and coding. At 14B parameters it delivers quality comparable to much larger models on many benchmarks, while fitting in 8 GB VRAM.
| Variant | Size | VRAM needed | Pull command |
|---|---|---|---|
| Phi-4 14B | 8.2 GB | 9 GB+ | ollama pull phi4 |
| Phi-4 Mini 3.8B | 2.5 GB | 4 GB+ | ollama pull phi4-mini |
Full guide: Phi-4 & Gemma 2 on Windows
Browse all available models
Ollama supports hundreds of models. Browse the full library at ollama.com/library. Common additional picks: