Models hub

Ollama models hub — popular local LLMs for Windows

The most popular Ollama models for Windows with pull commands, VRAM requirements and size. Copy the command, paste in your terminal, and the model downloads and runs offline.

Copy any ollama pull command below and paste it into Command Prompt or PowerShell. The model downloads once and then runs fully offline.

Llama 3 — Meta

Llama 3 is Meta's flagship open-source model. It delivers an excellent balance of quality, speed and hardware compatibility, making it the best starting point for most Windows users.

VariantSizeVRAM neededPull command
8B (default)4.7 GB6 GB+ollama pull llama3
8B Q4_K_M4.4 GB5 GB+ollama pull llama3:8b-instruct-q4_K_M
70B40 GB48 GB+ollama pull llama3:70b
cmd.exe
C:\> ollama pull llama3
C:\> ollama run llama3

Full guide: Llama 3 on Windows

Mistral — Mistral AI

Mistral 7B is one of the fastest models per token on consumer hardware. It excels at coding, summarisation and quick Q&A. Runs well even on GPUs with 6 GB VRAM or CPU-only.

VariantSizeVRAM neededPull command
7B (default)4.1 GB5 GB+ollama pull mistral
Mistral Nemo 12B7.1 GB8 GB+ollama pull mistral-nemo
cmd.exe
C:\> ollama pull mistral
C:\> ollama run mistral

Full guide: Mistral on Windows

Qwen 2.5 — Alibaba

Qwen 2.5 leads open-source benchmarks in reasoning, math and multilingual tasks. Available in sizes from 0.5B to 72B. Strong choice for non-English languages and technical work.

VariantSizeVRAM neededPull command
7B4.4 GB6 GB+ollama pull qwen2.5
14B8.7 GB10 GB+ollama pull qwen2.5:14b
Coder 7B4.4 GB6 GB+ollama pull qwen2.5-coder
cmd.exe
C:\> ollama pull qwen2.5
C:\> ollama run qwen2.5

Full guide: Qwen 2.5 on Windows

Gemma 2 — Google

Gemma 2 from Google punches above its weight for its size. The 2B variant runs on almost any hardware including CPU-only. Good for embedded use cases and machines with limited RAM.

VariantSizeVRAM neededPull command
2B1.6 GB3 GB+ollama pull gemma2:2b
9B5.4 GB7 GB+ollama pull gemma2
27B16 GB20 GB+ollama pull gemma2:27b
cmd.exe
C:\> ollama pull gemma2:2b
C:\> ollama run gemma2:2b

Phi-4 — Microsoft

Phi-4 is Microsoft's small language model optimised for instruction following and coding. At 14B parameters it delivers quality comparable to much larger models on many benchmarks, while fitting in 8 GB VRAM.

VariantSizeVRAM neededPull command
Phi-4 14B8.2 GB9 GB+ollama pull phi4
Phi-4 Mini 3.8B2.5 GB4 GB+ollama pull phi4-mini
cmd.exe
C:\> ollama pull phi4
C:\> ollama run phi4

Full guide: Phi-4 & Gemma 2 on Windows

Browse all available models

Ollama supports hundreds of models. Browse the full library at ollama.com/library. Common additional picks:

cmd.exe
# Code generation:
C:\> ollama pull codellama
C:\> ollama pull deepseek-coder
# Search your installed models:
C:\> ollama list
# Remove a model to free space:
C:\> ollama rm modelname

Not sure which model to pick?

Llama 3 8B is the best starting point for most Windows users.

Install guide