can-i-run-this-llm

LEADERBOARD

Fastest rigs, ranked by real benchmarks

Ranked from community-measured tokens/sec — not our estimates. Numbers from one or two reports are tagged prelim until a third confirms them. Add your benchmark →

Most tokens per dollar

Measured decode tok/s per $1,000 of device cost.

  1. 1.81.4tok/s per $1kNVIDIA RTX 4090 (24GB) · Llama 3.1 8B Q4_K_M · 130.2 tok/s · $1,599prelim

Fastest rig per model

Llama 3.1 8B

  1. 1.130.2tok/sNVIDIA RTX 4090 (24GB) · Q4_K_Mprelim
  2. 2.49.3tok/sMacBook Pro M2 Max (32GB) · Q4_K_Mprelim

3 benchmarks · 2 devices · 1 model · full benchmark table →