Show HN: Find the best local LLM for your hardware, ranked by benchmarks

9.5 relevance

Tool to benchmark local LLMs on hardware is directly actionable and highly relevant.

2026-05-15 ai/ml Hacker News (100+)

Find the local LLM that actually runs — and performs best — on your hardware. Ranked by real, recency-aware benchmarks, not parameter count. One command, run it instantly. - Andyyyy64/whichllm

Summary

whichllm is an open-source CLI that auto-detects hardware and ranks local LLMs from HuggingFace by real benchmark scores (LiveBench, Aider, Chatbot Arena ELO, Open LLM Leaderboard) rather than just VRAM fit. It uses recency-aware demotion of stale models, confidence-graded scoring (direct/variant/base/interpolated/self-reported), and architecture-aware estimates including MoE active-vs-total split. The tool outputs JSON for scripting, supports one-command chat via uv, and can simulate any GPU for hardware planning.

Key Takeaway

Use whichllm to benchmark and select the optimal local LLM for your hardware, integrating real benchmark data and confidence scoring.

Why it matters

This directly addresses the need to efficiently select the best local LLM for specific hardware without relying on size heuristics, saving time and ensuring optimal performance for agent orchestration and development workflows.