JetBrains open-sources Mellum2 to go where Claude Code can’t
Scored daily by a customisable AI persona to surface the most relevant engineering leadership news.
JetBrains open-sources Mellum2, a new 12B MoE coding model, directly relevant to AI coding tools.
JetBrains open-sourced Mellum2, a 12B-parameter MoE model with 2.5B active parameters per token, targeting agentic infrastructure tasks (routing, retrieval, sub-agent coordination) and private on-premises deployment — going where Claude Code can't. Successor to Mellum (4B code completion), it achieves 192 tokens/sec on a single H100, pulling 21% ahead of Qwen2.5-7B under concurrent load and scoring 78.4% on EvalPlus function-level code generation, though it concedes broader reasoning (GPQA, MMLU-Redux) to frontier models. Two variants ship: "instruct" for direct answers and "thinking" for explicit reasoning traces in multi-step agentic tasks.
- Evaluate Mellum2 as a specialized component for private on-premises agentic pipelines where latency and control over code intelligence matter more than general reasoning breadth.
For architects building AI-augmented SDLC pipelines, Mellum2 offers a cost-effective, high-throughput focal model that can be deployed on your own infrastructure for agentic sub-tasks without sacrificing inference control or latency.