Skip to main content

Research

Tracking model releases, stack updates, benchmarks, and infrastructure relevant to private AI deployment.

Model releases

MiniMax M2.5 — 230B parameter model achieves 74% on SWE-bench

MiniMax releases its M2.5 model with strong coding performance, positioning it as VAULTLINE AI's Pro tier coding model.

Model releases

Qwen 3.5 — 397B parameter model with 256K context window

Alibaba releases Qwen 3.5, a 397B parameter model supporting extended context and strong reasoning capabilities.

Stack updates

vLLM 0.8 — throughput improvements and FP4 quantisation support

vLLM 0.8 ships with significant throughput improvements and native FP4 quantisation support for NVIDIA GB-series GPUs.

Stack updates

Open WebUI 0.6 — multi-user RBAC and MCP connector framework

Open WebUI 0.6 introduces a comprehensive role-based access control system and a standardised MCP connector framework.

Benchmarks

SWE-bench Verified 2026 results — open-source models close the gap

Updated SWE-bench Verified results show open-source coding models within 5–8 percentage points of the leading proprietary models.

Benchmarks

MMLU-Pro 2026 — Qwen 3.5 tops open-source leaderboard

Qwen 3.5 achieves a new open-source record on MMLU-Pro, demonstrating strong general reasoning across professional domains.

Infrastructure

NVIDIA DGX Spark — compact AI compute for SME offices

NVIDIA announces DGX Spark, a desktop AI system designed for local deployment with up to 128GB unified memory.

Infrastructure

Apple M5 Ultra — Mac Studio reaches 192GB unified memory

Apple's M5 Ultra chip brings 192GB unified memory to the Mac Studio form factor, enabling large model inference in an office environment.