Built on Groq + NVIDIA open models

The right model for every job —
picked automatically.

LLM Switchboard catalogs every open model on Groq and NVIDIA build.nvidia.com — plus 303+ models you can run on your own hardware — scores them on what matters for your task, and routes each request to the best fit. Cloud or local. One API. No lock-in. Cut your AI bill 40–80% without your customers noticing.

45cloud models

303+run locally < 25GB

$0to start (free tiers)

15benchmarks graded

One control room for your whole AI stack

⚡

Smart router

Classifies each job, filters by your constraints, and ranks models with a transparent score. Callable as a REST API or an importable module.

⬇

Run locally

303+ open models under 25GB for reasoning, coding, vision, STT, TTS & embeddings — copy-paste Ollama/Docker commands, picked by benchmark, with a built-in sandbox to test them.

◑

Business recipes

MEDDIC & BANT call analysis, website SDR chat, AI voice SDR, compliance triage — each with the routed model, an example result, and a live test.

▤

Benchmark intelligence

We grade the benchmarks themselves with the Benchmark² framework, so you trust the right signal — not just whoever topped a leaderboard.

⟳

Always current

A freshness pipeline pulls new open-source models from free feeds — staying current for under $100/mo per source.

⬡

Harnesses & skills

A buyer's guide to LangGraph, CrewAI, DeerFlow and the SDKs, plus the portable SKILL.md ecosystem (NVIDIA-verified, cybersecurity, gstack).

◇

Vector databases

From pgvector to Milvus to Alibaba's zvec — pick the right memory layer for RAG and agents.

How it works

Describe the job

Send a prompt or pick a recipe. LLM Switchboard classifies it into one of 18 job types and reads your constraints (budget, latency, context, modality).

It picks the model

The engine filters the catalog and scores every candidate on task-fit, cost and speed — then explains the choice with capability scores and the relevant benchmarks.

You route to it

Get the decision over REST, or let LLM Switchboard execute the call on Groq/NVIDIA with an automatic cross-provider fallback chain.

Real jobs, real picks

◎ MEDDIC deal scorer

Turn a sales-call transcript into a MEDDIC scorecard with a deal-health score. → routes to a frontier reasoning model with 128k+ context.

✸ Website SDR chat

Real-time lead qualification on your homepage. → routes to the fastest Groq model — sub-second, nearly free at scale.

☎ AI voice SDR

Outbound voice that books meetings. → a Whisper → LLM → Orpheus pipeline, all on low-latency infra.

⛉ Compliance triage

Classify SOC 2 / ISO evidence at scale. → routes to a cheap fast classifier; pennies for the whole pile.

The right model for every job —picked automatically.