Sources & freshness · LLM Switchboard

How LLM Switchboard stays current. TIERED PULL+WATCH ARCHITECTURE. Use a single lightweight ingestion service (Python + httpx + a scheduler such as GitHub Actions cron, Cloud Run Jobs, or a small VM cron) writing to a normalized `models` table (Postgres/SQLite) keyed by a canonical model id (vendor/family/variant) with a `source_provenance` JSONB column so each field traces back to its origin. Layer the sources by role: (1) PRICING + CONTEXT CORE = OpenRouter /api/v1/models is the spine — one unauthenticated GET returns ~350 models with prompt/completion price, context_length, modality, and supported params; it is free and needs no key, so it can run hourly. (2) DISCOVERY OF NEW OSS DROPS = Hugging Face list-models (sorted by trendingScore/createdAt/downloads) + arXiv Atom feed (cat:cs.CL/cs.AI) + GitHub Releases atom feeds for the ~20 labs you care about (meta-llama, Qwen, deepseek-ai, mistralai, google, etc.). These surface a model BEFORE it has pricing anywhere. (3) PROVIDER-SPECIFIC AVAILABILITY = Groq, NVIDIA build.nvidia.com, Together, Fireworks, DeepInfra each expose an OpenAI-compatible GET /models (or /models/list) — poll these to know "who serves what, and at what host." DeepInfra's /models/list is public and even carries per-token pricing; Groq/NVIDIA/Together/Fireworks need a (free) key for the list. (4) QUALITY/RANK SIGNALS = LMArena (no official API — use the official lmarena-ai/leaderboard-dataset on HF Hub or the community GitHub Actions JSON snapshot at api.wulong.dev) + Epoch AI notable_ai_models.csv (3500+ models, training compute, params, release date) + Artificial Analysis Data API free tier (headline intelligence index + median speed + input/output price) as a cross-check on pricing/benchmarks. (5) PRICING CROSS-CHECK / SANITY = simonw/llm-prices current-v1.json (clean per-vendor input/output/cached JSON, MIT-licensed, free) as a second pricing oracle to flag disagreements. MERGE LOGIC: discovery sources create a stub row (id, vendor, release_date, hf_repo, arxiv_id); provider /models endpoints attach availability+host; OpenRouter/llm-prices/DeepInfra/ArtificialAnalysis attach pricing+context (prefer OpenRouter, fall back to provider-native, flag rows where two sources disagree >10%); LMArena/Epoch/AA attach rank/intelligence/compute. Everything is idempotent UPSERT on canonical id; keep a `first_seen`/`last_seen`/`last_changed` audit trail and emit a daily diff (new models, price changes, deprecations) to Slack/webhook. EVERY source individually is free or <$100/mo, so total cost is dominated by your own compute (a $5-20/mo VM or free GitHub Actions minutes) — comfortably under $100/mo per source. cron: OpenRouter /api/v1/models: hourly (free, no auth, it's the pricing+context spine; cheap to poll often so price/context drift is caught fast). DeepInfra /models/list: every 6h (public, carries pricing). Groq, NVIDIA build.nvidia.com, Together, Fireworks /models: every 6h (catalog/availability changes are infrequent; free keys). Hugging Face list-models sorted by createdAt + trendingScore: every 3h for trending/new (1,000 API req/5min on a free token is ample), plus a nightly full sweep filtered to text-generation + likes/downloads thresholds. arXiv Atom feed (cs.CL, cs.AI): every 6h (respect 1 request / 3 sec, page politely). GitHub Releases atom feeds for tracked labs: hourly (cheap, catches the actual weight drop). LMArena (HF leaderboard-dataset / community JSON snapshot): daily. Epoch AI notable_ai_models.csv: weekly (slow-moving research dataset). Artificial Analysis Data API (free tier): daily, but hard-cap at 100 req/day — one batched call is enough. simonw/llm-prices current-v1.json: daily (it updates roughly daily). Emit a consolidated diff/changelog once per day at a fixed time (e.g., 14:00 UTC) after the morning pulls have completed.

Run the freshness pipeline

Pulls live data from the free feeds below and reconciles the catalog.

New-model radar Hugging Face trending

Open models gaining traction that aren't curated yet.

1zai-org/GLM-5.2 in catalog40k dl

2yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF new456k dl

3WeiboAI/VibeThinker-3B new41k dl

4baidu/Unlimited-OCR new8k dl

5yuxinlu1/gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUF new96k dl

6unsloth/GLM-5.2-GGUF in catalog56k dl

7HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive in catalog3955k dl

8empero-ai/Qwythos-9B-Claude-Mythos-5-1M new2k dl

9nvidia/LocateAnything-3B new274k dl

10MiniMaxAI/MiniMax-M3 new131k dl

11empero-ai/Qwythos-9B-Claude-Mythos-5-1M-GGUF new27k dl

12microsoft/FastContext-1.0-4B-SFT new4k dl

Data sources & cost

Every source stays under $100/month. The two free feeds alone keep pricing & new-model discovery current.

Source	Provides	Free tier	Paid cost	≤ $100/mo	Ingestion
OpenRouter /api/v1/models ↗	Unified catalog of ~350+ models across all providers with per-million prompt/completion pricing, context_length, input/output modalities, supported params, instruction type, and embedded Artificial Analysis benchmark data. The single best normalized pricing+context spine for a catalog.	free	$0 for catalog reads (you only pay for actual inference, which you are not doing). No subscription needed to read /models.	✓	Primary spine. Hourly GET, UPSERT on canonical id. Treat as authoritative for price+context unless a provider-native endpoint disagrees >10% (then flag). Parse architecture.input_modalities/output_modalities and supported_parameters for capability columns.
Hugging Face Hub API (list models / trending) ↗	Authoritative source for new open-weight model DISCOVERY: repo id, author, createdAt, lastModified, downloads, likes, trendingScore, tags/pipeline_tag, gated status, config. Sort by trendingScore or createdAt to catch drops the moment they are published.	free	$0 on Free user token. PRO is $9/mo if higher limits ever needed (2,500 API req/5min vs 1,000), but Free tier is sufficient.	✓	Discovery feed. Poll createdAt+trendingScore every 3h to create stub rows (id, vendor, release_date, hf_repo); nightly full sweep filtered to text-generation with downloads/likes thresholds to avoid noise. Stubs get enriched with pricing later from provider/OpenRouter sources.
Artificial Analysis Data API ↗	Independent benchmark + pricing data: headline intelligence index, median output speed (tokens/s), time-to-first-token, and input/output (and blended) pricing per model. Pro adds per-model V2 detail/percentiles; Commercial adds per-provider data and 7/30/90-day performance-over-time.	free	Free tier $0 (just an account + key). Pro and Commercial dollar pricing is NOT published — Commercial is manually provisioned via contacting their team. Use free tier to stay at $0; if Pro is ever needed, confirm it is under $100/mo before committing.	✓	Quality/price cross-check. Daily single batched call on free tier (hard 100/day cap) to attach intelligence_index, median_speed, and a second independent price reading. Flag rows where AA price disagrees with OpenRouter to catch stale data.
LMArena (Chatbot Arena) leaderboard ↗	Human-preference Elo rankings, vote counts, and confidence intervals across arena categories (overall, coding, vision, etc.) — the headline crowd-ranking signal users expect next to each model.	free	$0 via the official HF dataset or the community JSON snapshot. Apify scraper would be usage-priced (avoid; not needed).	✓	Rank signal. Pull daily from the HF dataset (primary, official) with the community JSON as fallback. Fuzzy-match arena model names to canonical ids; attach elo_rating, rank, vote_count. Name-matching is the main effort — maintain an alias map.
Groq /openai/v1/models ↗	Active model list for Groq's ultra-fast inference (model id, owned_by, context_window, active status, max_completion_tokens). Tells you which models Groq currently serves and their host-specific limits.	free	$0 — a free Groq account/key reads /models at no cost (you pay only for inference, which you don't do here).	✓	Availability layer. Poll every 6h with free key to mark which canonical models are 'served_by: groq' with context_window; join Groq's published per-token pricing (small manual/scraped map) for the Groq price column.
NVIDIA build.nvidia.com (NIM API catalog) ↗	Catalog of 100+ hosted NIM models (DeepSeek, Qwen, Mistral, Llama, Gemma, Nemotron, plus embedding/vision/speech/specialty) with OpenAI-compatible endpoints — availability + which open models NVIDIA hosts.	free	$0 for catalog access — free Developer Program account, nvapi- key with 1,000 inference credits, 40 req/min. Reading /models doesn't consume inference credits meaningfully.	✓	Availability layer. Poll /v1/models every 6h with free nvapi- key to mark 'served_by: nvidia_nim'. No public price, so do not use NVIDIA as a price source — availability/host only. Optionally scrape build.nvidia.com cards for capability tags.
llm-prices (simonw/llm-prices) JSON ↗	Curated, human-verified per-vendor pricing: id, vendor, name, input, output, input_cached (USD per 1M tokens) for current and historical prices. Clean canonical-name pricing and a price-history audit trail.	free	$0 — fully free static JSON.	✓	Pricing cross-check oracle. Daily GET current-v1.json; compare against OpenRouter and provider-native prices. Use as the tie-breaker / disagreement flag for closed-model pricing (OpenAI/Anthropic/Google) where OpenRouter may lag. Use historical-v1.json to backfill price_history.
Together AI /v1/models ↗	Catalog of ~186 serverless models (Llama all sizes, Mistral, Qwen, DeepSeek, Kimi, plus long-tail specialty) with model id, type, context length, and per-token pricing fields — availability + native pricing for a broad OSS catalog.	free	$0 to read /models with a free account/key (inference is pay-per-use and separate). Example serverless prices for reference: DeepSeek V4 Pro ~$2.10 in, Llama 3.3 70B ~$0.88.	✓	Availability + native-price layer for OSS breadth. Poll /v1/models every 6h; mark 'served_by: together' and attach Together's per-token price. Good native source for long-tail OSS models not yet on OpenRouter.
Fireworks AI /v1/models ↗	Catalog of ~201 serverless models with ids, context windows, and per-token pricing — availability + native pricing, often the cheapest on high-traffic models (e.g., DeepSeek V4 Pro ~$1.74 in, Kimi K2.6 ~$0.95).	free	$0 to read /models with a free account/key; inference billed per-use separately.	✓	Availability + native-price layer. Poll every 6h; mark 'served_by: fireworks' and attach Fireworks price. Use alongside Together/DeepInfra to compute a min-price-across-hosts column per OSS model.
DeepInfra /models/list ↗	100+ models with model_name, type, pricing (cents_per_sec or per-token with short/full descriptions), max_tokens, quantization, mmlu score, tags, deprecated/replaced_by, is_partner — availability + native pricing AND a built-in deprecation signal.	free	$0 — public unauthenticated catalog; inference is pay-per-use separately. Prices range ~$0.02/1M (small) up to ~$1.50/1M (large).	✓	Availability + native-price + DEPRECATION layer. Poll /models/list every 6h (no auth). Uniquely valuable for deprecated/replaced_by — use it to mark models as deprecated/superseded in your catalog and trigger changelog entries.
Epoch AI (notable AI models + benchmarking hub) ↗	Database of 3,500+ notable models with training compute (FLOP), parameter counts, training dataset size, release date, organization, country, and benchmark results — the authoritative source for 'hard facts' (release date, size, compute) that no inference API provides.	free	$0 — free CSV downloads and free Python client.	✓	Metadata enrichment. Weekly CSV pull to attach release_date, params, training_compute, organization, and benchmark scores to canonical rows. Slow-moving — weekly is plenty. Great for backfilling 'model facts' columns for models you discovered via HF/arXiv.
arXiv API (Atom feed) for new model papers ↗	Earliest signal of new models/architectures via research papers — title, abstract, authors, arXiv id, publication date in cs.CL/cs.AI/cs.LG. Often surfaces a model days before weights or pricing exist.	free	$0 — fully free.	✓	Earliest-discovery feed. Poll every 6h for new cs.CL/cs.AI submissions matching model keywords; create 'announced/research' stub rows with arxiv_id and link them to HF repos / provider availability as they appear. Filter aggressively (keyword + org allowlist) to control noise.
GitHub Releases (atom feeds) for model labs ↗	Real-time signal of the actual weight/code drop from labs (meta-llama, Qwen, deepseek-ai, mistralai, google-deepmind, etc.) — release tag, date, notes. Catches OSS launches at the moment of the GitHub release.	free	$0 — atom feeds and the REST API free tier cover this.	✓	Real-time OSS-drop trigger. Hourly poll of the curated releases.atom set; on a new release, create/flag a stub row and kick an enrichment pass (HF lookup + provider /models checks) so the model appears in the catalog within the hour of launch.