What is the difference between an LLM router and an aggregator?

A router decides which model to use for each request. An aggregator like OpenRouter is one API to many models, where routing is an optional add-on. Inferbase is routing-first and also serves the model, so the decision and the inference come from one place.

Do I still need an inference provider?

With a routing brain like NotDiamond or a framework like RouteLLM, yes, you bring your own providers and keys and run the inference yourself. With Inferbase, no, it routes and serves through one OpenAI-compatible API.

Is model routing actually worth it?

When your traffic is mixed, yes. Simple prompts do not need a frontier model, so routing each request to the smallest model that clears the bar cuts cost without dropping quality. A single hard-wired model leaves both quality and cost on the table.

Can I switch between these later?

They are OpenAI-compatible to varying degrees, so switching is mostly a base URL and key change. Moving to Inferbase needs no provider keys and no routing SDK in your code, you just set model="auto".

Inferbase

How Inferbase compares

Inferbase routes each request to the best model and serves it through one OpenAI-compatible API. Here is how that differs from the routers and aggregators you might be weighing.

The LLM routing landscape

Most tools sit in one of three buckets. Inferbase is the one that both decides and delivers.

OpenRouter

Aggregator

One API to hundreds of models. Routing is an add-on, and its Auto Router is outsourced to NotDiamond.

NotDiamond

Routing brain

Recommends the best model per prompt, then leaves you to run the inference with your own providers.

RouteLLM

OSS framework

A free, self-hosted strong-vs-weak router you install, calibrate, and operate yourself.

Inferbase

Routes and serves

Picks the best model per request across a curated catalog and runs it, through one OpenAI-compatible API, with a decision you can audit. First-party, end to end.

Why route at all

No single model is the right choice for every request. Routing turns model selection from a standing decision into a per-request one.

The right model is not the same from one request to the next:

A frontier model is overkill for easy work. Classification, short summaries, and simple Q&A do not need a top-tier model, so paying for one wastes money on the bulk of your traffic.
A small model falls short on hard work. Complex reasoning and long-context analysis need a stronger model, so a lean default quietly loses quality where it matters most.
A single default leaves both on the table. Routing picks a model per request instead, so spend and quality each track the difficulty of the work.

This page compares Inferbase against the routers and aggregators teams usually weigh, OpenRouter, NotDiamond, and RouteLLM, and the criteria that distinguish them.

How to choose an LLM router

Four questions that separate the categories, and where the real quality and cost gains hide.

Does it pick the model, or just the host?

Optimizing the provider for a model you already chose is not the same as choosing the right model per request, which is where the quality and cost gains are.

Does it run the inference, or just decide?

A router that only recommends leaves you to operate providers, keys, and fallback. Routing plus serving is one system instead of two.

Can you see why it chose?

Without a per-request decision trail, routing is a black box. An auditable decision matters for trust and for debugging what ran.

Managed, or yours to operate?

Self-hosted frameworks are free, but you own the server, the threshold calibration, and the upkeep as frontier models change.

At a glance

The capability split, in one view. Honest cells, the full, sourced breakdowns are linked below.

	Inferbase	OpenRouter	NotDiamond	RouteLLM
Picks the best model per request	First-party	Via NotDiamond add-on	Yes	Strong vs weak only
Routes and serves in one API	Yes	Yes	No, you run it	No, self-hosted
Per-request decision audit	Yes	No	Recommend-side	Build your own
Nothing to self-host or calibrate	Yes	Yes	Yes	No
Model breadth	Curated catalog, plus your own models	Hundreds of models	Your chosen pool	Two models

How Inferbase approaches it

Model selection as a per-request decision, with execution and an audit trail in the same place.

Inferbase treats model selection as a per-request decision: each prompt is classified, scored on the objective you set, and routed to the best model, with a fallback if one fails. What sets it apart is what happens around that decision.

It serves, not just decides. Unlike a routing brain, Inferbase runs the chosen model, so you get one endpoint, one bill, and one record per request.
Routing is first-party. Unlike an aggregator, model selection is benchmark-grounded rather than outsourced to a third-party router.
Nothing to operate. Unlike a self-hosted framework, there is no router server to run and no cost-quality threshold to calibrate by hand.

The same call that picks the model also serves it and records why. See how the routing works in more detail.

Read the full comparison

Each one leads with the routing-philosophy difference, then a side-by-side table and where each tool fits.

Inferbase vs OpenRouterAggregator

Routing is a NotDiamond-powered add-on.

Read the comparison

Inferbase vs NotDiamondRouting brain

Recommends a model; you run it.

Read the comparison

Inferbase vs RouteLLMOSS framework

Self-host and calibrate it yourself.

Read the comparison

Frequently asked questions

The routing landscape, in plain terms.

Start building with the right model.

Automatically route workloads to the right model for every task, every time.

Start Building Read the docs