About Us

(the version that matches reality)

What we are

  • A real-time router that hunts the lowest-cost GPU seats we can rent, then relays your request to whichever provider is cheapest at that moment.
  • Up-front about the trade-offs: FP8 by default, zero certifications, zero SLAs, and some upstream providers may log or train on your data.
  • One-click OpenAI-compatible endpoint that works as soon as you top up your account with USD.
  • Extremely fast about adding new frontier open source models. Usually same day.

What we are not

  • SOC 2, GDPR, HIPAA, or ISO-anything.
  • A walled garden that logs everything you send. (We ourselves log only your raw tokens usage count, but we can't speak for every provider we route to.)
  • A forever-stable service—prices, routes, and even upstream providers can change overnight.

How to use us responsibly

  1. 1.Great for hobby projects, MVPs, internal tools, and any workload that can tolerate occasional hiccups.
  2. 2.Not recommended for regulated data or customer-facing production where uptime guarantees matter.
  3. 3.If you need a locked-down provider or full-precision FP16, flip the "pin-provider" header or look elsewhere.

Roadmap (rough order)

  • Reserved-capacity lanes for heavy users (no SLA, just better odds).
  • Batch/async endpoints for even deeper discounts.
  • Maybe—big maybe—an opt-in "enterprise tier" once we can afford audits.

Talk to the humans who run the boxes

Ready to cut your LLM bill?

Get your API key in 60 seconds

Get API Key