Can I self-host for free? +
Yes. The full source code is available under the Apache 2.0 license on GitHub. You can deploy it anywhere — your own servers, Docker, Kubernetes, or any cloud provider — completely free, with no restrictions for commercial use.
Does the Professional plan include a HIPAA BAA? +
The Professional plan does not include a HIPAA BAA by default. You need the Enterprise plan or the Healthcare Compliance Add-On for a signed BAA. If you are a covered entity handling PHI, contact us to discuss your specific requirements.
What counts as a "request"? +
One request is one call to the gateway's /v1/chat/completions endpoint. Streaming responses count as one request. Cache hits count as one request (but save the LLM cost entirely). There is no per-token billing from us — you pay your LLM provider directly.
Can I switch from self-hosted to managed without downtime? +
Yes. Because the gateway is a drop-in proxy, you simply point your base_url to the new managed endpoint. All your API keys, configuration, and provider integrations migrate with a config file import — no code changes in your applications.
Do you store my LLM responses? +
In the managed plans, audit logs are stored (encrypted at rest) to provide the immutable trail needed for compliance. The semantic cache stores a hash of the request and the response — never the raw PHI. You can disable caching for PHI traffic in your config. On self-hosted, you control all storage.
What LLM providers do you support? +
20+ providers including: OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Gemini (Vertex AI), Groq, Mistral, DeepSeek, Cohere, Fireworks AI, Together AI, HuggingFace, Ollama (local/on-prem), Perplexity, Replicate, xAI (Grok), AI21, and more. New providers are added regularly — open a GitHub issue to request one.
Does it support Microsoft Entra ID / Azure AD? +
Yes — native Entra ID OIDC is fully implemented. The gateway validates RS256 and ES256 JWTs against Microsoft's JWKS endpoint with full claim verification (issuer, audience, expiration, tenant). App Role enforcement and multi-tenant support are included. Three auth modes are available: api_key (default), entra_id (Entra only), and any (accept both — useful for zero-downtime migration from API keys to SSO). Available in all editions including Open Source.
Is there a self-hosted admin portal? +
Yes — the
Enterprise On-Premises Edition bundles the full governance portal (cost dashboards, request logging, governance rules, A/B experiments, team management) together with the gateway engine. It's a separate open-source repository designed for regulated industries that need everything self-hosted:
github.com/aaditya-b/ai-control-plane-enterprise. Deploy with
docker compose up in under 10 minutes.