Apache 2.0 · Self-hostable · OpenAI-compatible · Entra ID SSO

One Gateway.
Complete AI Control.

Drop-in proxy that sits between your applications and every LLM provider. HIPAA PHI redaction, prompt injection blocking, cost optimization, and full agentic governance — zero code changes required.

Live Demo View on GitHub
One line change. All 13 pipeline stages active.
# Before
base_url = "https://api.openai.com/v1"

# After — full governance, zero other changes
base_url = "https://gateway.aicontrolplanegateway.com/v1"
20+LLM Providers
13Pipeline Stages
15PHI Entity Types
40%Cost Reduction

The 13-Stage Pipeline

Every AI request passes through four governance layers before reaching any LLM.

INGRESS
01Auth & API Keys
02Rate Limiting
03Policy Engine
COMPLIANCE
04Budget Enforcement
05Guardrails
06PHI/PII Redaction
07A/B Testing
OPTIMIZATION
08Prompt Compression
09Semantic Cache
EXECUTION & AUDIT
10Intelligent Routing
11LLM Execution
12Output Guardrails
13Immutable Audit

Everything You Need. Nothing You Don't.

Built for enterprises that need governance, not just a proxy.

🔒

HIPAA PHI Redaction

15 entity types including MRN, NPI, ICD-10, CPT codes, drug names, and provider identities. Complete round-trip: redact before LLM, restore in response. Zero PHI exposure.

Healthcare-Grade
INPUT Patient John Smith, MRN 123456789, prescribed Lisinopril by Dr. Sarah Johnson NPI 1234567890
TO LLM Patient [NAME_1], MRN [MRN_1], prescribed [DRUG_1] by [PROVIDER_1] NPI [NPI_1]
🛡️

Prompt Injection Blocking

35+ patterns covering jailbreaks, token smuggling, indirect injection, DAN attacks, role reassignment, and system prompt extraction attempts.

Security
🌐

20+ LLM Providers

OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Gemini, Groq, Mistral, DeepSeek, Cohere, Fireworks, Together AI, Ollama, xAI (Grok), Perplexity, Replicate, AI21, HuggingFace, and more. Switch providers with zero code changes.

Multi-Provider

Semantic Caching

Content-aware cache with LRU eviction and configurable TTL. Semantically similar queries hit cache — 40%+ cost reduction on repetitive workloads.

Cost Optimization
📊

Per-Team Budget Caps

Daily and monthly spend limits per user and team. Real-time budget enforcement with configurable alert thresholds before limits are hit.

FinOps
🤖

Agentic Session Governance

Sub-agent depth limiting, per-run cost caps, infinite loop detection via fingerprinting, and consistent PHI redaction across all 80 turns of an agent session.

AI Agents
📡

Anomaly Detection & Observability

Prometheus metrics, OpenTelemetry traces, and immutable audit logs. EMA-based anomaly detection fires on rpm_spike, spend_spike, and error_rate. Proactive quota warnings at 80%, 90%, and 95% — delivered via Slack Block Kit, MS Teams, PagerDuty Events API, or SMTP email.

Observability
🔀

6 Routing Strategies

Round-robin, weighted, least-latency, cost-optimized, quality-scored, and fallback. Circuit breaker per provider. Data residency enforcement for PHI traffic.

Reliability
🗜️

Prompt Compression

Whitespace trimming, deduplication, conversation history trimming, and filler removal. Reduce token costs by 15-30% without losing semantic content.

Cost Optimization
🏢

Azure Entra ID SSO

Authenticate enterprise users via Microsoft Azure Entra ID. RS256/ES256 JWT validation against Microsoft's JWKS endpoint, App Role enforcement, multi-tenant support, and brute-force lockout — no separate API keys required for org members.

Enterprise Auth
🧪

A/B Model Experiments

Run statistically rigorous experiments across LLM models. Deterministic SHA-256 group assignment ensures consistent user experience. Track per-group latency, cost, and quality metrics in real time — built into Stage 7 of the pipeline.

Experimentation

System Architecture

Designed to run inside your perimeter. Your data never leaves your infrastructure.

Client Applications
📱 OpenAI SDK
🌐 REST / HTTP
🤖 MCP Agents
🖥️ Admin Console
⚙️ CI/CD Pipeline
Any OpenAI-compatible SDK
AI Control Plane Gateway
Auth · Rate Limit · Policy
Budget · Guardrails · PHI/PII · A/B
Compression · Semantic Cache
Routing · LLM Call · Output · Audit
Standard HTTPS
LLM Providers
OpenAI
Anthropic
Azure
Bedrock
Google
Groq
Mistral
DeepSeek
+ 9 more
Infrastructure
📈 Prometheus
🔍 OpenTelemetry
📝 Audit Log
💾 Semantic Cache
💰 Budget Store
🔔 Alerting

Built for Every Industry

🏥

Healthcare

HIPAA-compliant AI workflows with PHI redaction across all 15 entity types. Output guardrails scan LLM responses for PHI leakage and policy violations before they reach your application. Immutable audit trail for compliance.

  • PHI never reaches the LLM
  • Output content guardrails
  • Data residency enforcement
  • HIPAA audit trail
🏦

Financial Services

PII redaction for customer data, budget controls per team and product line, policy engine to block regulated content, and full audit trail for SOC 2 and regulatory compliance.

  • PII/PCI data redaction
  • Per-product budget caps
  • Regulatory policy enforcement
  • Audit logs for SOC 2
⚖️

Legal & Professional Services

Prevent confidential client data from reaching public LLM APIs. Route sensitive matters to on-premise models while using cloud LLMs for non-sensitive work.

  • Data classification routing
  • On-premise model support
  • Client confidentiality
  • Matter-level budget caps
🔬

Research & Education

Cost controls that prevent budget overruns from students or researchers. A/B testing to compare model quality. Semantic caching to reduce API costs on repetitive queries.

  • Per-user budget enforcement
  • Model A/B testing
  • 40%+ cost reduction via cache
  • Multi-provider comparison
NEW

Enterprise On-Premises Edition

Full governance portal, Entra ID SSO, Docker Compose & Kubernetes — one open-source package. For regulated industries that cannot use cloud SaaS.

🖥️
Full Admin Portal Included

Cost dashboards, request logging, governance rules, alert management, team controls, and A/B experiments — all self-hosted, zero cloud dependency.

🔑
Entra ID OIDC on Day 1

Authenticate your entire organization via Azure Active Directory. Existing Entra ID groups and App Roles control access — no separate user management required.

📜
Apache 2.0 — Own the Code, Own the Data

No vendor lock-in, no per-seat pricing, no data egress fees. Deploy on your own infrastructure: Docker Compose for single-node or Kubernetes for scale.

⬡ AI Control Plane — Enterprise
Dashboard Costs Governance Experiments Team
$2,847Monthly Spend
98.7%Cache Hit Rate
143msP95 Latency
🔐 Signed in via Entra ID
✓ Budget: 67% of monthly limit
✓ No anomalies detected
ℹ 3 active A/B experiments

See It Live

Watch a request flow through all 13 pipeline stages in real time. PHI gets redacted at Stage 6, restored in the response. The LLM never sees real patient data.

Launch Interactive Demo →
AI Control Plane — Live Demo
✓ Auth
✓ Rate Limit
✓ Policy
✓ Budget
⟳ PHI Redaction
— Routing
— LLM Call
— Audit

Simple, Transparent Pricing

Start free with our open-source edition. Scale to enterprise when you're ready.

Open Source
Free forever
Self-hosted, Apache 2.0 license. Full feature set.
  • ✓ All 13 pipeline stages
  • ✓ 20+ LLM providers
  • ✓ PHI/PII redaction
  • ✓ Azure Entra ID SSO
  • ✓ Agentic governance
  • ✓ Full source code
Get on GitHub
Enterprise
Custom
Unlimited scale, HIPAA BAA, dedicated support.
  • ✓ Unlimited requests
  • ✓ HIPAA BAA included
  • ✓ On-premise deployment
  • ✓ SLA 99.9% uptime
  • ✓ Dedicated engineer
Contact Sales