the problem

Building AI infrastructure
shouldn't be a nightmare

Manual routing, leaked API keys, no tenant quotas, missing RBAC, raw logs everywhere. Kimss collapses the tangle into one secure control plane — secure, multi-tenant, day one.

Start Building See how it works

platform

Why Kimss?

Kimss is a single control plane for model and agent traffic: operators govern workspaces, billing, and audit sinks while developers keep one API key from prototype through production. The same Python SDK and browser SPA you use for integration are backed by FastAPI, PostgreSQL row-level tenant isolation, optional Azure API Management for gateway telemetry, and Azure AI Foundry for model and agent execution. This section summarizes the three primary surface areas—SDK access, unified gateway routing, and enterprise controls—that most architecture reviews ask about first. CTO brief →

sdk

Zero-Friction SDK

The Kimss Python SDK exposes chat and agent calls against Kimss-hosted endpoints so your application does not embed long-lived provider keys for Azure AI Foundry. Kimss performs tenant-aware routing, attaches metering metadata, and enforces subscription and credit state before traffic reaches model workers. Install once, configure a workspace API key, and call models and tools through the same client in CI, local development, and production; gateway policy and optional APIM diagnostics apply uniformly across environments.

from kimss import KimssClient
client = KimssClient(api_key="kimss_live_…")
await client.chat.create(model="meta-llama-3-70b")

gateway

One Gateway. Two Playgrounds.

Kimss multiplexes fast text-inference workloads and full agentic flows (tools, code interpreter, retrieval) through one authenticated edge. Clients select modality with parameters rather than provisioning separate vendor stacks. Routing targets Azure AI Foundry projects mapped per workspace; the gateway enforces model allow lists, spend limits, and trace identifiers used in usage exports. Switching between chat and agents is a configuration change on the same key, which reduces secret sprawl and keeps compliance boundaries identical for both patterns.

# Same key — swap modality
mode: "chat" | "agents"
route: "kimss-gateway" → foundry

control_plane

Enterprise Control, Already Wired

Workspaces carry billing profiles, Redis-backed credit pools, execution logs for SDK integrations, and optional export of gateway events to Log Analytics for immutable audit trails. Managed Identity is used for service-to-service access to Foundry and supporting Azure resources, reducing static secrets in application configuration. Operators can segment usage by tenant, enforce soft and hard spend caps, and correlate gateway logs with application-level request identifiers without building a separate metering plane.

billing: "per-tenant"
audit_sink: "log_analytics"
identity: "SystemAssignedManagedIdentity"

security

Enterprise standards

Kimss maps product security claims to concrete Azure and data-layer mechanisms: Entra-issued tokens for the SPA and APIs, PostgreSQL schemas with tenant-scoped row access, Redis for credit enforcement, and optional Azure API Management for gateway logs and token metrics. The items below are the three primitives most security questionnaires isolate first—identity to Azure services, isolation at persistence, and prompt lifecycle—so each entry leads with a factual summary before shorthand configuration examples. Read Security & architecture →

azure_security

Azure-Native Security

Application and worker roles use Azure Managed Identity to obtain tokens for Azure AI Foundry, Key Vault references, and related control-plane APIs where configured. This removes static API keys from Kimss service configuration for the data plane path and aligns with zero-trust guidance for cloud-hosted AI gateways. Human access to the product uses Microsoft Entra ID; personal sign-in flows may use Entra External ID (CIAM) when enabled for a deployment.

identity: "SystemAssignedManagedIdentity"
key_vault_refs: "optional"

tenancy

Strict Tenant Isolation

Workspace identifiers and Entra tenant or subject keys partition rows in PostgreSQL so one customer cannot read another’s agents, sessions, or billing artifacts through the API. ORM and SQL access paths are expected to carry tenant context from authenticated principals; pooled individual subscribers can be further partitioned by object id for wallets and telemetry when configured. Isolation is enforced server-side; browser storage holds only session UI state.

tenant_isolation: "enforced"
postgres_rls: "on"

data_plane

Zero-Retention Execution

Kimss routes prompts and tool traffic to customer-scoped Foundry projects; it does not maintain a separate long-lived store of end-user prompt bodies for model inference beyond what Azure AI and your workspace configuration retain. Operational logs focus on metadata—tokens, latency, identifiers—suitable for billing and audit, not on retraining Kimss-owned foundation models from customer content. Exact retention for provider-side logs follows Microsoft Azure AI data handling for the regions you select.

prompt_retention: "none"
routing_only: "true"

customer success

Building a sovereign digital workforce on Kimss.

Modern teams want AI that ships work — reviews code, fixes incidents, drafts content, triages support — with governance: one model gateway, auditable actions, and human approval where it matters.

worksfusion runs a production multi-agent fleet headless on Azure: specialized workers orchestrated by Apache Airflow, cognition through KimssClient, and tools through a self-hosted MCP server — not IDE-bound assistants or scattered provider SDK keys.

Sovereign by design All model and agent calls flow through Kimss workspaces — no shadow OpenAI or Azure AI clients.

Safe GitOps Workers open PRs to staging; only the Guardian merges. Production branches are protected at the tool layer.

Workforce that scales HR hiring packages define new workers as JSON; dynamic workers execute without redeploying per role.

Enterprise-ready ops Microsoft Entra ID, Key Vault, private PostgreSQL, and human-in-the-loop approval in Slack.

Read the full story

Digital worker fleet

Slack ingress, Kimss cognition, MCP tools, Airflow orchestration.

Production

1

Slack → agent router

One channel delegates natural language to the right digital worker DAG.

2

Workers → Kimss + MCP

Reasoning via KimssClient; Git, GitHub, and telemetry via Entra-authenticated MCP.

3

State + human approval

PostgreSQL holds missions and packages; sensitive actions wait for Slack sign-off.

digital workforce

Meet the AI Digital Team

Specialized agents that orchestrate, scale, secure, measure, and self-heal your Kimss workspace — coordinated through one gateway.

product

Run the platform from one dashboard

Agents, models, cookbook recipes, plans and billing, and usage — governed in a single control plane.

architecture

How Kimss works

Requests originate in browsers or SDK clients, authenticate to Kimss, and traverse an optional Azure API Management layer for policy and telemetry before reaching the FastAPI core. The core resolves workspace and tenant context in PostgreSQL, enforces Redis-backed credit pools, and forwards eligible traffic to the Azure AI Foundry project mapped to that workspace. Responses return through the same path so gateway logs, token metrics, and application audit hooks share identifiers. Use the interactive nodes below for component-level detail; this paragraph states the default production topology assumed by security documentation.

Client Layer

APP

Azure APIM

Gateway

FastAPI Core

Python App

Data Layer

PG & Redis

AI Foundry

Agents

System Overview

Select any stage in the pipeline above to see how Kimss processes, secures, and tracks your AI interactions.

billing

Transparent multi-tenant billing

Azure Monitor and Log Analytics provide immutable, regulation-ready audit trails at the API gateway. Credit pools enforce spend limits per tenant in real time via Redis.

Live Tenant Usage Tracking

Tenant ID: Acme_Corp

Current Token Usage 0 / 5,000,000

Soft Limit (80%)

PostgreSQL Logged Redis Cache Enforced

compliance

Built for regulated industries

Kimss is architected on Azure-native compliance primitives you can cite in procurement questionnaires. This is our technical design posture — not legal advice; your counsel validates fit for your sector.

EU AI Act — Article 12

Automatic, immutable AI logs

API Management diagnostic settings feed Log Analytics for gateway-level records. Token metrics use azure-openai-emit-token-metric with per-tenant dimensions for cost and governance dashboards.

GDPR — Data residency

Regional AI processing

Tenant slug maps to the correct Azure AI Foundry region via APIM backends and Named Values — no client-supplied region header. Project paths stay under /api/projects/{tenant}/… for a consistent data model.

Zero-trust

Managed identity to models

Gateway backends authenticate to Foundry with Managed Identity — no long-lived API keys in APIM policies for model traffic. Optional SDK-side PII scrubbing before traffic reaches the gateway.

Request our compliance pack

tooling

Dynamic tool registry

Equip your agents with custom, secure functions. Kimss strictly enforces access control, ensuring agents only call registered backend tools.

get_project_quote

Fetches dynamic pricing for clients

{
  "agent_id": "agt_19283",
  "action": "execute_tool",
  "parameters": { "scope": "enterprise" }
}

get_order_status

Queries live DB for logistics tracking

{
  "agent_id": "agt_55421",
  "action": "execute_tool",
  "parameters": { "order_id": "ORD-882" }
}

fetch_audit_logs

Admin tool for compliance reporting

{
  "role": "admin",
  "action": "telemetry_recent",
  "parameters": { "limit": "100" }
}

Ship agents your auditors approve.

One Kimss key from your IDE to production — tenant isolation, audit-ready telemetry, and Managed Identity included from the first call.

Start Building Documentation

Powered by secure Azure infrastructure.

Every agent.
One secure plane.

Building AI infrastructure
shouldn't be a nightmare

Why Kimss?

Zero-Friction SDK

One Gateway. Two Playgrounds.

Enterprise Control, Already Wired

Enterprise standards

Azure-Native Security

Strict Tenant Isolation

Zero-Retention Execution

Building a sovereign digital workforce on Kimss.

Digital worker fleet

Slack → agent router

Workers → Kimss + MCP

State + human approval

Meet the AI Digital Team

Run the platform from one dashboard

Your AI agent fleet

Developer cookbook

Centralized agent management

Model catalog and routing

Plans, pools, and limits

Granular usage reporting

How Kimss works

Transparent multi-tenant billing

Live Tenant Usage Tracking

Built for regulated industries

Automatic, immutable AI logs

Regional AI processing

Managed identity to models

Dynamic tool registry

Ship agents your auditors approve.

Every agent.One secure plane.

Building AI infrastructureshouldn't be a nightmare

Why Kimss?

Zero-Friction SDK

One Gateway. Two Playgrounds.

Enterprise Control, Already Wired

Enterprise standards

Azure-Native Security

Strict Tenant Isolation

Zero-Retention Execution

Building a sovereign digital workforce on Kimss.

Digital worker fleet

Slack → agent router

Workers → Kimss + MCP

State + human approval

Meet the AI Digital Team

Run the platform from one dashboard

Your AI agent fleet

Developer cookbook

Centralized agent management

Model catalog and routing

Plans, pools, and limits

Granular usage reporting

How Kimss works

Transparent multi-tenant billing

Live Tenant Usage Tracking

Built for regulated industries

Automatic, immutable AI logs

Regional AI processing

Managed identity to models

Dynamic tool registry

Ship agents your auditors approve.

Every agent.
One secure plane.

Building AI infrastructure
shouldn't be a nightmare