Starling EX

One DevOps platform for your Kubernetes cluster — explained in plain terms.

Part 1

What is it, really?

The problem

Running apps takes a pile of tools

To run software for a team you normally bolt together five separate products: a login system, an alarm system, a way to run code, a record of who did what, and ops tooling.

The idea

Starling EX is all five, in one install

install.sh

curl -sSL https://starling-ex.frakma.io/install.sh | bash

In one sentence

Think of it like a "home router" for your cluster

A home router bundles wifi, firewall, and DHCP into one box you plug in once. Starling EX bundles the team-software essentials into one thing you install once.

What you get

Five things, one install

🔐 Single sign-on

Everyone logs in with your existing Google / GitHub accounts. No new passwords.

⚡ Run code on demand

Describe a job in a file; the cluster runs it. "Serverless," git-friendly.

📜 Audit log

Every action recorded: who, when, what, how long, outcome.

🚨 Alerts

Problems route to Slack & PagerDuty automatically.

🛠 Ops tooling

Your AI assistant can safely inspect what's running.

🧾 One dashboard

See it all in a read-only web console behind sign-in.

Part 2

How it actually works

The shape of it

install.sh → a Helm chart → a few pods

install.shchecks kubectl + helm, takes your license key, installs the chart

↓

Helm chartoci://ghcr.io/…/starling-ex — Deployments, Service, Ingress, RBAC

↓

PodsOperator · Admin UI · Dex SSO

Everything lands in the starling-ex namespace on Kubernetes 1.27+.

The engine

You declare what you want — Starling makes it happen

Describe a job in a small file: which image to run and how it's triggered. Apply it, and Starling keeps the cluster matching that description — you never run the steps by hand.

function.yaml

cat <<'EOF' | kubectl apply -f -
apiVersion: starling.frakma.io/v1
kind: StarlingFunction
metadata:
  name: resize-images
spec:
  image: ghcr.io/acme/resize:1.4
  trigger:
    http:
      path: /resize
EOF

Now POST to /resize and your image runs. Edit the file, re-apply — Starling reconciles the difference.

Make it yours

Customize Starling to your requirement

Everything is declarative CRDs and Helm values — no forking. Tune sign-in, scaling, triggers, alerts and tooling to fit your stack.

Helm values

Override replicas, resources, ingress host, image tags, namespace via values.yaml or --set.

Auth connectors

Plug Dex into Google, GitHub, OIDC, LDAP or SAML — bring your own identity provider.

Function triggers

HTTP, cron, or event triggers per StarlingFunction; set autoscaling and concurrency.

Alert routing & tools

Define your own AlertRule / AlertChannel and register custom MCP tools for your agents.

helm upgrade

helm upgrade starling-ex oci://ghcr.io/.../starling-ex \
  --set replicas=3 \
  --set ingress.host=starling.acme.dev \
  --set dex.connectors[0].type=oidc

Know what's happening

Observability: three pillars

📊 Metrics & observability

The umbrella: can you tell what the system is doing and why from the outside, without new code?

🧵 Distributed tracing

Follow one request as it hops across services — see where time is spent and what failed in the chain.

📝 Logging

Time-stamped records of discrete events — the detail you read when something breaks.

How Starling EX solves it

Built in, not bolted on

Metrics

Operator & functions expose Prometheus endpoints out of the box — scrape into Grafana, no extra agents.

Tracing

Requests carry OpenTelemetry trace context end-to-end (gateway → function → tool call) to any OTLP backend.

Logging + audit

Structured JSON logs, plus the Firestore audit log: every tool call with key, cost, latency & outcome — queryable.

Commercial Tiers

Enterprise & Agent-Ready Features

Firestore Audit Log

Every API and CLI tool call logged: user, tool, cost, latency, outcome. Queryable and SIEM-exportable.

Helm-aware MCP Tools

AI-agent accessible tools: list_helm_releases, describe_helm_release, and helm_release_history.

Slack & PagerDuty

Declarative CRDs route critical cluster and compliance events to Slack channels and PagerDuty services.

Per-key Rate Limits

Redis-backed rate limiting per scope bounds the blast radius of any leaked keys or runaway AI agents.

AI-Native Stack

Built for SRE Agents

Traditional DevOps stacks are designed for humans. Starling EX includes the structured rails, tool registries, and safeguards that autonomous AI SREs require to operate safely.

🤖 Local MCP Server

Exposes cluster state to LLMs using standard Model Context Protocol.

🛡️ Rate Limiting

Throttles AI actions to prevent loops or API abuse.

📝 Agent Audits

Trace agent tool calls with precise cost/latency analysis.

Market Comparison

The Open-Source Advantage

Feature	Starling EX (Self-Hosted)	Proprietary SaaS Alternatives
Login SSO	Dex OIDC (Free/Self-hosted)	Auth0 / Okta (high per-seat costs)
Serverless	StarlingFunction (No cold starts/egress)	AWS Lambda / Vercel (egress lock-in)
Observability	Prometheus + OTel (zero ingestion fee)	Datadog / NewRelic (heavy volume charges)
Lock-in	None (Apache-2.0 base)	Full vendor dependency

Plans & Pricing

Sovereign Plans

Community

Free

Up to 3 users, Dex SSO, Function CRDs, email alerts, community support.

Startup

Up to 10 users, Slack & PagerDuty alerts, per-key rate limits, basic audit log.

Growth

Up to 50 users, full Helm-aware MCP tools, Firestore audit log, OTel tracing.

Enterprise plan: Unlimited seats, SAML, SIEM export, 99.9% SLA (Talk to us).

How gating works

Offline key validation

The operator and Helm charts are Apache-2.0 and run in Community mode by default. Activating Startup, Growth, or Enterprise features is as simple as generating an offline-verifiable JWT license key.

Try it

Three steps

kubectl & installer

# 1 — pick a cluster
kubectl config use-context my-cluster

# 2 — run the installer (or `helm install` directly)
curl -sSL https://starling-ex.frakma.io/install.sh | bash

# 3 — sign in through Dex
open https://starling-ex.your-cluster.example/

Starling EX

One platform. One command. Your cluster.