Architecture¶
Source Layout¶
skipper/
├── Dockerfile GraalVM multi-stage build (native binary → debian:12-slim)
├── pom.xml
├── helm/naftiko-skipper/ Helm chart
│ ├── crds/ CRD YAML installed by Helm
│ └── templates/ Operator Deployment, RBAC, ServiceAccount
├── config/
│ ├── crds/ CRD definitions — managed by ArgoCD (wave -1)
│ │ ├── application.yaml
│ │ └── manifests/
│ │ ├── naftiko-capability-crd.yaml
│ │ └── capability-class-crd.yaml
│ ├── defaults/ Default CapabilityClass instances — managed by ArgoCD (wave 0)
│ │ ├── application.yaml
│ │ └── manifests/
│ │ ├── capability-class-standard.yaml
│ │ ├── capability-class-premium.yaml
│ │ └── capability-class-dev.yaml
│ ├── operator/ Operator ArgoCD Application (wave 1)
│ │ └── application.yaml
│ ├── capabilities/ ApplicationSet template for user capabilities
│ │ └── applicationset.template.yaml
│ └── samples/ Sample Capability CRs for testing and CI
└── src/main/java/io/naftiko/operator/
├── NaftikoOperator.java Main entry point
├── CapabilityReconciler.java Core reconciliation logic
└── crd/
├── CapabilityResource.java
├── CapabilitySpec.java
├── CapabilityStatus.java
├── CapabilityClassResource.java
└── CapabilityClassSpec.java
Reconciliation Flow¶

Core Components¶
Capability CRD¶
The Capability resource is the primary user-facing API.
Users submit a capability specification through:
- a referenced ConfigMap (recommended — specRef pattern)
- labels describing tier/domain metadata
- exposed REST, MCP, skill, or control endpoints
Example:
apiVersion: naftiko.io/v1alpha3
kind: Capability
metadata:
name: hello-world
spec:
specRef:
configMap: hello-world-spec
CapabilityClass¶
CapabilityClass defines operational defaults per resource tier:
- CPU requests/limits
- memory requests/limits
- HPA autoscaling configuration
- Resilience4j defaults (circuit breaker, retry, bulkhead, rate limiter)
Selection is driven by:
Three tiers ship by default: standard, premium, dev.
If no matching class is found, the operator falls back to built-in standard defaults.
CapabilityReconciler¶
The reconciler is the core controller loop.
Responsibilities:
- watch Capability resources
- resolve the CapabilityClass
- resolve bind secrets and import mounts
- generate all child resources
- maintain drift correction
- patch status
Reconciliation flow:
Generated Resources¶
For each Capability CR, Skipper creates and continuously reconciles:
ConfigMap¶
Stores the full capability specification verbatim at key capability.yaml.
Mounted into the engine pod at /data/capability.yaml using subPath so that
import file mounts under /data/ can coexist without conflict.
Deployment¶
Skipper generates a Deployment running:
The Deployment:
- mounts the spec ConfigMap at /data/capability.yaml
- mounts bind secrets at the declared file:// path (e.g. /app/shared/)
- mounts import ConfigMaps as individual files under /data/shared/
- injects OpenTelemetry env vars (OTEL_SERVICE_NAME, OTEL_EXPORTER_OTLP_ENDPOINT, etc.)
- exposes one named container port per expose entry (mcp, rest, skill, control)
- adds Prometheus scrape annotations when a type: control expose is declared
Service¶
A ClusterIP Service is created with one named port per expose entry
ports:
- name: mcp # port 3001
- name: rest # port 3002
- name: skill # port 3003
- name: control # port 9090
Ingress¶
Ingress creation is conditional. Generated only when:
ServiceMonitor¶
A Prometheus Operator ServiceMonitor is created automatically when a
type: control expose is declared. It targets the control named port at
/metrics with a 15-second scrape interval.
If Prometheus Operator CRDs are not installed, the operator logs a warning
and falls back gracefully — Prometheus pod annotations (prometheus.io/scrape,
prometheus.io/port, prometheus.io/path) are always written on the pod
template regardless.
Runtime Flow¶
User applies Capability CR
│
▼
Operator receives event
│
▼
Reconciler resolves spec
├─ reads specRef ConfigMap (verbatim, no re-serialization)
├─ resolves CapabilityClass tier
├─ resolves bind secrets → /app/shared/secrets.yaml
└─ resolves import mounts → /data/shared/*.yaml
│
▼
Reconciler generates child resources
├─ create/update ConfigMap
├─ create/update Deployment (all ports + OTEL env vars)
├─ create/update Service (all named ports)
├─ reconcile Ingress (only if "public" tag)
└─ reconcile ServiceMonitor (only if control port)
│
▼
Engine pod starts
├─ reads /data/capability.yaml
├─ loads imports from /data/shared/
├─ resolves secrets from /app/shared/
└─ starts adapters on declared ports
│
▼
Capability endpoints become available
Multi-Port Support¶
A capability can expose multiple adapters simultaneously, each on its own port.
Skipper generates a named Service port and container port for every entry in exposes[].
capability:
exposes:
- type: mcp # → Service port "mcp":3001, container port 3001
port: 3001
- type: rest # → Service port "rest":3002, container port 3002
port: 3002
- type: skill # → Service port "skill":3003, container port 3003
port: 3003
- type: control # → Service port "control":9090, container port 9090
port: 9090 # + ServiceMonitor + pod annotations
Bind Secret Convention¶
Multiple bind namespaces that share the same location path are backed by
one Kubernetes Secret containing all keys combined.
Secret naming: {capability-name}-bind-{parent-directory}
This avoids projected volume conflicts when two namespaces point to the same file — two secrets with the same key cannot be merged by Kubernetes projected volumes without one silently overwriting the other.
Import Consumes Convention¶
capability.consumes entries using the import + location pattern are
backed by a dedicated ConfigMap per import.
ConfigMap naming: {capability-name}-import-{import-namespace}
Expects: {name}-import-registry with key step7-registry-consumes.yml.
Each import file is mounted individually via subPath so that multiple
imports sharing the same parent directory do not conflict.
Reconciliation Model¶
Skipper follows the Kubernetes controller pattern.
The operator continuously ensures:
If resources drift — deleted Service, modified Deployment, outdated ConfigMap — the reconcile loop restores consistency automatically.
Resources created by Skipper carry OwnerReference pointing to the Capability CR.
When the CR is deleted, Kubernetes garbage-collects all child resources.
OTEL Env Var Injection¶
The operator injects OpenTelemetry environment variables into every engine pod. Configuration is driven by env vars on the operator Deployment itself:
| Operator env var | Injected into pod as | Purpose |
|---|---|---|
NAFTIKO_ENGINE_IMAGE |
(image field) | ikanos image to run |
NAFTIKO_OTEL_ENDPOINT |
OTEL_EXPORTER_OTLP_ENDPOINT |
OTLP collector endpoint |
NAFTIKO_OTEL_PROTOCOL |
OTEL_EXPORTER_OTLP_PROTOCOL |
grpc or http/protobuf |
NAFTIKO_OTEL_HEADERS |
OTEL_EXPORTER_OTLP_HEADERS |
e.g. DD-API-KEY=xxx |
NAFTIKO_OTEL_SAMPLING_RATE |
OTEL_TRACES_SAMPLER_ARG |
sampling rate 0.0–1.0 |
OTEL_SERVICE_NAME is always set to naftiko-{capability-name} regardless of operator config.
Native Binary Architecture¶
The operator is compiled as a GraalVM native executable.
Benefits: - low memory footprint - fast startup - reduced JVM overhead - improved Kubernetes density
Container strategy:
Failure Handling¶
If reconciliation fails:
- status.phase becomes Failed
- conditions are updated with the error message
- the reconcile loop retries automatically with exponential backoff
Successful reconciliation sets:
Design Principles¶
Kubernetes-native¶
Skipper delegates runtime lifecycle management to Kubernetes primitives. The operator only manages desired state — it does not interpret capability business logic.
Declarative Model¶
Users declare what a capability is and how it should be exposed. Skipper determines how resources are materialized.
Loose Coupling¶
The engine container (ikanos) is responsible for parsing the spec, serving
APIs, and executing runtime behaviour. Skipper only orchestrates Kubernetes
resources and is agnostic of the ikanos version.
Spec Fidelity¶
When specRef is used, the raw YAML from the referenced ConfigMap is written
verbatim into the generated ConfigMap — no re-serialization through the Java
model. Unknown fields (MCP tools, descriptions, prompts, aggregates) are
preserved exactly as authored.