Observability
Metrics, logs, and traces — the three pillars of observability
k8s-provisioner includes a complete stack of production-ready Kubernetes components.
The cluster includes Metrics Server for resource monitoring.
# View node resources
kubectl top nodes
# View pod resources
kubectl top pods
# View pods in all namespaces
kubectl top pods -A
The cluster includes three complementary autoscalers:
| Autoscaler | What it scales | Trigger | Config |
|---|---|---|---|
| HPA | Replicas (horizontal) | CPU, memory, custom metrics | Native Kubernetes |
| VPA | CPU/Memory requests per pod | Historical usage | components.vpa: "enabled" |
| KEDA | Replicas to zero and back | Prometheus, queues, cron, etc. | components.keda: "enabled" |
kubectl autoscale deployment my-app --cpu-percent=50 --min=1 --max=10
kubectl get hpa
Automatically adjusts CPU and memory requests based on observed usage.
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Auto"
| Mode | Behaviour |
|---|---|
Off | Only shows recommendations |
Initial | Sets requests at pod creation only |
Recreate | Evicts pods to apply new requests |
Auto | Same as Recreate (recommended) |
VPA and HPA should not target the same metric simultaneously.
Scales deployments based on external sources — including Prometheus. Can scale to zero.
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: my-app-scaler
spec:
scaleTargetRef:
name: my-app
minReplicaCount: 0
maxReplicaCount: 10
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus.monitoring.svc:9090
metricName: http_requests_total
threshold: "100"
query: sum(rate(http_requests_total{app="my-app"}[2m]))
Scale on cron schedule:
triggers:
- type: cron
metadata:
timezone: America/Sao_Paulo
start: "0 8 * * 1-5"
end: "0 18 * * 1-5"
desiredReplicas: "3"
| Component | Description |
|---|---|
| Prometheus Operator | Manages Prometheus instances |
| Prometheus | Metrics collection and storage |
| Grafana | Visualization and dashboards |
| Node Exporter | Host metrics (CPU, memory, disk) |
| kube-state-metrics | Kubernetes object metrics |
# Get Istio Ingress IP
INGRESS_IP=$(kubectl get svc -n istio-system istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
# Add to /etc/hosts
echo "$INGRESS_IP grafana.local prometheus.local alertmanager.local" | sudo tee -a /etc/hosts
# Access
open https://grafana.local
Credentials:
adminvault kv get -field=grafana_admin_password secret/k8s-provisioner/api-keysIf Vault is disabled:
admin123
open https://prometheus.local
Import dashboards from grafana.com: Dashboards → Import → Enter ID → Load
| ID | Dashboard | Description |
|---|---|---|
15760 | Kubernetes / Views / Global | Cluster overview |
15757 | Kubernetes / Views / Namespaces | Per namespace metrics |
15759 | Kubernetes / Views / Pods | Pod details |
10000 | Kubernetes Cluster | Complete cluster view |
12740 | Kubernetes Monitoring | General monitoring |
| ID | Dashboard | Description |
|---|---|---|
1860 | Node Exporter Full | Detailed host metrics |
| ID | Dashboard | Description |
|---|---|---|
4701 | JVM Micrometer | Spring Boot with Micrometer |
8563 | JVM Dashboard | JMX Exporter metrics |
11955 | JVM Metrics | Heap, GC, Threads |
14430 | JVM Overview | Complete JVM view |
Note: Java apps need to expose metrics via Micrometer or JMX Exporter
| ID | Dashboard | Description |
|---|---|---|
10826 | Go Processes | Go runtime metrics |
6671 | Go Metrics | Goroutines, GC, Memory |
14061 | Go Runtime | Detailed runtime |
Note: Go apps need to expose metrics via prometheus/client_golang
| Component | Description |
|---|---|
| Loki 3.x | Log aggregation and storage (TSDB schema v13) |
| Grafana Alloy | Log collector DaemonSet — replaces Promtail, collects pod logs and Kubernetes events |
Logs are accessed via Grafana:
# All logs from a namespace
{namespace="default"}
# Logs from kube-system
{namespace="kube-system"}
# Logs from specific pods
{pod=~"nginx.*"}
# Filter by container
{container="app"}
# Search for errors
{namespace="default"} |= "error"
# Search for errors (case insensitive)
{namespace="default"} |~ "(?i)error"
# Exclude patterns
{namespace="default"} != "health"
# Multiple filters
{namespace="default", container="app"} |= "error" != "timeout"
# Parse JSON logs
{namespace="default"} | json | level="error"
# Count errors per pod
sum by (pod) (count_over_time({namespace="default"} |= "error" [5m]))
| ID | Dashboard | Description |
|---|---|---|
13639 | Loki Dashboard | General log overview |
12611 | Loki & Alloy | Logs with Alloy stats |
15141 | Loki Logs | Simple log viewer |
nfs-dynamic - Automatic PV creation (recommended)nfs-static - Manual PV/PVC managementJust create a PVC - the PV is created automatically:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-data
spec:
storageClassName: nfs-dynamic
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
For specific NFS paths, create PV first:
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
storageClassName: nfs-static
capacity:
storage: 5Gi
accessModes:
- ReadWriteMany
nfs:
server: 192.168.56.20
path: /exports/k8s-volumes/my-data
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
storageClassName: nfs-static
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
Note: Karpor requires extra resources (~1.5 CPU, ~2GB RAM). To disable, set
components.karpor: "none"in config.yaml.
# Get Istio Ingress IP
INGRESS_IP=$(kubectl get svc -n istio-system istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
# Add to /etc/hosts
echo "$INGRESS_IP karpor.local" | sudo tee -a /etc/hosts
# Access
open https://karpor.local
kind:Deployment # All deployments
namespace:monitoring kind:Pod # Pods in monitoring namespace
label:app=nginx # Resources with label
name:*api* # Resources with "api" in name
kubectl get pods -n karpor
kubectl logs -n karpor -l app.kubernetes.io/component=karpor-server
Note: Ollama is only installed when
karpor_ai.enabled: true.
Local models run inside the cluster on node01:
| Model | RAM | Quality | Speed |
|---|---|---|---|
llama3.2:1b | ~2GB | Basic | Very fast |
llama3.2:3b | ~4GB | Good | Fast |
qwen2.5-coder:7b | ~8GB | Excellent | Moderate |
llama3.1:8b | ~10GB | Excellent | Slower |
Cloud models offer better performance without GPU requirements:
ollama.api_key in config.yaml| Model | Description |
|---|---|
minimax-m2.5:cloud | Top performer, comparable to Claude Opus |
qwen3-coder:480b-cloud | Excellent for code analysis |
glm-4.7:cloud | Good general purpose |
# Check pods
kubectl get pods -n ollama
# Check if model is loaded
kubectl exec -n ollama deployment/ollama -- ollama list
# Check logs
kubectl logs -n ollama deployment/ollama
# Test AI endpoint
kubectl exec -n ollama deployment/ollama -- curl -s localhost:11434/api/tags
To change the model after installation:
# Pull new model
kubectl exec -n ollama deployment/ollama -- ollama pull llama3.1:8b
# Restart Karpor to use new model
kubectl rollout restart deployment/karpor-server -n karpor
Metrics, logs, and traces — the three pillars of observability
OIDC authentication for kubectl and Single Sign-On for Grafana