Components

Deep dive into the components included in k8s-provisioner

k8s-provisioner includes a complete stack of production-ready Kubernetes components.

Networking

  • Calico - CNI plugin for pod networking and network policies
  • MetalLB - LoadBalancer implementation for bare metal

Service Mesh

  • Istio - Service mesh for traffic management, security, and observability

Metrics Server

The cluster includes Metrics Server for resource monitoring.

Usage

# View node resources
kubectl top nodes

# View pod resources
kubectl top pods

# View pods in all namespaces
kubectl top pods -A

Autoscaling

The cluster includes three complementary autoscalers:

AutoscalerWhat it scalesTriggerConfig
HPAReplicas (horizontal)CPU, memory, custom metricsNative Kubernetes
VPACPU/Memory requests per podHistorical usagecomponents.vpa: "enabled"
KEDAReplicas to zero and backPrometheus, queues, cron, etc.components.keda: "enabled"

HPA (Horizontal Pod Autoscaler)

kubectl autoscale deployment my-app --cpu-percent=50 --min=1 --max=10
kubectl get hpa

VPA (Vertical Pod Autoscaler)

Automatically adjusts CPU and memory requests based on observed usage.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Auto"
ModeBehaviour
OffOnly shows recommendations
InitialSets requests at pod creation only
RecreateEvicts pods to apply new requests
AutoSame as Recreate (recommended)

VPA and HPA should not target the same metric simultaneously.

KEDA (Kubernetes Event-Driven Autoscaler)

Scales deployments based on external sources — including Prometheus. Can scale to zero.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: my-app-scaler
spec:
  scaleTargetRef:
    name: my-app
  minReplicaCount: 0
  maxReplicaCount: 10
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc:9090
      metricName: http_requests_total
      threshold: "100"
      query: sum(rate(http_requests_total{app="my-app"}[2m]))

Scale on cron schedule:

triggers:
- type: cron
  metadata:
    timezone: America/Sao_Paulo
    start: "0 8 * * 1-5"
    end: "0 18 * * 1-5"
    desiredReplicas: "3"

Monitoring (Prometheus + Grafana)

Components

ComponentDescription
Prometheus OperatorManages Prometheus instances
PrometheusMetrics collection and storage
GrafanaVisualization and dashboards
Node ExporterHost metrics (CPU, memory, disk)
kube-state-metricsKubernetes object metrics

Accessing Grafana

# Get Istio Ingress IP
INGRESS_IP=$(kubectl get svc -n istio-system istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

# Add to /etc/hosts
echo "$INGRESS_IP grafana.local prometheus.local alertmanager.local" | sudo tee -a /etc/hosts

# Access
open https://grafana.local

Credentials:

  • Username: admin
  • Password: from Vault → vault kv get -field=grafana_admin_password secret/k8s-provisioner/api-keys

    If Vault is disabled: admin123

Accessing Prometheus

open https://prometheus.local

Import dashboards from grafana.com: DashboardsImport → Enter ID → Load

Kubernetes Dashboards

IDDashboardDescription
15760Kubernetes / Views / GlobalCluster overview
15757Kubernetes / Views / NamespacesPer namespace metrics
15759Kubernetes / Views / PodsPod details
10000Kubernetes ClusterComplete cluster view
12740Kubernetes MonitoringGeneral monitoring

Node Dashboards

IDDashboardDescription
1860Node Exporter FullDetailed host metrics

Java/JVM Dashboards

IDDashboardDescription
4701JVM MicrometerSpring Boot with Micrometer
8563JVM DashboardJMX Exporter metrics
11955JVM MetricsHeap, GC, Threads
14430JVM OverviewComplete JVM view

Note: Java apps need to expose metrics via Micrometer or JMX Exporter

Go Dashboards

IDDashboardDescription
10826Go ProcessesGo runtime metrics
6671Go MetricsGoroutines, GC, Memory
14061Go RuntimeDetailed runtime

Note: Go apps need to expose metrics via prometheus/client_golang


Logging (Loki + Grafana Alloy)

Components

ComponentDescription
Loki 3.xLog aggregation and storage (TSDB schema v13)
Grafana AlloyLog collector DaemonSet — replaces Promtail, collects pod logs and Kubernetes events

Accessing Logs

Logs are accessed via Grafana:

  1. Open https://grafana.local
  2. Go to Explore (left sidebar)
  3. Select Loki as datasource

LogQL Query Examples

# All logs from a namespace
{namespace="default"}

# Logs from kube-system
{namespace="kube-system"}

# Logs from specific pods
{pod=~"nginx.*"}

# Filter by container
{container="app"}

# Search for errors
{namespace="default"} |= "error"

# Search for errors (case insensitive)
{namespace="default"} |~ "(?i)error"

# Exclude patterns
{namespace="default"} != "health"

# Multiple filters
{namespace="default", container="app"} |= "error" != "timeout"

# Parse JSON logs
{namespace="default"} | json | level="error"

# Count errors per pod
sum by (pod) (count_over_time({namespace="default"} |= "error" [5m]))
IDDashboardDescription
13639Loki DashboardGeneral log overview
12611Loki & AlloyLogs with Alloy stats
15141Loki LogsSimple log viewer

Storage

  • NFS Server - Dedicated VM for persistent storage
  • NFS Dynamic Provisioner - Automatic PV provisioning via StorageClasses:
    • nfs-dynamic - Automatic PV creation (recommended)
    • nfs-static - Manual PV/PVC management

Using Dynamic Storage

Just create a PVC - the PV is created automatically:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-data
spec:
  storageClassName: nfs-dynamic
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

Using Static Storage

For specific NFS paths, create PV first:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  storageClassName: nfs-static
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  nfs:
    server: 192.168.56.20
    path: /exports/k8s-volumes/my-data
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  storageClassName: nfs-static
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 5Gi

Kubernetes Explorer (Karpor)

  • Karpor - Kubernetes Explorer with intelligent search and AI-powered analysis

Note: Karpor requires extra resources (~1.5 CPU, ~2GB RAM). To disable, set components.karpor: "none" in config.yaml.

Features

  • Resource Search - Find resources across the cluster with powerful queries
  • AI Analysis - Natural language insights about your resources
  • Dependency View - Visualize relationships between resources

Accessing Karpor

# Get Istio Ingress IP
INGRESS_IP=$(kubectl get svc -n istio-system istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

# Add to /etc/hosts
echo "$INGRESS_IP karpor.local" | sudo tee -a /etc/hosts

# Access
open https://karpor.local

Search Examples

kind:Deployment                    # All deployments
namespace:monitoring kind:Pod      # Pods in monitoring namespace
label:app=nginx                    # Resources with label
name:*api*                         # Resources with "api" in name

Check Karpor Status

kubectl get pods -n karpor
kubectl logs -n karpor -l app.kubernetes.io/component=karpor-server

AI Backend (Ollama)

  • Ollama - Local and cloud LLM backend for Karpor AI features

Note: Ollama is only installed when karpor_ai.enabled: true.

Local Models (Default)

Local models run inside the cluster on node01:

ModelRAMQualitySpeed
llama3.2:1b~2GBBasicVery fast
llama3.2:3b~4GBGoodFast
qwen2.5-coder:7b~8GBExcellentModerate
llama3.1:8b~10GBExcellentSlower

Cloud Models (Optional)

Cloud models offer better performance without GPU requirements:

  1. Create account at https://ollama.com/signup
  2. Generate API key at https://ollama.com/settings/keys
  3. Configure ollama.api_key in config.yaml
ModelDescription
minimax-m2.5:cloudTop performer, comparable to Claude Opus
qwen3-coder:480b-cloudExcellent for code analysis
glm-4.7:cloudGood general purpose

Check Ollama Status

# Check pods
kubectl get pods -n ollama

# Check if model is loaded
kubectl exec -n ollama deployment/ollama -- ollama list

# Check logs
kubectl logs -n ollama deployment/ollama

# Test AI endpoint
kubectl exec -n ollama deployment/ollama -- curl -s localhost:11434/api/tags

Switching Models

To change the model after installation:

# Pull new model
kubectl exec -n ollama deployment/ollama -- ollama pull llama3.1:8b

# Restart Karpor to use new model
kubectl rollout restart deployment/karpor-server -n karpor

Observability

Metrics, logs, and traces — the three pillars of observability

Keycloak (Identity Provider / SSO)

OIDC authentication for kubectl and Single Sign-On for Grafana