Components

Deep dive into the components included in k8s-provisioner

k8s-provisioner includes a complete stack of production-ready Kubernetes components.

Networking

Calico - CNI plugin for pod networking and network policies
MetalLB - LoadBalancer implementation for bare metal

Service Mesh

Istio - Service mesh for traffic management, security, and observability

Metrics Server

The cluster includes Metrics Server for resource monitoring.

Usage

# View node resources
kubectl top nodes

# View pod resources
kubectl top pods

# View pods in all namespaces
kubectl top pods -A

Autoscaling

The cluster includes three complementary autoscalers:

Autoscaler	What it scales	Trigger	Config
HPA	Replicas (horizontal)	CPU, memory, custom metrics	Native Kubernetes
VPA	CPU/Memory requests per pod	Historical usage	`components.vpa: "enabled"`
KEDA	Replicas to zero and back	Prometheus, queues, cron, etc.	`components.keda: "enabled"`

HPA (Horizontal Pod Autoscaler)

kubectl autoscale deployment my-app --cpu-percent=50 --min=1 --max=10
kubectl get hpa

VPA (Vertical Pod Autoscaler)

Automatically adjusts CPU and memory requests based on observed usage.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Auto"

Mode	Behaviour
`Off`	Only shows recommendations
`Initial`	Sets requests at pod creation only
`Recreate`	Evicts pods to apply new requests
`Auto`	Same as Recreate (recommended)

VPA and HPA should not target the same metric simultaneously.

KEDA (Kubernetes Event-Driven Autoscaler)

Scales deployments based on external sources — including Prometheus. Can scale to zero.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: my-app-scaler
spec:
  scaleTargetRef:
    name: my-app
  minReplicaCount: 0
  maxReplicaCount: 10
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc:9090
      metricName: http_requests_total
      threshold: "100"
      query: sum(rate(http_requests_total{app="my-app"}[2m]))

Scale on cron schedule:

triggers:
- type: cron
  metadata:
    timezone: America/Sao_Paulo
    start: "0 8 * * 1-5"
    end: "0 18 * * 1-5"
    desiredReplicas: "3"

Monitoring (Prometheus + Grafana)

Components

Component	Description
Prometheus Operator	Manages Prometheus instances
Prometheus	Metrics collection and storage
Grafana	Visualization and dashboards
Node Exporter	Host metrics (CPU, memory, disk)
kube-state-metrics	Kubernetes object metrics

Accessing Grafana

# Get Istio Ingress IP
INGRESS_IP=$(kubectl get svc -n istio-system istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

# Add to /etc/hosts
echo "$INGRESS_IP grafana.local prometheus.local alertmanager.local" | sudo tee -a /etc/hosts

# Access
open https://grafana.local

Credentials:

Username: admin
Password: from Vault → vault kv get -field=grafana_admin_password secret/k8s-provisioner/api-keys
If Vault is disabled: admin123

Accessing Prometheus

open https://prometheus.local

Recommended Dashboards

Import dashboards from grafana.com: Dashboards → Import → Enter ID → Load

Kubernetes Dashboards

ID	Dashboard	Description
`15760`	Kubernetes / Views / Global	Cluster overview
`15757`	Kubernetes / Views / Namespaces	Per namespace metrics
`15759`	Kubernetes / Views / Pods	Pod details
`10000`	Kubernetes Cluster	Complete cluster view
`12740`	Kubernetes Monitoring	General monitoring

Node Dashboards

ID	Dashboard	Description
`1860`	Node Exporter Full	Detailed host metrics

Java/JVM Dashboards

ID	Dashboard	Description
`4701`	JVM Micrometer	Spring Boot with Micrometer
`8563`	JVM Dashboard	JMX Exporter metrics
`11955`	JVM Metrics	Heap, GC, Threads
`14430`	JVM Overview	Complete JVM view

Note: Java apps need to expose metrics via Micrometer or JMX Exporter

Go Dashboards

ID	Dashboard	Description
`10826`	Go Processes	Go runtime metrics
`6671`	Go Metrics	Goroutines, GC, Memory
`14061`	Go Runtime	Detailed runtime

Note: Go apps need to expose metrics via prometheus/client_golang

Logging (Loki + Grafana Alloy)

Components

Component	Description
Loki 3.x	Log aggregation and storage (TSDB schema v13)
Grafana Alloy	Log collector DaemonSet — replaces Promtail, collects pod logs and Kubernetes events

Accessing Logs

Logs are accessed via Grafana:

Open https://grafana.local
Go to Explore (left sidebar)
Select Loki as datasource

LogQL Query Examples

# All logs from a namespace
{namespace="default"}

# Logs from kube-system
{namespace="kube-system"}

# Logs from specific pods
{pod=~"nginx.*"}

# Filter by container
{container="app"}

# Search for errors
{namespace="default"} |= "error"

# Search for errors (case insensitive)
{namespace="default"} |~ "(?i)error"

# Exclude patterns
{namespace="default"} != "health"

# Multiple filters
{namespace="default", container="app"} |= "error" != "timeout"

# Parse JSON logs
{namespace="default"} | json | level="error"

# Count errors per pod
sum by (pod) (count_over_time({namespace="default"} |= "error" [5m]))

Recommended Log Dashboards

ID	Dashboard	Description
`13639`	Loki Dashboard	General log overview
`12611`	Loki & Alloy	Logs with Alloy stats
`15141`	Loki Logs	Simple log viewer

Storage

NFS Server - Dedicated VM for persistent storage
NFS Dynamic Provisioner - Automatic PV provisioning via StorageClasses:
- nfs-dynamic - Automatic PV creation (recommended)
- nfs-static - Manual PV/PVC management

Using Dynamic Storage

Just create a PVC - the PV is created automatically:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-data
spec:
  storageClassName: nfs-dynamic
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

Using Static Storage

For specific NFS paths, create PV first:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  storageClassName: nfs-static
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  nfs:
    server: 192.168.56.20
    path: /exports/k8s-volumes/my-data
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  storageClassName: nfs-static
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 5Gi

Kubernetes Explorer (Karpor)

Karpor - Kubernetes Explorer with intelligent search and AI-powered analysis

Note: Karpor requires extra resources (~1.5 CPU, ~2GB RAM). To disable, set components.karpor: "none" in config.yaml.

Features

Resource Search - Find resources across the cluster with powerful queries
AI Analysis - Natural language insights about your resources
Dependency View - Visualize relationships between resources

Accessing Karpor

# Get Istio Ingress IP
INGRESS_IP=$(kubectl get svc -n istio-system istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

# Add to /etc/hosts
echo "$INGRESS_IP karpor.local" | sudo tee -a /etc/hosts

# Access
open https://karpor.local

Search Examples

kind:Deployment                    # All deployments
namespace:monitoring kind:Pod      # Pods in monitoring namespace
label:app=nginx                    # Resources with label
name:*api*                         # Resources with "api" in name

Check Karpor Status

kubectl get pods -n karpor
kubectl logs -n karpor -l app.kubernetes.io/component=karpor-server

AI Backend (Ollama)

Ollama - Local and cloud LLM backend for Karpor AI features

Note: Ollama is only installed when karpor_ai.enabled: true.

Local Models (Default)

Local models run inside the cluster on node01:

Model	RAM	Quality	Speed
`llama3.2:1b`	~2GB	Basic	Very fast
`llama3.2:3b`	~4GB	Good	Fast
`qwen2.5-coder:7b`	~8GB	Excellent	Moderate
`llama3.1:8b`	~10GB	Excellent	Slower

Cloud Models (Optional)

Cloud models offer better performance without GPU requirements:

Create account at https://ollama.com/signup
Generate API key at https://ollama.com/settings/keys
Configure ollama.api_key in config.yaml

Model	Description
`minimax-m2.5:cloud`	Top performer, comparable to Claude Opus
`qwen3-coder:480b-cloud`	Excellent for code analysis
`glm-4.7:cloud`	Good general purpose

Check Ollama Status

# Check pods
kubectl get pods -n ollama

# Check if model is loaded
kubectl exec -n ollama deployment/ollama -- ollama list

# Check logs
kubectl logs -n ollama deployment/ollama

# Test AI endpoint
kubectl exec -n ollama deployment/ollama -- curl -s localhost:11434/api/tags

Switching Models

To change the model after installation:

# Pull new model
kubectl exec -n ollama deployment/ollama -- ollama pull llama3.1:8b

# Restart Karpor to use new model
kubectl rollout restart deployment/karpor-server -n karpor

Observability

Metrics, logs, and traces — the three pillars of observability

Keycloak (Identity Provider / SSO)

OIDC authentication for kubectl and Single Sign-On for Grafana