docs: rewrite READMEs to reflect Kubernetes migration

This commit is contained in:
2026-03-12 20:23:33 +02:00
parent 33d0df77bf
commit bae73611c9
3 changed files with 272 additions and 58 deletions

150
README.md
View File

@@ -1,81 +1,115 @@
# Homelab Infrastructure
A collection of self-hosted services running on Docker containers, orchestrated through Portainer and exposed via Traefik reverse proxy.
Self-hosted services running on a single-node Talos Kubernetes cluster, provisioned via Terraform on Proxmox and managed through Flux CD GitOps.
## Architecture
This homelab uses a stack-based approach where each service is containerized and deployed as a complete stack with its dependencies. All services integrate with a centralized Traefik instance for SSL termination and domain routing.
### Stack Structure
```
docker/stacks/<service>/
- docker-compose.yaml # Service definition
- stack.env # Environment template (tracked)
- stack.env.real # Actual values with secrets (gitignored)
Proxmox (hypervisor)
└── Talos Linux VM (Kubernetes node)
└── Flux CD (GitOps)
├── config → cluster-wide variables & secrets
├── infrastructure → Traefik, cert-manager, Authelia, MetalLB, NFS, ...
└── apps → application workloads
```
### Repository Layout
```
homelab-v2/
├── terraform/ # Proxmox VM + Talos cluster provisioning
└── kubernetes/ # Flux CD manifests (Kustomize + Helm)
├── config/
├── flux-system/
├── infrastructure/
│ ├── controllers/ # Traefik, cert-manager, Authelia, MetalLB, ...
│ └── configs/ # ClusterIssuer, MetalLB config
├── app/
│ ├── archmirror/
│ ├── external/ # External service vars (e.g. Home Assistant)
│ ├── grocy/
│ ├── homepage/
│ ├── immich/
│ ├── jellyfin/
│ ├── lubelogger/
│ ├── media/
│ ├── paperless/
│ ├── pihole/
│ └── podsync/
└── docs/
└── k8s-service-spec.md
```
## Services
| Service | Description | Purpose |
|---------|-------------|---------|
| **Immich** | Self-hosted photo and video management | Personal media library with ML features |
| **Paperless-ngx** | Document management system with OCR | Digital document archive and search |
| **Media Stack** | Sonarr, Radarr, Prowlarr, qBittorrent | Automated media acquisition and management |
| **Pi-hole** | DNS sinkhole with ad blocking and dnscrypt-proxy | Network-wide ad blocking and encrypted DNS |
| **Arch Mirror** | Local Arch Linux package repository mirror | Local package cache for faster updates |
| Service | Description |
|---------|-------------|
| **Immich** | Photo and video management with face recognition |
| **Jellyfin** | Media streaming with Intel GPU hardware transcoding |
| **Media Stack** | Sonarr, Radarr, Prowlarr, qBittorrent — automated media acquisition |
| **Paperless-ngx** | Document management with OCR |
| **Pi-hole** | DNS sinkhole with ad blocking and encrypted DNS via dnscrypt-proxy |
| **Grocy** | Pantry and grocery management |
| **LubeLogger** | Vehicle maintenance tracker |
| **Homepage** | Dashboard aggregator |
| **Podsync** | Podcast downloader |
| **Archmirror** | Local Arch Linux package repository mirror |
## Infrastructure Stack
| Component | Role |
|-----------|------|
| **Flux CD** | GitOps controller — reconciles this repo to the cluster |
| **Traefik** | Ingress controller with Let's Encrypt TLS |
| **cert-manager** | TLS certificate provisioning (Cloudflare DNS-01) |
| **Authelia** | SSO / OIDC provider for protected services |
| **MetalLB** | Bare-metal load balancer |
| **NFS Provisioner** | Dynamic PVC provisioning backed by Synology NAS |
| **Intel GPU Plugin** | Hardware transcoding device plugin (Jellyfin) |
| **SOPS + age** | Secret encryption at rest |
### Storage
- **Synology NAS** — primary storage backend for all services
- Dynamic NFS PVCs via `nfs-synology-ssd` storage class
- Static NFS PVs for media library and document archives
- **local-path-provisioner** — node-local storage for SQLite databases
### Backups
Unified strategy using **restic + resticprofile**:
- **Primary**: Synology NAS via `rest-server` container (`${BACKUP_LOCAL_HOST}:8000`)
- **Secondary**: Backblaze B2 (offsite), synced via `resticprofile copy`
- PostgreSQL: pg_dump init container → restic
- SQLite: online backup API → restic
- Files/media: NFS mount → restic
## Deployment
Services are deployed through **Portainer WebUI**:
All changes are deployed by pushing to this repository. Flux CD reconciles on every commit.
1. Access Portainer dashboard
2. Navigate to Stacks section
3. Create new stack or update existing
4. Copy content from `docker-compose.yaml`
5. Configure environment variables from `stack.env.real`
6. Deploy stack
```sh
# Check reconciliation status
flux get kustomizations
### Environment Setup
# Force reconciliation
flux reconcile source git flux-system
For each stack:
```bash
cd docker/stacks/<service>/
cp stack.env stack.env.real
# Edit stack.env.real with actual values
# Check application status
kubectl get helmreleases -A
kubectl get pods -A
```
## Common Operations
### Stack Management
- Stack status and logs monitored through Portainer WebUI dashboard
- Updates performed by pulling new images and recreating containers
### Backup Operations
Each stack includes automated backup services:
- **Database backups**: Hourly PostgreSQL dumps using postgres-backup-local
- **File backups**: Scheduled Restic backups to AWS S3 backend
## Network Architecture
- **traefik** (external): Reverse proxy network for SSL termination and routing
- **service-specific**: Internal networks for each stack (immich, paperless, sonarr, radarr)
- Services primarily accessed through Traefik with minimal direct port exposure
For initial cluster bootstrap, see [`kubernetes/README.md`](kubernetes/README.md).
## Security
- All services behind Traefik reverse proxy with Let's Encrypt SSL certificates
- Environment variables with secrets stored in `*.env.real` files (gitignored)
- API endpoints protected with HTTP basic authentication where applicable
- Internal service communication isolated over Docker networks
- All ingress through Traefik with Let's Encrypt TLS
- Secrets encrypted with SOPS + age (decrypted at runtime by Flux)
- SSO via Authelia (OIDC) for user-facing services
- Per-namespace NetworkPolicies with default-deny + explicit Traefik ingress allow
## Requirements
## Provisioning
- Docker and Docker Compose
- Portainer CE for stack management
- Traefik reverse proxy (external dependency)
- Valid domain names for SSL certificate generation
## Notes
- This repository contains infrastructure definitions only
- Actual deployment and management handled through Portainer WebUI
The cluster is provisioned with Terraform (Proxmox + Talos). See [`terraform/README.md`](terraform/README.md).

112
kubernetes/README.md Normal file
View File

@@ -0,0 +1,112 @@
# Kubernetes Cluster Bootstrap
This covers **phase 2** of the full cluster setup. The two phases are:
1. **Terraform** (`terraform/`) — provisions the Talos VM on Proxmox and bootstraps the Kubernetes control plane. Outputs `kubeconfig` and `talosconfig`.
2. **Flux CD** (this file) — installs the GitOps controller into the running cluster and points it at this repository. From that point on, everything in `kubernetes/` is reconciled automatically.
If you haven't run Terraform yet, start with [`terraform/README.md`](../terraform/README.md).
## Prerequisites
- `flux` CLI installed
- AGE private key for SOPS decryption
- `kubectl` configured with the cluster kubeconfig from Terraform:
```sh
cd ../terraform
terraform output -json kubeconfig | jq -r '.homelab' > ~/.kube/config
```
## Bootstrap Steps
### 1. Verify cluster access
```sh
kubectl get nodes
```
### 2. Bootstrap Flux CD
```sh
flux bootstrap github \
--owner=berezovskyi-oleksandr \
--repository=homelab \
--branch=homelab-v2 \
--path=./kubernetes \
--token-auth \
--personal
```
You will be prompted for a GitHub PAT, or set it beforehand:
```sh
export GITHUB_TOKEN=<your-pat>
```
Create a fine-grained PAT scoped to the `homelab` repository with:
- **Contents**: Read and write
- **Metadata**: Read-only (granted automatically)
This installs the Flux controllers and creates the `flux-system` namespace.
### 3. Create the SOPS AGE secret
Flux needs the AGE private key to decrypt SOPS-encrypted secrets.
```sh
kubectl create secret generic sops-age \
--namespace=flux-system \
--from-file=age.agekey=<path-to-age.key>
```
### 4. Verify Flux is reconciling
```sh
flux get kustomizations --watch
```
All kustomizations should eventually show as `Ready`.
### 5. Troubleshooting
Check Flux controller logs:
```sh
flux logs
```
Force a reconciliation:
```sh
flux reconcile source git flux-system
flux reconcile kustomization flux-system
```
## Changing the Target Branch
To point Flux at a different branch (e.g. after merging `homelab-v2` into `master`):
1. Merge the branch as usual via a PR.
2. Re-run `flux bootstrap` with the new `--branch` value:
```sh
flux bootstrap github \
--owner=berezovskyi-oleksandr \
--repository=homelab \
--branch=master \
--path=./kubernetes \
--token-auth \
--personal
```
This updates both the `GitRepository` resource in the cluster and the `flux-system/gotk-sync.yaml` file committed to the repo. No manual `kubectl patch` needed.
## Reconciliation Order
Flux applies resources in dependency order:
1. **config** — Cluster-wide variables and encrypted secrets
2. **infrastructure-controllers** — Traefik, cert-manager, Authelia, MetalLB, NFS provisioner, Intel GPU plugin (depends on config)
3. **infrastructure-configs** — ClusterIssuer, MetalLB config (depends on infrastructure-controllers)
4. **external-vars** — External service variables (e.g. Home Assistant)
5. **apps** — All application workloads (depends on config + infrastructure-configs + external-vars)

68
terraform/README.md Normal file
View File

@@ -0,0 +1,68 @@
# Terraform — Cluster Provisioning
Provisions a Talos Linux VM on Proxmox and bootstraps the Kubernetes control plane.
## What It Does
1. Downloads the Talos ISO to Proxmox local storage
2. Creates a VM per entry in `var.clusters` (UEFI, SCSI disk, host CPU passthrough)
3. Generates Talos machine secrets and applies the machine configuration
4. Bootstraps the Talos cluster and waits for health check
5. Outputs `kubeconfig` and `talosconfig` for cluster access
## Providers
| Provider | Version |
|----------|---------|
| `bpg/proxmox` | 0.95.0 |
| `siderolabs/talos` | 0.10.1 |
## Variables
Configured via `terraform.tfvars` (gitignored):
| Variable | Description |
|----------|-------------|
| `proxmox_endpoint` | Proxmox API URL (e.g. `https://pve:8006`) |
| `proxmox_api_token` | Proxmox API token (`user@realm!token=secret`) |
| `clusters` | Map of cluster definitions (see below) |
Each entry in `clusters`:
```hcl
clusters = {
homelab = {
cores = 8
memory = 16384
disk_size_gb = 100
hostname = "talos.example.com"
mac_address = "BC:24:11:xx:xx:xx"
ip_address = "192.168.1.x"
datastore_id = "local-lvm"
}
}
```
## Usage
```sh
terraform init
terraform apply
# Write kubeconfig
terraform output -json kubeconfig | jq -r '.homelab' > ~/.kube/config
# Write talosconfig
terraform output -json talosconfig | jq -r '.homelab' > ~/.talos/config
```
## Notes
- The Talos ISO resource has `prevent_destroy = true` to avoid accidental re-download
- Control plane node has `allowSchedulingOnControlPlanes = true` (single-node cluster)
- State files (`terraform.tfstate`, `terraform.tfstate.backup`, `terraform.tfvars`, `talosconfig`) are gitignored
## Next Steps
Once `terraform apply` completes and you have a working kubeconfig, proceed to
[`kubernetes/README.md`](../kubernetes/README.md) to bootstrap Flux CD onto the cluster.