docs: rewrite READMEs to reflect Kubernetes migration

This commit is contained in:
2026-03-12 20:23:33 +02:00
parent 33d0df77bf
commit bae73611c9
3 changed files with 272 additions and 58 deletions

150
README.md
View File

@@ -1,81 +1,115 @@
# Homelab Infrastructure # Homelab Infrastructure
A collection of self-hosted services running on Docker containers, orchestrated through Portainer and exposed via Traefik reverse proxy. Self-hosted services running on a single-node Talos Kubernetes cluster, provisioned via Terraform on Proxmox and managed through Flux CD GitOps.
## Architecture ## Architecture
This homelab uses a stack-based approach where each service is containerized and deployed as a complete stack with its dependencies. All services integrate with a centralized Traefik instance for SSL termination and domain routing.
### Stack Structure
``` ```
docker/stacks/<service>/ Proxmox (hypervisor)
- docker-compose.yaml # Service definition └── Talos Linux VM (Kubernetes node)
- stack.env # Environment template (tracked) └── Flux CD (GitOps)
- stack.env.real # Actual values with secrets (gitignored) ├── config → cluster-wide variables & secrets
├── infrastructure → Traefik, cert-manager, Authelia, MetalLB, NFS, ...
└── apps → application workloads
```
### Repository Layout
```
homelab-v2/
├── terraform/ # Proxmox VM + Talos cluster provisioning
└── kubernetes/ # Flux CD manifests (Kustomize + Helm)
├── config/
├── flux-system/
├── infrastructure/
│ ├── controllers/ # Traefik, cert-manager, Authelia, MetalLB, ...
│ └── configs/ # ClusterIssuer, MetalLB config
├── app/
│ ├── archmirror/
│ ├── external/ # External service vars (e.g. Home Assistant)
│ ├── grocy/
│ ├── homepage/
│ ├── immich/
│ ├── jellyfin/
│ ├── lubelogger/
│ ├── media/
│ ├── paperless/
│ ├── pihole/
│ └── podsync/
└── docs/
└── k8s-service-spec.md
``` ```
## Services ## Services
| Service | Description | Purpose | | Service | Description |
|---------|-------------|---------| |---------|-------------|
| **Immich** | Self-hosted photo and video management | Personal media library with ML features | | **Immich** | Photo and video management with face recognition |
| **Paperless-ngx** | Document management system with OCR | Digital document archive and search | | **Jellyfin** | Media streaming with Intel GPU hardware transcoding |
| **Media Stack** | Sonarr, Radarr, Prowlarr, qBittorrent | Automated media acquisition and management | | **Media Stack** | Sonarr, Radarr, Prowlarr, qBittorrent — automated media acquisition |
| **Pi-hole** | DNS sinkhole with ad blocking and dnscrypt-proxy | Network-wide ad blocking and encrypted DNS | | **Paperless-ngx** | Document management with OCR |
| **Arch Mirror** | Local Arch Linux package repository mirror | Local package cache for faster updates | | **Pi-hole** | DNS sinkhole with ad blocking and encrypted DNS via dnscrypt-proxy |
| **Grocy** | Pantry and grocery management |
| **LubeLogger** | Vehicle maintenance tracker |
| **Homepage** | Dashboard aggregator |
| **Podsync** | Podcast downloader |
| **Archmirror** | Local Arch Linux package repository mirror |
## Infrastructure Stack
| Component | Role |
|-----------|------|
| **Flux CD** | GitOps controller — reconciles this repo to the cluster |
| **Traefik** | Ingress controller with Let's Encrypt TLS |
| **cert-manager** | TLS certificate provisioning (Cloudflare DNS-01) |
| **Authelia** | SSO / OIDC provider for protected services |
| **MetalLB** | Bare-metal load balancer |
| **NFS Provisioner** | Dynamic PVC provisioning backed by Synology NAS |
| **Intel GPU Plugin** | Hardware transcoding device plugin (Jellyfin) |
| **SOPS + age** | Secret encryption at rest |
### Storage
- **Synology NAS** — primary storage backend for all services
- Dynamic NFS PVCs via `nfs-synology-ssd` storage class
- Static NFS PVs for media library and document archives
- **local-path-provisioner** — node-local storage for SQLite databases
### Backups
Unified strategy using **restic + resticprofile**:
- **Primary**: Synology NAS via `rest-server` container (`${BACKUP_LOCAL_HOST}:8000`)
- **Secondary**: Backblaze B2 (offsite), synced via `resticprofile copy`
- PostgreSQL: pg_dump init container → restic
- SQLite: online backup API → restic
- Files/media: NFS mount → restic
## Deployment ## Deployment
Services are deployed through **Portainer WebUI**: All changes are deployed by pushing to this repository. Flux CD reconciles on every commit.
1. Access Portainer dashboard ```sh
2. Navigate to Stacks section # Check reconciliation status
3. Create new stack or update existing flux get kustomizations
4. Copy content from `docker-compose.yaml`
5. Configure environment variables from `stack.env.real`
6. Deploy stack
### Environment Setup # Force reconciliation
flux reconcile source git flux-system
For each stack: # Check application status
```bash kubectl get helmreleases -A
cd docker/stacks/<service>/ kubectl get pods -A
cp stack.env stack.env.real
# Edit stack.env.real with actual values
``` ```
## Common Operations For initial cluster bootstrap, see [`kubernetes/README.md`](kubernetes/README.md).
### Stack Management
- Stack status and logs monitored through Portainer WebUI dashboard
- Updates performed by pulling new images and recreating containers
### Backup Operations
Each stack includes automated backup services:
- **Database backups**: Hourly PostgreSQL dumps using postgres-backup-local
- **File backups**: Scheduled Restic backups to AWS S3 backend
## Network Architecture
- **traefik** (external): Reverse proxy network for SSL termination and routing
- **service-specific**: Internal networks for each stack (immich, paperless, sonarr, radarr)
- Services primarily accessed through Traefik with minimal direct port exposure
## Security ## Security
- All services behind Traefik reverse proxy with Let's Encrypt SSL certificates - All ingress through Traefik with Let's Encrypt TLS
- Environment variables with secrets stored in `*.env.real` files (gitignored) - Secrets encrypted with SOPS + age (decrypted at runtime by Flux)
- API endpoints protected with HTTP basic authentication where applicable - SSO via Authelia (OIDC) for user-facing services
- Internal service communication isolated over Docker networks - Per-namespace NetworkPolicies with default-deny + explicit Traefik ingress allow
## Requirements ## Provisioning
- Docker and Docker Compose The cluster is provisioned with Terraform (Proxmox + Talos). See [`terraform/README.md`](terraform/README.md).
- Portainer CE for stack management
- Traefik reverse proxy (external dependency)
- Valid domain names for SSL certificate generation
## Notes
- This repository contains infrastructure definitions only
- Actual deployment and management handled through Portainer WebUI

112
kubernetes/README.md Normal file
View File

@@ -0,0 +1,112 @@
# Kubernetes Cluster Bootstrap
This covers **phase 2** of the full cluster setup. The two phases are:
1. **Terraform** (`terraform/`) — provisions the Talos VM on Proxmox and bootstraps the Kubernetes control plane. Outputs `kubeconfig` and `talosconfig`.
2. **Flux CD** (this file) — installs the GitOps controller into the running cluster and points it at this repository. From that point on, everything in `kubernetes/` is reconciled automatically.
If you haven't run Terraform yet, start with [`terraform/README.md`](../terraform/README.md).
## Prerequisites
- `flux` CLI installed
- AGE private key for SOPS decryption
- `kubectl` configured with the cluster kubeconfig from Terraform:
```sh
cd ../terraform
terraform output -json kubeconfig | jq -r '.homelab' > ~/.kube/config
```
## Bootstrap Steps
### 1. Verify cluster access
```sh
kubectl get nodes
```
### 2. Bootstrap Flux CD
```sh
flux bootstrap github \
--owner=berezovskyi-oleksandr \
--repository=homelab \
--branch=homelab-v2 \
--path=./kubernetes \
--token-auth \
--personal
```
You will be prompted for a GitHub PAT, or set it beforehand:
```sh
export GITHUB_TOKEN=<your-pat>
```
Create a fine-grained PAT scoped to the `homelab` repository with:
- **Contents**: Read and write
- **Metadata**: Read-only (granted automatically)
This installs the Flux controllers and creates the `flux-system` namespace.
### 3. Create the SOPS AGE secret
Flux needs the AGE private key to decrypt SOPS-encrypted secrets.
```sh
kubectl create secret generic sops-age \
--namespace=flux-system \
--from-file=age.agekey=<path-to-age.key>
```
### 4. Verify Flux is reconciling
```sh
flux get kustomizations --watch
```
All kustomizations should eventually show as `Ready`.
### 5. Troubleshooting
Check Flux controller logs:
```sh
flux logs
```
Force a reconciliation:
```sh
flux reconcile source git flux-system
flux reconcile kustomization flux-system
```
## Changing the Target Branch
To point Flux at a different branch (e.g. after merging `homelab-v2` into `master`):
1. Merge the branch as usual via a PR.
2. Re-run `flux bootstrap` with the new `--branch` value:
```sh
flux bootstrap github \
--owner=berezovskyi-oleksandr \
--repository=homelab \
--branch=master \
--path=./kubernetes \
--token-auth \
--personal
```
This updates both the `GitRepository` resource in the cluster and the `flux-system/gotk-sync.yaml` file committed to the repo. No manual `kubectl patch` needed.
## Reconciliation Order
Flux applies resources in dependency order:
1. **config** — Cluster-wide variables and encrypted secrets
2. **infrastructure-controllers** — Traefik, cert-manager, Authelia, MetalLB, NFS provisioner, Intel GPU plugin (depends on config)
3. **infrastructure-configs** — ClusterIssuer, MetalLB config (depends on infrastructure-controllers)
4. **external-vars** — External service variables (e.g. Home Assistant)
5. **apps** — All application workloads (depends on config + infrastructure-configs + external-vars)

68
terraform/README.md Normal file
View File

@@ -0,0 +1,68 @@
# Terraform — Cluster Provisioning
Provisions a Talos Linux VM on Proxmox and bootstraps the Kubernetes control plane.
## What It Does
1. Downloads the Talos ISO to Proxmox local storage
2. Creates a VM per entry in `var.clusters` (UEFI, SCSI disk, host CPU passthrough)
3. Generates Talos machine secrets and applies the machine configuration
4. Bootstraps the Talos cluster and waits for health check
5. Outputs `kubeconfig` and `talosconfig` for cluster access
## Providers
| Provider | Version |
|----------|---------|
| `bpg/proxmox` | 0.95.0 |
| `siderolabs/talos` | 0.10.1 |
## Variables
Configured via `terraform.tfvars` (gitignored):
| Variable | Description |
|----------|-------------|
| `proxmox_endpoint` | Proxmox API URL (e.g. `https://pve:8006`) |
| `proxmox_api_token` | Proxmox API token (`user@realm!token=secret`) |
| `clusters` | Map of cluster definitions (see below) |
Each entry in `clusters`:
```hcl
clusters = {
homelab = {
cores = 8
memory = 16384
disk_size_gb = 100
hostname = "talos.example.com"
mac_address = "BC:24:11:xx:xx:xx"
ip_address = "192.168.1.x"
datastore_id = "local-lvm"
}
}
```
## Usage
```sh
terraform init
terraform apply
# Write kubeconfig
terraform output -json kubeconfig | jq -r '.homelab' > ~/.kube/config
# Write talosconfig
terraform output -json talosconfig | jq -r '.homelab' > ~/.talos/config
```
## Notes
- The Talos ISO resource has `prevent_destroy = true` to avoid accidental re-download
- Control plane node has `allowSchedulingOnControlPlanes = true` (single-node cluster)
- State files (`terraform.tfstate`, `terraform.tfstate.backup`, `terraform.tfvars`, `talosconfig`) are gitignored
## Next Steps
Once `terraform apply` completes and you have a working kubeconfig, proceed to
[`kubernetes/README.md`](../kubernetes/README.md) to bootstrap Flux CD onto the cluster.