docs: rewrite READMEs to reflect Kubernetes migration
This commit is contained in:
150
README.md
150
README.md
@@ -1,81 +1,115 @@
|
||||
# Homelab Infrastructure
|
||||
|
||||
A collection of self-hosted services running on Docker containers, orchestrated through Portainer and exposed via Traefik reverse proxy.
|
||||
Self-hosted services running on a single-node Talos Kubernetes cluster, provisioned via Terraform on Proxmox and managed through Flux CD GitOps.
|
||||
|
||||
## Architecture
|
||||
|
||||
This homelab uses a stack-based approach where each service is containerized and deployed as a complete stack with its dependencies. All services integrate with a centralized Traefik instance for SSL termination and domain routing.
|
||||
|
||||
### Stack Structure
|
||||
```
|
||||
docker/stacks/<service>/
|
||||
- docker-compose.yaml # Service definition
|
||||
- stack.env # Environment template (tracked)
|
||||
- stack.env.real # Actual values with secrets (gitignored)
|
||||
Proxmox (hypervisor)
|
||||
└── Talos Linux VM (Kubernetes node)
|
||||
└── Flux CD (GitOps)
|
||||
├── config → cluster-wide variables & secrets
|
||||
├── infrastructure → Traefik, cert-manager, Authelia, MetalLB, NFS, ...
|
||||
└── apps → application workloads
|
||||
```
|
||||
|
||||
### Repository Layout
|
||||
|
||||
```
|
||||
homelab-v2/
|
||||
├── terraform/ # Proxmox VM + Talos cluster provisioning
|
||||
└── kubernetes/ # Flux CD manifests (Kustomize + Helm)
|
||||
├── config/
|
||||
├── flux-system/
|
||||
├── infrastructure/
|
||||
│ ├── controllers/ # Traefik, cert-manager, Authelia, MetalLB, ...
|
||||
│ └── configs/ # ClusterIssuer, MetalLB config
|
||||
├── app/
|
||||
│ ├── archmirror/
|
||||
│ ├── external/ # External service vars (e.g. Home Assistant)
|
||||
│ ├── grocy/
|
||||
│ ├── homepage/
|
||||
│ ├── immich/
|
||||
│ ├── jellyfin/
|
||||
│ ├── lubelogger/
|
||||
│ ├── media/
|
||||
│ ├── paperless/
|
||||
│ ├── pihole/
|
||||
│ └── podsync/
|
||||
└── docs/
|
||||
└── k8s-service-spec.md
|
||||
```
|
||||
|
||||
## Services
|
||||
|
||||
| Service | Description | Purpose |
|
||||
|---------|-------------|---------|
|
||||
| **Immich** | Self-hosted photo and video management | Personal media library with ML features |
|
||||
| **Paperless-ngx** | Document management system with OCR | Digital document archive and search |
|
||||
| **Media Stack** | Sonarr, Radarr, Prowlarr, qBittorrent | Automated media acquisition and management |
|
||||
| **Pi-hole** | DNS sinkhole with ad blocking and dnscrypt-proxy | Network-wide ad blocking and encrypted DNS |
|
||||
| **Arch Mirror** | Local Arch Linux package repository mirror | Local package cache for faster updates |
|
||||
| Service | Description |
|
||||
|---------|-------------|
|
||||
| **Immich** | Photo and video management with face recognition |
|
||||
| **Jellyfin** | Media streaming with Intel GPU hardware transcoding |
|
||||
| **Media Stack** | Sonarr, Radarr, Prowlarr, qBittorrent — automated media acquisition |
|
||||
| **Paperless-ngx** | Document management with OCR |
|
||||
| **Pi-hole** | DNS sinkhole with ad blocking and encrypted DNS via dnscrypt-proxy |
|
||||
| **Grocy** | Pantry and grocery management |
|
||||
| **LubeLogger** | Vehicle maintenance tracker |
|
||||
| **Homepage** | Dashboard aggregator |
|
||||
| **Podsync** | Podcast downloader |
|
||||
| **Archmirror** | Local Arch Linux package repository mirror |
|
||||
|
||||
## Infrastructure Stack
|
||||
|
||||
| Component | Role |
|
||||
|-----------|------|
|
||||
| **Flux CD** | GitOps controller — reconciles this repo to the cluster |
|
||||
| **Traefik** | Ingress controller with Let's Encrypt TLS |
|
||||
| **cert-manager** | TLS certificate provisioning (Cloudflare DNS-01) |
|
||||
| **Authelia** | SSO / OIDC provider for protected services |
|
||||
| **MetalLB** | Bare-metal load balancer |
|
||||
| **NFS Provisioner** | Dynamic PVC provisioning backed by Synology NAS |
|
||||
| **Intel GPU Plugin** | Hardware transcoding device plugin (Jellyfin) |
|
||||
| **SOPS + age** | Secret encryption at rest |
|
||||
|
||||
### Storage
|
||||
|
||||
- **Synology NAS** — primary storage backend for all services
|
||||
- Dynamic NFS PVCs via `nfs-synology-ssd` storage class
|
||||
- Static NFS PVs for media library and document archives
|
||||
- **local-path-provisioner** — node-local storage for SQLite databases
|
||||
|
||||
### Backups
|
||||
|
||||
Unified strategy using **restic + resticprofile**:
|
||||
|
||||
- **Primary**: Synology NAS via `rest-server` container (`${BACKUP_LOCAL_HOST}:8000`)
|
||||
- **Secondary**: Backblaze B2 (offsite), synced via `resticprofile copy`
|
||||
- PostgreSQL: pg_dump init container → restic
|
||||
- SQLite: online backup API → restic
|
||||
- Files/media: NFS mount → restic
|
||||
|
||||
## Deployment
|
||||
|
||||
Services are deployed through **Portainer WebUI**:
|
||||
All changes are deployed by pushing to this repository. Flux CD reconciles on every commit.
|
||||
|
||||
1. Access Portainer dashboard
|
||||
2. Navigate to Stacks section
|
||||
3. Create new stack or update existing
|
||||
4. Copy content from `docker-compose.yaml`
|
||||
5. Configure environment variables from `stack.env.real`
|
||||
6. Deploy stack
|
||||
```sh
|
||||
# Check reconciliation status
|
||||
flux get kustomizations
|
||||
|
||||
### Environment Setup
|
||||
# Force reconciliation
|
||||
flux reconcile source git flux-system
|
||||
|
||||
For each stack:
|
||||
```bash
|
||||
cd docker/stacks/<service>/
|
||||
cp stack.env stack.env.real
|
||||
# Edit stack.env.real with actual values
|
||||
# Check application status
|
||||
kubectl get helmreleases -A
|
||||
kubectl get pods -A
|
||||
```
|
||||
|
||||
## Common Operations
|
||||
|
||||
### Stack Management
|
||||
- Stack status and logs monitored through Portainer WebUI dashboard
|
||||
- Updates performed by pulling new images and recreating containers
|
||||
|
||||
### Backup Operations
|
||||
Each stack includes automated backup services:
|
||||
- **Database backups**: Hourly PostgreSQL dumps using postgres-backup-local
|
||||
- **File backups**: Scheduled Restic backups to AWS S3 backend
|
||||
|
||||
## Network Architecture
|
||||
|
||||
- **traefik** (external): Reverse proxy network for SSL termination and routing
|
||||
- **service-specific**: Internal networks for each stack (immich, paperless, sonarr, radarr)
|
||||
- Services primarily accessed through Traefik with minimal direct port exposure
|
||||
For initial cluster bootstrap, see [`kubernetes/README.md`](kubernetes/README.md).
|
||||
|
||||
## Security
|
||||
|
||||
- All services behind Traefik reverse proxy with Let's Encrypt SSL certificates
|
||||
- Environment variables with secrets stored in `*.env.real` files (gitignored)
|
||||
- API endpoints protected with HTTP basic authentication where applicable
|
||||
- Internal service communication isolated over Docker networks
|
||||
- All ingress through Traefik with Let's Encrypt TLS
|
||||
- Secrets encrypted with SOPS + age (decrypted at runtime by Flux)
|
||||
- SSO via Authelia (OIDC) for user-facing services
|
||||
- Per-namespace NetworkPolicies with default-deny + explicit Traefik ingress allow
|
||||
|
||||
## Requirements
|
||||
## Provisioning
|
||||
|
||||
- Docker and Docker Compose
|
||||
- Portainer CE for stack management
|
||||
- Traefik reverse proxy (external dependency)
|
||||
- Valid domain names for SSL certificate generation
|
||||
|
||||
## Notes
|
||||
|
||||
- This repository contains infrastructure definitions only
|
||||
- Actual deployment and management handled through Portainer WebUI
|
||||
The cluster is provisioned with Terraform (Proxmox + Talos). See [`terraform/README.md`](terraform/README.md).
|
||||
|
||||
112
kubernetes/README.md
Normal file
112
kubernetes/README.md
Normal file
@@ -0,0 +1,112 @@
|
||||
# Kubernetes Cluster Bootstrap
|
||||
|
||||
This covers **phase 2** of the full cluster setup. The two phases are:
|
||||
|
||||
1. **Terraform** (`terraform/`) — provisions the Talos VM on Proxmox and bootstraps the Kubernetes control plane. Outputs `kubeconfig` and `talosconfig`.
|
||||
2. **Flux CD** (this file) — installs the GitOps controller into the running cluster and points it at this repository. From that point on, everything in `kubernetes/` is reconciled automatically.
|
||||
|
||||
If you haven't run Terraform yet, start with [`terraform/README.md`](../terraform/README.md).
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- `flux` CLI installed
|
||||
- AGE private key for SOPS decryption
|
||||
- `kubectl` configured with the cluster kubeconfig from Terraform:
|
||||
```sh
|
||||
cd ../terraform
|
||||
terraform output -json kubeconfig | jq -r '.homelab' > ~/.kube/config
|
||||
```
|
||||
|
||||
## Bootstrap Steps
|
||||
|
||||
### 1. Verify cluster access
|
||||
|
||||
```sh
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
### 2. Bootstrap Flux CD
|
||||
|
||||
```sh
|
||||
flux bootstrap github \
|
||||
--owner=berezovskyi-oleksandr \
|
||||
--repository=homelab \
|
||||
--branch=homelab-v2 \
|
||||
--path=./kubernetes \
|
||||
--token-auth \
|
||||
--personal
|
||||
```
|
||||
|
||||
You will be prompted for a GitHub PAT, or set it beforehand:
|
||||
|
||||
```sh
|
||||
export GITHUB_TOKEN=<your-pat>
|
||||
```
|
||||
|
||||
Create a fine-grained PAT scoped to the `homelab` repository with:
|
||||
- **Contents**: Read and write
|
||||
- **Metadata**: Read-only (granted automatically)
|
||||
|
||||
This installs the Flux controllers and creates the `flux-system` namespace.
|
||||
|
||||
### 3. Create the SOPS AGE secret
|
||||
|
||||
Flux needs the AGE private key to decrypt SOPS-encrypted secrets.
|
||||
|
||||
```sh
|
||||
kubectl create secret generic sops-age \
|
||||
--namespace=flux-system \
|
||||
--from-file=age.agekey=<path-to-age.key>
|
||||
```
|
||||
|
||||
### 4. Verify Flux is reconciling
|
||||
|
||||
```sh
|
||||
flux get kustomizations --watch
|
||||
```
|
||||
|
||||
All kustomizations should eventually show as `Ready`.
|
||||
|
||||
### 5. Troubleshooting
|
||||
|
||||
Check Flux controller logs:
|
||||
|
||||
```sh
|
||||
flux logs
|
||||
```
|
||||
|
||||
Force a reconciliation:
|
||||
|
||||
```sh
|
||||
flux reconcile source git flux-system
|
||||
flux reconcile kustomization flux-system
|
||||
```
|
||||
|
||||
## Changing the Target Branch
|
||||
|
||||
To point Flux at a different branch (e.g. after merging `homelab-v2` into `master`):
|
||||
|
||||
1. Merge the branch as usual via a PR.
|
||||
2. Re-run `flux bootstrap` with the new `--branch` value:
|
||||
|
||||
```sh
|
||||
flux bootstrap github \
|
||||
--owner=berezovskyi-oleksandr \
|
||||
--repository=homelab \
|
||||
--branch=master \
|
||||
--path=./kubernetes \
|
||||
--token-auth \
|
||||
--personal
|
||||
```
|
||||
|
||||
This updates both the `GitRepository` resource in the cluster and the `flux-system/gotk-sync.yaml` file committed to the repo. No manual `kubectl patch` needed.
|
||||
|
||||
## Reconciliation Order
|
||||
|
||||
Flux applies resources in dependency order:
|
||||
|
||||
1. **config** — Cluster-wide variables and encrypted secrets
|
||||
2. **infrastructure-controllers** — Traefik, cert-manager, Authelia, MetalLB, NFS provisioner, Intel GPU plugin (depends on config)
|
||||
3. **infrastructure-configs** — ClusterIssuer, MetalLB config (depends on infrastructure-controllers)
|
||||
4. **external-vars** — External service variables (e.g. Home Assistant)
|
||||
5. **apps** — All application workloads (depends on config + infrastructure-configs + external-vars)
|
||||
68
terraform/README.md
Normal file
68
terraform/README.md
Normal file
@@ -0,0 +1,68 @@
|
||||
# Terraform — Cluster Provisioning
|
||||
|
||||
Provisions a Talos Linux VM on Proxmox and bootstraps the Kubernetes control plane.
|
||||
|
||||
## What It Does
|
||||
|
||||
1. Downloads the Talos ISO to Proxmox local storage
|
||||
2. Creates a VM per entry in `var.clusters` (UEFI, SCSI disk, host CPU passthrough)
|
||||
3. Generates Talos machine secrets and applies the machine configuration
|
||||
4. Bootstraps the Talos cluster and waits for health check
|
||||
5. Outputs `kubeconfig` and `talosconfig` for cluster access
|
||||
|
||||
## Providers
|
||||
|
||||
| Provider | Version |
|
||||
|----------|---------|
|
||||
| `bpg/proxmox` | 0.95.0 |
|
||||
| `siderolabs/talos` | 0.10.1 |
|
||||
|
||||
## Variables
|
||||
|
||||
Configured via `terraform.tfvars` (gitignored):
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `proxmox_endpoint` | Proxmox API URL (e.g. `https://pve:8006`) |
|
||||
| `proxmox_api_token` | Proxmox API token (`user@realm!token=secret`) |
|
||||
| `clusters` | Map of cluster definitions (see below) |
|
||||
|
||||
Each entry in `clusters`:
|
||||
|
||||
```hcl
|
||||
clusters = {
|
||||
homelab = {
|
||||
cores = 8
|
||||
memory = 16384
|
||||
disk_size_gb = 100
|
||||
hostname = "talos.example.com"
|
||||
mac_address = "BC:24:11:xx:xx:xx"
|
||||
ip_address = "192.168.1.x"
|
||||
datastore_id = "local-lvm"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
```sh
|
||||
terraform init
|
||||
terraform apply
|
||||
|
||||
# Write kubeconfig
|
||||
terraform output -json kubeconfig | jq -r '.homelab' > ~/.kube/config
|
||||
|
||||
# Write talosconfig
|
||||
terraform output -json talosconfig | jq -r '.homelab' > ~/.talos/config
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
- The Talos ISO resource has `prevent_destroy = true` to avoid accidental re-download
|
||||
- Control plane node has `allowSchedulingOnControlPlanes = true` (single-node cluster)
|
||||
- State files (`terraform.tfstate`, `terraform.tfstate.backup`, `terraform.tfvars`, `talosconfig`) are gitignored
|
||||
|
||||
## Next Steps
|
||||
|
||||
Once `terraform apply` completes and you have a working kubeconfig, proceed to
|
||||
[`kubernetes/README.md`](../kubernetes/README.md) to bootstrap Flux CD onto the cluster.
|
||||
Reference in New Issue
Block a user