1.0.0 Release IaaS

2026-03-15 04:41:02 +09:00
commit a7365da431
292 changed files with 36059 additions and 0 deletions
@@ -0,0 +1,72 @@
+# ADR 001 - Architecture
+
+## Date
+
+- Feb/23/2026
+    - First documentation
+- Mar/4/2026
+    - Refining sentences
+
+## Status
+
+- Accepted
+
+## Context
+
+- Maintaining multi nodes requires a huge amount of resources, including hardware, electricity, even administrative efforts
+- All units which responsible for a single role should follow the Principle of Least Privilege \(PoLP\).
+- All units should be interchangeable on standard to avoid vendor lock-in.
+
+## Consideration
+
+### Hypervisor
+
+- Proxmox Virutal Environment \(PVE\)
+    - Based on Debian.
+    - PVE uses `qm` command which is not a standard to implement the virtual environment.
+- VMware ESXi
+    - Based on UNIX, deveoped by VMware \(Licence is not free\)
+- Hyper-V
+    - Based on Microsoft Windows \(Licence is not free\)
+- Debian Stable
+    - Based on standard linux \(conservative\)
+    - Standard virtualization technology 'Libvirt, QEMU, KVM'
+
+### Container
+
+- Docker
+    - Daemon is used to run containers
+    - Root authority required
+    - Socket and network problem is complex \(Docker bridge\)
+    - docker-compose is an orchestration tool
+- Rootless Podman
+    - Daemonless design
+    - Root authority not required
+    - Orchestration is integrated into systemd
+    - PASTA dumps packet via host-gateway
+- K8S, K3S
+    - HA is based on reprovisioning
+    - Guarantee availability to create and destroy node dynamically
+
+### IaC
+
+- Terraform
+    - Strength for initiating low-level and dynamic multi node environment
+- Ansible
+    - Declaritive and easy yaml grammar
+    - SSH is the way to set
+
+## Decisions
+
+- Use Libvirt/KVM/QEMU on pure linux \(Debian stable\).
+- Separate all services by VM, and podman rootless containers without K3S.
+    - Orchestration stack is not needed in single node system
+    - Services will be defined by Quadelt to integrate into systemd and to manage them declaratively
+    - IaC will be implemented by Ansible only declaratively
+- All VMs and services are isolated logically by VLAN and nftables
+
+## Consequences
+
+- All VMs have independent borderline by VLAN and nftables
+- All services have independent namespaces by podman subuid without daemon
+- Ansible can manage all configurations of services and VMs declaratively
@@ -0,0 +1,63 @@
+# ADR 002 - Network
+
+## Date
+
+- Feb/23/2026
+    - First documentation
+
+
+## Status
+
+- Accepted
+
+## Context
+
+- All L3 communications should be contolled by central firewall node.
+- Every firewall rule should be managed by code, not clicks.
+- Every edge node takes charge of L2 communication rules.
+- IPv4 and IPv6 dual stack should be supported for future network environment.
+
+## Consideration
+
+### Firewall
+
+- OPNSense/pfSense
+    - vendor lock-in
+    - GUI environment \(WebGUI\) can contain vulnerability
+    - It is hard to manage configurations by IaC
+- iptables
+    - Previous standard of Linux
+    - IPv4 and IPv6 configuration is separated \(no inet\)
+- nftables
+    - New standard of Linux
+    - English grammar friendly
+    - IPv4 and IPv6 configuration can be set on the same table \(inet\)
+
+### Flat network structure
+- LAN only
+    - L2 communication doesn't need to pass through gateway
+    - They use MAC address with ARP. Unicast communication is hard to manage.
+    - It is hard to manage and apply the policy centrally
+
+## Decisions
+
+- Categorize all nodes in 4 roles 'client', 'server', 'user', 'wg0; vpn connections'
+- Implement role separation with VLAN tagging on L2 switch (systemd-networkd bridge)
+    - VLAN 1: client (vmm, console, nas)
+    - VLAN 10: server (vmm, infra, auth, app)
+    - VLAN 20: user (DHCP allocated devices)
+    - wg0: VPN connections
+- Manage the rules based on roles fundamentally, furthermore manage them based on ip and ports when it is needed
+- All L3 communication which needs to pass gateway should be on control of firewall \(fw\)
+- All nodes including firewall uses nftables \(modern standard\) to manage the packets based on zone concept
+- IPv6 has two track strategy
+    - Client and server, wg nodes has static ULA IP, and use NAT66 for permanency
+    - User nodes has GUA SLAAC IP from ISP for compatibility
+
+
+## Consequences
+
+- Firewall takes charge of L3 communications
+- Each nodes takes charge of L2 communications and communication from FW
+- All nodes can communicate under both IPv4 and IPv6
+- All policies can be managed by Code
@@ -0,0 +1,57 @@
+# ADR 003 - PKI
+
+## Date
+
+- Feb/23/2026
+    - First documentation
+- Mar/06/2026
+    - Add expiry date observation way
+
+## Status
+
+- Accepted
+
+## Context
+
+- All communications except loop-back, should be encrypted
+- ssh, and TLS communications needs key and certificates
+- Public CA never issues for private domain, '.internal'
+- Automate issuing and renewing certificates
+- Revocation is not needed in this single and small environment.
+
+## Consideration
+
+### Automate protocol
+
+- JWK/JWT provisioner
+    - It is hard to manage pre-shared secret values than ACME \(Especially nsupdate\)
+- authorized_keys
+    - When the nodes are increased, it is hard to manage authorized_key.
+    - SSH ca.pub allow all the certificates signed by ca key, so it is not needed to manage authroized_keys from each hosts.
+
+### Revocation
+
+- CRL/OCSP/OCSP-stappling
+    - All long-term certificates are managed manually
+    - All short-term certificates are managed by ACME
+    - When the certificates are leaked, it is easier to change intermediate CA itself
+
+## Decisions
+
+- Operate private CA
+    - Root CA \(Store on coldstorage\) - 10 years
+    - Intermediate CA \(Online server as Step-CA\) - 5 years
+    - SSH CA - No period
+- Manage certificates with two track
+    - ACME with nsupdate \(using private DNS\) for web services via Caddy - 90 days
+    - Manual issuing and managing leaf certificate for infra services for independency - 2.5 years
+    - All manual issuing leaf certificate expiry date is observed by x509-exporter on infra vm
+- Manage SSH certificates
+    - *-cert.pub for host \(with -h options\)
+    - *-cert.pub for client \(without -h options\)
+
+## Consequences
+
+- Private PKI is operated
+- Private SSH CA is operated
+- All external/internal communication is encrypted as TLS re-encryption. \(E2EE\)
@@ -0,0 +1,52 @@
+# ADR 004 - DNS
+
+## Date
+
+- Feb/23/2026
+    - First documentation
+
+
+## Status
+
+- Accepted
+
+## Context
+
+- Private authoritative DNS is required to use private reserved root domain \(.internal\)
+- Split horizon DNS needs DNS resolver, because authoritative DNS must not send queries to other DNS.
+- Automatical issuing certificates needs private authoritative DNS which supports nsupdate \(RFC 2136\)
+
+## Consideration
+
+### Resolver DNS
+- AdGuard Home
+    - More powerful query routing than blocky
+    - Web UI dependency
+    - Extra function which is not useful \(DHCP, etc ..\)
+- Unbound DNS
+    - Cache and forward zone management is powerful
+    - more complex than blocky
+    - cache function is not that needed in this environment
+        - Internal authoritative DNS only takes charge of internal communication
+        - All security function is delegated to public DNS like cloudflare \(DNSSEC, etc\) 
+
+## Decisions
+
+- Operate BIND9 as authoritative DNS
+    - BIND9 is developed by ISC as de facto standard of authoritative DNS
+    - It supports nsupdate perfectly
+    - Use 2 forward zones
+        - ilnmors.com for split horizon DNS
+        - ilnmors.internal for internal DNS
+    - Uses 4 PTR zones
+        - Client vlan ipv4, v6 PTR zone
+        - Server vlan ipv4, v6 PTR zone
+- Operate Blocky as resolver and cache DNS
+    - blocky set the configurations with one code file
+    - It supports query routing based on its domain - Split horizon DNS
+
+## Consequences
+
+- Implementation of split horizon DNS
+- ACME is available via nsupdate
+- malicious DNS query is blocked in DNS level
@@ -0,0 +1,54 @@
+# ADR 005 - IDS/IPS
+
+## Date
+
+- Feb/23/2026
+    - First documentation
+
+
+## Status
+
+- Accepted
+
+## Context
+
+- Automized detection and prevention threats via network
+
+## considerations
+
+### IPS
+- Operate suricata as IDS and IPS
+    - Suricata IPS mode blocks packets by itself, bypassing nftables integration
+    - Suricata IPS mode overhead is very higher than IDS mode.
+    - Suricata IPS mode cannot detect and prevent TLS based communication.
+    - Homelab server resources are not enough to deal with high overhead.
+
+- fail2ban
+    - Single node only, no centralized decision sharing
+    - No community-based threat intelligence (CAPI)
+    - Regex based log parsing is less structured than CrowdSec's parser/scenario model
+
+- Crowdsec
+    - Community based rules and sinario \(CAPI\)
+    - Prevention based on local machines and parsers \(LAPI\)
+    - Bouncers can use nftables to prevent threats
+    - Parser can detect even L7 attack under TLS
+
+## Decisions
+
+- Operate suricata as IDS
+    - suricata IDS mode mirror all packets from interfaces
+    - match the packets based on its rules and writes log as fast.log, and eve.json
+
+- Operate Crowdsec as IPS
+    - CrowdSec uses two API server, CAPI, LAPI.
+        - CAPI updates malicious IPs based on community decisions
+        - LAPI decides malicious attack based on log from its parser and scenario \(Suricata, caddy, etc\)
+        - When CAPI, and LAPI decides block some IP based on log parsed by parser and scenarios, bouncer block the malicious accesses.
+    - Crowdsec register blacklist on nftables or iptables.
+
+
+## Consequences
+
+- All malicious attack from WAN, even from LAN is controlled by CrowdSec and Suricata
+- The firewall maintains high network throughput because blocking is performed efficiently at the OS network level (`nftables` sets) rather than through deep inline packet inspection.
@@ -0,0 +1,60 @@
+# ADR 006 - Secrets
+
+## Date
+
+- Feb/23/2026
+    - First documentation
+
+## Status
+
+- Accepted
+
+## Context
+
+- Secret values must not uploaded anywhere as plain values.
+- Manage secret values as Git without its real values.
+
+## Considerations
+
+### External KMS
+
+- HashiCorp Vault or Infisical
+    - Very powerful, but introduces significant compute/memory overhead.
+    - Creates a "Secret Zero" problem for a single-node homelab environment because of dependency \(DB, or etc\).
+    - It is hard to operate hardware separated key servers.
+
+### Systemd-credential
+
+- VM environment is hard to apply TPM for systemd-credential
+    - It is hard to guarantee the idempotency of TPM in virtual environment.
+
+### Ansible vault only
+
+- Ansible vault is powerful options but they are not convenient.
+    - It is necessary to encrypt separately outside of host_vars or group_vars' file.
+    - It is hard to add or modify secret values in inventory file.
+
+## Decisions
+
+- All secret data which has yaml format is encrypted by sops with age-key in `secret.yaml`.
+- age-key is encrypted by gpg and ansible vault with master key \(including upper, lower case, number, special letters) above 40 characters.
+    - All secret data always decrypt by `edit_secret.sh` script or ansible tasks from secrets.yaml using age-key encrypted by ansible-vault.
+    - decrypted secret data is always processed on ramfs, they are never saved on disk.
+- Master key is never saved on disk, but only cold storage \(USB, M-DISC, operators' memory\)
+- The secret data will be saved on each servers specific directory or podman secret.
+    - OS:
+        - path: /etc/secrets
+          owner: root:root
+          mode: 0711
+        - path: /etc/secrets/\$UID
+          owner: \$UID:root
+          mode: 0500
+    - Containers:
+        - podman secret:
+          path: /run/secret/\$SECRET_NAME
+    - These data are never backed up by kopia, or uploaded to git.
+
+## Consequences
+
+- Secret values are not located as a plain text in everywhere except where they are needed.
+- It is possible to manage encrypted secret data with Git.
@@ -0,0 +1,61 @@
+# ADR 007 - backup
+
+## Date
+
+- Feb/23/2026
+    - First documentation
+
+- Feb/27/2026
+    - Status changed from Deffered to Accepted
+    
+## Status
+
+- Accepted
+
+## Context
+
+- All configuration file is managed by git \(IaC\)
+- All data file should be backed up by kopia
+- All backup should follow 3-2-1 backup cycle
+
+## Considerations
+
+### Backup Tool
+- Restic / BorgBackup
+    - Also excellent deduplicating backup tools.
+    - However, Kopia provides a highly efficient native server mode (API) and cross-platform compatibility, making it easier to integrate with Synology DSM.
+
+### Database Backup Method
+- Physical Backup (Raw data folder backup / File system snapshots)
+    - Backing up the `/var/lib/postgresql` directory directly while the DB is running can lead to severe data corruption and inconsistency.
+    - Logical dumps (`pg_dump`) are much safer, database-agnostic, and easier to restore in a homelab environment.
+
+
+## Decisions
+
+- All configuration files are managed by Git
+    - Configuration files are based on text
+    - It is necessary to version, history management.
+    - Local git -> private Gitea -> github private project \(mirrored\) 
+    - This fulfills 3-2-1 backup rules
+- Data files are managed by Kopia and DSM
+    - Local storage - kopia -> DSM's Kopia repository server - CloudSync -> Cloud server such as OneDrive or Google Drive
+    - This fulfills 3-2-1 backup rules
+- Data files which needs backup
+    - DB data files: dump
+        - DB data files are located on infra:/home/infra/containers/postgresql/backups/\{cluster,$service\}/
+    - App data files: Photos, Media, etc ..
+        - App data files are located on app:/home/app/data/
+    - Backed up files: kopia
+        - DSM:/kopia/{infra,app}/
+- Kopia over DSM configuration is managed by runbook with equivalent CLI commands due to vendor limitation
+- Restore will be processed manually
+    - DB data files
+        - From kopia server to console:$HOMELAB_PATH/data/volume/infra/postgresql/\{cluster,data\}
+    - APP data files
+        - From kopia server to APP vm after initiating before deploy services
+- Automative backup does not guarantee integrity of data system, so before reset the system conduct manual backup after making sure all services are shutdown.
+
+## Consequences
+
+- All files including configuration and data back ups will fulfill 3-2-1 \(3 Copies, 2 different media, 1 offsite\) back up rules
@@ -0,0 +1,43 @@
+# ADR 008 - passthrough
+
+## Date
+
+- Feb/23/2026
+    - First documentation
+
+## Status
+
+- Accepted
+
+## Context
+
+- App VM needs GPU for heavy workloads like Immich \(hardware transcoding and machine learning\)
+- App VM needs huge data storage for its own services 
+
+## Considerations
+
+### iGPU
+
+- SR-IOV
+    - SR-IOV is tech to divide PCIe devices for several VMs.
+    - Current stable linux kernel doesn't support sr-iov
+    - It is necessary to use DKMS for sr-iov
+        - Use DKMS is unstable depending on kernel upgrade, and the most important thing in server is stability.
+    - When passthrough iGPU itself, hypervisor cannot use graphic function.
+        - All nodes are managed by SSH session, so it is not a problem.
+
+### Storage
+
+- Each HDD
+    - Aoostar WTR Pro has their own sata controller for HDD.
+    - It is more effective and advantageous to passthrough SATA controller itself to manage btrfs RAID10, and HDD health check via S.M.A.R.T values.
+
+## Decisions
+
+- Passthrough N150's iGPU to APP VM
+- Passthrough SATA controller to APP VM
+
+## Consequences
+
+- Passthrough iGPU itself to APP vm.
+- Passthrough SATA controller to APP vm.
@@ -0,0 +1,49 @@
+# ADR 009 - isolation
+
+## Date
+
+- Mar/06/2026
+    - First documentation
+
+## Status
+
+- Accepted
+
+## Context
+
+- Distinguish borderline for service unit including hypervisor, vm, container
+
+## Considerations
+
+### Hypervisor
+
+- As a pure hypervisor, it should only operate virtualization for VM.
+- Hypervisor just provides resources and dummy hub \(br\)
+
+### VM
+
+- VM should be distinguished based on their logical role.
+    - Firewall is responsible for networking
+    - Infra is responsible for infrastructure services such as DB, Monitoring, CA server
+    - Auth is responsible for authentication and authorization for services
+    - App is responsible for applications
+
+### Services
+
+- Services should be distinguished based on their needs \(Privilege\)
+    - Network stack, backup stack needs special privilege for low level ACL or networks.
+    - application stack doesn't need low level privilege usually
+
+## Decisions
+
+- Hypervisor: Only supply pure virtualization for VM
+- VM: isolated by hypervisor from the other vms based on their role
+- Services:
+    - the one which needs previlieges: Run as native on vm. Don't make overhead for virtualization.
+    - the one which doesn't need previlieges: Isolate as container from host.
+
+## Consequences
+
+- Guarantee scurity integrity
+- Simple operational rules
+- Optimize the limited resources
@@ -0,0 +1,35 @@
+# ADR 010 - provisioning
+
+## Date
+
+- Mar/06/2026
+    - First documentation
+    
+## Status
+
+- Accepted
+
+## Context
+
+- Every sensitive process should be controlled and managed.
+
+## Considerations
+
+### Automate destroying process
+
+- Destroying is not frequent process, there's no reason to make complex logic
+- Sensitive process should be double checked by human manually
+
+## Decisions
+
+- Make provisioning process as auto
+- Make sensitive process as manual
+    - Removing
+    - Formatting
+    - Destroying
+    - Certificates and CA \([ADR-003](./003-pki.md)\)
+    - Etc. what operator decides that is sensitive
+
+## Consequences
+
+- All process can be under control of the operator
@@ -0,0 +1,33 @@
+# ADR 011 - TLS communication
+
+## Date
+
+- Mar/06/2026
+    - First documentation
+    
+## Status
+
+- Accepted
+
+## Context
+
+- To make administrational policy simple
+- Set the principle of TLS communication boundry
+
+## Considerations
+
+### Apply mTLS
+
+- implementing mTLS needs both client certificate and server certificate
+- Managing a number of certificates makes a huge operational burden \(expiry date, revocation, etc ..\)
+
+## Decisions
+
+- Set TLS for all communication except 'lo' interface
+- When it is possible to activate TLS, apply it even in 'lo' interface
+
+## Consequences
+
+- The policy is set simple
+- The overhead is increased little
+- Exclude the exceptions on operation \(For the administrator\)
@@ -0,0 +1,45 @@
+# ADR 012 - Alerting
+
+## Date
+
+- Mar/08/2026
+    - First documentation
+    
+## Status
+
+- Accepted
+
+## Context
+
+- The necessity of observability
+- Difficulty of realizing present status of services
+- Stable restoring process already exists
+
+## Considerations
+
+### Mail based
+
+- MTA is hard to manage even when operator uses this as relay host
+- The mail protocol is complex to implement only for internal mail system for single operator
+
+### Chat based
+
+- Using discord, telegram is easy to get announcment automatically
+- The dependency of external services
+
+## Decisions
+
+- Do not operate alerting system
+    - Single node system for small group doesn't need HA
+    - When single node system is down, the alerting system is also down.
+- When the alert system is needed, implement the system on free instance of external IaaS like AWS or Azure
+
+## Consequences
+
+- Simple management and stable restoring
+    - Check service availability
+    - Check from grafana
+    - Access to node via vpn with ssh
+    - Access to node via physical VLAN
+    - Reprovisioning the node
+- The additional possibility of extension with Cloud services.