Compare commits
20 Commits
aea6e58f43
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 382d08e056 | |||
| 91aab1fce6 | |||
| 492cc8abbc | |||
| 3f191d8f93 | |||
| 258d1ecc60 | |||
| a927b69aad | |||
| 443380b224 | |||
| b351696017 | |||
| a88f347777 | |||
| eaaacc2f68 | |||
| b7a52c0c37 | |||
| a9b18b5821 | |||
| b2c1cc6577 | |||
| 95f543b4f4 | |||
| d7d0098a5c | |||
| 8e9a90bfc3 | |||
| a4fe05e26a | |||
| adc92a61b4 | |||
| 902f00e2b9 | |||
| 18f0f637bd |
3
.gitignore
vendored
3
.gitignore
vendored
@@ -5,7 +5,8 @@ images/
|
|||||||
# Boot file artifacts (but keep boot.ipxe configuration)
|
# Boot file artifacts (but keep boot.ipxe configuration)
|
||||||
http/*
|
http/*
|
||||||
!http/boot.ipxe
|
!http/boot.ipxe
|
||||||
tftp/
|
tftp/*
|
||||||
|
!tftp/ipxe.efi
|
||||||
|
|
||||||
# Generated metadata
|
# Generated metadata
|
||||||
*.img
|
*.img
|
||||||
|
|||||||
3
.sops.yaml
Normal file
3
.sops.yaml
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
creation_rules:
|
||||||
|
- path_regex: secrets/.*\.yaml$
|
||||||
|
age: age1gausnystsln7fpenw7arw7x79xe22z22697jnauj38npy0usayqqxqc7td2y
|
||||||
169
CLAUDE.md
Normal file
169
CLAUDE.md
Normal file
@@ -0,0 +1,169 @@
|
|||||||
|
# AGENTS.md
|
||||||
|
|
||||||
|
This file provides guidance to AI coding assistants when working with code in this repository.
|
||||||
|
|
||||||
|
## Project Overview
|
||||||
|
|
||||||
|
This is a netboot system for diskless Ubuntu Noble (24.04) nodes designed for K3s clusters. It builds bootable images (kernel, initramfs, squashfs) that are served via HTTP and loaded using iPXE for network booting.
|
||||||
|
|
||||||
|
**Boot Flow:**
|
||||||
|
1. Machine PXE boots and loads iPXE from network
|
||||||
|
2. iPXE fetches and executes `boot.ipxe` configuration
|
||||||
|
3. iPXE downloads kernel (`vmlinuz`) and custom initramfs (`initrd-netboot.img`) over HTTP
|
||||||
|
4. Kernel boots with custom initramfs that downloads the squashfs root filesystem over HTTP
|
||||||
|
5. Root filesystem is mounted as read-only squashfs with writable overlay (tmpfs)
|
||||||
|
|
||||||
|
## Build Commands
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Build netboot image (15-30 minutes, requires sudo)
|
||||||
|
make build
|
||||||
|
# Builds: http/vmlinuz, http/initrd-netboot.img, http/filesystem.squashfs
|
||||||
|
|
||||||
|
# Deploy to NAS server
|
||||||
|
make deploy
|
||||||
|
# Syncs http/ directory to phoenix:/srv/netboot/http/
|
||||||
|
|
||||||
|
# Build and deploy in one step
|
||||||
|
make all
|
||||||
|
|
||||||
|
# Clean build artifacts (unmounts any stray filesystem mounts first)
|
||||||
|
make clean
|
||||||
|
|
||||||
|
# Check NAS connectivity
|
||||||
|
make check-nas
|
||||||
|
```
|
||||||
|
|
||||||
|
**Configuration:**
|
||||||
|
- `NAS_HOST=phoenix` - target server for deployment
|
||||||
|
- `NAS_PATH=/srv/netboot` - deployment path on NAS
|
||||||
|
- Edit these in `Makefile` if deployment target changes
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
### Build System
|
||||||
|
|
||||||
|
**build-image.sh** - Main build script that:
|
||||||
|
1. Creates Ubuntu Noble base system using `debootstrap`
|
||||||
|
2. Chroots into rootfs and installs packages (kernel, K3s prerequisites, container runtime, tools)
|
||||||
|
3. Configures system (networking via netplan, SSH keys, tmpfs mounts, services)
|
||||||
|
4. Builds custom initramfs using `mkinitramfs` with customizations from `initramfs/`
|
||||||
|
5. Creates compressed squashfs image of entire rootfs
|
||||||
|
6. Copies artifacts to `images/<VERSION>/` and `http/` directories
|
||||||
|
7. Sets proper file permissions (644) for HTTP serving
|
||||||
|
|
||||||
|
**Path handling:** All scripts use `SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"` to work from any location, not just hardcoded paths.
|
||||||
|
|
||||||
|
### Custom Initramfs
|
||||||
|
|
||||||
|
Located in `initramfs/` directory, passed to `mkinitramfs` with `-d` flag:
|
||||||
|
|
||||||
|
- **initramfs.conf** - Configuration (MODULES=most, COMPRESS=gzip, RESUME=none)
|
||||||
|
- **modules** - Extra kernel modules to include (squashfs, overlay, r8125 network driver for 2.5GbE)
|
||||||
|
- **hooks/netboot** - Copies binaries into initramfs (wget, curl, unsquashfs, switch_root)
|
||||||
|
- **scripts/netboot** - Provides `mountroot()` function that:
|
||||||
|
- Parses kernel cmdline for `root=http://...` URL and `overlayroot=tmpfs`
|
||||||
|
- Configures networking via `configure_networking`
|
||||||
|
- Downloads squashfs over HTTP using wget (with retries)
|
||||||
|
- Validates downloaded file is squashfs
|
||||||
|
- Mounts squashfs read-only
|
||||||
|
- If `overlayroot=tmpfs`, creates overlay with tmpfs upper layer for writes
|
||||||
|
|
||||||
|
### Dracut Module (Alternative)
|
||||||
|
|
||||||
|
Located in `dracut-module/90netboot/`, an alternative initramfs approach using dracut:
|
||||||
|
|
||||||
|
- **module-setup.sh** - Dracut module setup and dependencies
|
||||||
|
- **parse-netboot.sh** - Parses kernel command line for netboot parameters
|
||||||
|
- **mount-netboot.sh** - Handles HTTP squashfs download and mounting
|
||||||
|
|
||||||
|
### iPXE Boot Configuration
|
||||||
|
|
||||||
|
**http/boot.ipxe** - iPXE script that:
|
||||||
|
- Loads kernel from `http://192.168.100.1:8800/vmlinuz`
|
||||||
|
- Loads initramfs from `http://192.168.100.1:8800/initrd-netboot.img`
|
||||||
|
- Sets kernel args: `boot=netboot root=http://192.168.100.1:8800/filesystem.squashfs rootfstype=squashfs overlayroot=tmpfs ip=dhcp console=tty0 console=ttyS0,115200 earlyprintk=serial,ttyS0,115200 loglevel=7`
|
||||||
|
- Boots the kernel
|
||||||
|
|
||||||
|
**IMPORTANT:** The HTTP server IP (192.168.100.1:8800) is hardcoded in boot.ipxe. Update this if the boot server changes.
|
||||||
|
|
||||||
|
### System Configuration
|
||||||
|
|
||||||
|
Built systems are configured with:
|
||||||
|
- Norwegian keyboard layout (nb_NO.UTF-8 + en_US.UTF-8 locales)
|
||||||
|
- Root SSH access with specific authorized keys (see build-image.sh around line 160)
|
||||||
|
- Password auth disabled, pubkey only
|
||||||
|
- Network via netplan with DHCP (systemd-networkd)
|
||||||
|
- Ephemeral tmpfs mounts: /tmp (2G), /var/tmp (1G), /var/log (1G), /run (512M)
|
||||||
|
- systemd-journald configured for volatile storage (tmpfs, 256M max)
|
||||||
|
- K3s dependencies: apparmor, iptables, conntrack, containerd, runc
|
||||||
|
- No hibernation/resume support
|
||||||
|
|
||||||
|
### Utility Scripts
|
||||||
|
|
||||||
|
**chroot-rootfs.sh** - Enter chroot for manual tweaking
|
||||||
|
- Mounts proc/sys/dev into existing rootfs
|
||||||
|
- Cleanup trap unmounts on exit
|
||||||
|
- **Hardcoded path:** `/srv/netboot/build/rootfs` - update if repo moves
|
||||||
|
|
||||||
|
**rebuild-squashfs.sh** - Rebuild squashfs after manual changes
|
||||||
|
- Creates new versioned image from existing rootfs
|
||||||
|
- Skips full debootstrap/package installation
|
||||||
|
- **Hardcoded paths:** `/srv/netboot/*` - update if repo moves
|
||||||
|
|
||||||
|
**verify-image.sh** - Validate built image completeness
|
||||||
|
- Checks all required files exist (vmlinuz, initrd, squashfs, boot.ipxe)
|
||||||
|
- Validates file types (kernel, cpio archive, squashfs)
|
||||||
|
- Verifies file permissions (644 for HTTP serving)
|
||||||
|
- Inspects initramfs for custom netboot script and required binaries
|
||||||
|
- Checks squashfs for critical directories and configurations
|
||||||
|
- Validates iPXE configuration references correct files
|
||||||
|
- Run with `./verify-image.sh` after `make build`
|
||||||
|
|
||||||
|
## File Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
.
|
||||||
|
├── build-image.sh # Main build script
|
||||||
|
├── Makefile # Build/deploy automation
|
||||||
|
├── verify-image.sh # Image validation script
|
||||||
|
├── chroot-rootfs.sh # Chroot helper (hardcoded paths)
|
||||||
|
├── rebuild-squashfs.sh # Rebuild helper (hardcoded paths)
|
||||||
|
├── AGENTS.md # AI assistant guidance (this file)
|
||||||
|
├── CLAUDE.md # Claude-specific guidance
|
||||||
|
├── initramfs/ # Custom initramfs configuration (mkinitramfs)
|
||||||
|
│ ├── initramfs.conf # mkinitramfs config
|
||||||
|
│ ├── modules # Extra kernel modules
|
||||||
|
│ ├── hooks/netboot # Binary copying hook
|
||||||
|
│ └── scripts/netboot # HTTP root mounting logic
|
||||||
|
├── dracut-module/ # Alternative initramfs (dracut)
|
||||||
|
│ └── 90netboot/
|
||||||
|
│ ├── module-setup.sh
|
||||||
|
│ ├── parse-netboot.sh
|
||||||
|
│ └── mount-netboot.sh
|
||||||
|
├── build/ # Build artifacts (gitignored)
|
||||||
|
│ └── rootfs/ # debootstrap rootfs
|
||||||
|
├── images/ # Versioned builds (gitignored)
|
||||||
|
│ ├── <YYYYMMDD-HHMM>/ # Timestamped builds
|
||||||
|
│ └── latest -> <VERSION> # Symlink to latest
|
||||||
|
└── http/ # HTTP serving directory (gitignored except boot.ipxe)
|
||||||
|
├── boot.ipxe # iPXE config (tracked)
|
||||||
|
├── vmlinuz # Kernel (generated)
|
||||||
|
├── initrd-netboot.img # Custom initramfs (generated)
|
||||||
|
├── filesystem.squashfs # Root filesystem (generated)
|
||||||
|
└── version.txt # Build metadata (generated)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Development Notes
|
||||||
|
|
||||||
|
**Hardcoded paths issue:** `chroot-rootfs.sh` and `rebuild-squashfs.sh` use hardcoded `/srv/netboot/` paths instead of dynamic path detection like `build-image.sh`. They need updating if repo is cloned elsewhere.
|
||||||
|
|
||||||
|
**Build requirements:**
|
||||||
|
- Ubuntu/Debian host with debootstrap, mkinitramfs, mksquashfs
|
||||||
|
- Sudo access for chroot operations and filesystem mounting
|
||||||
|
- 15-30 minute build time
|
||||||
|
- ~1GB disk space for build artifacts
|
||||||
|
|
||||||
|
**SSH key management:** Root SSH keys are embedded in build-image.sh around line 160. Update these before building images for new environments.
|
||||||
|
|
||||||
|
**Network driver:** RTL8125 (r8125) driver is explicitly loaded in initramfs for 2.5GbE NICs. If different NICs are used, update `initramfs/modules` and `initramfs/scripts/netboot`.
|
||||||
29
Makefile
29
Makefile
@@ -1,9 +1,19 @@
|
|||||||
.PHONY: build deploy clean help
|
.PHONY: deploy clean help check-nas all
|
||||||
|
|
||||||
NAS_HOST=phoenix
|
NAS_HOST=phoenix
|
||||||
NAS_PATH=/srv/netboot
|
NAS_PATH=/srv/netboot
|
||||||
SCRIPT_DIR=$(shell dirname $(realpath $(firstword $(MAKEFILE_LIST))))
|
SCRIPT_DIR=$(shell dirname $(realpath $(firstword $(MAKEFILE_LIST))))
|
||||||
|
|
||||||
|
# Source files that trigger a rebuild
|
||||||
|
BUILD_SOURCES := $(SCRIPT_DIR)/build-image.sh \
|
||||||
|
$(wildcard $(SCRIPT_DIR)/initramfs/*) \
|
||||||
|
$(wildcard $(SCRIPT_DIR)/initramfs/*/*) \
|
||||||
|
$(wildcard $(SCRIPT_DIR)/files/*) \
|
||||||
|
$(wildcard $(SCRIPT_DIR)/secrets/*.yaml)
|
||||||
|
|
||||||
|
# Build artifact (used as target for dependency tracking)
|
||||||
|
BUILD_ARTIFACT := $(SCRIPT_DIR)/http/filesystem.squashfs
|
||||||
|
|
||||||
help:
|
help:
|
||||||
@echo "Netboot image build and deployment"
|
@echo "Netboot image build and deployment"
|
||||||
@echo ""
|
@echo ""
|
||||||
@@ -23,7 +33,8 @@ check-nas:
|
|||||||
@echo "Checking NAS connectivity..."
|
@echo "Checking NAS connectivity..."
|
||||||
@ping -c 1 $(NAS_HOST) > /dev/null 2>&1 && echo "✓ NAS is reachable" || (echo "✗ Cannot reach $(NAS_HOST)"; exit 1)
|
@ping -c 1 $(NAS_HOST) > /dev/null 2>&1 && echo "✓ NAS is reachable" || (echo "✗ Cannot reach $(NAS_HOST)"; exit 1)
|
||||||
|
|
||||||
build:
|
# Build depends on source files - only rebuilds if sources changed
|
||||||
|
$(BUILD_ARTIFACT): $(BUILD_SOURCES)
|
||||||
@echo "Building netboot image..."
|
@echo "Building netboot image..."
|
||||||
@echo "This will take 15-30 minutes..."
|
@echo "This will take 15-30 minutes..."
|
||||||
sudo $(SCRIPT_DIR)/build-image.sh
|
sudo $(SCRIPT_DIR)/build-image.sh
|
||||||
@@ -32,10 +43,14 @@ build:
|
|||||||
@echo "Artifacts ready in $(SCRIPT_DIR)/http/"
|
@echo "Artifacts ready in $(SCRIPT_DIR)/http/"
|
||||||
@du -sh $(SCRIPT_DIR)/http/*
|
@du -sh $(SCRIPT_DIR)/http/*
|
||||||
|
|
||||||
|
build: $(BUILD_ARTIFACT)
|
||||||
|
|
||||||
deploy: check-nas
|
deploy: check-nas
|
||||||
@echo "Deploying to NAS ($(NAS_HOST):$(NAS_PATH))..."
|
@echo "Deploying to NAS ($(NAS_HOST):$(NAS_PATH))..."
|
||||||
@echo "Syncing http/ directory..."
|
@echo "Syncing http/ directory..."
|
||||||
rsync -avz --delete $(SCRIPT_DIR)/http/ $(NAS_HOST):$(NAS_PATH)/http/
|
rsync -avz --delete $(SCRIPT_DIR)/http/ $(NAS_HOST):$(NAS_PATH)/http/
|
||||||
|
@echo "Syncing tftp/ directory (iPXE bootloader)..."
|
||||||
|
rsync -avz $(SCRIPT_DIR)/tftp/ $(NAS_HOST):$(NAS_PATH)/tftp/
|
||||||
@echo ""
|
@echo ""
|
||||||
@echo "✓ Deployment complete!"
|
@echo "✓ Deployment complete!"
|
||||||
@echo "Images are now live on $(NAS_HOST)"
|
@echo "Images are now live on $(NAS_HOST)"
|
||||||
@@ -46,7 +61,15 @@ all: build deploy
|
|||||||
@echo "✓ Build and deployment complete!"
|
@echo "✓ Build and deployment complete!"
|
||||||
|
|
||||||
clean:
|
clean:
|
||||||
@echo "Removing build artifacts..."
|
@echo "Cleaning build artifacts..."
|
||||||
|
@if [ -d "$(SCRIPT_DIR)/build/rootfs" ]; then \
|
||||||
|
echo "Unmounting any stray mounts from $(SCRIPT_DIR)/build/rootfs..."; \
|
||||||
|
mount | grep "$(SCRIPT_DIR)/build/rootfs" | awk '{print $$3}' | while read mount; do \
|
||||||
|
echo " Unmounting $$mount"; \
|
||||||
|
sudo umount -l "$$mount" 2>/dev/null || true; \
|
||||||
|
done; \
|
||||||
|
fi
|
||||||
|
@echo "Removing build directories..."
|
||||||
sudo rm -rf $(SCRIPT_DIR)/build/rootfs
|
sudo rm -rf $(SCRIPT_DIR)/build/rootfs
|
||||||
sudo rm -rf $(SCRIPT_DIR)/images
|
sudo rm -rf $(SCRIPT_DIR)/images
|
||||||
@echo "✓ Cleaned!"
|
@echo "✓ Cleaned!"
|
||||||
|
|||||||
333
OPERATIONS.md
Normal file
333
OPERATIONS.md
Normal file
@@ -0,0 +1,333 @@
|
|||||||
|
# Netboot Operations Guide
|
||||||
|
|
||||||
|
This document covers day-to-day operations for the netboot K3s cluster system.
|
||||||
|
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Build new image (15-30 min, requires sudo)
|
||||||
|
cd /home/lindahl/git/netboot
|
||||||
|
sudo ./build-image.sh
|
||||||
|
make deploy
|
||||||
|
|
||||||
|
# Rebuild initramfs only (faster, ~2 min)
|
||||||
|
sudo ./rebuild-initramfs.sh
|
||||||
|
make deploy
|
||||||
|
|
||||||
|
# SSH to a node
|
||||||
|
ssh root@192.168.100.51
|
||||||
|
|
||||||
|
# Check node storage
|
||||||
|
ssh root@192.168.100.51 "lsblk && df -h /var/lib/containerd /var/lib/longhorn"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Architecture Overview
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────┐ HTTP (8800) ┌──────────────────┐
|
||||||
|
│ Phoenix NAS │◄────────────────────►│ K3s Nodes │
|
||||||
|
│ 192.168.100.1 │ │ 192.168.100.5x │
|
||||||
|
├─────────────────┤ ├──────────────────┤
|
||||||
|
│ /srv/netboot/ │ │ RAM (overlay) │
|
||||||
|
│ http/ │ │ └─ / (root) │
|
||||||
|
│ vmlinuz │ │ NVMe (persistent)│
|
||||||
|
│ initrd-netboot.img │ ├─ containerd │
|
||||||
|
│ filesystem.squashfs │ └─ longhorn │
|
||||||
|
│ boot.ipxe │ └──────────────────┘
|
||||||
|
└─────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
**Boot sequence:**
|
||||||
|
1. Node PXE boots → loads iPXE
|
||||||
|
2. iPXE fetches `boot.ipxe` from phoenix
|
||||||
|
3. Downloads kernel + initramfs
|
||||||
|
4. Initramfs downloads squashfs root over HTTP
|
||||||
|
5. Mounts squashfs read-only with tmpfs overlay
|
||||||
|
6. `setup-node-storage.service` partitions/mounts local NVMe
|
||||||
|
7. System starts, K3s joins cluster
|
||||||
|
|
||||||
|
## Building Images
|
||||||
|
|
||||||
|
### Full Build
|
||||||
|
|
||||||
|
Builds everything from scratch: debootstrap, packages, initramfs, squashfs.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /home/lindahl/git/netboot
|
||||||
|
sudo ./build-image.sh
|
||||||
|
make deploy
|
||||||
|
```
|
||||||
|
|
||||||
|
**Time:** 15-30 minutes
|
||||||
|
**When to use:** Package changes, kernel updates, major configuration changes
|
||||||
|
|
||||||
|
### Initramfs-Only Rebuild
|
||||||
|
|
||||||
|
Faster rebuild when only changing boot/network logic.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo ./rebuild-initramfs.sh
|
||||||
|
make deploy
|
||||||
|
```
|
||||||
|
|
||||||
|
**Time:** ~2 minutes
|
||||||
|
**When to use:** Changes to `initramfs/` scripts or hooks
|
||||||
|
|
||||||
|
### Verify Build
|
||||||
|
|
||||||
|
Check that all components are present and valid:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./verify-image.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
## Secret Management
|
||||||
|
|
||||||
|
Secrets are encrypted with [sops](https://github.com/getsops/sops) using age encryption. The age key lives on phoenix.
|
||||||
|
|
||||||
|
### Encrypted Files
|
||||||
|
|
||||||
|
| File | Contents |
|
||||||
|
|------|----------|
|
||||||
|
| `secrets/netboot.sops.yaml` | Root password hash for console login |
|
||||||
|
|
||||||
|
### Viewing Secrets
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# From any machine with SSH access to phoenix
|
||||||
|
cat secrets/netboot.sops.yaml | ssh phoenix "sops -d --input-type yaml --output-type yaml /dev/stdin"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Updating Root Password
|
||||||
|
|
||||||
|
1. Generate new password hash:
|
||||||
|
```bash
|
||||||
|
ssh phoenix "echo 'newpassword' | openssl passwd -6 -stdin"
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Update the encrypted file:
|
||||||
|
```bash
|
||||||
|
ssh phoenix "cd /path/to/netboot && sops secrets/netboot.sops.yaml"
|
||||||
|
# Edit root_password_hash value, save
|
||||||
|
```
|
||||||
|
|
||||||
|
Or recreate entirely:
|
||||||
|
```bash
|
||||||
|
NEW_HASH=$(ssh phoenix "echo 'newpassword' | openssl passwd -6 -stdin")
|
||||||
|
ssh phoenix "echo 'root_password_hash: \"$NEW_HASH\"' | sops --input-type yaml --output-type yaml -e --age age1gausnystsln7fpenw7arw7x79xe22z697jnauj38npy0usayqqxqc7td2y /dev/stdin" > secrets/netboot.sops.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Rebuild and deploy:
|
||||||
|
```bash
|
||||||
|
sudo ./build-image.sh
|
||||||
|
make deploy
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Reboot nodes to pick up new password
|
||||||
|
|
||||||
|
### Adding New Secrets
|
||||||
|
|
||||||
|
Edit `.sops.yaml` to add new file patterns, then create encrypted files on phoenix:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ssh phoenix "sops secrets/newfile.sops.yaml"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Node Storage Setup
|
||||||
|
|
||||||
|
Local NVMe is automatically partitioned on first boot by `setup-node-storage.service`.
|
||||||
|
|
||||||
|
### Partition Layout
|
||||||
|
|
||||||
|
| Partition | Size | Label | Mount Point | Purpose |
|
||||||
|
|-----------|------|-------|-------------|---------|
|
||||||
|
| nvme0n1p1 | 75GB | containerd | /var/lib/containerd | Container images |
|
||||||
|
| nvme0n1p2 | Remaining | longhorn | /var/lib/longhorn | Distributed storage |
|
||||||
|
|
||||||
|
### Automatic Behavior
|
||||||
|
|
||||||
|
| Drive State | Action |
|
||||||
|
|-------------|--------|
|
||||||
|
| No partition table | Auto-format (no prompt) |
|
||||||
|
| Has our labels (containerd/longhorn) | Mount silently |
|
||||||
|
| Has unknown partitions | Prompt on tty1, 120s timeout, skip if no response |
|
||||||
|
|
||||||
|
### Manual Intervention
|
||||||
|
|
||||||
|
If a node has an unknown drive and you want to format it:
|
||||||
|
|
||||||
|
1. Connect to physical console (tty1)
|
||||||
|
2. Reboot the node
|
||||||
|
3. Press ENTER when prompted (within 120 seconds)
|
||||||
|
4. Wait 5 seconds (abort window)
|
||||||
|
5. Drive is formatted and mounted
|
||||||
|
|
||||||
|
### Checking Storage Status
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# On node
|
||||||
|
journalctl -u setup-node-storage
|
||||||
|
cat /var/lib/containerd/.netboot-storage # marker file with metadata
|
||||||
|
lsblk /dev/nvme0n1
|
||||||
|
df -h /var/lib/containerd /var/lib/longhorn
|
||||||
|
```
|
||||||
|
|
||||||
|
## SSH Access
|
||||||
|
|
||||||
|
### Authorized Keys
|
||||||
|
|
||||||
|
Keys are baked into the image at build time. Current keys:
|
||||||
|
|
||||||
|
| Key | Source |
|
||||||
|
|-----|--------|
|
||||||
|
| `ssh-ed25519 AAAAC3...y1J` | lindahl@lindahl-Legion-5-Pro-16ACH6H |
|
||||||
|
| `ssh-ed25519 AAAA...0tX` | lindahl@phoenix.home |
|
||||||
|
|
||||||
|
To add/remove keys, edit `build-image.sh` around line 164-167.
|
||||||
|
|
||||||
|
### Console Access
|
||||||
|
|
||||||
|
Root password is set for physical console login only. SSH remains pubkey-only.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Physical console or IPMI
|
||||||
|
login: root
|
||||||
|
Password: <from secrets/netboot.sops.yaml>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Node Won't Boot
|
||||||
|
|
||||||
|
1. Check phoenix HTTP server:
|
||||||
|
```bash
|
||||||
|
ssh phoenix "curl -I http://localhost:8800/boot.ipxe"
|
||||||
|
ssh phoenix "ls -lh /srv/netboot/http/"
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Check nginx is running:
|
||||||
|
```bash
|
||||||
|
ssh phoenix "systemctl status nginx"
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Verify image integrity:
|
||||||
|
```bash
|
||||||
|
./verify-image.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
### Node Boots But No Network
|
||||||
|
|
||||||
|
1. Check if initramfs has network driver:
|
||||||
|
```bash
|
||||||
|
lsinitramfs http/initrd-netboot.img | grep -E "r8169|r8125"
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Check kernel cmdline includes `ip=dhcp`:
|
||||||
|
```bash
|
||||||
|
cat http/boot.ipxe
|
||||||
|
```
|
||||||
|
|
||||||
|
### Storage Not Mounting
|
||||||
|
|
||||||
|
1. Check service status:
|
||||||
|
```bash
|
||||||
|
ssh root@node "systemctl status setup-node-storage"
|
||||||
|
ssh root@node "journalctl -u setup-node-storage"
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Check if NVMe exists:
|
||||||
|
```bash
|
||||||
|
ssh root@node "lsblk"
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Check labels:
|
||||||
|
```bash
|
||||||
|
ssh root@node "blkid -L containerd && blkid -L longhorn"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Overlay Filling Up
|
||||||
|
|
||||||
|
The root overlay is only 2GB. If it fills:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check what's using space
|
||||||
|
ssh root@node "du -sh /var/* | sort -h"
|
||||||
|
|
||||||
|
# Temporary files should go to NVMe or tmpfs mounts
|
||||||
|
# /tmp, /var/tmp, /var/log are separate tmpfs
|
||||||
|
```
|
||||||
|
|
||||||
|
## File Reference
|
||||||
|
|
||||||
|
| File | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `build-image.sh` | Main build script |
|
||||||
|
| `rebuild-initramfs.sh` | Quick initramfs rebuild |
|
||||||
|
| `verify-image.sh` | Validate built image |
|
||||||
|
| `Makefile` | Build/deploy automation |
|
||||||
|
| `initramfs/` | Custom initramfs config for mkinitramfs |
|
||||||
|
| `initramfs/scripts/netboot` | HTTP root download and overlay mount |
|
||||||
|
| `files/setup-node-storage` | NVMe partitioning script |
|
||||||
|
| `files/setup-node-storage.service` | Systemd unit for storage setup |
|
||||||
|
| `secrets/netboot.sops.yaml` | Encrypted root password |
|
||||||
|
| `.sops.yaml` | Sops encryption config |
|
||||||
|
| `http/boot.ipxe` | iPXE boot configuration |
|
||||||
|
|
||||||
|
## Network Configuration
|
||||||
|
|
||||||
|
### IP Address Layout
|
||||||
|
|
||||||
|
| Range | Purpose |
|
||||||
|
|-------|---------|
|
||||||
|
| .1 | phoenix (gateway, DHCP, HTTP) |
|
||||||
|
| .2-.19 | Reserved (future infrastructure) |
|
||||||
|
| .20-.29 | Infrastructure devices |
|
||||||
|
| .50-.59 | Static K3s nodes |
|
||||||
|
| .60-.100 | Dynamic DHCP pool |
|
||||||
|
|
||||||
|
### Static Assignments
|
||||||
|
|
||||||
|
| Host | IP | MAC | Role |
|
||||||
|
|------|-----|-----|------|
|
||||||
|
| phoenix | 192.168.100.1 | - | NAS, HTTP server, DHCP |
|
||||||
|
| usw-flex-2 | 192.168.100.21 | 94:2a:6f:4c:fc:72 | Managed switch |
|
||||||
|
| k3s-node-01 | 192.168.100.51 | 78:55:36:04:e7:c8 | K3s worker |
|
||||||
|
| k3s-node-02 | 192.168.100.52 | 78:55:36:04:e7:1d | K3s worker |
|
||||||
|
|
||||||
|
HTTP server: `http://192.168.100.1:8800/`
|
||||||
|
|
||||||
|
### DHCP Reservations
|
||||||
|
|
||||||
|
Static IP assignments are configured in `/etc/dnsmasq.d/pxe-netboot.conf` on phoenix:
|
||||||
|
|
||||||
|
```
|
||||||
|
dhcp-range=192.168.100.60,192.168.100.100,12h
|
||||||
|
|
||||||
|
# Static DHCP reservations for K3s nodes
|
||||||
|
dhcp-host=78:55:36:04:e7:c8,192.168.100.51,k3s-node-01
|
||||||
|
dhcp-host=78:55:36:04:e7:1d,192.168.100.52,k3s-node-02
|
||||||
|
|
||||||
|
# Infrastructure
|
||||||
|
dhcp-host=94:2a:6f:4c:fc:72,192.168.100.21,usw-flex-2
|
||||||
|
```
|
||||||
|
|
||||||
|
To add a new node:
|
||||||
|
|
||||||
|
1. Boot the node once to get its MAC (check leases):
|
||||||
|
```bash
|
||||||
|
ssh phoenix "cat /var/lib/misc/dnsmasq.leases"
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Add reservation:
|
||||||
|
```bash
|
||||||
|
ssh phoenix "sudo tee -a /etc/dnsmasq.d/pxe-netboot.conf << EOF
|
||||||
|
dhcp-host=XX:XX:XX:XX:XX:XX,192.168.100.5X,k3s-node-0X
|
||||||
|
EOF"
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Restart dnsmasq:
|
||||||
|
```bash
|
||||||
|
ssh phoenix "sudo systemctl restart dnsmasq"
|
||||||
|
```
|
||||||
|
|
||||||
|
To change the boot server IP, edit `http/boot.ipxe` and `initramfs/scripts/netboot`.
|
||||||
166
build-image.sh
166
build-image.sh
@@ -14,6 +14,22 @@ VERSION=$(date +%Y%m%d-%H%M)
|
|||||||
|
|
||||||
echo "Building netboot image version $VERSION"
|
echo "Building netboot image version $VERSION"
|
||||||
|
|
||||||
|
# Decrypt secrets from phoenix (requires SSH access as the invoking user, not root)
|
||||||
|
echo "Decrypting secrets from phoenix..."
|
||||||
|
SECRETS_FILE="$SCRIPT_DIR/secrets/netboot.sops.yaml"
|
||||||
|
SUDO_USER_HOME=$(getent passwd "${SUDO_USER:-$USER}" | cut -d: -f6)
|
||||||
|
if [ -f "$SECRETS_FILE" ]; then
|
||||||
|
# Run SSH as the original user (not root) to use their SSH keys
|
||||||
|
ROOT_PW_HASH=$(sudo -u "${SUDO_USER:-$USER}" bash -c "cat '$SECRETS_FILE' | ssh phoenix 'sops -d --input-type yaml --output-type yaml /dev/stdin'" | grep root_password_hash | cut -d' ' -f2)
|
||||||
|
if [ -z "$ROOT_PW_HASH" ]; then
|
||||||
|
echo "WARNING: Failed to decrypt root password, console login will be disabled"
|
||||||
|
ROOT_PW_HASH="*"
|
||||||
|
fi
|
||||||
|
else
|
||||||
|
echo "WARNING: No secrets file found at $SECRETS_FILE, console login will be disabled"
|
||||||
|
ROOT_PW_HASH="*"
|
||||||
|
fi
|
||||||
|
|
||||||
# Clean previous build - unmount any stray mounts first
|
# Clean previous build - unmount any stray mounts first
|
||||||
if [ -d "$BUILD_DIR/rootfs" ]; then
|
if [ -d "$BUILD_DIR/rootfs" ]; then
|
||||||
echo "Cleaning up previous build mounts..."
|
echo "Cleaning up previous build mounts..."
|
||||||
@@ -25,12 +41,32 @@ fi
|
|||||||
rm -rf $BUILD_DIR/rootfs
|
rm -rf $BUILD_DIR/rootfs
|
||||||
mkdir -p $BUILD_DIR/rootfs
|
mkdir -p $BUILD_DIR/rootfs
|
||||||
|
|
||||||
|
# Copy custom initramfs config into rootfs BEFORE debootstrap, so it's available during chroot
|
||||||
|
echo "Setting up custom initramfs configuration..."
|
||||||
|
INITRAMFS_CONFIG="$SCRIPT_DIR/initramfs"
|
||||||
|
if [ ! -d "$INITRAMFS_CONFIG" ]; then
|
||||||
|
echo "ERROR: Custom initramfs config not found at $INITRAMFS_CONFIG"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
# These dirs will be created by debootstrap, so we'll copy after that
|
||||||
|
|
||||||
# Create base Ubuntu system
|
# Create base Ubuntu system
|
||||||
echo "Running debootstrap (this will take several minutes)..."
|
echo "Running debootstrap (this will take several minutes)..."
|
||||||
debootstrap --arch=amd64 --variant=minbase --components=main,universe,multiverse \
|
debootstrap --arch=amd64 --variant=minbase --components=main,universe,multiverse \
|
||||||
noble $BUILD_DIR/rootfs \
|
noble $BUILD_DIR/rootfs \
|
||||||
http://archive.ubuntu.com/ubuntu
|
http://archive.ubuntu.com/ubuntu
|
||||||
|
|
||||||
|
# Write root password hash to temp file for chroot to read
|
||||||
|
# Use /root/ not /tmp/ because systemd installation may mount tmpfs over /tmp
|
||||||
|
mkdir -p "$BUILD_DIR/rootfs/root"
|
||||||
|
if [ -n "$ROOT_PW_HASH" ] && [ "$ROOT_PW_HASH" != "*" ]; then
|
||||||
|
echo "$ROOT_PW_HASH" > "$BUILD_DIR/rootfs/root/.pw_hash"
|
||||||
|
echo "Root password hash written to rootfs"
|
||||||
|
else
|
||||||
|
echo "*" > "$BUILD_DIR/rootfs/root/.pw_hash"
|
||||||
|
echo "WARNING: No valid password hash, console login will be disabled"
|
||||||
|
fi
|
||||||
|
|
||||||
# Chroot and configure
|
# Chroot and configure
|
||||||
cat << 'CHROOT_SCRIPT' > $BUILD_DIR/rootfs/setup.sh
|
cat << 'CHROOT_SCRIPT' > $BUILD_DIR/rootfs/setup.sh
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
@@ -44,15 +80,28 @@ echo "keyboard-configuration keyboard-configuration/variant select Norwegian" |
|
|||||||
echo "locales locales/default_environment_locale select en_US.UTF-8" | debconf-set-selections
|
echo "locales locales/default_environment_locale select en_US.UTF-8" | debconf-set-selections
|
||||||
echo "locales locales/locales_to_be_generated multiselect en_US.UTF-8 UTF-8, nb_NO.UTF-8 UTF-8" | debconf-set-selections
|
echo "locales locales/locales_to_be_generated multiselect en_US.UTF-8 UTF-8, nb_NO.UTF-8 UTF-8" | debconf-set-selections
|
||||||
|
|
||||||
# Update and upgrade
|
# Enable noble-updates and noble-security for HWE kernel
|
||||||
|
cat > /etc/apt/sources.list.d/ubuntu.sources << 'SOURCES'
|
||||||
|
Types: deb
|
||||||
|
URIs: http://archive.ubuntu.com/ubuntu
|
||||||
|
Suites: noble noble-updates noble-backports
|
||||||
|
Components: main universe multiverse
|
||||||
|
Signed-By: /usr/share/keyrings/ubuntu-archive-keyring.gpg
|
||||||
|
|
||||||
|
Types: deb
|
||||||
|
URIs: http://security.ubuntu.com/ubuntu
|
||||||
|
Suites: noble-security
|
||||||
|
Components: main universe multiverse
|
||||||
|
Signed-By: /usr/share/keyrings/ubuntu-archive-keyring.gpg
|
||||||
|
SOURCES
|
||||||
|
|
||||||
apt-get update
|
apt-get update
|
||||||
apt-get upgrade -y
|
apt-get upgrade -y
|
||||||
|
|
||||||
# Install essential packages
|
# Install HWE kernel (6.14+) for RTL8125BP XID 689 support
|
||||||
apt-get install -y \
|
apt-get install -y \
|
||||||
linux-image-generic \
|
linux-image-generic-hwe-24.04 \
|
||||||
linux-firmware \
|
linux-firmware \
|
||||||
cloud-initramfs-rooturl \
|
|
||||||
busybox-initramfs \
|
busybox-initramfs \
|
||||||
initramfs-tools \
|
initramfs-tools \
|
||||||
keyboard-configuration \
|
keyboard-configuration \
|
||||||
@@ -81,13 +130,20 @@ apt-get install -y \
|
|||||||
conntrack \
|
conntrack \
|
||||||
socat \
|
socat \
|
||||||
ethtool \
|
ethtool \
|
||||||
nfs-common
|
nfs-common \
|
||||||
|
open-iscsi
|
||||||
|
|
||||||
# Container runtime prerequisites
|
# Container runtime prerequisites
|
||||||
apt-get install -y \
|
apt-get install -y \
|
||||||
containerd \
|
containerd \
|
||||||
runc
|
runc
|
||||||
|
|
||||||
|
# Vulkan drivers for GPU compute workloads (ollama, llama.cpp)
|
||||||
|
apt-get install -y \
|
||||||
|
mesa-vulkan-drivers \
|
||||||
|
libvulkan1 \
|
||||||
|
vulkan-tools
|
||||||
|
|
||||||
# Useful tools
|
# Useful tools
|
||||||
apt-get install -y \
|
apt-get install -y \
|
||||||
htop \
|
htop \
|
||||||
@@ -95,7 +151,11 @@ apt-get install -y \
|
|||||||
vim \
|
vim \
|
||||||
less \
|
less \
|
||||||
rsync \
|
rsync \
|
||||||
git
|
git \
|
||||||
|
squashfs-tools \
|
||||||
|
parted \
|
||||||
|
fdisk \
|
||||||
|
gdisk
|
||||||
|
|
||||||
# Clean up
|
# Clean up
|
||||||
apt-get clean
|
apt-get clean
|
||||||
@@ -103,8 +163,9 @@ rm -rf /var/lib/apt/lists/*
|
|||||||
rm -rf /tmp/*
|
rm -rf /tmp/*
|
||||||
rm -rf /var/tmp/*
|
rm -rf /var/tmp/*
|
||||||
|
|
||||||
# Configure hostname (will be overridden by netplan)
|
# Don't set static hostname - let DHCP provide it via networkd
|
||||||
echo "k3s-node" > /etc/hostname
|
# Empty /etc/hostname allows transient hostname from DHCP
|
||||||
|
echo "" > /etc/hostname
|
||||||
|
|
||||||
# Configure network with netplan
|
# Configure network with netplan
|
||||||
cat > /etc/netplan/01-netcfg.yaml <<EOF
|
cat > /etc/netplan/01-netcfg.yaml <<EOF
|
||||||
@@ -125,11 +186,19 @@ EOF
|
|||||||
systemctl enable systemd-networkd
|
systemctl enable systemd-networkd
|
||||||
systemctl enable systemd-resolved
|
systemctl enable systemd-resolved
|
||||||
|
|
||||||
# Configure SSH
|
# Configure SSH - disable socket activation, use traditional daemon
|
||||||
sed -i 's/#PermitRootLogin.*/PermitRootLogin prohibit-password/' /etc/ssh/sshd_config
|
sed -i 's/#PermitRootLogin.*/PermitRootLogin prohibit-password/' /etc/ssh/sshd_config
|
||||||
sed -i 's/#PubkeyAuthentication.*/PubkeyAuthentication yes/' /etc/ssh/sshd_config
|
sed -i 's/#PubkeyAuthentication.*/PubkeyAuthentication yes/' /etc/ssh/sshd_config
|
||||||
|
# Disable socket activation (Ubuntu 24.04 default) and use traditional sshd
|
||||||
|
systemctl disable ssh.socket 2>/dev/null || true
|
||||||
|
rm -f /etc/systemd/system/ssh.service.requires/ssh.socket 2>/dev/null || true
|
||||||
|
rm -f /etc/systemd/system/sockets.target.wants/ssh.socket 2>/dev/null || true
|
||||||
systemctl enable ssh
|
systemctl enable ssh
|
||||||
|
|
||||||
|
# Fix SSH host key permissions (must be 0600 for private keys, sshd refuses otherwise)
|
||||||
|
chmod 600 /etc/ssh/ssh_host_*_key
|
||||||
|
chmod 644 /etc/ssh/ssh_host_*_key.pub
|
||||||
|
|
||||||
# Create SSH directory for root
|
# Create SSH directory for root
|
||||||
mkdir -p /root/.ssh
|
mkdir -p /root/.ssh
|
||||||
chmod 700 /root/.ssh
|
chmod 700 /root/.ssh
|
||||||
@@ -142,8 +211,10 @@ SSHKEY
|
|||||||
|
|
||||||
chmod 600 /root/.ssh/authorized_keys
|
chmod 600 /root/.ssh/authorized_keys
|
||||||
|
|
||||||
# Disable password authentication completely
|
# Set root password from decrypted hash (for console login only)
|
||||||
echo "root:*" | chpasswd -e
|
ROOT_PW_HASH=$(cat /root/.pw_hash)
|
||||||
|
echo "root:$ROOT_PW_HASH" | chpasswd -e
|
||||||
|
rm -f /root/.pw_hash
|
||||||
|
|
||||||
# Configure tmpfs mounts for ephemeral data
|
# Configure tmpfs mounts for ephemeral data
|
||||||
cat >> /etc/fstab <<FSTAB
|
cat >> /etc/fstab <<FSTAB
|
||||||
@@ -202,21 +273,59 @@ umount -l $BUILD_DIR/rootfs/proc
|
|||||||
|
|
||||||
rm $BUILD_DIR/rootfs/setup.sh
|
rm $BUILD_DIR/rootfs/setup.sh
|
||||||
|
|
||||||
# Build initramfs with custom netboot hooks/scripts
|
# Copy custom initramfs config into rootfs while /proc is mounted
|
||||||
|
echo "Installing custom initramfs hooks and scripts..."
|
||||||
|
INITRAMFS_CONFIG="$SCRIPT_DIR/initramfs"
|
||||||
|
cp "$INITRAMFS_CONFIG/initramfs.conf" "$BUILD_DIR/rootfs/etc/initramfs-tools/"
|
||||||
|
cp "$INITRAMFS_CONFIG/modules" "$BUILD_DIR/rootfs/etc/initramfs-tools/"
|
||||||
|
cp -r "$INITRAMFS_CONFIG/hooks/"* "$BUILD_DIR/rootfs/usr/share/initramfs-tools/hooks/"
|
||||||
|
cp -r "$INITRAMFS_CONFIG/scripts/"* "$BUILD_DIR/rootfs/usr/share/initramfs-tools/scripts/"
|
||||||
|
|
||||||
|
# Install node storage setup service
|
||||||
|
echo "Installing node storage setup service..."
|
||||||
|
FILES_DIR="$SCRIPT_DIR/files"
|
||||||
|
cp "$FILES_DIR/setup-node-storage" "$BUILD_DIR/rootfs/usr/local/bin/"
|
||||||
|
chmod +x "$BUILD_DIR/rootfs/usr/local/bin/setup-node-storage"
|
||||||
|
cp "$FILES_DIR/setup-node-storage.service" "$BUILD_DIR/rootfs/etc/systemd/system/"
|
||||||
|
# Enable the service (create symlink manually since we can't run systemctl)
|
||||||
|
mkdir -p "$BUILD_DIR/rootfs/etc/systemd/system/multi-user.target.wants"
|
||||||
|
ln -sf /etc/systemd/system/setup-node-storage.service \
|
||||||
|
"$BUILD_DIR/rootfs/etc/systemd/system/multi-user.target.wants/setup-node-storage.service"
|
||||||
|
|
||||||
|
# Install DHCP hostname service
|
||||||
|
echo "Installing DHCP hostname service..."
|
||||||
|
cp "$FILES_DIR/set-hostname-from-dhcp" "$BUILD_DIR/rootfs/usr/local/bin/"
|
||||||
|
chmod +x "$BUILD_DIR/rootfs/usr/local/bin/set-hostname-from-dhcp"
|
||||||
|
cp "$FILES_DIR/set-hostname-from-dhcp.service" "$BUILD_DIR/rootfs/etc/systemd/system/"
|
||||||
|
ln -sf /etc/systemd/system/set-hostname-from-dhcp.service \
|
||||||
|
"$BUILD_DIR/rootfs/etc/systemd/system/multi-user.target.wants/set-hostname-from-dhcp.service"
|
||||||
|
|
||||||
|
# Download and install K3s binary
|
||||||
|
echo "Downloading K3s binary..."
|
||||||
|
K3S_VERSION="v1.34.3+k3s1"
|
||||||
|
curl -sfL "https://github.com/k3s-io/k3s/releases/download/${K3S_VERSION}/k3s" \
|
||||||
|
-o "$BUILD_DIR/rootfs/usr/local/bin/k3s"
|
||||||
|
chmod +x "$BUILD_DIR/rootfs/usr/local/bin/k3s"
|
||||||
|
echo "K3s $K3S_VERSION installed"
|
||||||
|
|
||||||
|
# Install K3s agent service
|
||||||
|
echo "Installing K3s agent service..."
|
||||||
|
# Create K3s directories first (will be bind-mounted from NVMe at runtime)
|
||||||
|
mkdir -p "$BUILD_DIR/rootfs/etc/rancher/k3s"
|
||||||
|
mkdir -p "$BUILD_DIR/rootfs/etc/rancher/node"
|
||||||
|
mkdir -p "$BUILD_DIR/rootfs/var/lib/rancher/k3s/agent"
|
||||||
|
cp "$FILES_DIR/k3s-agent.service" "$BUILD_DIR/rootfs/etc/systemd/system/"
|
||||||
|
cp "$FILES_DIR/k3s-agent.env" "$BUILD_DIR/rootfs/etc/rancher/k3s/"
|
||||||
|
# Enable the service
|
||||||
|
ln -sf /etc/systemd/system/k3s-agent.service \
|
||||||
|
"$BUILD_DIR/rootfs/etc/systemd/system/multi-user.target.wants/k3s-agent.service"
|
||||||
|
|
||||||
|
# Build initramfs while /proc/sys/dev are still mounted
|
||||||
echo "Building custom netboot initramfs..."
|
echo "Building custom netboot initramfs..."
|
||||||
KERNEL_VERSION=$(ls -1 $BUILD_DIR/rootfs/boot/vmlinuz-* | sed 's|.*/vmlinuz-||' | head -1)
|
KERNEL_VERSION=$(ls -1 $BUILD_DIR/rootfs/boot/vmlinuz-* | sed 's|.*/vmlinuz-||' | head -1)
|
||||||
|
chroot $BUILD_DIR/rootfs mkinitramfs -v -o /boot/initrd-netboot.img $KERNEL_VERSION
|
||||||
# Use mkinitramfs with custom initramfs directory
|
INITRAMFS_OUTPUT="$BUILD_DIR/rootfs/boot/initrd-netboot.img"
|
||||||
INITRAMFS_CONFIG="$SCRIPT_DIR/initramfs"
|
echo "Initramfs build complete. Size: $(du -h $INITRAMFS_OUTPUT | cut -f1)"
|
||||||
if [ -d "$INITRAMFS_CONFIG" ]; then
|
|
||||||
mkinitramfs -d "$INITRAMFS_CONFIG" \
|
|
||||||
-k "$KERNEL_VERSION" \
|
|
||||||
-o $BUILD_DIR/rootfs/boot/initrd-netboot.img
|
|
||||||
echo "Initramfs build complete. Size: $(du -h $BUILD_DIR/rootfs/boot/initrd-netboot.img | cut -f1)"
|
|
||||||
else
|
|
||||||
echo "ERROR: Custom initramfs config not found at $INITRAMFS_CONFIG"
|
|
||||||
exit 1
|
|
||||||
fi
|
|
||||||
|
|
||||||
# Copy kernel and netboot initramfs
|
# Copy kernel and netboot initramfs
|
||||||
mkdir -p $IMAGE_DIR/$VERSION
|
mkdir -p $IMAGE_DIR/$VERSION
|
||||||
@@ -249,7 +358,14 @@ ln -sfn $VERSION $IMAGE_DIR/latest
|
|||||||
# Copy to HTTP directory
|
# Copy to HTTP directory
|
||||||
echo "Deploying to HTTP directory..."
|
echo "Deploying to HTTP directory..."
|
||||||
rsync -av $IMAGE_DIR/$VERSION/ $HTTP_DIR/
|
rsync -av $IMAGE_DIR/$VERSION/ $HTTP_DIR/
|
||||||
ln -sfn $VERSION $HTTP_DIR/latest
|
|
||||||
|
# Fix permissions for web server access
|
||||||
|
echo "Setting permissions for HTTP serving..."
|
||||||
|
chmod 644 $HTTP_DIR/vmlinuz
|
||||||
|
chmod 644 $HTTP_DIR/initrd-netboot.img
|
||||||
|
chmod 644 $HTTP_DIR/filesystem.squashfs
|
||||||
|
chmod 644 $HTTP_DIR/version.txt
|
||||||
|
|
||||||
echo "Build complete! Version: $VERSION"
|
echo "Build complete! Version: $VERSION"
|
||||||
echo "Files available at: $HTTP_DIR/"
|
echo "Files available at: $HTTP_DIR/"
|
||||||
|
ls -lh $HTTP_DIR/vmlinuz $HTTP_DIR/initrd-netboot.img $HTTP_DIR/filesystem.squashfs
|
||||||
|
|||||||
4
files/k3s-agent.env
Normal file
4
files/k3s-agent.env
Normal file
@@ -0,0 +1,4 @@
|
|||||||
|
# K3s agent configuration
|
||||||
|
# Server URL and token for cluster join
|
||||||
|
K3S_URL="https://192.168.100.1:6443"
|
||||||
|
K3S_TOKEN="K106e2ea6914f7a019d1222c1fdd19c5065978377364701f60eb1f2a585e8c3924b::server:0a15c4d7a13df65b066f5b8eff710ecd"
|
||||||
25
files/k3s-agent.service
Normal file
25
files/k3s-agent.service
Normal file
@@ -0,0 +1,25 @@
|
|||||||
|
[Unit]
|
||||||
|
Description=Lightweight Kubernetes (K3s Agent)
|
||||||
|
Documentation=https://k3s.io
|
||||||
|
After=network-online.target setup-node-storage.service set-hostname-from-dhcp.service
|
||||||
|
Wants=network-online.target
|
||||||
|
Requires=setup-node-storage.service set-hostname-from-dhcp.service
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=notify
|
||||||
|
EnvironmentFile=-/etc/rancher/k3s/k3s-agent.env
|
||||||
|
ExecStartPre=/sbin/modprobe br_netfilter
|
||||||
|
ExecStartPre=/sbin/modprobe overlay
|
||||||
|
ExecStart=/usr/local/bin/k3s agent
|
||||||
|
KillMode=process
|
||||||
|
Delegate=yes
|
||||||
|
LimitNOFILE=1048576
|
||||||
|
LimitNPROC=infinity
|
||||||
|
LimitCORE=infinity
|
||||||
|
TasksMax=infinity
|
||||||
|
TimeoutStartSec=0
|
||||||
|
Restart=always
|
||||||
|
RestartSec=5s
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
28
files/set-hostname-from-dhcp
Normal file
28
files/set-hostname-from-dhcp
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# Set hostname from DHCP lease
|
||||||
|
# Runs before k3s-agent to ensure proper node name
|
||||||
|
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
log() { echo "[hostname] $*"; logger -t set-hostname "$*"; }
|
||||||
|
|
||||||
|
# Wait for DHCP lease
|
||||||
|
MAX_WAIT=60
|
||||||
|
for i in $(seq 1 $MAX_WAIT); do
|
||||||
|
# Check for lease files from systemd-networkd
|
||||||
|
for lease in /run/systemd/netif/leases/*; do
|
||||||
|
if [ -f "$lease" ]; then
|
||||||
|
HOSTNAME=$(grep -oP '^HOSTNAME=\K.*' "$lease" 2>/dev/null || true)
|
||||||
|
if [ -n "$HOSTNAME" ]; then
|
||||||
|
log "Found hostname in DHCP lease: $HOSTNAME"
|
||||||
|
hostnamectl set-hostname "$HOSTNAME"
|
||||||
|
log "Hostname set to: $(hostname)"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
sleep 1
|
||||||
|
done
|
||||||
|
|
||||||
|
log "Warning: No DHCP hostname found after ${MAX_WAIT}s, using default"
|
||||||
|
exit 0
|
||||||
15
files/set-hostname-from-dhcp.service
Normal file
15
files/set-hostname-from-dhcp.service
Normal file
@@ -0,0 +1,15 @@
|
|||||||
|
[Unit]
|
||||||
|
Description=Set hostname from DHCP lease
|
||||||
|
Documentation=file:///usr/local/bin/set-hostname-from-dhcp
|
||||||
|
After=network-online.target systemd-networkd.service
|
||||||
|
Wants=network-online.target
|
||||||
|
Before=k3s-agent.service
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=oneshot
|
||||||
|
ExecStart=/usr/local/bin/set-hostname-from-dhcp
|
||||||
|
RemainAfterExit=yes
|
||||||
|
TimeoutStartSec=90
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
236
files/setup-node-storage
Normal file
236
files/setup-node-storage
Normal file
@@ -0,0 +1,236 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# Setup local NVMe storage for K3s node
|
||||||
|
# Runs at boot via systemd service
|
||||||
|
#
|
||||||
|
# Logic:
|
||||||
|
# - No NVMe: exit cleanly
|
||||||
|
# - No partition table: auto-format (new drive)
|
||||||
|
# - Has our labels: mount and exit (already configured)
|
||||||
|
# - Has other partitions: prompt with 120s timeout (safety)
|
||||||
|
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
DEVICE="/dev/nvme0n1"
|
||||||
|
CONTAINERD_SIZE="75GiB"
|
||||||
|
CONTAINERD_LABEL="containerd"
|
||||||
|
LONGHORN_LABEL="longhorn"
|
||||||
|
CONTAINERD_MOUNT="/var/lib/containerd"
|
||||||
|
LONGHORN_MOUNT="/var/lib/longhorn"
|
||||||
|
MARKER_FILE=".netboot-storage"
|
||||||
|
PROMPT_TIMEOUT=120
|
||||||
|
|
||||||
|
# Colors for console output
|
||||||
|
RED='\033[0;31m'
|
||||||
|
GREEN='\033[0;32m'
|
||||||
|
YELLOW='\033[1;33m'
|
||||||
|
CYAN='\033[0;36m'
|
||||||
|
NC='\033[0m'
|
||||||
|
|
||||||
|
# Log to both console and journald
|
||||||
|
log() { echo -e "${GREEN}[storage]${NC} $*"; logger -t setup-node-storage "$*"; }
|
||||||
|
warn() { echo -e "${YELLOW}[storage]${NC} $*"; logger -t setup-node-storage -p warning "$*"; }
|
||||||
|
error() { echo -e "${RED}[storage]${NC} $*"; logger -t setup-node-storage -p err "$*"; }
|
||||||
|
|
||||||
|
# Check if NVMe exists
|
||||||
|
if [ ! -b "$DEVICE" ]; then
|
||||||
|
log "No NVMe device found at $DEVICE - skipping storage setup"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
DEVICE_SIZE=$(lsblk -b -d -n -o SIZE "$DEVICE" | awk '{printf "%.0fGB", $1/1000000000}')
|
||||||
|
log "Found NVMe: $DEVICE ($DEVICE_SIZE)"
|
||||||
|
|
||||||
|
# Get partition names (handles nvme naming with 'p' prefix)
|
||||||
|
if [[ "$DEVICE" == *"nvme"* ]]; then
|
||||||
|
PART1="${DEVICE}p1"
|
||||||
|
PART2="${DEVICE}p2"
|
||||||
|
else
|
||||||
|
PART1="${DEVICE}1"
|
||||||
|
PART2="${DEVICE}2"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Function to mount existing storage
|
||||||
|
mount_storage() {
|
||||||
|
log "Mounting existing storage..."
|
||||||
|
|
||||||
|
mkdir -p "$CONTAINERD_MOUNT" "$LONGHORN_MOUNT"
|
||||||
|
|
||||||
|
if ! mountpoint -q "$CONTAINERD_MOUNT"; then
|
||||||
|
mount -L "$CONTAINERD_LABEL" "$CONTAINERD_MOUNT" || {
|
||||||
|
error "Failed to mount containerd partition"
|
||||||
|
return 1
|
||||||
|
}
|
||||||
|
fi
|
||||||
|
|
||||||
|
if ! mountpoint -q "$LONGHORN_MOUNT"; then
|
||||||
|
mount -L "$LONGHORN_LABEL" "$LONGHORN_MOUNT" || {
|
||||||
|
error "Failed to mount longhorn partition"
|
||||||
|
return 1
|
||||||
|
}
|
||||||
|
fi
|
||||||
|
|
||||||
|
# K3s persistence: bind mount agent data and node identity from NVMe
|
||||||
|
# This allows the node to survive reboots without re-registering
|
||||||
|
setup_k3s_persistence
|
||||||
|
|
||||||
|
log "Storage mounted:"
|
||||||
|
log " $CONTAINERD_MOUNT: $(df -h "$CONTAINERD_MOUNT" | tail -1 | awk '{print $2}')"
|
||||||
|
log " $LONGHORN_MOUNT: $(df -h "$LONGHORN_MOUNT" | tail -1 | awk '{print $2}')"
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
# Setup K3s persistence directories
|
||||||
|
# Bind mounts NVMe directories to k3s paths so node identity survives reboots
|
||||||
|
setup_k3s_persistence() {
|
||||||
|
# K3s agent data (containerd, kubelet certs, etc.)
|
||||||
|
# Uses overlayfs internally, so must be on real filesystem, not overlay
|
||||||
|
K3S_AGENT="/var/lib/rancher/k3s/agent"
|
||||||
|
K3S_AGENT_DATA="$CONTAINERD_MOUNT/k3s-agent"
|
||||||
|
mkdir -p "$K3S_AGENT_DATA" "$K3S_AGENT"
|
||||||
|
if ! mountpoint -q "$K3S_AGENT"; then
|
||||||
|
mount --bind "$K3S_AGENT_DATA" "$K3S_AGENT"
|
||||||
|
log " $K3S_AGENT: bind mount to NVMe"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# K3s node identity (password file)
|
||||||
|
# Must persist across reboots or node will be rejected
|
||||||
|
K3S_NODE="/etc/rancher/node"
|
||||||
|
K3S_NODE_DATA="$CONTAINERD_MOUNT/k3s-node"
|
||||||
|
mkdir -p "$K3S_NODE_DATA" "$K3S_NODE"
|
||||||
|
if ! mountpoint -q "$K3S_NODE"; then
|
||||||
|
mount --bind "$K3S_NODE_DATA" "$K3S_NODE"
|
||||||
|
log " $K3S_NODE: bind mount to NVMe"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Kubelet data (pod volumes, projected tokens, etc.)
|
||||||
|
# Must be on NVMe so kubelet reports real disk capacity, not the 2G tmpfs overlay
|
||||||
|
KUBELET_DIR="/var/lib/kubelet"
|
||||||
|
KUBELET_DATA="$CONTAINERD_MOUNT/kubelet"
|
||||||
|
mkdir -p "$KUBELET_DATA" "$KUBELET_DIR"
|
||||||
|
if ! mountpoint -q "$KUBELET_DIR"; then
|
||||||
|
mount --bind "$KUBELET_DATA" "$KUBELET_DIR"
|
||||||
|
log " $KUBELET_DIR: bind mount to NVMe"
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
# Function to format the drive
|
||||||
|
format_storage() {
|
||||||
|
log "Partitioning $DEVICE..."
|
||||||
|
|
||||||
|
wipefs -af "$DEVICE"
|
||||||
|
parted -s "$DEVICE" mklabel gpt
|
||||||
|
parted -s "$DEVICE" mkpart primary ext4 1MiB "$CONTAINERD_SIZE"
|
||||||
|
parted -s "$DEVICE" mkpart primary ext4 "$CONTAINERD_SIZE" 100%
|
||||||
|
|
||||||
|
# Tell kernel to re-read partition table and wait for udev
|
||||||
|
partprobe "$DEVICE"
|
||||||
|
udevadm settle --timeout=10
|
||||||
|
|
||||||
|
# Verify partitions appeared
|
||||||
|
if [ ! -b "$PART1" ] || [ ! -b "$PART2" ]; then
|
||||||
|
error "Partitions not found after partprobe: $PART1, $PART2"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
log "Formatting ${PART1} as ext4 (containerd, 75GB)..."
|
||||||
|
mkfs.ext4 -L "$CONTAINERD_LABEL" -q "$PART1"
|
||||||
|
|
||||||
|
log "Formatting ${PART2} as ext4 (longhorn, remaining)..."
|
||||||
|
mkfs.ext4 -L "$LONGHORN_LABEL" -q "$PART2"
|
||||||
|
|
||||||
|
# Mount the new partitions
|
||||||
|
mkdir -p "$CONTAINERD_MOUNT" "$LONGHORN_MOUNT"
|
||||||
|
mount "$PART1" "$CONTAINERD_MOUNT"
|
||||||
|
mount "$PART2" "$LONGHORN_MOUNT"
|
||||||
|
|
||||||
|
# Create marker files with metadata
|
||||||
|
for mount_point in "$CONTAINERD_MOUNT" "$LONGHORN_MOUNT"; do
|
||||||
|
cat > "${mount_point}/${MARKER_FILE}" <<EOF
|
||||||
|
# Netboot storage marker - DO NOT DELETE
|
||||||
|
formatted_date=$(date -Iseconds)
|
||||||
|
formatted_by=setup-node-storage
|
||||||
|
hostname=$(hostname)
|
||||||
|
device=$DEVICE
|
||||||
|
EOF
|
||||||
|
done
|
||||||
|
|
||||||
|
# K3s persistence: bind mount agent data and node identity from NVMe
|
||||||
|
setup_k3s_persistence
|
||||||
|
|
||||||
|
log "Storage formatted and mounted successfully"
|
||||||
|
log " $CONTAINERD_MOUNT: $(df -h "$CONTAINERD_MOUNT" | tail -1 | awk '{print $2}')"
|
||||||
|
log " $LONGHORN_MOUNT: $(df -h "$LONGHORN_MOUNT" | tail -1 | awk '{print $2}')"
|
||||||
|
}
|
||||||
|
|
||||||
|
# Check for partition table
|
||||||
|
# Method 1: blkid returns empty PTTYPE for unpartitioned drives
|
||||||
|
# Method 2: parted error message (locale-dependent fallback)
|
||||||
|
has_partition_table() {
|
||||||
|
local pttype
|
||||||
|
pttype=$(blkid -o value -s PTTYPE "$DEVICE" 2>/dev/null)
|
||||||
|
if [ -n "$pttype" ]; then
|
||||||
|
return 0 # has partition table
|
||||||
|
fi
|
||||||
|
# Fallback: check if parted can read it
|
||||||
|
if parted -s "$DEVICE" print &>/dev/null; then
|
||||||
|
return 0 # has partition table
|
||||||
|
fi
|
||||||
|
return 1 # no partition table
|
||||||
|
}
|
||||||
|
|
||||||
|
if ! has_partition_table; then
|
||||||
|
# No partition table - this is a fresh drive, auto-format
|
||||||
|
log "Empty drive detected (no partition table) - auto-formatting..."
|
||||||
|
format_storage
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Has partition table - check if it's ours
|
||||||
|
if blkid -L "$CONTAINERD_LABEL" &>/dev/null && blkid -L "$LONGHORN_LABEL" &>/dev/null; then
|
||||||
|
# Check for marker file (belt and suspenders)
|
||||||
|
# Create temp mount to check marker without leaving dangling mount
|
||||||
|
TEMP_MOUNT=$(mktemp -d)
|
||||||
|
if mount -L "$CONTAINERD_LABEL" "$TEMP_MOUNT" 2>/dev/null; then
|
||||||
|
if [ -f "${TEMP_MOUNT}/${MARKER_FILE}" ]; then
|
||||||
|
umount "$TEMP_MOUNT"
|
||||||
|
rmdir "$TEMP_MOUNT"
|
||||||
|
log "Storage already configured (found labels and marker)"
|
||||||
|
mount_storage
|
||||||
|
exit 0
|
||||||
|
else
|
||||||
|
umount "$TEMP_MOUNT"
|
||||||
|
rmdir "$TEMP_MOUNT"
|
||||||
|
# Has our labels but no marker - probably ours, mount it
|
||||||
|
warn "Found labels but no marker file - assuming configured"
|
||||||
|
mount_storage
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
rmdir "$TEMP_MOUNT" 2>/dev/null || true
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Has partitions but not ours - this could contain data!
|
||||||
|
warn "NVMe has existing partitions but no netboot labels."
|
||||||
|
warn "This drive may contain important data!"
|
||||||
|
echo ""
|
||||||
|
lsblk "$DEVICE"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Prompt on console with timeout
|
||||||
|
echo -e "${CYAN}========================================${NC}"
|
||||||
|
echo -e "${CYAN} Press ENTER within ${PROMPT_TIMEOUT}s to format ${NC}"
|
||||||
|
echo -e "${CYAN} Or wait to skip (safe default) ${NC}"
|
||||||
|
echo -e "${CYAN}========================================${NC}"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
if read -t "$PROMPT_TIMEOUT" -p "Format $DEVICE? [press ENTER to confirm] " response; then
|
||||||
|
echo ""
|
||||||
|
warn "Formatting in 5 seconds... Ctrl+C to abort"
|
||||||
|
sleep 5
|
||||||
|
format_storage
|
||||||
|
else
|
||||||
|
echo ""
|
||||||
|
warn "Timeout - skipping storage setup (drive left untouched)"
|
||||||
|
warn "To format manually, reboot and press ENTER when prompted"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
26
files/setup-node-storage.service
Normal file
26
files/setup-node-storage.service
Normal file
@@ -0,0 +1,26 @@
|
|||||||
|
[Unit]
|
||||||
|
Description=Setup local NVMe storage for K3s
|
||||||
|
Documentation=file:///usr/local/bin/setup-node-storage
|
||||||
|
|
||||||
|
# Run early, after devices are available but before container services
|
||||||
|
After=local-fs.target systemd-udevd.service
|
||||||
|
Before=containerd.service
|
||||||
|
|
||||||
|
# Only run if not already mounted
|
||||||
|
ConditionPathIsMountPoint=!/var/lib/containerd
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=oneshot
|
||||||
|
ExecStart=/usr/local/bin/setup-node-storage
|
||||||
|
RemainAfterExit=yes
|
||||||
|
|
||||||
|
# Console access for interactive prompt
|
||||||
|
StandardInput=tty
|
||||||
|
TTYPath=/dev/tty1
|
||||||
|
TTYReset=yes
|
||||||
|
|
||||||
|
# Generous timeout for user interaction (3 minutes)
|
||||||
|
TimeoutStartSec=180
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
@@ -1,10 +1,5 @@
|
|||||||
#!ipxe
|
#!ipxe
|
||||||
echo Booting Ubuntu Noble K3s Node via iPXE
|
echo Configuring network via DHCP...
|
||||||
echo Loading kernel from http://192.168.100.1:8800/vmlinuz
|
dhcp
|
||||||
kernel --name vmlinuz http://192.168.100.1:8800/vmlinuz
|
echo Chaining to dynamic boot script...
|
||||||
echo Loading initramfs from http://192.168.100.1:8800/initrd-netboot.img
|
chain http://192.168.100.1:8800/netboot.ipxe
|
||||||
initrd --name initrd http://192.168.100.1:8800/initrd-netboot.img
|
|
||||||
echo Setting kernel arguments for HTTP root mounting
|
|
||||||
imgargs vmlinuz root=http://192.168.100.1:8800/filesystem.squashfs rootfstype=squashfs overlayroot=tmpfs ip=dhcp console=ttyS0,115200 earlyprintk=ttyS0,115200 loglevel=7
|
|
||||||
echo Booting system...
|
|
||||||
boot vmlinuz
|
|
||||||
|
|||||||
@@ -68,3 +68,6 @@ FSTYPE=auto
|
|||||||
|
|
||||||
# Disable hibernation
|
# Disable hibernation
|
||||||
RESUME=none
|
RESUME=none
|
||||||
|
|
||||||
|
# Use custom netboot script for HTTP root mounting
|
||||||
|
BOOT=netboot
|
||||||
|
|||||||
@@ -1,147 +1,85 @@
|
|||||||
#!/bin/sh
|
#!/bin/sh
|
||||||
|
# Netboot HTTP root mounting - HARDCODED VALUES - no cmdline parsing
|
||||||
# Import standard initramfs functions
|
|
||||||
. /scripts/functions
|
|
||||||
|
|
||||||
export PATH=/usr/bin:/usr/sbin:/bin:/sbin
|
export PATH=/usr/bin:/usr/sbin:/bin:/sbin
|
||||||
MOUNTPOINT=/root
|
|
||||||
TMPFS_MOUNT=/mnt
|
|
||||||
|
|
||||||
# Parse kernel command line for HTTP root
|
# HARDCODED CONFIGURATION
|
||||||
parse_cmdline() {
|
ROOT_URL="http://192.168.100.1:8800/filesystem.squashfs"
|
||||||
for x in $(cat /proc/cmdline); do
|
OVERLAYROOT="tmpfs"
|
||||||
case $x in
|
MOUNTPOINT=/root
|
||||||
root=http://*)
|
SQUASHFS_MOUNT=/mnt/squashfs
|
||||||
export ROOT_URL="${x#root=}" ;;
|
OVERLAY_TMPFS=/mnt/overlay
|
||||||
rootfstype=*)
|
|
||||||
export ROOTFSTYPE="${x#rootfstype=}" ;;
|
# Debug logging to console
|
||||||
overlayroot=*)
|
log() {
|
||||||
export OVERLAYROOT="${x#overlayroot=}" ;;
|
echo "$@" > /dev/console 2>&1
|
||||||
ip=*)
|
|
||||||
export BOOTIP="${x#ip=}" ;;
|
|
||||||
*)
|
|
||||||
: ;;
|
|
||||||
esac
|
|
||||||
done
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
# Minimal hook functions
|
||||||
|
netboot_top() { :; }
|
||||||
|
netboot_premount() { :; }
|
||||||
|
netboot_bottom() { :; }
|
||||||
|
mount_top() { :; }
|
||||||
|
mount_premount() { :; }
|
||||||
|
mount_bottom() { :; }
|
||||||
|
|
||||||
mountroot() {
|
mountroot() {
|
||||||
rc=1
|
log "NETBOOT: ========================================"
|
||||||
parse_cmdline
|
log "NETBOOT: mountroot() HARDCODED VERSION"
|
||||||
|
log "NETBOOT: ROOT_URL=${ROOT_URL}"
|
||||||
|
log "NETBOOT: ========================================"
|
||||||
|
|
||||||
if test -z "${ROOT_URL}"; then
|
# Load network module
|
||||||
log_failure_msg "No root URL defined (root=http://... not found)"
|
/sbin/modprobe af_packet
|
||||||
return ${rc}
|
|
||||||
fi
|
|
||||||
|
|
||||||
# Configure networking before attempting downloads
|
# Wait for udev
|
||||||
log_begin_msg "Configuring network"
|
wait_for_udev 10
|
||||||
modprobe af_packet || log_warning_msg "af_packet load failed"
|
|
||||||
|
|
||||||
# Load RTL8125 driver (already in module list but explicit load for debugging)
|
|
||||||
modprobe r8125 || log_warning_msg "r8125 driver load failed, may use generic driver"
|
|
||||||
|
|
||||||
|
# Configure networking via DHCP
|
||||||
|
log "NETBOOT: Calling configure_networking..."
|
||||||
configure_networking
|
configure_networking
|
||||||
udevadm trigger
|
|
||||||
timeout 30 udevadm settle || log_warning_msg "udevadm settle timed out"
|
|
||||||
export DEVICE
|
|
||||||
log_end_msg
|
|
||||||
|
|
||||||
# Validate networking is up
|
# Check we got an IP
|
||||||
INTERFACE_UP=0
|
log "NETBOOT: Checking for IP address..."
|
||||||
for iface in $(ip link show | grep "^[0-9]" | awk -F: '{print $2}' | tr -d ' '); do
|
if ! ip addr show | grep -q "inet "; then
|
||||||
if ip addr show "$iface" | grep -q "inet "; then
|
log "NETBOOT: FATAL - no IP address"
|
||||||
INTERFACE_UP=1
|
return 1
|
||||||
log_begin_msg "Interface $iface has IP address"
|
|
||||||
ip addr show "$iface" | grep "inet " | awk '{print $2}'
|
|
||||||
break
|
|
||||||
fi
|
fi
|
||||||
done
|
log "NETBOOT: Network is up"
|
||||||
|
|
||||||
if [ $INTERFACE_UP -eq 0 ]; then
|
# Download squashfs
|
||||||
log_failure_msg "No network interface obtained an IP address"
|
log "NETBOOT: Downloading ${ROOT_URL}..."
|
||||||
return ${rc}
|
if ! wget -O /filesystem.squashfs "${ROOT_URL}"; then
|
||||||
|
log "NETBOOT: FATAL - wget failed"
|
||||||
|
return 1
|
||||||
|
fi
|
||||||
|
log "NETBOOT: Download complete"
|
||||||
|
|
||||||
|
# Create mount points
|
||||||
|
mkdir -p "${SQUASHFS_MOUNT}" "${OVERLAY_TMPFS}"
|
||||||
|
|
||||||
|
# Mount squashfs
|
||||||
|
log "NETBOOT: Mounting squashfs..."
|
||||||
|
if ! mount -t squashfs /filesystem.squashfs "${SQUASHFS_MOUNT}" -o ro; then
|
||||||
|
log "NETBOOT: FATAL - squashfs mount failed"
|
||||||
|
return 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Extract filename from URL
|
# Mount tmpfs for overlay
|
||||||
FILE_NAME=$(basename "${ROOT_URL}")
|
log "NETBOOT: Mounting tmpfs..."
|
||||||
FILE_PATH="/${FILE_NAME}"
|
if ! mount -t tmpfs -o size=2G tmpfs "${OVERLAY_TMPFS}"; then
|
||||||
|
log "NETBOOT: FATAL - tmpfs mount failed"
|
||||||
|
return 1
|
||||||
|
fi
|
||||||
|
mkdir -p "${OVERLAY_TMPFS}/upper" "${OVERLAY_TMPFS}/work"
|
||||||
|
|
||||||
# Download the root filesystem with retries and timeouts
|
# Mount overlay
|
||||||
log_begin_msg "Downloading root filesystem from ${ROOT_URL}"
|
log "NETBOOT: Mounting overlay..."
|
||||||
if wget --timeout=30 --tries=3 --waitretry=5 \
|
if ! mount -t overlay -o "lowerdir=${SQUASHFS_MOUNT},upperdir=${OVERLAY_TMPFS}/upper,workdir=${OVERLAY_TMPFS}/work" overlay "${MOUNTPOINT}"; then
|
||||||
--progress=dot:mega \
|
log "NETBOOT: FATAL - overlay mount failed"
|
||||||
"${ROOT_URL}" -O "${FILE_PATH}"; then
|
return 1
|
||||||
log_end_msg
|
|
||||||
else
|
|
||||||
log_failure_msg "Failed to download from ${ROOT_URL} after retries"
|
|
||||||
rm -f "${FILE_PATH}"
|
|
||||||
return ${rc}
|
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Verify the downloaded file is a valid SquashFS
|
log "NETBOOT: SUCCESS - root mounted at ${MOUNTPOINT}"
|
||||||
if ! file "${FILE_PATH}" | grep -q "Squash"; then
|
return 0
|
||||||
log_failure_msg "Downloaded file is not a valid SquashFS image"
|
|
||||||
rm -f "${FILE_PATH}"
|
|
||||||
return ${rc}
|
|
||||||
fi
|
|
||||||
|
|
||||||
# Handle SquashFS images with overlay
|
|
||||||
if echo "${FILE_NAME}" | grep -q squashfs; then
|
|
||||||
log_begin_msg "Setting up SquashFS with overlay"
|
|
||||||
|
|
||||||
# Mount read-only SquashFS
|
|
||||||
if ! mount -t squashfs "${FILE_PATH}" "${MOUNTPOINT}" -o ro; then
|
|
||||||
log_failure_msg "Failed to mount SquashFS at ${MOUNTPOINT}"
|
|
||||||
rm -f "${FILE_PATH}"
|
|
||||||
return ${rc}
|
|
||||||
fi
|
|
||||||
log_begin_msg "SquashFS mounted at ${MOUNTPOINT}"
|
|
||||||
log_end_msg
|
|
||||||
|
|
||||||
# Setup overlay if requested
|
|
||||||
if [ -n "${OVERLAYROOT}" ]; then
|
|
||||||
log_begin_msg "Mounting ${OVERLAYROOT} for overlay"
|
|
||||||
|
|
||||||
# Create tmpfs for upper and work directories
|
|
||||||
if ! mount -o size=2G -t "${OVERLAYROOT}" tmpfs_overlay "${TMPFS_MOUNT}"; then
|
|
||||||
log_failure_msg "Failed to mount tmpfs for overlay"
|
|
||||||
umount "${MOUNTPOINT}"
|
|
||||||
rm -f "${FILE_PATH}"
|
|
||||||
return ${rc}
|
|
||||||
fi
|
|
||||||
|
|
||||||
# Create overlay structure
|
|
||||||
mkdir -p "${TMPFS_MOUNT}/upper" "${TMPFS_MOUNT}/work"
|
|
||||||
|
|
||||||
# Mount overlay combining read-only lower + writable upper
|
|
||||||
if ! mount -t overlay \
|
|
||||||
-o "lowerdir=${MOUNTPOINT},upperdir=${TMPFS_MOUNT}/upper,workdir=${TMPFS_MOUNT}/work" \
|
|
||||||
overlay_root "${MOUNTPOINT}"; then
|
|
||||||
log_failure_msg "Failed to mount overlay filesystem"
|
|
||||||
umount "${TMPFS_MOUNT}"
|
|
||||||
umount "${MOUNTPOINT}"
|
|
||||||
rm -f "${FILE_PATH}"
|
|
||||||
return ${rc}
|
|
||||||
fi
|
|
||||||
|
|
||||||
log_end_msg
|
|
||||||
log_begin_msg "Overlay mounted successfully"
|
|
||||||
log_end_msg
|
|
||||||
|
|
||||||
# Clean up downloaded image as it's now mounted
|
|
||||||
rm -f "${FILE_PATH}"
|
|
||||||
rc=0
|
|
||||||
else
|
|
||||||
# Direct SquashFS mount without overlay
|
|
||||||
log_begin_msg "Mounted SquashFS without overlay"
|
|
||||||
log_end_msg
|
|
||||||
rc=0
|
|
||||||
fi
|
|
||||||
else
|
|
||||||
log_failure_msg "Unknown filesystem type: ${FILE_NAME}"
|
|
||||||
rm -f "${FILE_PATH}"
|
|
||||||
fi
|
|
||||||
|
|
||||||
return ${rc}
|
|
||||||
}
|
}
|
||||||
|
|||||||
51
rebuild-initramfs.sh
Executable file
51
rebuild-initramfs.sh
Executable file
@@ -0,0 +1,51 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# Rebuild initramfs with updated netboot script
|
||||||
|
set -e
|
||||||
|
|
||||||
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
BUILD_DIR="$SCRIPT_DIR/build"
|
||||||
|
ROOTFS="$BUILD_DIR/rootfs"
|
||||||
|
HTTP_DIR="$SCRIPT_DIR/http"
|
||||||
|
|
||||||
|
if [ ! -d "$ROOTFS" ]; then
|
||||||
|
echo "ERROR: Rootfs not found at $ROOTFS"
|
||||||
|
echo "Run build-image.sh first"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "=== Copying updated initramfs scripts ==="
|
||||||
|
cp "$SCRIPT_DIR/initramfs/scripts/netboot" "$ROOTFS/usr/share/initramfs-tools/scripts/netboot"
|
||||||
|
cp "$SCRIPT_DIR/initramfs/hooks/netboot" "$ROOTFS/usr/share/initramfs-tools/hooks/netboot"
|
||||||
|
cp "$SCRIPT_DIR/initramfs/initramfs.conf" "$ROOTFS/etc/initramfs-tools/initramfs.conf"
|
||||||
|
cp "$SCRIPT_DIR/initramfs/modules" "$ROOTFS/etc/initramfs-tools/modules"
|
||||||
|
|
||||||
|
echo "=== Getting kernel version ==="
|
||||||
|
KVER=$(ls "$ROOTFS/lib/modules/" | head -1)
|
||||||
|
echo "Kernel version: $KVER"
|
||||||
|
|
||||||
|
echo "=== Mounting filesystems for chroot ==="
|
||||||
|
mount --bind /proc "$ROOTFS/proc"
|
||||||
|
mount --bind /sys "$ROOTFS/sys"
|
||||||
|
mount --bind /dev "$ROOTFS/dev"
|
||||||
|
|
||||||
|
cleanup() {
|
||||||
|
echo "=== Cleaning up mounts ==="
|
||||||
|
umount "$ROOTFS/proc" 2>/dev/null || true
|
||||||
|
umount "$ROOTFS/sys" 2>/dev/null || true
|
||||||
|
umount "$ROOTFS/dev" 2>/dev/null || true
|
||||||
|
}
|
||||||
|
trap cleanup EXIT
|
||||||
|
|
||||||
|
echo "=== Rebuilding initramfs ==="
|
||||||
|
chroot "$ROOTFS" mkinitramfs -v -o /boot/initrd-netboot.img "$KVER"
|
||||||
|
|
||||||
|
echo "=== Copying to http directory ==="
|
||||||
|
cp "$ROOTFS/boot/initrd-netboot.img" "$HTTP_DIR/"
|
||||||
|
chmod 644 "$HTTP_DIR/initrd-netboot.img"
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "=== Done! ==="
|
||||||
|
echo "New initramfs: $HTTP_DIR/initrd-netboot.img"
|
||||||
|
ls -lh "$HTTP_DIR/initrd-netboot.img"
|
||||||
|
echo ""
|
||||||
|
echo "Run 'make deploy' to sync to NAS"
|
||||||
16
secrets/netboot.sops.yaml
Normal file
16
secrets/netboot.sops.yaml
Normal file
@@ -0,0 +1,16 @@
|
|||||||
|
root_password_hash: ENC[AES256_GCM,data:Oc1Kpg1S3NSG4dDoe0AiDmdWe4wdz9zSMn/WlTvURz3u62HcF9ddZh3yKbsXdc19WbGj/ZJa+MFzucgCg6ChT5OG2k4S+JuAVvRaNmB54XSjyIL2vDkambq8Pt4rg5rVxfv5H6uEd5IWUg==,iv:fO72qW/8JIWGubbfjZYsfhjL3XUq/7RbohGPd1avS+8=,tag:nXP7w2b49iYAcnWxM4WFlA==,type:str]
|
||||||
|
sops:
|
||||||
|
age:
|
||||||
|
- recipient: age1gausnystsln7fpenw7arw7x79xe22z697jnauj38npy0usayqqxqc7td2y
|
||||||
|
enc: |
|
||||||
|
-----BEGIN AGE ENCRYPTED FILE-----
|
||||||
|
YWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IFgyNTUxOSBDS1VrWkNZTmswYlRrVXAv
|
||||||
|
ZC9FemRVWkc2bzlVL1BuQm9FaDlPVmVuVFZvCnUyb2xnaDdwQ3BsVkNmY0NxZktp
|
||||||
|
Zk9qSlZVZk16UUhhOHdGRFN1Zno1V3cKLS0tIHV6YXE1bFBHZjMyVVdMbVZEMXlW
|
||||||
|
YTN1RnJ3SjRkN21MYmhQK0hZZFB5Sk0KfxfMPUdJjZq/JDOE87oD2XBpQebvy0a5
|
||||||
|
IAI5tdpEzNP6tF4oqunmh15fPc61Q0C/5ev+uz0QyHhTlTI13lYpGg==
|
||||||
|
-----END AGE ENCRYPTED FILE-----
|
||||||
|
lastmodified: "2026-02-05T20:16:15Z"
|
||||||
|
mac: ENC[AES256_GCM,data:mTCLM3t35mMv9nLQHba65Gq3yAWnY4UKUDHEncMF22RnZKiVDaTMAV6tiaKGu7hHXdDu9fU/E7wPomR8pirGf6pJBUWxCflCe3Q3ZGK9/Aw3guz5ZD34H9nMaCjXME59r1rQdQdQlWP5aW4o+kqfD/bukFpW1HUY0YT8g8fqCpw=,iv:bG1M8Ghuc8JkMNQfODZ1FkMI/8Qs217xlN5ihDnz7hs=,tag:gCScQi1YYXFH4Xo/8Wq5+g==,type:str]
|
||||||
|
unencrypted_suffix: _unencrypted
|
||||||
|
version: 3.11.0
|
||||||
BIN
tftp/ipxe.efi
Normal file
BIN
tftp/ipxe.efi
Normal file
Binary file not shown.
266
verify-image.sh
Executable file
266
verify-image.sh
Executable file
@@ -0,0 +1,266 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# Verify netboot image build completeness and correctness
|
||||||
|
# Run this after 'make build' to validate the generated image
|
||||||
|
|
||||||
|
set -e
|
||||||
|
|
||||||
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
HTTP_DIR="$SCRIPT_DIR/http"
|
||||||
|
TMPDIR=$(mktemp -d)
|
||||||
|
EXIT_CODE=0
|
||||||
|
|
||||||
|
# Colors for output
|
||||||
|
RED='\033[0;31m'
|
||||||
|
GREEN='\033[0;32m'
|
||||||
|
YELLOW='\033[1;33m'
|
||||||
|
NC='\033[0m' # No Color
|
||||||
|
|
||||||
|
error() {
|
||||||
|
echo -e "${RED}✗${NC} $1"
|
||||||
|
EXIT_CODE=1
|
||||||
|
}
|
||||||
|
|
||||||
|
success() {
|
||||||
|
echo -e "${GREEN}✓${NC} $1"
|
||||||
|
}
|
||||||
|
|
||||||
|
warning() {
|
||||||
|
echo -e "${YELLOW}!${NC} $1"
|
||||||
|
}
|
||||||
|
|
||||||
|
info() {
|
||||||
|
echo " $1"
|
||||||
|
}
|
||||||
|
|
||||||
|
cleanup() {
|
||||||
|
rm -rf "$TMPDIR"
|
||||||
|
}
|
||||||
|
trap cleanup EXIT
|
||||||
|
|
||||||
|
echo "Verifying netboot image build..."
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check 1: Required files exist
|
||||||
|
echo "Checking required files..."
|
||||||
|
for file in vmlinuz initrd-netboot.img filesystem.squashfs version.txt boot.ipxe; do
|
||||||
|
if [ -f "$HTTP_DIR/$file" ]; then
|
||||||
|
success "$file exists"
|
||||||
|
else
|
||||||
|
error "$file missing"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check 2: File types are correct
|
||||||
|
echo "Checking file types..."
|
||||||
|
if file "$HTTP_DIR/vmlinuz" | grep -q "Linux kernel"; then
|
||||||
|
success "vmlinuz is a valid Linux kernel"
|
||||||
|
else
|
||||||
|
error "vmlinuz is not a valid kernel"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if file "$HTTP_DIR/initrd-netboot.img" | grep -q "cpio archive"; then
|
||||||
|
success "initrd-netboot.img is a valid initramfs"
|
||||||
|
else
|
||||||
|
error "initrd-netboot.img is not a valid initramfs"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if file "$HTTP_DIR/filesystem.squashfs" | grep -q "Squashfs filesystem"; then
|
||||||
|
success "filesystem.squashfs is a valid squashfs image"
|
||||||
|
else
|
||||||
|
error "filesystem.squashfs is not a valid squashfs"
|
||||||
|
fi
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check 3: File permissions for HTTP serving
|
||||||
|
echo "Checking file permissions..."
|
||||||
|
for file in vmlinuz initrd-netboot.img filesystem.squashfs; do
|
||||||
|
PERMS=$(stat -c "%a" "$HTTP_DIR/$file")
|
||||||
|
if [ "$PERMS" = "644" ]; then
|
||||||
|
success "$file has correct permissions (644)"
|
||||||
|
else
|
||||||
|
warning "$file has permissions $PERMS (expected 644)"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check 4: File sizes are reasonable
|
||||||
|
echo "Checking file sizes..."
|
||||||
|
KERNEL_SIZE=$(stat -c %s "$HTTP_DIR/vmlinuz")
|
||||||
|
INITRD_SIZE=$(stat -c %s "$HTTP_DIR/initrd-netboot.img")
|
||||||
|
SQUASHFS_SIZE=$(stat -c %s "$HTTP_DIR/filesystem.squashfs")
|
||||||
|
|
||||||
|
if [ "$KERNEL_SIZE" -gt 5000000 ] && [ "$KERNEL_SIZE" -lt 50000000 ]; then
|
||||||
|
success "vmlinuz size reasonable: $(numfmt --to=iec-i --suffix=B $KERNEL_SIZE)"
|
||||||
|
else
|
||||||
|
warning "vmlinuz size unusual: $(numfmt --to=iec-i --suffix=B $KERNEL_SIZE)"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ "$INITRD_SIZE" -gt 10000000 ] && [ "$INITRD_SIZE" -lt 100000000 ]; then
|
||||||
|
success "initrd size reasonable: $(numfmt --to=iec-i --suffix=B $INITRD_SIZE)"
|
||||||
|
else
|
||||||
|
warning "initrd size unusual: $(numfmt --to=iec-i --suffix=B $INITRD_SIZE)"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ "$SQUASHFS_SIZE" -gt 500000000 ] && [ "$SQUASHFS_SIZE" -lt 2000000000 ]; then
|
||||||
|
success "squashfs size reasonable: $(numfmt --to=iec-i --suffix=B $SQUASHFS_SIZE)"
|
||||||
|
else
|
||||||
|
warning "squashfs size unusual: $(numfmt --to=iec-i --suffix=B $SQUASHFS_SIZE)"
|
||||||
|
fi
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check 5: Initramfs contains custom netboot script and binaries
|
||||||
|
echo "Checking initramfs contents..."
|
||||||
|
unmkinitramfs "$HTTP_DIR/initrd-netboot.img" "$TMPDIR/initramfs" 2>/dev/null
|
||||||
|
|
||||||
|
if [ -f "$TMPDIR/initramfs/main/scripts/netboot" ]; then
|
||||||
|
success "Custom netboot script present in initramfs"
|
||||||
|
|
||||||
|
# Verify it matches source
|
||||||
|
if diff -q "$SCRIPT_DIR/initramfs/scripts/netboot" "$TMPDIR/initramfs/main/scripts/netboot" >/dev/null 2>&1; then
|
||||||
|
success "netboot script matches source"
|
||||||
|
else
|
||||||
|
warning "netboot script differs from source"
|
||||||
|
fi
|
||||||
|
else
|
||||||
|
error "Custom netboot script missing from initramfs"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Check required binaries
|
||||||
|
for binary in wget curl awk; do
|
||||||
|
if [ -f "$TMPDIR/initramfs/main/usr/bin/$binary" ]; then
|
||||||
|
success "$binary binary present"
|
||||||
|
else
|
||||||
|
error "$binary binary missing from initramfs"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
|
||||||
|
# Check unsquashfs in both possible locations
|
||||||
|
if [ -f "$TMPDIR/initramfs/main/usr/bin/unsquashfs" ] || [ -f "$TMPDIR/initramfs/main/usr/sbin/unsquashfs" ]; then
|
||||||
|
success "unsquashfs binary present"
|
||||||
|
else
|
||||||
|
error "unsquashfs binary missing from initramfs"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ -f "$TMPDIR/initramfs/main/usr/sbin/switch_root" ]; then
|
||||||
|
success "switch_root binary present"
|
||||||
|
else
|
||||||
|
error "switch_root binary missing from initramfs"
|
||||||
|
fi
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check 6: Required kernel modules configured
|
||||||
|
echo "Checking kernel modules configuration..."
|
||||||
|
if [ -f "$TMPDIR/initramfs/main/conf/modules" ]; then
|
||||||
|
MODULES_FILE="$TMPDIR/initramfs/main/conf/modules"
|
||||||
|
|
||||||
|
for module in squashfs overlay r8125 ext4 isofs af_packet nls_iso8859-1; do
|
||||||
|
if grep -q "^$module" "$MODULES_FILE"; then
|
||||||
|
success "Module $module configured"
|
||||||
|
else
|
||||||
|
error "Module $module not configured"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
else
|
||||||
|
warning "modules configuration file not found"
|
||||||
|
fi
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check 7: Squashfs root filesystem contents
|
||||||
|
echo "Checking squashfs root filesystem..."
|
||||||
|
|
||||||
|
# Check critical paths exist (handle usr-merge symlinks)
|
||||||
|
for path in boot etc root usr var; do
|
||||||
|
if unsquashfs -ll "$HTTP_DIR/filesystem.squashfs" | grep -q "squashfs-root/$path$"; then
|
||||||
|
success "/$path directory exists"
|
||||||
|
else
|
||||||
|
error "/$path directory missing"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
|
||||||
|
# Check bin and lib (may be symlinks due to usr-merge)
|
||||||
|
for path in bin lib; do
|
||||||
|
if unsquashfs -ll "$HTTP_DIR/filesystem.squashfs" | grep -qE "squashfs-root/$path( |->|$)"; then
|
||||||
|
success "/$path exists (directory or symlink)"
|
||||||
|
else
|
||||||
|
error "/$path missing"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
|
||||||
|
# Check SSH authorized_keys
|
||||||
|
if unsquashfs -cat "$HTTP_DIR/filesystem.squashfs" root/.ssh/authorized_keys >/dev/null 2>&1; then
|
||||||
|
KEY_COUNT=$(unsquashfs -cat "$HTTP_DIR/filesystem.squashfs" root/.ssh/authorized_keys 2>/dev/null | grep -c "^ssh-")
|
||||||
|
success "SSH authorized_keys present ($KEY_COUNT keys)"
|
||||||
|
else
|
||||||
|
error "SSH authorized_keys missing or invalid"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Check netplan config
|
||||||
|
if unsquashfs -cat "$HTTP_DIR/filesystem.squashfs" etc/netplan/01-netcfg.yaml >/dev/null 2>&1; then
|
||||||
|
success "Netplan configuration present"
|
||||||
|
else
|
||||||
|
error "Netplan configuration missing"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Check K3s dependencies
|
||||||
|
for pkg in containerd runc; do
|
||||||
|
if unsquashfs -ll "$HTTP_DIR/filesystem.squashfs" | grep -q "usr/bin/$pkg"; then
|
||||||
|
success "K3s dependency: $pkg"
|
||||||
|
else
|
||||||
|
warning "K3s dependency missing: $pkg (may be in /usr/sbin)"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check 8: Version info
|
||||||
|
echo "Build information:"
|
||||||
|
if [ -f "$HTTP_DIR/version.txt" ]; then
|
||||||
|
cat "$HTTP_DIR/version.txt" | sed 's/^/ /'
|
||||||
|
else
|
||||||
|
warning "version.txt not found"
|
||||||
|
fi
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check 9: iPXE boot configuration
|
||||||
|
echo "Checking iPXE configuration..."
|
||||||
|
if [ -f "$HTTP_DIR/boot.ipxe" ]; then
|
||||||
|
success "boot.ipxe present"
|
||||||
|
|
||||||
|
# Extract server URL from boot.ipxe
|
||||||
|
SERVER_URL=$(grep "http://" "$HTTP_DIR/boot.ipxe" | head -1 | grep -oP 'http://[^/]+')
|
||||||
|
if [ -n "$SERVER_URL" ]; then
|
||||||
|
info "Boot server configured: $SERVER_URL"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Check if all required files are referenced
|
||||||
|
for file in vmlinuz initrd-netboot.img filesystem.squashfs; do
|
||||||
|
if grep -q "$file" "$HTTP_DIR/boot.ipxe"; then
|
||||||
|
success "boot.ipxe references $file"
|
||||||
|
else
|
||||||
|
error "boot.ipxe missing reference to $file"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
|
||||||
|
# Check for required kernel parameters
|
||||||
|
if grep -q "overlayroot=tmpfs" "$HTTP_DIR/boot.ipxe"; then
|
||||||
|
success "boot.ipxe configures overlayroot"
|
||||||
|
else
|
||||||
|
warning "boot.ipxe missing overlayroot parameter"
|
||||||
|
fi
|
||||||
|
else
|
||||||
|
error "boot.ipxe missing"
|
||||||
|
fi
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Summary
|
||||||
|
echo "========================================="
|
||||||
|
if [ $EXIT_CODE -eq 0 ]; then
|
||||||
|
echo -e "${GREEN}✓ All checks passed!${NC}"
|
||||||
|
echo "Image is ready for deployment."
|
||||||
|
else
|
||||||
|
echo -e "${RED}✗ Some checks failed${NC}"
|
||||||
|
echo "Review errors above before deploying."
|
||||||
|
fi
|
||||||
|
echo "========================================="
|
||||||
|
|
||||||
|
exit $EXIT_CODE
|
||||||
Reference in New Issue
Block a user