Fix netboot initialization and add documentation tooling

- Add CLAUDE.md with project architecture and build documentation
- Add verify-image.sh script to validate generated netboot images
- Fix boot.ipxe kernel parameters:
  - Add boot=netboot to invoke custom initramfs script
  - Add console=tty0 for VGA output alongside serial console
  - Fix earlyprintk serial specification
- Remove dead symlink creation in build-image.sh (http/latest pointed to non-existent directory)

The boot=netboot parameter is critical - without it, initramfs falls back to local boot
and fails with /dev/root errors. The console changes enable viewing boot messages on
monitor instead of only serial port.
This commit is contained in:
2026-01-31 09:56:21 +01:00
parent adc92a61b4
commit a4fe05e26a
4 changed files with 405 additions and 2 deletions

145
CLAUDE.md Normal file
View File

@@ -0,0 +1,145 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
This is a netboot system for diskless Ubuntu Noble (24.04) nodes designed for K3s clusters. It builds bootable images (kernel, initramfs, squashfs) that are served via HTTP and loaded using iPXE for network booting.
**Boot Flow:**
1. Machine PXE boots and loads iPXE from network
2. iPXE fetches and executes `boot.ipxe` configuration
3. iPXE downloads kernel (`vmlinuz`) and custom initramfs (`initrd-netboot.img`) over HTTP
4. Kernel boots with custom initramfs that downloads the squashfs root filesystem over HTTP
5. Root filesystem is mounted as read-only squashfs with writable overlay (tmpfs)
## Build Commands
```bash
# Build netboot image (15-30 minutes, requires sudo)
make build
# Builds: http/vmlinuz, http/initrd-netboot.img, http/filesystem.squashfs
# Deploy to NAS server
make deploy
# Syncs http/ directory to phoenix:/srv/netboot/http/
# Build and deploy in one step
make all
# Clean build artifacts (unmounts any stray filesystem mounts first)
make clean
# Check NAS connectivity
make check-nas
```
**Configuration:**
- `NAS_HOST=phoenix` - target server for deployment
- `NAS_PATH=/srv/netboot` - deployment path on NAS
- Edit these in `Makefile` if deployment target changes
## Architecture
### Build System
**build-image.sh** - Main build script that:
1. Creates Ubuntu Noble base system using `debootstrap`
2. Chroots into rootfs and installs packages (kernel, K3s prerequisites, container runtime, tools)
3. Configures system (networking via netplan, SSH keys, tmpfs mounts, services)
4. Builds custom initramfs using `mkinitramfs` with customizations from `initramfs/`
5. Creates compressed squashfs image of entire rootfs
6. Copies artifacts to `images/<VERSION>/` and `http/` directories
7. Sets proper file permissions (644) for HTTP serving
**Path handling:** All scripts use `SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"` to work from any location, not just hardcoded paths.
### Custom Initramfs
Located in `initramfs/` directory, passed to `mkinitramfs` with `-d` flag:
- **initramfs.conf** - Configuration (MODULES=most, COMPRESS=gzip, RESUME=none)
- **modules** - Extra kernel modules to include (squashfs, overlay, r8125 network driver for 2.5GbE)
- **hooks/netboot** - Copies binaries into initramfs (wget, curl, unsquashfs, switch_root)
- **scripts/netboot** - Provides `mountroot()` function that:
- Parses kernel cmdline for `root=http://...` URL and `overlayroot=tmpfs`
- Configures networking via `configure_networking`
- Downloads squashfs over HTTP using wget (with retries)
- Validates downloaded file is squashfs
- Mounts squashfs read-only
- If `overlayroot=tmpfs`, creates overlay with tmpfs upper layer for writes
### iPXE Boot Configuration
**http/boot.ipxe** - iPXE script that:
- Loads kernel from `http://192.168.100.1:8800/vmlinuz`
- Loads initramfs from `http://192.168.100.1:8800/initrd-netboot.img`
- Sets kernel args: `root=http://192.168.100.1:8800/filesystem.squashfs rootfstype=squashfs overlayroot=tmpfs ip=dhcp console=ttyS0,115200`
- Boots the kernel
**IMPORTANT:** The HTTP server IP (192.168.100.1:8800) is hardcoded in boot.ipxe. Update this if the boot server changes.
### System Configuration
Built systems are configured with:
- Norwegian keyboard layout (nb_NO.UTF-8 + en_US.UTF-8 locales)
- Root SSH access with specific authorized keys (see build-image.sh:138-141)
- Password auth disabled, pubkey only
- Network via netplan with DHCP (systemd-networkd)
- Ephemeral tmpfs mounts: /tmp (2G), /var/tmp (1G), /var/log (1G), /run (512M)
- systemd-journald configured for volatile storage (tmpfs, 256M max)
- K3s dependencies: apparmor, iptables, conntrack, containerd, runc
- No hibernation/resume support
### Utility Scripts
**chroot-rootfs.sh** - Enter chroot for manual tweaking
- Mounts proc/sys/dev into existing rootfs
- Cleanup trap unmounts on exit
- **Hardcoded path:** `/srv/netboot/build/rootfs` - update if repo moves
**rebuild-squashfs.sh** - Rebuild squashfs after manual changes
- Creates new versioned image from existing rootfs
- Skips full debootstrap/package installation
- **Hardcoded paths:** `/srv/netboot/*` - update if repo moves
## File Structure
```
.
├── build-image.sh # Main build script
├── Makefile # Build/deploy automation
├── boot.ipxe # iPXE boot configuration (in http/)
├── initramfs/ # Custom initramfs configuration
│ ├── initramfs.conf # mkinitramfs config
│ ├── modules # Extra kernel modules
│ ├── hooks/netboot # Binary copying hook
│ └── scripts/netboot # HTTP root mounting logic
├── chroot-rootfs.sh # Chroot helper (hardcoded paths)
├── rebuild-squashfs.sh # Rebuild helper (hardcoded paths)
├── build/ # Build artifacts (gitignored)
│ └── rootfs/ # debootstrap rootfs
├── images/ # Versioned builds (gitignored)
│ ├── <YYYYMMDD-HHMM>/ # Timestamped builds
│ └── latest -> <VERSION> # Symlink to latest
└── http/ # HTTP serving directory (gitignored except boot.ipxe)
├── boot.ipxe # iPXE config (tracked)
├── vmlinuz # Kernel (generated)
├── initrd-netboot.img # Custom initramfs (generated)
├── filesystem.squashfs # Root filesystem (generated)
└── version.txt # Build metadata (generated)
```
## Development Notes
**Hardcoded paths issue:** `chroot-rootfs.sh` and `rebuild-squashfs.sh` use hardcoded `/srv/netboot/` paths instead of dynamic path detection like `build-image.sh`. They need updating if repo is cloned elsewhere.
**Build requirements:**
- Ubuntu/Debian host with debootstrap, mkinitramfs, mksquashfs
- Sudo access for chroot operations and filesystem mounting
- 15-30 minute build time
- ~1GB disk space for build artifacts
**SSH key management:** Root SSH keys are embedded in build-image.sh:138-141. Update these before building images for new environments.
**Network driver:** RTL8125 (r8125) driver is explicitly loaded in initramfs for 2.5GbE NICs. If different NICs are used, update `initramfs/modules` and `initramfs/scripts/netboot`.

View File

@@ -257,7 +257,6 @@ ln -sfn $VERSION $IMAGE_DIR/latest
# Copy to HTTP directory
echo "Deploying to HTTP directory..."
rsync -av $IMAGE_DIR/$VERSION/ $HTTP_DIR/
ln -sfn $VERSION $HTTP_DIR/latest
# Fix permissions for web server access
echo "Setting permissions for HTTP serving..."

View File

@@ -5,6 +5,6 @@ kernel --name vmlinuz http://192.168.100.1:8800/vmlinuz
echo Loading initramfs from http://192.168.100.1:8800/initrd-netboot.img
initrd --name initrd http://192.168.100.1:8800/initrd-netboot.img
echo Setting kernel arguments for HTTP root mounting
imgargs vmlinuz root=http://192.168.100.1:8800/filesystem.squashfs rootfstype=squashfs overlayroot=tmpfs ip=dhcp console=ttyS0,115200 earlyprintk=ttyS0,115200 loglevel=7
imgargs vmlinuz boot=netboot root=http://192.168.100.1:8800/filesystem.squashfs rootfstype=squashfs overlayroot=tmpfs ip=dhcp console=tty0 console=ttyS0,115200 earlyprintk=serial,ttyS0,115200 loglevel=7
echo Booting system...
boot vmlinuz

259
verify-image.sh Executable file
View File

@@ -0,0 +1,259 @@
#!/bin/bash
# Verify netboot image build completeness and correctness
# Run this after 'make build' to validate the generated image
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
HTTP_DIR="$SCRIPT_DIR/http"
TMPDIR=$(mktemp -d)
EXIT_CODE=0
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
error() {
echo -e "${RED}${NC} $1"
EXIT_CODE=1
}
success() {
echo -e "${GREEN}${NC} $1"
}
warning() {
echo -e "${YELLOW}!${NC} $1"
}
info() {
echo " $1"
}
cleanup() {
rm -rf "$TMPDIR"
}
trap cleanup EXIT
echo "Verifying netboot image build..."
echo ""
# Check 1: Required files exist
echo "Checking required files..."
for file in vmlinuz initrd-netboot.img filesystem.squashfs version.txt boot.ipxe; do
if [ -f "$HTTP_DIR/$file" ]; then
success "$file exists"
else
error "$file missing"
fi
done
echo ""
# Check 2: File types are correct
echo "Checking file types..."
if file "$HTTP_DIR/vmlinuz" | grep -q "Linux kernel"; then
success "vmlinuz is a valid Linux kernel"
else
error "vmlinuz is not a valid kernel"
fi
if file "$HTTP_DIR/initrd-netboot.img" | grep -q "cpio archive"; then
success "initrd-netboot.img is a valid initramfs"
else
error "initrd-netboot.img is not a valid initramfs"
fi
if file "$HTTP_DIR/filesystem.squashfs" | grep -q "Squashfs filesystem"; then
success "filesystem.squashfs is a valid squashfs image"
else
error "filesystem.squashfs is not a valid squashfs"
fi
echo ""
# Check 3: File permissions for HTTP serving
echo "Checking file permissions..."
for file in vmlinuz initrd-netboot.img filesystem.squashfs; do
PERMS=$(stat -c "%a" "$HTTP_DIR/$file")
if [ "$PERMS" = "644" ]; then
success "$file has correct permissions (644)"
else
warning "$file has permissions $PERMS (expected 644)"
fi
done
echo ""
# Check 4: File sizes are reasonable
echo "Checking file sizes..."
KERNEL_SIZE=$(stat -c %s "$HTTP_DIR/vmlinuz")
INITRD_SIZE=$(stat -c %s "$HTTP_DIR/initrd-netboot.img")
SQUASHFS_SIZE=$(stat -c %s "$HTTP_DIR/filesystem.squashfs")
if [ "$KERNEL_SIZE" -gt 5000000 ] && [ "$KERNEL_SIZE" -lt 50000000 ]; then
success "vmlinuz size reasonable: $(numfmt --to=iec-i --suffix=B $KERNEL_SIZE)"
else
warning "vmlinuz size unusual: $(numfmt --to=iec-i --suffix=B $KERNEL_SIZE)"
fi
if [ "$INITRD_SIZE" -gt 10000000 ] && [ "$INITRD_SIZE" -lt 100000000 ]; then
success "initrd size reasonable: $(numfmt --to=iec-i --suffix=B $INITRD_SIZE)"
else
warning "initrd size unusual: $(numfmt --to=iec-i --suffix=B $INITRD_SIZE)"
fi
if [ "$SQUASHFS_SIZE" -gt 500000000 ] && [ "$SQUASHFS_SIZE" -lt 2000000000 ]; then
success "squashfs size reasonable: $(numfmt --to=iec-i --suffix=B $SQUASHFS_SIZE)"
else
warning "squashfs size unusual: $(numfmt --to=iec-i --suffix=B $SQUASHFS_SIZE)"
fi
echo ""
# Check 5: Initramfs contains custom netboot script and binaries
echo "Checking initramfs contents..."
unmkinitramfs "$HTTP_DIR/initrd-netboot.img" "$TMPDIR/initramfs" 2>/dev/null
if [ -f "$TMPDIR/initramfs/main/scripts/netboot" ]; then
success "Custom netboot script present in initramfs"
# Verify it matches source
if diff -q "$SCRIPT_DIR/initramfs/scripts/netboot" "$TMPDIR/initramfs/main/scripts/netboot" >/dev/null 2>&1; then
success "netboot script matches source"
else
warning "netboot script differs from source"
fi
else
error "Custom netboot script missing from initramfs"
fi
# Check required binaries
for binary in wget curl unsquashfs awk; do
if [ -f "$TMPDIR/initramfs/main/usr/bin/$binary" ]; then
success "$binary binary present"
else
error "$binary binary missing from initramfs"
fi
done
if [ -f "$TMPDIR/initramfs/main/usr/sbin/switch_root" ]; then
success "switch_root binary present"
else
error "switch_root binary missing from initramfs"
fi
echo ""
# Check 6: Required kernel modules configured
echo "Checking kernel modules configuration..."
if [ -f "$TMPDIR/initramfs/main/conf/modules" ]; then
MODULES_FILE="$TMPDIR/initramfs/main/conf/modules"
for module in squashfs overlay r8125 ext4 isofs af_packet nls_iso8859-1; do
if grep -q "^$module" "$MODULES_FILE"; then
success "Module $module configured"
else
error "Module $module not configured"
fi
done
else
warning "modules configuration file not found"
fi
echo ""
# Check 7: Squashfs root filesystem contents
echo "Checking squashfs root filesystem..."
# Check critical paths exist (handle usr-merge symlinks)
for path in boot etc root usr var; do
if unsquashfs -ll "$HTTP_DIR/filesystem.squashfs" | grep -q "squashfs-root/$path$"; then
success "/$path directory exists"
else
error "/$path directory missing"
fi
done
# Check bin and lib (may be symlinks due to usr-merge)
for path in bin lib; do
if unsquashfs -ll "$HTTP_DIR/filesystem.squashfs" | grep -qE "squashfs-root/$path( |->|$)"; then
success "/$path exists (directory or symlink)"
else
error "/$path missing"
fi
done
# Check SSH authorized_keys
if unsquashfs -cat "$HTTP_DIR/filesystem.squashfs" root/.ssh/authorized_keys >/dev/null 2>&1; then
KEY_COUNT=$(unsquashfs -cat "$HTTP_DIR/filesystem.squashfs" root/.ssh/authorized_keys 2>/dev/null | grep -c "^ssh-")
success "SSH authorized_keys present ($KEY_COUNT keys)"
else
error "SSH authorized_keys missing or invalid"
fi
# Check netplan config
if unsquashfs -cat "$HTTP_DIR/filesystem.squashfs" etc/netplan/01-netcfg.yaml >/dev/null 2>&1; then
success "Netplan configuration present"
else
error "Netplan configuration missing"
fi
# Check K3s dependencies
for pkg in containerd runc; do
if unsquashfs -ll "$HTTP_DIR/filesystem.squashfs" | grep -q "usr/bin/$pkg"; then
success "K3s dependency: $pkg"
else
warning "K3s dependency missing: $pkg (may be in /usr/sbin)"
fi
done
echo ""
# Check 8: Version info
echo "Build information:"
if [ -f "$HTTP_DIR/version.txt" ]; then
cat "$HTTP_DIR/version.txt" | sed 's/^/ /'
else
warning "version.txt not found"
fi
echo ""
# Check 9: iPXE boot configuration
echo "Checking iPXE configuration..."
if [ -f "$HTTP_DIR/boot.ipxe" ]; then
success "boot.ipxe present"
# Extract server URL from boot.ipxe
SERVER_URL=$(grep "http://" "$HTTP_DIR/boot.ipxe" | head -1 | grep -oP 'http://[^/]+')
if [ -n "$SERVER_URL" ]; then
info "Boot server configured: $SERVER_URL"
fi
# Check if all required files are referenced
for file in vmlinuz initrd-netboot.img filesystem.squashfs; do
if grep -q "$file" "$HTTP_DIR/boot.ipxe"; then
success "boot.ipxe references $file"
else
error "boot.ipxe missing reference to $file"
fi
done
# Check for required kernel parameters
if grep -q "overlayroot=tmpfs" "$HTTP_DIR/boot.ipxe"; then
success "boot.ipxe configures overlayroot"
else
warning "boot.ipxe missing overlayroot parameter"
fi
else
error "boot.ipxe missing"
fi
echo ""
# Summary
echo "========================================="
if [ $EXIT_CODE -eq 0 ]; then
echo -e "${GREEN}✓ All checks passed!${NC}"
echo "Image is ready for deployment."
else
echo -e "${RED}✗ Some checks failed${NC}"
echo "Review errors above before deploying."
fi
echo "========================================="
exit $EXIT_CODE