!! BETTER SOLUTION IN COMMENTS BELLOW !!
Foreword
Recent improvements in the Raspberry Pi EEPROM has made it possible to boot them completely online, this method currently requires access to the Pi and human interaction. What I’m describing here is a way to image Pis with zero access or interaction required. It uses the network booting mechanism which has been available in the EEPROM for a while. You’ll be right at home if you know PXE. It is meant for tighter controlled local networks, not distributing images online.
In the interest of clarity, I’ll focus on the bare minimum necessary to the imaging process. Where I use this professionally, it is wrapped with more logic to have fleets of Pis detect when new images are published, and report to Slack when they go for reimaging.
What you Need
Simply 2 Pis on the same network. One is the Netboot Server providing images to other Pis on the network, the other is the Pi to image which will ask the Netboot Server for an image and write it to its own SD card. The Netboot Server does not have to be another Pi, it requires not special Pi sauce, but I like to throw Pis at all my hosting needs. The Pi to image on the other hand does need to be a Pi as well, we are imaging Pis here.
Setting up the NetBoot Server
The server can be any recent vanilla RaspiOS. Copy the following script to it under /home/pi/script.sh
#!/bin/bash # ascii text from https://patorjk.com/software/taag/#p=display&f=Standard if [ -z "$1" ] then echo "ERROR: I need one argument: the raspi image file" exit 1 fi if [ ! -f "$1" ]; then echo "ERROR: Image file $1 doesn't exist" exit 1 fi echo "> installing needed packages" apt-get update > /dev/null apt-get install -y bc nfs-kernel-server tftpd-hpa apache2 php php-curl php-xml libapache2-mod-php ipcalc curl wget screen lsof > /dev/null myip=`ifconfig | grep -Eo 'inet (addr:)?([0-9]*\.){3}[0-9]*' | grep -Eo '([0-9]*\.){3}[0-9]*' | grep -v '127.0.0.1'` mymask=`ifconfig | grep $myip | grep -Eo 'netmask (addr:)?([0-9]*\.){3}[0-9]*' | grep -Eo '([0-9]*\.){3}[0-9]*' | grep -v '127.0.0.1'` mynetwork=`ipcalc $myip/$mymask | grep "Network:" | sed 's/\s\{1,\}/ /g' | cut -d' ' -f2` echo "> my ip: $myip" echo "> my network: $mynetwork" echo "> starting services" echo "> tftp" service tftpd-hpa start echo "> nfs" service nfs-kernel-server start echo "> web" service apache2 start echo "> clean slate" rm -rf /srv/tftp/* rm -rf /srv/nfs/* rm -rf /var/www/html/* echo "> installing web components" echo "> bootstrap.sh" # _ _ _ _ # | |__ ___ ___ | |_ ___| |_ _ __ __ _ _ __ ___| |__ # | '_ \ / _ \ / _ \| __/ __| __| '__/ _` | '_ \ / __| '_ \ # | |_) | (_) | (_) | |_\__ \ |_| | | (_| | |_) |\__ \ | | | # |_.__/ \___/ \___/ \__|___/\__|_| \__,_| .__(_)___/_| |_| # |_| cat << EOF > /var/www/html/bootstrap.sh disk_count=\`lsblk -d -l -p -n | cut -d' ' -f1 | wc -l\` if [ \$disk_count -ne 1 ] then echo "ERROR: I need to have exactly 1 disk to write to" exit 1 fi echo "> writing image to disk" disk=\`lsblk -d -l -p -n | cut -d' ' -f1\` dd if=/tmp/img of=\$disk bs=1M status=progress if [ \$? -ne 0 ] then echo "> ERROR: writing image to disk failed" echo "> rebooting in 300 seconds" sleep 300 reboot exit 1 fi echo "> rebooting in 10 seconds" sleep 10 wget -qO- http://${myip}/reboot.php &> /dev/null reboot EOF # _ _ _ # (_)_ __ __| | _____ __ _ __ | |__ _ __ # | | '_ \ / _` |/ _ \ \/ / | '_ \| '_ \| '_ \ # | | | | | (_| | __/> < _| |_) | | | | |_) | # |_|_| |_|\__,_|\___/_/\_(_) .__/|_| |_| .__/ # |_| |_| # echo "> index.php" cat << EOF > /var/www/html/index.php <?php echo "Hi!" ; exit( 0 ) ; ?> EOF # _ _ # (_)_ __ ___ __ _ _ __ | |__ _ __ # | | '_ ` _ \ / _` | | '_ \| '_ \| '_ \ # | | | | | | | (_| |_| |_) | | | | |_) | # |_|_| |_| |_|\__, (_) .__/|_| |_| .__/ # |___/ |_| |_| echo "> img.php" cat << EOF > /var/www/html/img.php <?php header( "Cache-Control: no-store, no-cache, must-revalidate" ) ; header( "Cache-Control: post-check=0, pre-check=0", false ) ; header( "Pragma: no-cache" ) ; header( "Expires: ".gmdate("D, d M Y H:i:s", mktime(date("H")+2, date("i"), date("s"), date("m"), date("d"), date("Y")))." GMT" ) ; header( "Last-Modified: ".gmdate("D, d M Y H:i:s")." GMT" ) ; header( "Content-Type: application/octet-stream" ) ; header( "Content-Length: ".(string)(filesize("/var/img")) ) ; header( "Content-Transfer-Encoding: binary\n" ) ; \$handle = fopen( "/var/img", "rb" ) ; if( \$handle===false ) { echo "can't read image" ; exit( 1 ) ; } while( !feof(\$handle) ) { echo fread( \$handle, 8192 ) ; } fclose( \$handle ) ; ?> EOF # _ _ _ # _ __ ___| |__ ___ ___ | |_ _ __ | |__ _ __ # | '__/ _ \ '_ \ / _ \ / _ \| __| | '_ \| '_ \| '_ \ # | | | __/ |_) | (_) | (_) | |_ _| |_) | | | | |_) | # |_| \___|_.__/ \___/ \___/ \__(_) .__/|_| |_| .__/ # |_| |_| echo "> reboot.php" cat << EOF > /var/www/html/reboot.php <?php echo shell_exec( "sudo /usr/local/bin/enable_eeprom_sdboot.sh 2>&1; screen -S disable_eeprom_sdboot -dm sh -c \"sleep 240; sudo /usr/local/bin/disable_eeprom_sdboot.sh\"" ) ; ?> EOF echo "> dissecting image file" offset=`fdisk -l $1 | grep Linux | tr -s ' ' | tr '\t' ' ' | cut -d' ' -f2` real_linux_offset=`echo "$offset*512" | bc` offset=`fdisk -l $1 | grep W95 | tr -s ' ' | tr '\t' ' ' | cut -d' ' -f2` real_boot_offset=`echo "$offset*512" | bc` mkdir /tmp/boot 2>/dev/null mount -o loop,offset=$real_boot_offset $1 /tmp/boot rm -rf /srv/tftp/* cp -rpf /tmp/boot/* /srv/tftp/ umount /tmp/boot mkdir /tmp/root 2>/dev/null mount -o loop,offset=$real_linux_offset $1 /tmp/root mkdir /srv/nfs 2>/dev/null cp -rpf /tmp/root/* /srv/nfs/ echo "> getting sdboot eeprom ready" cat << EOF > /tmp/eeprom_config.sdboot [all] DHCP_TIMEOUT=45000 DHCP_REQ_TIMEOUT=4000 TFTP_FILE_TIMEOUT=30000 TFTP_IP=${myip} TFTP_PREFIX=1 TFTP_PREFIX_STR= BOOT_ORDER=0xf21 ENABLE_SELF_UPDATE=1 SD_BOOT_MAX_RETRIES=3 NET_BOOT_MAX_RETRIES=2 EOF latest_eeprom=`ls -1 /tmp/root/lib/firmware/raspberrypi/bootloader/stable/pieeprom-*.bin | sort -u | tail -1` cp $latest_eeprom /tmp/pieeprom.bin rpi-eeprom-config --out /tmp/pieeprom-out.bin --config /tmp/eeprom_config.sdboot /tmp/pieeprom.bin && rpi-eeprom-update -d -f /tmp/pieeprom-out.bin mv /boot/pieeprom.sig /srv/tftp/pieeprom.sig.inert mv /boot/pieeprom.upd /srv/tftp/pieeprom.upd.inert mv /boot/recovery.bin /srv/tftp/recovery.bin.inert cat << EOF > /usr/local/bin/enable_eeprom_sdboot.sh sed -i "s/ts:.*/ts: \`date +%s\`/g" /srv/tftp/pieeprom.sig.inert mv /srv/tftp/pieeprom.sig.inert /srv/tftp/pieeprom.sig mv /srv/tftp/pieeprom.upd.inert /srv/tftp/pieeprom.upd mv /srv/tftp/recovery.bin.inert /srv/tftp/recovery.bin EOF chmod 755 /usr/local/bin/enable_eeprom_sdboot.sh cat << EOF > /usr/local/bin/disable_eeprom_sdboot.sh mv /srv/tftp/pieeprom.sig /srv/tftp/pieeprom.sig.inert mv /srv/tftp/pieeprom.upd /srv/tftp/pieeprom.upd.inert mv /srv/tftp/recovery.bin /srv/tftp/recovery.bin.inert EOF chmod 755 /usr/local/bin/disable_eeprom_sdboot.sh cat << EOF > /etc/sudoers.d/010_www-data_eeprom www-data ALL=(ALL) NOPASSWD: /usr/local/bin/enable_eeprom_sdboot.sh www-data ALL=(ALL) NOPASSWD: /usr/local/bin/disable_eeprom_sdboot.sh EOF umount /tmp/root echo "/srv/nfs $mynetwork(rw,sync,no_subtree_check,no_root_squash)" > /etc/exports exportfs -rav echo "console=serial0,115200 console=tty1 root=/dev/nfs nfsroot="${myip}":/srv/nfs,nfsvers=3 ip=dhcp rw elevator=deadline fsck.repair=yes rootwait" > /srv/tftp/cmdline.txt echo "proc /proc proc defaults 0 0" > /srv/nfs/etc/fstab echo "> deploying image" if [ ! -f /var/img ]; then cp -f $1 /var/img else cp -f $1 /var/img.new if [ "`lsof | grep "/var/img" | wc -l`" -ne 0 ]; then echo "> waiting for file handle release on /var/img" while [ "`lsof | grep "/var/img" | wc -l`" -ne 0 ]; do echo -n "." sleep 1 done echo "" fi mv /var/img /var/img.old mv /var/img.new /var/img rm /var/img.old fi echo "> adding web bootstrap" echo '#!/bin/sh -e' > /srv/nfs/etc/rc.local echo "echo \"> disabling cron\"" >> /srv/nfs/etc/rc.local # I've seen this command hang several times before echo "/usr/bin/timeout --kill-after=10s 5s /usr/sbin/service cron stop || :" >> /srv/nfs/etc/rc.local echo "echo \"> sleeping until we have connectivity\"" >> /srv/nfs/etc/rc.local echo "bash -c 'count=0; res=1; while [ \$res -ne 0 -a \$count -lt 120 ]; do echo "."; wget -q --spider http://${myip}; res=\$?; sleep 1; count=\$((count+1)); done'" >> /srv/nfs/etc/rc.local echo "echo \"> retrieving imaging image\"" >> /srv/nfs/etc/rc.local echo "curl -s http://${myip}/img.php --output /tmp/img" >> /srv/nfs/etc/rc.local echo "echo \"> launching bootstraping process\"" >> /srv/nfs/etc/rc.local echo "curl -s http://${myip}/bootstrap.sh --output /tmp/bootstrap.sh" >> /srv/nfs/etc/rc.local echo "bash /tmp/bootstrap.sh" >> /srv/nfs/etc/rc.local echo "> all done"
Make is executable with:
chmod 755 /home/pi/script.sh
Download a RaspiOS image, the one you want to serve out to the Pi to image, and copy it on the server under /home/pi.
Run the script with the RaspiOS image as the argument:
sudo /home/pi/script.sh /home/pi/2022-01-28-raspios-bullseye-armhf-lite.img
This can take a while depending on the speed of your SD card. Note that our NetBoot Server‘s IP is 192.168.1.116.
Sending the Pi to image for reimaging
With your server ready, you can instruct the Pi to image to go reimage itself by asking it to network boot from 192.168.1.116. The following script will update its EEPROM to this effect:
#!/bin/bash cat << EOF > /tmp/eeprom_config.netboot [all] DHCP_TIMEOUT=45000 DHCP_REQ_TIMEOUT=4000 TFTP_FILE_TIMEOUT=30000 TFTP_IP=192.168.1.116 TFTP_PREFIX=1 TFTP_PREFIX_STR= BOOT_ORDER=0xf12 ENABLE_SELF_UPDATE=1 SD_BOOT_MAX_RETRIES=3 NET_BOOT_MAX_RETRIES=2 EOF /usr/bin/rpi-eeprom-config --apply /tmp/eeprom_config.netboot sleep 1 /usr/sbin/reboot
The 192.168.1.116 IP is the only thing you need to change. Copy it to the Pi to image as reimage.sh for example, make it executable and run it.
Working Principle
The script on the NetBoot Server grabs a RaspiOS image and serves it via NFS. It also runs a TFTP server pointing to the NFS share. You could run your Pis entirely off the network and disregard the SD card, however I think it makes more sense for most purposes to simply burn that image locally on the Pi’s SD card and run off of it. The image has added instructions in /etc/rc.local for doing just that, retrieving the original RaspiOS image via HTTP, and burning it to the SD card. Having burnt it, it reboots, but right before it tells the NetBoot Server that it is rebooting. This is because for a brief moment, the NetBoot Server will have to catch that Pi rebooting and ask it to go back to booting from its newly burnt SD card.
You might see errors while the Pi is reimaging, this is because we are serving via NFS a full OS that doesn’t like some of its filesystems being read-only. And that is fine, we only need enough kernel to run wget, dd and reboot. A better alternative might be to serve via NFS a more well purposed OS, but in case I prefer serving the very OS we are imaging, because it’s more expedient, but also because it gives me access to the very EEPROM binaries which are critical to the process.
There you Have it
The building blocks to set up a more seamless and better integrated reimaging process :).
I currently use this for a fleet of 70 Pis to great effect. Of course I image them with a custom RaspiOS build which on top of serving professional needs, is setup to check with the NetBoot server every 15 minutes. I’ll talk about building custom RaspiOS images in a different post. I can simply publish a new image there and the whole fleet will go in for reimaging. When it does so, I have the Pis sleep a random amount between 0 and 48 hours to avoid having a huge wave of traffic, but also to avoid breaking everything at the same time. This give me time to halt the process if the Pis are coming out of reimaging wrong. It’s also particularly relevant as the NetBoot server can have concurrency issues with Pis imaging at the same time. None that will create real issues, but there is a chicken-and-egg issue with the way Pis update their EEPROM to be instructed to boot from their SD card once they have booted from the network and reimaged themselves. They need to be told by the NetBoot server itself, and so it needs to switch into serving instructions for SD Booting for a few minutes while a newly imaged Pi reboots. And so if a Pi shows up to be imaged at that time, it’ll reboot into its SD card having done nothing. This issue seems to exist in other scenarios such as USB booting, it seems to be a shortcoming of how EEPROMS are updated.
Hello Ben,
After few hours looking online to know if such a setup was possible without relying on NFS, I am really glad to see this is possible !
I have not yet tested it, but in my context I do not want to use an sdcard but rather an usb attached SSD for the image to be copied to. Do you see any issue to do so with your scripts ?
Thanks !
There’s definitely some NFS here sorry :\
as far as where the image gets written to though, that’s a simple tweak to bootstrap.sh
Yes sorry I was not clear enough. I am fine with NFS being part of the image upgrade process, but I prefer to rely on local disk afterwards, which is exactly what your are proposing here.
I will first make a try of your scripts on sdcard and try do the update to manage my usb SSD use case next. I will let you know if that works well. Thanks again !
Ah yes ok! Indeed that’s the whole point of that script, it leverages the NFS link just long enough to write a new SD card :). I’ve actually been using this in production on a fleet of Pi 4s for a few years so I’m optimistic it’ll work for you.
Hello Ben,
Like you said, it worked out of the box and allowed me to tweak it easily for usb booting !
I just had to fix few things related to bookworm compatibility in firmware paths, and also disabling some default timeout for img.php (“set_time_limit(0);”) that caused download interruptions for large images (> 4GB).
I have one architecture question regarding your system, can you elaborate on why you preferred to manage the eeprom update from network booting to sdcard booting on the tftp side rather than applying an eeprom update directly on the “Pi to Image” at the end of the bootstrap.sh script ? Doing it on the later would allow to bypass any concurrency issue but I may be missing something.
Thanks again !
Nice! I’m glad it worked. To my knowledge, you’re the first person to actually try this and it’s been a few years, it’s good you had the chops to make the needed tweaks :).
To answer your question, it’s not ideal but there is a chicken-and-egg issue with eprom updates. The get checked for and applied early in the boot process. Even if you are booting from the network. When we boot from the network, the eprom provided by the network boot needs to be that we’re booting from the network or the Pi will immediately reboot from disk and never get a chance to truly boot from the network. And so how do we go back? Well the network needs to provide an eprom that says to boot from disk, but that can’t be there all the time, or the Pis will never truly boot from the network. So what I devised here is something where the network is set to temporarily have an eprom that boots from disk. It’s not perfect and there are potential synchronization issues with a fleet of Pis all reimaging at the same time, but they were easy to work around in my case. I don’t want a reimage flood anyway so I stagger them.
I believe this is a shortcoming of the Pi’s boot process, I don’t think they envisioned network booting to be a temporary state just for the purpose of reimaging.
I hope this makes sense, let me know if you find something better to do please 🙂 I haven’t looked deeply at this in a while.
Hello Ben,
Thanks for clarifying this, you are right, no way to avoid the eeprom update on the tftp side on your architecture.
For your information, like I have to manage remote Pis where I cannot control anything on the LAN, I wanted to deep dive into the HTTP boot mode and see if there could be a viable alternative to your tftp system based on it, and it seems there is ! (cf discussion here: https://forums.raspberrypi.com/viewtopic.php?p=2267650)
Principle is quite the same, the main difference is that you do not have to touch to the EEPROM expect for first configuration, and second is that it allows to manage a full re-install remotely (considering you can trigger a power off & power on remotely) if you somehow lost control of the device.
Darko, that’s incredible! Definitely better as protocols like TFTP become less and less supported in enterprise networks while HTTP becomes the de facto protocol for everything. I came up with this before the announcement for http based net booting, I took a quick look without looking under the hood at the time and was left with the impression that one wouldn’t be able to set up their own web server to point to. That’s obviously wrong and your solution is far better. I’ve updated my post to point to it, and I encourage you to document it somewhere with some nice scripts and commands like I did here :). I’ll likely be using it in the near future. Thank you very much for so carefully dissecting every aspect of this issue in 2024, and following up with your findings.
As a side note, the network boot solution you propose here works starting with Pi 3 if I am not mistaken, while the HTTP boot mode is implemented only starting with Pi 4, so your solution is still the best with retro-compatibility in mind.
I will definitely try to share some more out of the box full solution like the one you provided as soon as I will fully implement it. So far, this only proves that the concept works.
I would never have dug deeper into this topic without finding your solution first, so thanks again 🙂
Hey! Does this expect the pi that you want to image already has an OS loaded?
Not necessarily, the “Sending the Pi to image for reimaging” part indeed do assume you can hop on a Pi to run that script. But it if was already setup to network boot, that wouldn’t be needed.
Thanks for the great write-up. I would love to know if a similar process exists (or could be used) for the Compute Module 4 (CM4)
I’ve never played with one but I’ll wager it all depends on what EEPROM it comes loaded with. Do you have one you can try stuff with?