When working with a Linux system being loaded through PXE / tftpboot / NFS in a device without hard disk it could happen that we need to understand or fix problems at boot time process, part of it is initramfs.

The procedure to unpack an initrd.img is simple:

# create a new directory
root@mainserver:/tftpboot/system# mkdir test
# copy the image into it
root@mainserver:/tftpboot/system# cp initrd_12.04 test/
# change to that directory
root@mainserver:/tftpboot/system# cd test/
# list its content
root@mainserver:/tftpboot/system/test# ls -ltr
total 11136
-rw-r--r-- 1 root root 11383089 jul  5 16:24 initrd_12.04
# extract the image
root@mainserver:/tftpboot/system/test# zcat initrd_12.04 | cpio --extract
47645 blocks
# and list again
root@mainserver:/tftpboot/system/test# ls -ltr
total 11172
-rw-r--r-- 1 root root 11383089 jul  5 16:24 initrd_12.04
-rwxr-xr-x 1 root root     7237 jul  5 16:25 init
drwxr-xr-x 8 root root     4096 jul  5 16:25 scripts
drwxr-xr-x 2 root root     4096 jul  5 16:25 sbin
drwxr-xr-x 2 root root     4096 jul  5 16:25 run
drwxr-xr-x 6 root root     4096 jul  5 16:25 lib
drwxr-xr-x 7 root root     4096 jul  5 16:25 etc
drwxr-xr-x 3 root root     4096 jul  5 16:25 conf
drwxr-xr-x 2 root root     4096 jul  5 16:25 bin

Here we’ll have access to the scripts and configuration of initramfs. I’ve modified the init script and added echo and sleep whenever was necessary  to display the different stages of the execution. Also when starting Linux I’ve added the parameter «debug» to the kernel, so this «case statement» in the init script would be executed:

                exec >/run/initramfs/initramfs.debug 2>&1
                set -x

As you can see, errors will be redirected to initramfs.debug but also I wanted to see all the executions with set -x so I’ve commented the exec part.

After changes were made it’s time to pack everything again:

root@mainserver:/tftpboot/system/test# find . 2>/dev/null | cpio --quiet --dereference -o -H newc | gzip -9 > initrd_12.04
cpio: ./etc/ld.so.conf.d/i386-linux-gnu_GL.conf: Cannot stat: No such file or directory
cpio: ./etc/modprobe.d/blacklist-oss.conf: Cannot stat: No such file or directory
cpio: File ./initrd_12.04 grew, 11763712 new bytes not copied

I’ve safely ignored those errors which correspond to broken symlinks. The modified initrd_12.04 was generated and it worked perfectly.

It’s important to notice that the last step of init will be executing:

exec run-init ${rootmnt} ${init} "$@" ${recovery:+--startup-event=recovery} <${rootmnt}/dev/console >${rootmnt}/dev/console 2>&1

In this case run-init which is executed from /usr/lib/klibc/bin/run-init won’t display errors so easily due the «exec», in this case we must divide the debug process in «before run-init» and «after executing run-init». If there’s not a «panic error» usually the startup scripts, certain udev rules and other processes will be executed through SystemV or Upstart.

If the system hangs at «Stopping Userspace bootsplash» and there is no text console to interact, run-init «somehow has finished» and we should look for issues on the startup processes of SystemV or Upstart located at /etc/init.d and /etc/init respectively, disabling heavy services like X, or pulseaudio  is a lucky shot.

In my case I was stuck at «Stopping Userspace bootsplash» with no X server or text console to interact with the device, the system was just «stuck» there. Using other machine and disabling X (in init.d) lead me to a text console and finally taking a look on the services that used  «X»  I found the one causing problems, which was a personalized binary to splash an image at the boot process, after disabling it I had «X» back and everything worked.