##################################################################################### # Background # ##################################################################################### We would like to enable network boot, NFS root and channel bonding at the same time. In order to be able to boot from the network, the server (node1 in our case) must not be channel bonded when clients' power are turned on. Client machines Ethernet channels couldn't be bonded until /etc, /sbin .... are mounted through NFS. Problem will arise if you bond client machine's Ethernet channels. Once they are bonded, the NFS root will be lost and all of the clients would hang. To solve this problem, we use the initial ramdisk (initrd) mechanism introduced since Linux 2.0. We prepare a kernel and a initrd image, boot each client machine to this ramdisk. Perform channel bonding from within this ramdisk. Then tell this environment the real root device should be mounted via NFS. It turns out that client machines will be able to pick up the NFS root file systems once the server machine got bonded. ##################################################################################### # Quick start # ##################################################################################### 1. Prepare the /linuxrc program: Essentially what we want is to turn off eth0 then turn on bond0, eth0 and eth1. "0x00ff" is the magic number which tells Linux kernel that the real root device is going to be NFSROOT. #!/bin/bash /sbin/insmod bonding /sbin/ifconfig bond0 192.168.1.105 hw ether 00:90:27:E0:30:51 netmask 255.255.255.0 broadcast 192.168.1.255 up /sbin/ifconfig eth0 down /sbin/ifconfig eth1 down /sbin/ifenslave bond0 eth0 /sbin/ifenslave bond0 eth1 mount -n -t proc proc /proc echo 0x00ff > /proc/sys/kernel/real-root-dev In a more general form, use the following linuxrc (suppose your initrd ramdisk has /bin/grep, /bin/sed but no /bin/awk) #!/bin/bash #hwadds=`/sbin/ifconfig | grep eth0 | awk '{print $5}'` hwadds=`/sbin/ifconfig | /bin/grep eth0 | /bin/sed -e 's/eth0 Link encap:Ethernet HWaddr //' ` echo $hwadds ipadds=`/sbin/ifconfig | /bin/grep "inet addr" | /bin/sed -e 's/inet addr://' | /bin/sed -e 's/ Bcast:192.168.1.255 Mask:255.255.255.0//' ` echo $ipadds /sbin/insmod bonding /sbin/ifconfig bond0 $ipadds hw ether $hwadds netmask 255.255.255.0 broadcast 192.168.1.255 up /sbin/ifconfig eth0 down /sbin/ifconfig eth1 down /sbin/ifenslave bond0 eth0 /sbin/ifenslave bond0 eth1 mount -n -t proc proc /proc echo 0x00ff > /proc/sys/kernel/real-root-dev 2. Create the required initrd image: (become root) mke2fs -m0 /dev/ram 4096 mount -t ext2 /dev/ram /mnt/ram mount -t ext2 -o loop /home/li/faq/cluster/initrd/rescue /mnt/rescue cp -a /mnt/rescue/* /mnt/ram cp -a /home/li/faq/cluster/initrd/ifenslave /mnt/ram/sbin mkdir -p /mnt/ram/lib/modules/2.2.14/net cd /mnt/ram cp -a /home/li/faq/cluster/initrd/bonding.o /mnt/ram/lib/modules/2.2.14/net cd /mnt/ram #ln -s bin/bash linuxrc cp /home/li/faq/cluster/initrd/linuxrc /mnt/ram cd /; umount /dev/ram umount /mnt/rescue dd if=/dev/ram bs=1k count=4096 of=/home/li/faq/cluster/initrd/initrd-kbl gzip -f /home/li/faq/cluster/initrd/initrd-kbl freeramdisk /dev/ram 3. on the server (node1): /tftpboot/bpbatch.bpb: set CacheNever="ON" # # For channel bonded clients linuxboot "/tftpboot/bzImage-initrd-kbl" "" "initrd-kbl.gz" # # For non-bonded clients #linuxboot "/tftpboot/bzImage" "nfsroot=192.168.1.101" 4. Turn on all clients. When all of them are booted, enable node1's channel bonding: Everything should be all right. sh /tftpboot/node1_bonding.sh #!/bin/bash /sbin/ifconfig bond0 192.168.1.101 hw ether 00:90:27:E5:AB:A6 netmask 255.255.255.0 broadcast 192.168.1.255 up /sbin/ifconfig eth0 down /sbin/ifconfig eth1 down /sbin/ifenslave bond0 eth0 /sbin/ifenslave bond0 eth1