# Fork-Lifting VMs on vm1.sea.rg.net from ESXI to Debian with Ganeti With the lessons learned from [Fork-Lifting VMs on vm0.sea.rg.net from ESXI to Debian with Ganeti](./ForkLift.md), the intrepid crew foolishly embarks on doing the same on vm1.sea.rg.net. ------ ## vm1.sea.rg.net Hardware Platform **[Cisco R210-2121605W - part 74-7341-02 - serial QCI1549A9AY](http://www.cisco.com/en/US/prod/collateral/ps10265/ps10493/data_sheet_c78-587522.html)** - Effectively: 72GB RAM, 4TB datastore1, 8 cores - UCS C210 M2 Svr, 2x E5640, 2x4GB, SAS Expand, 1PS - 2 x 2.66GHz Xeon E5640 80W CPU/12MB cache/DDR3 1066MHz - 4 x 16GB DDR3-1066MHz RDIMM/PC3-8500/quad rank/Low-Dual Volt - 2 x 4GB DDR3-1333MHz RDIMM/PC3-10600/dual rank 1Gb DRAMs - LSI 6G MegaRAID 9261-8i card (RAID 0,1,5,6,10,60) - 512WC - 8 x 500GB 6Gb SATA 7.2K RPM SFF hot plug/drive sled mounted - Intel Quad port GbE Controller (E1G44ETG1P20) ------ First, record disk and memory allocation for each VM, configured size, not utilization. | Hostname | RAM | Disk | IP Address | Owner | | :------------------ | ---- | ---- | ------------ | ------------------------------ | | build-u.rpki.net | 1G | 100G | 147.28.0.28 | Rob Austein | | ca0.rpki.net | 2G | 100G | 147.28.0.85 | Randy Bush | | cache0.sea.rpki.net | 2G | 100G | 147.28.0.84 | Randy Bush | | hans.rg.net | 3G | 256G | 147.28.0.42 | Hans Kuhn | | nic0.net.lb | 2G | 100G | 147.28.0.44 | Samer Khalil | | nlring.sea.rg.net | 2G | 32G | 147.28.0.89 | Randy Bush | | proto0.sea.rpki.net | 2G | 100G | 147.28.0.100 | Iain Phillips | | xmpp.rg.net | 2G | 100G | 147.28.0.6 | Randy Bush | ## Users May or May Not Need to Pre-Configure ### FreeBSD Users Hack Configuration Aspects Which Will Change FreeBSD Disk and Network Interface naming may change from the ESXI guest environment to the Ganeti/KVM environment. Owners of FreeBSD guests should either - Make config changes just before shutting down their machines. Thus, when they come back up in the new environment they will boot usefully. FreeBSD guests seem to use /dev/ada for the disk drives. ``` root@fbsd0:~ # more /etc/fstab # Device Mountpoint FStype Options Dump Pass# /dev/ada0p2 / ufs rw 1 1 /dev/ada0p3 none swap sw 0 0 ``` FreeBSD drives on ESXI seem to be /dev/da. So users will have to change their /etc/fstab just before the fork-lift. s/da/ada/g or tell VM SysAdmins so we can hack the ganeti configs so you can keep your old disk and NIC names. ### Linux Guests Should Need No Modification Linux Guests should be able to find their disks as UUIDs and mount as /dev/sdaN. And Ethernet seems to be a pretty constant eth0. ## Copy VMs to an NFS Mounted Filesystem Create a /data/nfs directory on raid1.psg.com and NFS export it to vm1.sea using hacks/advice from: - [Exporting NFS from FreeBSD ](http://myitnotes.info/doku.php?id=en:jobs:freebsd_zfs_nfs_for_vmware) - [A really disgusting and unsafe hack to disable syncs to speed up NFS writes ](https://www.ateamsystems.com/tech-blog/solved-performance-issues-with-freebsd-zfs-backed-esxi-storage-over-nfs/) Mount raid1.psg.com:/data/nfs on vm1.sea.rg.net as an NFS datastore in Configuration / Storage / AddStorage? Stop and power off all guest VMs on vm1.sea.rg.net. We can actually do this one by one. Record the md5 checksum of each and every guest VM .vmdk file. Use VMware vSphere Client on my laptop to move each guest VM from vm1.sea.rg.net:datastore1 to the NFS datastore. Take the md5 checksum of each and every .vmdk file on the NFS datastore and compare to that of the original from vm1.sea.rg.net:datastore1. It is now safe to destroy and rebuild vm1.sea.rg.net ## Build a Debian/Ganeti System on vm1.sea.rg.net Boot into Adaptec BIOS and configure the drives as one big RAID5. The hack to get an INSert key on the MacBook? is Windows, Accessories, EasyOfAccess?, On-ScreenKeyboard? ### Install Debian - Boot Debian CD/ISO - Choose Install - Choose English, UK (so you can get UCT) - Choose American English - Name the host - Choose root password - Choose user name and password - Partition - Choose Manual Partitioning - Select the drive - Create new empty partition table - Select Free Space - Create new partition, primary, 1GB, begining, bios, no use, bios - Done - Select Free Space - Create new partition, primary, 1GB, begining, /boot, ext4, bootable - Done - Select Free Space again - Create a new partition - Accept whatever size is shown (the rest of the disk) - Primary, physical volume for LVM - Done - Configure LVM - Configure LVM accepting write changes to disks - Create volume group - Volume group name: ganeti - Devices for the new volume group: select only the LVM partition - Create Logical Volume: on ganeti, root, 16G - Create Logical Volume: on ganeti, swap, 16G - Create Logical Volume: on ganeti, var, 16G - Edit the Logical Volumes to be ext4 /, swap, and ext4 /var - Finish partitioning and write changes - Finish partitioning and write changes to disk - Be sure it will not boot CD-ROM, and Reboot from the installed system ### Finish Debian Installation Clean up from CDROM sources ``` vi /etc/apt/sources.list ``` and delete the two CDROM entries at the top Install homey things (it's not a computer without emacs:) ``` apt-get update apt-get upgrade apt-get install emacs23-nox apt-get install rsync apt-get install gcc apt-get install bridge-utils vlan apt-get install sudo apt-get install unbound usermod -G sudo -a randy ``` Fix hostname ``` echo vm1.sea.rg.net > /etc/hostname hostname `cat /etc/hostname` ``` Fix /etc/unbound/unbound.conf ``` access-control: 127.0.0.0/8 allow access-control: 147.28.0.0/16 allow access-control: 198.180.150.0/24 allow access-control: 198.180.152.0/24 allow access-control: 0.0.0.0/0 refuse access-control: ::1 allow access-control: ::ffff:127.0.0.1 allow access-control: 2001:418:1::0/48 allow access-control: 2001:418:3807::0/48 allow access-control: 2001:418:8006::0/48 allow access-control: ::0/0 refuse ``` [Install Unattended Upgrading](http://www.howtoforge.com/how-to-configure-automatic-updates-on-debian-squeeze) ## Debian Ganeti Specific Configuration Edit /etc/hosts to have the real address of the host, e.g. ``` 127.0.0.1 localhost 147.28.0.3 vm0.sea.rg.net vm0 147.28.0.15 vm1.sea.rg.net vm1 147.28.0.100 gnt0.sea.rg.net gnt0 ``` ### Fix /etc/network/interfaces Make eth0 hang off of whatever your bridge will be called ``` # The loopback network interface auto lo iface lo inet loopback # Management interface auto eth0 iface eth0 inet manual auto br-lan iface br-lan inet static address 147.28.0.15 netmask 255.255.255.0 gateway 147.28.0.1 bridge_ports eth0 bridge_stp off bridge_fd 0 bridge_maxwait 0 # VLAN 100 auto eth0.100 iface eth0.100 inet manual auto br-rep iface br-rep inet static address 147.28.0.101 netmask 255.255.255.0 bridge_ports eth0.100 bridge_stp off bridge_fd 0 bridge_maxwait 0 auto eth0.255 iface eth0.255 inet manual # VLAN 255 auto br-svc iface br-svc inet manual bridge_ports eth0.255 bridge_stp off bridge_fd 0 bridge_maxwait 0 ``` Check /etc.resolv.comf In theory, this looks like ``` -------------+-------------- | br-lan | this host +---------+---------+ | eth0 | | | |eth0.255 eth0.100| +--+-----------+----+ | | br-svc br-rep | | VMs --------+ +------> to other ganeti hosts ``` Also, put the following in /etc/sysctl.conf: ``` net.bridge.bridge-nf-call-ip6tables = 0 net.bridge.bridge-nf-call-iptables = 0 net.bridge.bridge-nf-call-arptables = 0 ``` ## Install Ganeti Set up to get Ganeti from backports ``` cat >> /etc/apt/sources.list.d/wheezy-backports.list deb http://cdn.debian.net/debian/ wheezy-backports main ``` And then install it ``` apt-get update apt-get install ganeti/wheezy-backports ``` Fix up drbd ``` echo "options drbd minor_count=128 usermode_helper=/bin/true" > /etc/modprobe.d/drbd.conf rmmod drbd # ignore any error modprobe drbd ``` ## Add vm1 to the Ganeti Cluster On vm0.sea.rg.net, the existing ganeti single-node cluster, run ``` gnt-node add vm1.sea.rg.net ``` Which will SSH as root to vm1, set up ssh keys, do all the right things to make vm1 part of the cluster. Then set "PermitRootLogin" to "without-password" in vm1's /etc/ssh/sshd_config Fix VNC passwording ``` echo 'clusture' > /etc/ganeti/vnc-cluster-password gnt-cluster modify -H kvm:vnc_password_file=/etc/ganeti/vnc-cluster-password ``` As vm0 was pretty loaded, make vm1 the master. So, on vm1, the new master, run ``` gnt-cluster master-failover ``` ## Load the ESXI Images Mount the NFS system that has the guest VMs. On vm1, add the following line to /etc/fstab ``` 147.28.0.64:/data/nfs /nfs-data nfs defaults 0 0 ``` and then ``` mkdir /nfs-data mount /nfs-data ``` ### Install Ganeti Instance Management Install ganeti-instance-image ``` wget https://code.osuosl.org/attachments/download/2169/ganeti-instance-image_0.5.1-1_all.debcd dpkg -i ganeti-instance-image_0.5.1-1_all.deb ``` Install qemu utilities (though they likely came in with other installs) ``` apt-get install qemu-utils ``` And force latest version of qemu-image ``` apt-get install qemu-utils/wheezy-backports ``` Aside: if you also want ganeti-instance-debootstrap then version 0.14 is now in wheezy-backports. You don't need to install from source. You'll only want ganeti-instance-debootstrap to create images from scratch where it installs Debian or a Debian-related OS automatically. ### Create the Guest VM Instances For each VM, run the following: ``` #!/bin/sh # makeVM diskGB ramGB nameFQDN DISK=$1 RAM=$2 NAME=$3 NODE=vm1.sea.rg.net gnt-instance add \ -t plain \ -o image+default \ -s ${DISK}G \ -B minmem=${RAM}G,maxmem=$((${RAM}*2))G \ -n $NODE \ -H kvm:vnc_bind_address=0.0.0.0 \ --no-install \ --no-start \ --no-ip-check \ --no-name-check \ ${NAME} ``` This produces ``` vm1.sea.rg.net:/root# ./do-add 200 4 Tue Apr 22 23:15:35 2014 * disk 0, size 200.0G Tue Apr 22 23:15:35 2014 * creating instance disks... Tue Apr 22 23:15:38 2014 adding instance to cluster config Tue Apr 22 23:15:38 2014 - INFO: Waiting for instance to sync disks Tue Apr 22 23:15:39 2014 - INFO: Instance 's disks are in sync ``` If it is a FreeBSD VM, also do ``` gnt-instance modify -H disk_type=scsi ``` so that da devices still work at boot. Get the UUIDs of all VMs, and fill out the table. ``` gnt-instance list -o name,disk.uuid/0 ``` ## Load the Stored VM VMDK Files into the Ganeti Images As root, mount the raid1 nfs filesystem ### Convert the ESXI Images to Ganeti Guest Images ``` gnt-instance info --all | egrep 'Instance name|on primary' ``` Will show the primary device for each ganeti VM ``` on primary: /dev/xenvg/95d2bb8f-063f-498d-b98a-9c03acea991f.disk0 (252:2) ``` which we use as the output UUID Check the type of image we have ``` qemu-img info ``` ### If it is a Flat Raw Image For -flat.vmdk files, you should be able to ``` dd bs=4096k if= of=/dev/ganeti/ ``` ### If it is a Real VMDK For -s001.vmdk files, then you should be able, for each VMDK ``` qemu-img convert -f vmdk -O raw ``` ## Try the VMs! You can use the built in console or come over VNC over ssh, of course). Start the image ``` gnt instance start ``` And come im over the text console or VNC ### Direct Text Console ``` gnt-instance console ``` ### Over VNC for Graphics ``` gnt-instance list -o +network_port ``` To get ``` Instance Hypervisor OS Primary_node Status Memory Network_port minibsd-test kvm image+default deb64.psg.com running 256M 11001 ``` Then run a VNC to the base system port number in that report e.g. 11001, e.g. (notice port 11001) ``` ssh -N -L 5900:127.0.0.1:11001 vm1.sea.rg.net ``` And get ready to start your VNC session (in this case, I would be using Chicken of the VNC to VNC display localhost:0, aka localhost port 5900). To give each user a different password, do it at the instance level: ``` echo 'wombat' >/etc/ganeti/vnc-password- gnt-instance modify -H vnc_password_file=/etc/ganeti/vnc-password- foobar ``` Or make a directory /etc/ganeti/passwords and stash them there. ## If FreeBSD Does Not Mount Root If the system boots but does did not mount the root filesystem, and leaves you at the mountroot prompt. It seems as if FreeBSD > /dev/da0p2 may become > /dev/vtbd0p2 If you do the mountroot to ``` ufs:/dev/vtbd0p2 ``` the root mounts and the system comes up. sra reminds us that it is good idea to do an fsck of / at single user, before enabling write to the / filesystem. Of course, the filesystem will be image dependent. ## Converting a FreeBSD Guest to Paravirtual I/O FreeBSD systems will run better and be kinder to the underlying virtualization system if they run paravirtual I/O for both disk and network. To hack this, Add to /boot/loader.conf.local ``` virtio_load=YES virtio_pci_load=YES ``` As advised in [http:http://freebsd.1045724.n5.nabble.com/kvm-vlan-virtio-problem-tp5757713p5757788.html http://freebsd.1045724.n5.nabble.com/kvm-vlan-virtio-problem-tp5757713p5757788.html], In /etc/sysctl.conf add ``` net.inet.tcp.tso=0 ``` Hack config in /etc/rc.conf changing the interface name and disabling tso ``` ifconfig_vtnet0="147.28.0.8/24 -tso" ifconfig_vtnet0_ipv6="inet6 2001:418:1::8/64" ``` And hack /etc/fstab to ``` # Device Mountpoint FStype Options Dump Pass# /dev/vtbd0s1a / ufs rw 1 1 /dev/vtbd0s1b none swap sw 0 0 /dev/vtbd0s1d /root ufs rw 2 2 /dev/vtbd0s1e /var ufs rw 2 2 /dev/vtbd0s1f /var/spool ufs rw 2 2 /dev/vtbd0s1g /usr ufs rw 2 2 ``` Then the VM admin has to ``` gnt-instance shutdown gnt-instance modify -H nic_type=paravirtual,disk_type=paravirtual gnt-instance start ``` To revert, the VM Admin can ``` gnt-instance shutdown gnt-instance modify -H nic_type=e1000,disk_type=scsi gnt-instance start ``` It would also be helpful to enable the 9600 baud serial console so that admins can see your VM boot. ## Optionally Convert plain to drbd For each instance ``` $ gnt-instance stop $ gnt-instance modify \ -t drbd \ --no-wait-for-sync \ -n \ $ gnt-instance start ``` To watch the paint drying, ``` cat /proc/drbd ``` ------ ``` Node DTotal DFree MTotal MNode MFree Pinst Sinst vm0.sea.rg.net 5.4T 3.2T 31.5G 24.3G 8.2G 14 0 vm1.sea.rg.net 5.9T 5.3T 70.9G 26.1G 64.9G 6 0 Instance Primary_node ConfigMaxMem DiskUsage adrilankha.hactrn.net vm0.sea.rg.net 4.0G 260.0G archive.psg.com vm0.sea.rg.net 1.0G 100.0G build-u.rpki.net vm1.sea.rg.net 1.0G 100.0G ca0.rpki.net vm1.sea.rg.net 2.0G 100.0G cache0.sea.rpki.net vm1.sea.rg.net 2.0G 100.0G chezrandy.x0.dk vm0.sea.rg.net 768M 100.0G hans.rg.net vm0.sea.rg.net 3.0G 250.0G hiroshima.bogus.com vm0.sea.rg.net 4.0G 256.0G linear.algebras.org vm0.sea.rg.net 1.0G 100.0G nagasaki.bogus.com vm0.sea.rg.net 4.0G 258.0G nic0.net.lb vm1.sea.rg.net 2.0G 100.0G nlring.sea.rg.net vm1.sea.rg.net 2.0G 32.0G okui.psg.com vm0.sea.rg.net 1.0G 100.0G proto0.sea.rpki.net vm1.sea.rg.net 2.0G 100.0G r1.securerouting.org vm0.sea.rg.net 2.0G 100.0G rip1.psg.com vm0.sea.rg.net 2.0G 36.0G turing.worldpowersystems.com vm0.sea.rg.net 2.0G 256.0G xmpp.rg.net vm0.sea.rg.net 2.0G 100.0G zoe.dns.gh vm0.sea.rg.net 1.0G 200.0G zzyzx.sigpipe.org vm0.sea.rg.net 2.0G 100.0G ```