Debug spinning PHP script on a WHM/cPanel Server

Getting A Backtrace

Sometimes PHP will spin at 100% CPU and it is difficult to figure out why. The `strace` command is too noisy, and without knowing where in the code there is a problem, you cannot insert your own backtrace. The newer version of WHM has support for multiple PHP versions, so make sure you run this for whatever PHP version the site is using. In our case, this is using php-fpm.

First, install xdebug:

/opt/cpanel/ea-php72/root/usr/bin/pecl install xdebug

After that, follow the instructions here: https://stackoverflow.com/questions/14261821/get-a-stack-trace-of-a-hung-php-script#53056294

Basically you just need to run the following:

gdb --batch --readnever --pid=$pid --command=/tmp/dumpstack.gdbscript

And the content of dumpstack.gdbscript is:

set $xstack = ((long**)&xdebug_globals)[2]
if ($xstack !=0) && ($xstack[0]!=0)
set $pcurrent = (long*)$xstack[0]
while $pcurrent
set $xptr = (long*)$pcurrent[0]
set $xptr_s = (char**)$xptr
set $xptr_i = (int*)$xptr
set $filename = $xptr_s[4]
set $funcname = $xptr_s[1]
set $linenum = $xptr_i[10]
if ($funcname!=0)
printf "%s@%s:%d\\n", $funcname, $filename, $linenum
else
printf "global@%s:%d\\n", $filename, $linenum
end
set $pnext = (long*)$pcurrent[2]
set $pcurrent = $pnext
end
else
printf "no stack"
end

Fix LVM Thin Can’t create snapshot, Failed to suspend vg01/pool0 with queued messages

Fix LVM Thin Snapshot Creation Errors

From time to time you might see errors like the following:

~]# lvcreate -s -n foo-snap data/foo
Can’t create snapshot bar-snap as origin bar is not suspended.
Failed to suspend vg01/pool0 with queued messages.

You will note that foo and bar have nothing to do with each other, but the error message prevents creating additional thin volumes. While the cause is unknown, the fix is easy. Something caused LVM to try to create an LVM that it was unable to complete, so it generates this in its metadata:

message1 {
create = "bar-snap"
}

The Fix

  1. deactivate the thinpool
  2. dump the VG metadata
  3. backup the file
  4. remove the message1 section
  5. restore the metadata.

The Procedure

  • vgcfgbackup -f /tmp/pool0-current vg01
  • cp /tmp/pool0-current /tmp/pool0-current-orig # backup the file before making changes
  • vim /tmp/pool0-current # remove the message1 section in vg01 -> logical_volumes -> pool0
  • vgcfgrestore -f /tmp/pool0-current vg01 –force

Hopefully this works for you, and hopefully whatever causes this gets fixed upstream.

LSI Megaraid Storage Manager Does Nothing

Installing Broadcom MSM for LSI Megaraid Cards

On a minimal CentOS install I found that MSM would refuse to load when I ran “/usr/local/MegaRAID\ Storage\ Manager/startupui.sh”.  It would just exit without an error.  If you cat the script you will notice java running into /dev/null, thus hiding useful errors—so remove the redirect!  At least then we can see the error.

Since this was a minimal install, I was missing some of the X libraries that MSM wanted.  This fixed it:

yum install libXrender libXtst

-Eric

 

Redirect Directory Trailing Slash (/) with Restricted Access

Securing Apache and Maintaining Usability

First, you should always avoid .htaccess and use it as a last resort. Still, this example holds whether or not you are using .htaccess.

Let’s say you have a directory you wish to secure so that only the index and some file (test.txt) is available. Other other content in the directory should be denied. For example:

These links should load:

  • www.example.com/foo
  • www.example.com/foo/
  • www.example.com/foo/test.txt

In addition, the link without the trailing / should redirect to the link with the trailing / (from /foo to /foo/) for ease of access for your users.

These links should give a 403:

  • www.example.com/foo/bar
  • www.example.com/foo/letmein.txt

To accomplish this, you might write a .htaccess as follows:

Apache 2.2

Order allow,deny
<Files ~ ^$|^index.html$|^test.txt$>
     Order deny,allow
</Files>

Apache 2.4

Require all denied
<Files ~ ^$|^index.html$|^test.txt$>
     Require all granted
</Files>

However, you will run into a problem: The link without a trailing / will not work (www.example.com/foo) because permissions are evaluated before the mod_dir module’s DirectorySlash functionality evaluates whether or not this is a directory. While not intuitive, we also must add the directory as a file name to be allowed as follows:

Apache 2.2

Order allow,deny
<Files ~ ^foo$|^$|^index.html$|^test.txt$>
     Order deny,allow
</Files>

Apache 2.4

Require all denied
<Files ~ ^foo$|^$|^index.html$|^test.txt$>
     Require all granted
</Files>

Hopefully this will help anyone else dealing with a similar issue because it took us a lot of troubleshooting to pin this down. Here are some search terms you might try to find this post:

  • Apache 403 does not add trailing /
  • Apache does not add trailing slash
  • .htaccess deny all breaks trailing directory slash
  • .htaccess Require all denied breaks trailing directory slash

-Eric

 

Check Authorize.net TLS 1.2 Support: tlsv1 alert protocol version

TLS v1.0 and v1.1 to be Disabled on February 28th, 2018

As you may be aware, Authorize.net is disabling TLS v1.0 and v1.1 at the end of this month.  More information about the disablement schedule is available here.

You may begin to see errors like the following if you have not already updated your system:

error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version

We can help you solve this issue as well as provide security hardening or PCI compliance service for your server. Please call or email if we may be of service!

Checking for TLS v1.2 Support

Most modern Linux releases support TLS v1.2, however, it would be best to check to avoid a surprise. These tests should work on most any Linux version including SUSE, Red Hat, CentOS, Debian, Ubuntu, and many others.

PHP

To check your server, you can use this simple PHP script. Make sure you are running this PHP code from the same PHP executable that runs your website. For example, you might have PHP compiled from source and also have it installed as a package. In some cases, one will work and the other will not:

<?php
$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, 'https://apitest.authorize.net');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

if (($response = curl_exec($ch)) === false) {
 $error = curl_error($ch);
 print "$error\n";
}
else {
 $httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
 print "TLS OK: " . strlen($response) . " bytes received ($httpcode).\n";
}

curl_close($ch);
?>

Perl

As above, make sure that you are using the same Perl interpreter that your production site is using or you can end up with a false positive/false negative test. If you get output saying “403 – Forbidden: Access is denied” then it is working because TLS connected successfully.

# perl -MLWP::UserAgent -e 'print LWP::UserAgent->new->get("https://apitest.authorize.net")->decoded_content'
Can't connect to apitest.authorize.net:443

LWP::Protocol::https::Socket: SSL connect attempt failed with unknown errorerror:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version at /usr/lib/perl5/vendor_perl/5.10.0/LWP/Protocol/http.pm line 57.

OpenSSL/Generic

To check from the command line without PHP, you can use the following which shows a failed TLS negotiation:

# openssl s_client -connect apitest.authorize.net:443
CONNECTED(00000003)
30371:error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version:s23_clnt.c:605

Other Languages

If you use any language, we can help verify that your application is set up to work correctly.  Just let us know and we can work with you directly.  I hope this post helps, please comment below!

-Eric

Netboot CentOS 7 on ATAoE

This writeup covers the process of configuring CentOS 7 to boot seamlessly from an ATAoE target.

This is our setup:

  • TFTP server with a MAC-specific boot menu presenting the kernel and initrd. See http://www.syslinux.org/wiki/index.php/PXELINUX
  • Server exporting an LVM Logical Volume with a CentOS 7 install on it using vblade.
  • Computer to boot from the network (This computer has no hard drive and is using the exported LV as it’s disk).

Required components:

  • A kernel that supports AoE and your network card (We are using a custom build 3.17.4 kernel. If you are building your own kernel, the driver for AoE is in Device Drivers —> Block Devices and is called ATA over Ethernet Support. The driver for your network card is in Device Drivers —> Network device support —> Ethernet driver support. The driver for your NIC will be in there somewhere under the appropriate manufacturer).
  • A custom dracut module to bring up the network and discover AoE targets at boot time.
  • Packages:
    • vblade (This is only required on the server exporting the LV).
    • Dracut (dracut, dracut-network, dracut-tools)

Required Steps:

1. Install CentOS 7 onto an LV on the boot server. Since this is not in the scope of this tutorial, I will leave this part up to you. There is plenty of documentation out there (we installed it under KVM first, and exported the resulting disk as an AoE target).
2. Once you have it installed, you need to enter the environment that you deployed in step 1 to install a dracut AoE module and generate an initrd.
3. Next, make sure that you have a kernel that supports aoe and your network card. It is important that both of those kernel modules get built into your initrd or this will not work.
4. At this point (hopefully) we are in the CentOS 7 environment, whether chrooted, virtual machine, or something else; we can now install packages:

1. yum install dracut-network dracut-tools dracut
2. cd /usr/lib/dracut/modules.d/ && ls

You’ll notice that there are a bunch of folders in here, all starting with two digits. The digits signify the order in which the modules are loaded. To make sure all of the prerequisite modules are loaded, we went with 95aoe for our module directory.

3. mkdir 95aoe && cd 95aoe
4. There are 3 files that we need to create — module-setup.sh, parse-aoe.sh, and aoe-up.sh

1. Let’s start with module-setup.sh, since this is the script that will pull into the initrd all of the pieces we need for AoE to work.

#!/bin/bash
# -*- mode: shell-script; indent-tabs-mode: nil; sh-basic-offset: 4; -*-
# ex: ts=8 sw=4 sts=4 et filetype=sh

check() {
        for i in mknod ip rm bash grep sed awk seq echo mkdir; do
                type -P $i >/dev/null || return 1
        done

        return 0
}

depends() {
        echo network
        return 0
}

installkernel() {
        instmods aoe
}

install() {
        inst_multiple mknod ip rm bash grep sed awk seq echo mkdir

        inst "$moddir/aoe-up.sh" "/sbin/aoe-up"
        inst_hook cmdline 98 "$moddir/parse-aoe.sh"
        dracut_need_initqueue
}

2. Next is parse-aoe.sh. This script will load the modules and queue up our main script to be run by init.

#!/bin/sh

modprobe aoe
udevadm settle --timeout=30
/sbin/initqueue --settled --unique /sbin/aoe-up

3. Finally, we need aoe-up.sh, which is what is actually run by init to bring up the network interfaces and discover the AoE device being exported. It’s worth noting that I am setting the MTU to 9000 on each interface, which won’t necessarily be supported on your system. if you are unsure, remove the line “ip link set dev ${INTERFACES[$i]} mtu 9000”:

#!/bin/bash

PATH=/usr/sbin:/usr/bin:/sbin:/bin

exec >>/run/initramfs/loginit.pipe 2>>/run/initramfs/loginit.pipe

mkdir -p /dev/etherd
rm -f /dev/etherd/discover
mknod /dev/etherd/discover c 152 3

INTERFACES=(`ip link |grep BROADCAST |awk -F ":" '{print $2}' |sed 's/ //g' |sed 's/\n/ /g'`)
TOTAL_INTERFACES=${#INTERFACES[@]}

for i in `seq 0 $(($TOTAL_INTERFACES-1))`; do
    ip link set dev ${INTERFACES[$i]} mtu 9000
    ip link set ${INTERFACES[$i]} up
    wait_for_if_up ${INTERFACES[$i]}
done

ip link show

echo > /dev/etherd/discover

5. Now that we have written our AoE dracut module, we need to rebuild the initrd. Before we do this however, we need to make sure that dracut will pull in the modules we need (aoe and your NIC module). There are a few different ways to do this, but here are two options:

1. modprobe aoe && modprobe NIC_MODULE && dracut –force /boot/initramfs-KERNEL_VER.img KERNEL_VER
2. dracut –force –add-drivers aoe –add-drivers NIC_MODULE /boot/initramfs-KERNEL_VER.img KERNEL_VER

6. Now that we have our new initrd, we need to transfer the kernel and initrd to our tftp server

1. scp /boot/initramfs-KERNEL_VER.img /boot/vmlinuz-KERNEL_VER TFTP_SERVER:/path/to/tftp
2. Make sure that permissions are set correctly (should be chmod 644)

7. Before we shut down our CentOS 7 VM, there is one last bit of configuration we need to adjust. Since we require the network to be active to do anything with our FS, we need to make sure that on shutdown, the network isn’t brought down before the FS is unmounted. In CentOS 6, this was as easy as running `chkconfig –level 0123456 on`. CentOS 7 uses systemd, and so this solution will not work. Fortunately the solution is simple (and probably works in CentOS 6 too): Modify /etc/fstab, adding the option _netdev to each mount point that is being exported with AoE (e.g. change defaults to defaults,_netdev).
8. Now we need to get vblade (http://sourceforge.net/projects/aoetools/files/vblade/) and put it on the system exporting the LV with CentOS 7 on it. If you don’t want to install it onto your system, you can just make it and run it from where you extracted it. To export the disk, just run:

1. vbladed -b 1024 -m MAC 0 1 INTERFACE /path/to/disk (for information on what each command does, check out the vblade man page, which is included with the package)

And that’s it! I recommend at this point to run dracut on your netbooted hardware to make sure everything still loads (at this point you shouldn’t have to specifically install the AoE module or your NIC module, so make sure that this is true). Don’t forget to copy the newly created initrd to the TFTP server, and I recommend not overwriting your working initrd so that you can easily go back to a known working state.

What took me days to get working should now only take you an hour or so! I tried to be as detailed as I could, and I don’t think I left anything out, but if you have any issues, please let me know and I’ll update this guide accordingly.

Show the virtual machine name in dstat instead of showing qemu

Do you run dstat to watch Linux KVM hypervisors, but wish process names showed virtual machine names?  Me too.

This patch does just that:

--- a/usr/bin/dstat	2009-11-24 01:30:11.000000000 -0800
+++ b/usr/bin/dstat	2014-11-07 10:20:09.719148833 -0800
@@ -1946,6 +1946,12 @@
         return os.path.basename(name)
     return name

+def index_containing_substring(the_list, substring):
+	for i, s in enumerate(the_list):
+		if substring in s:
+			return i
+	return -1
+
 def getnamebypid(pid, name):
     ret = None
     try:
@@ -1956,6 +1962,10 @@
         if ret.startswith('-'):
             ret = basename(cmdline[-2])
             if ret.startswith('-'): raise
+        if any("qemu" in s for s in cmdline):
+            idx = index_containing_substring(cmdline, '-name')
+            if idx >= 0:
+                ret = cmdline[idx+1]
         if not ret: raise
     except:
         ret = basename(name)

Forcing insserv to start sshd early

Many distributions are using the `insserv` based dependency following at boot time.  After a bit of searching, I found very little actual documentation on the subject.  Here’s the process:

  1. Add override files to /etc/insserv/override/
  2. The files must contain ‘### BEGIN INIT INFO’ and ‘### END INIT INFO’, else insserv will ignore them.
  3. Some have indicated that you can override missing LSB fields with this method, however, it does require the Default-Start and Default-Start options even though you wouldn’t expect to need to override those.
  4. The name of the file in /etc/insserv/override must be equal to the name in /etc/init.d *not* the name it “Provides:”.  In an ideal world, the name would be the same as provides—but in this case that isn’t always so.

For my purpose, I created overrides for all of my services in rc2.d with this script.  Note that the overrides are just copies of the content  from the /etc/init.d/ scripts:

cd /etc/rc2.d
# This is one long line; $f is filename, $p is the Provides value.
grep Provides * | cut -f1,3 -d: | tr -d : | while read f p; do perl -lne '$a++ if /BEGIN INIT INFO/; print if $a; $a-- if /END INIT INFO/' $f > /etc/insserv/overrides/$p;done

Note that the script writes the filename from the “Provides” field so you may need to change the filename if you have initscripts where /etc/init.d/script doesn’t match the Provides field.  Notably, Debian Wheezy does not follow this for ssh.  Provides is sshd, but the script is named ssh.

Next, I append sshd to the Require-Start line of all of my overrides:

cd /etc/insserv/overrides/
perl -i -lne 's/(Required-Start.*)$/$1 sshd/; print' *

This of course creates a cyclic dependency for ssh, so fix that one up by hand.  Feel free to make any other boot-order preferences while you’re in the overrides directory.  For this case, ssh  was made dependent on netplug.

Finally, run `insserv` and double-check that it did what you expected:

# cat /etc/init.d/.depend.start
TARGETS = rsyslog munin-node killprocs motd sysfsutils sudo netplug rsync ssh mysql openvpn ntp wd_keepalive apache2 bootlogs cron stop-readahead-fedora watchdog single rc.local rmnologin
INTERACTIVE =
netplug: rsyslog
rsync: rsyslog
ssh: rsyslog netplug
mysql: rsyslog ssh
ntp: rsyslog ssh
[...snip...]

Viola!  Now I can ssh to the host far earlier, and before services that can take a long time to start to troubleshoot in case of a problem.  In my opinion, ssh should always run directly after the network starts.

-Eric

 

Tightening CentOS/RHEL Security

While there is far more to hardening a server than this single example, this is an often overlooked security issue in many default installations of RHEL and RHEL-based distributions (CentOS, Scientific Linux, etc.)

CentOS and RHEL come with the isdn4k-utils and coolkey packages installed by default for graphical workstations.  Unfortunately, these packages create world-writable directories which binaries and scripts may execute from.  While it is common to tighten /tmp, /var/tmp and /usr/tmp against execution attacks, these directories often go un-noticed.

If you do not use these utilities (and few servers do), they can be easily removed:

yum remove isdn4k-utils coolkey

Of course if you are using these, then you should find a way to secure these mountpoints with the noexec mount option.  This can be done with a loopback filesystem mounted atop the offending mountpoints or with separate LVM volumes for each.

Traditionally, /var does not run executable code so you could mount the entire /var mountpoint as noexec.  Its a great security practice if you can support this, however, there are some packages which expect to run their update scripts out of /var/tmp/ so be prepared to fix some broken package updates or installations.  When you do have a package error, simply mount /var as executable:

mount -o remount,rw,exec /var

install the package, and then disable execution on the mountpoint:

mount -o remount,rw,noexec /var

I recommend nosuid and nodev mount options for these types of mount points as well to restrict less common attack vectors.

-Eric

Bypassing the link-local routing table

Linux can use multiple routing tables, which is convenient for providing different routes for specific networks based on many different metrics, such as the source address.  For example, if we want to route traffic from 192.168.99.0/24 out the 172.17.22.1 default gateway, you could create a new table and route it as such:

# ip route add default via 172.17.22.1 dev eth7 table 100
# ip rule add from 192.168.99.0/24 lookup 100

Now imagine another scenario, where you wish to route traffic from 192.168.99.0/24 to an external network (the Internet), but 1.2.3.0/24 is (for some reason) link-local on your host.  That is, an address like 1.2.3.4 is directly assigned to an adapter on your host.  Linux tracks link-local connections through its ‘local’ routing table, and the ip rule’s show the preference order as:

# ip rule show
0:    from all lookup local 
32766:    from all lookup main 
32767:    from all lookup default

You might think deleting and adding the ‘local’ rule above with a higher preference and placing your new rule above it would fix the problem, but I’ve tried it—and it doesn’t.  Searching around shows that others have had the same problem.

So what to do?  Use fwmark.

First, change local’s preference from 0 to 100:

ip rule del from all pref 0 lookup local
ip rule add from all pref 100 lookup local

Next, mark all traffic from 192.168.99.0/24 with some mark, we are using “1”.  Note that I am using OUTPUT because 192.168.99.0/24 is my local address.  You might want PREROUTING if this is a forwarding host.

iptables -t mangle -s 192.68.99.0/24 -A OUTPUT -j MARK --set-mark 1

And finally add the rule that routes it through table 100:

# ip rule add fwmark 1 pref 10 lookup 100
# ip rule show
10:    from all fwmark 0x1 lookup 100
100:    from all lookup local
32766:    from all lookup main
32767:    from all lookup default

# ip route flush cache

Now all locally generated traffic to 1.2.3.0/24 from 192.168.99.0/24 will head out 172.17.22.1 on eth7 through table 100, instead of being looked up in the ‘local’ table.

Yay!

-Eric