Recover Deleted MegaRAID Volume with Linux

Recently a customer with a 4-disk RAID5 array backed by a MegaRAID controller came to us because their RAID volume was missing. The virtual disk (VD) exported by the RAID controller had disappeared!

The logs indicated that it was deleted, but as far as we can tell no one was on the system at the time. Maybe this was a hardware/firmware error, or user error, but either way the information that defines the RAID volume was no longer available.

We were able to use the Linux RAID 5 module to recover the missing data!  Read on:

Hardware RAID Volumes

When a volume is deleted, the RAID controller removes the volume descriptor on the physical disks but it does not destroy the data itself; the data is still there. There are ways to recover the data using the controller itself. MegaRAID volume recovery documentation suggests that you to re-create the volume using the same parameters and disk order as the original RAID. For example, if you know that it was a 256k stripe then you can recreate the array with the same disk ordering and same stripe size. However, for a raid volume that has been in service for years, how can this possibly be known unless someone wrote it down?

While the controller would then stamp the disks with a new header and leave the data alone, theoretically there will be no data loss and the array will continue as it had originally. This procedure comes with quite a bit of risk. if the parameters are off then you can introduce data corruption. Thus, you should back up each disk individually as a raw full disk image. Unfortunately this takes a long time and as far from convenient. If you get the parameters wrong then the data should be restored before trying again to guarantee consistency, and that takes even longer.

Guarantee Recovery Without Data Loss

Recovery in place was the only option because it would take too long to do full backups during each iteration. However, we also had to guarantee that the recovery process would not introduce failures that could lead to data corruption. For example, if we had chosen a 64k stripe size but it was formatted with a 256k stripe size, then the data would be incorrect. While it is probably safe to try multiple stripe sizes without initialization by the raid controller, there is the risk of technician error causing data loss during this process.

You should always avoid single-shot processes with the risk of corrupting data. In this case terabytes of scientific data were at riskand I certainly did not trust the opaque behavior of a hardware raid controller card that might try to initialize an array with invalid data when I did not want it to. Using a procedure provided by the RAID controller making several attempts, each with an additional risk of losing data, is unnerving to say the least!

I would much prefer a method that is more likely to succeed, and certainly with the guarantee that it is impossible to lose data. When working with customer data it is imperative that every test we make along the way during the process of data recovery is guaranteed not to make things worse.

How to use Linux for RAID Recovery

This is where Linux comes in: using the Linux loopback driver we can require read-only access to the disks. By using the Linux dm-raid5 module we can attempt to reconstruct the array by guessing it’s parameters. This allows us a limitless number of tries and the guarantee that the process of trying recover the data never cause corruption, whether or not it succeeds.

First we started by exporting the disks on the RAID controller as jbod volumes. This allows Linux to see the raw volumes as they were without being modified by the RAID controller firmware:

storcli64 /c0 set jbod=on

Once the drives are available to the operating system as raw disks, we used the Linux loopback driver to configure them as read only volumes.  For example:

losetup --read-only /dev/loop0 /dev/sdX
losetup --read-only /dev/loop1 /dev/sdY
losetup --read-only /dev/loop2 /dev/sdZ
losetup --read-only /dev/loop3 /dev/sdW

Now that the volumes are read only we can attempt to construct them using the Linux RAID 5 device mapper target. There are several major problems:

  1. We do not know the stripe size
  2. We do not know the on-disk format used by the RAID controller.
  3. Worse than that, we do not know the disk ordering that the RAID controller selected when it built the array.

This is makes for quite a few unknown variables–and only one combination will be correct. In our case there were only 4 disks so the possible disk ordering is 24 (4! = 4*3*2*1 = 24).

Now it is a matter of trial and error and we can use the computer to help us a bit to minimize the amount of typing we have to do, but there is still manual inspection to review and make sure that the ordering it found is useful and correct:

[0,1,2,3], [0,1,3,2], [0,2,1,3], [0,2,3,1], [0,3,1,2], [0,3,2,1],
[1,0,2,3], [1,0,3,2], [1,2,0,3], [1,2,3,0], [1,3,0,2], [1,3,2,0],
[2,0,1,3], [2,0,3,1], [2,1,0,3], [2,1,3,0], [2,3,0,1], [2,3,1,0],
[3,0,1,2], [3,0,2,1], [3,1,0,2], [3,1,2,0], [3,2,0,1], [3,2,1,0]

We permuted all possible disk orderings and passed them through to the `dm-raid` module to create the device mapper target:

dmsetup create foo --table '0 23438817282 raid raid5_la 1 128 4 - 7:0 - 7:1 - 7:2 - 7:3'

For each possible permutation, we used `gfdisk` to determine if the partition table was valid and found that disk-3 and disk-1 presented a valid partition table if they were the first drive in the list; thus, we were able to rule out half of the permutations in a short amount of time:

gdisk -l /dev/foo
...
Number Start (sector) End (sector) Size Code Name
1 2048 23438817279 10.9 TiB 0700 primary

we did not know the exact sector count of the original volume, so we had to estimate based on the ending sector reported by gdisk. We knew that there were 3 data disks so the sector count had to be a multiple of three.

The next step was to use the file system checker (e2fsck / fsck.ext4) to determine which permutation has the least number of errors. Many of the permutations that we tested fail the minute immediately the file system checker did not recognize the data at all. However, in a few of the permutations the file system checker understood the file system enough to spew 1000s of lines of errors on the screen. We knew we were getting closer, but none of the file system checks seemed to complete with a reasonable number of errors. This caused us to speculate that our initial guess of a 64k stripe size was incorrect. The next stripe size we tried was 256k and we began to see better results. Again many of the file system checks failed altogether but the file system checker seems to be doing better on some of the permutations. however, it still was not quite right. We had only been trying the default raid5_la module format, but the dm-raid module has the following possible formats:

  • raid5_la RAID5 left asymmetric – rotating parity 0 with data continuation
  • raid5_ra RAID5 right asymmetric – rotating parity N with data continuation
  • raid5_ls RAID5 left symmetric – rotating parity 0 with data restart
  • raid5_rs RAID5 right symmetric – rotating parity N with data restartwe added a 2nd loop to test every raid format for every permutation and when it reached raid5_ls on the 23rd permutation, the file system checker became silent and took a very long time. Only rarely did it spit out a benign warning about some structure problem that it found which was probably already in the valid RAID array to begin with. We had found the correct configuration to recover this raid volume!While we had to figure this out initially by trial and error using a simple Perl script to discover the configuration, you know that the MegaRAID controller uses the raid5_ls RAID type. This was the correct configuration for our drive:
  • RAID disk ordering: 3,2,0,1
  • RAID Stripe size: 256k
  • RAID on-disk format: raid5_ls – left symmetric: rotating parity 0 with data restart

Now that the raid volume was constructed we wanted to test to make sure it would mount and see if we can access the original data. Modern file systems have journals that will replay at mt time and we needed to keep this read only because a journal replay of invalid data could cause corruption. Thus, we used the “noload” option while mounting to prevent replay:

mount -o ro,noload /dev/foo /mnt/tmp

The volume was read only because we used a read-only loopback device, so it was safe, but when trying to mount without the noload option it would refuse to mount because the journey journal replay failed.

Automating the Process

whenever there is a lot of work to do, we always do our best to automate the process to save time and minimize user error. Below you can see the Perl script that we used to inspect the disk.

The script does not have any specific intelligence, it just spits out the result of each test for human inspection; of course it needs to be tuned to the specific environment. All tests are done in a read-only mode and loopback devices were configured before running the script.

When the file system checker for a particular permutation would display 1000s of lines of errors and we would have to kill that process from another window so it would proceed and try the next permutation. In these cases there was so much text displayed on the screen that we would pipe the output through less or directed into a file to inspect it after the run.

This script is for informational use only, use it at your own risk! If you are in need of RAID volume disk recovery on a Linux system than we may be able to help with that. Let us know if we can be of service!

#!/bin/perl

#   This program is free software; you can redistribute it and/or modify
#   it under the terms of the GNU General Public License as published by
#   the Free Software Foundation; either version 2 of the License, or
#   (at your option) any later version.
# 
#   This program is distributed in the hope that it will be useful,
#   but WITHOUT ANY WARRANTY; without even the implied warranty of
#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#   GNU Library General Public License for more details.
# 
#   You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#   Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
#
#   Copyright (c) 2023 Linux Global, all rights reserved.

use strict;

my @p = (
	[0,1,2,3], [0,1,3,2], [0,2,1,3], [0,2,3,1], [0,3,1,2], [0,3,2,1],
	[1,0,2,3], [1,0,3,2], [1,2,0,3], [1,2,3,0], [1,3,0,2], [1,3,2,0],
	[2,0,1,3], [2,0,3,1], [2,1,0,3], [2,1,3,0], [2,3,0,1], [2,3,1,0],
	[3,0,1,2], [3,0,2,1], [3,1,0,2], [3,1,2,0], [3,2,0,1], [3,2,1,0]);

my $stripe = 256*1024/512;
my $n = 0;

foreach my $p (@p)
{
        next unless $p->[0] =~ /3|1/;

        for my $fmt (qw/raid5_la raid5_ra raid5_ls raid5_rs raid5_zr raid5_nr raid5_nc/)
        {
                activate($p, $fmt);
        }
        $n++
}

sub activate
{
        my ($p, $fmt) = @_;

        system("losetup -d /dev/loop7; dmsetup remove foo");
        print "\n\n========= $n $fmt: @$p\n";

        my $dmsetup = "dmsetup create foo --table '0 23438817282 raid $fmt 1 $stripe "
               . "4 - 7:$p->[0] - 7:$p->[1] - 7:$p->[2] - 7:$p->[3]'";
        
        print "$dmsetup\n";
        system($dmsetup);
        system("gdisk -l /dev/mapper/foo |grep -A1 ^Num");
        system("losetup -r -o 1048576 /dev/loop7 /dev/mapper/foo");
        system("file -s /dev/loop7");
        system("e2fsck -fn /dev/loop7 2>&1");
}

 

How to reset your root password!

From time to time I am asked how to set reset the root password on a Linux server when you get locked out. Obvious for security reasons you can’t to do this remotely, but it is pretty easy to do if you have physical access.

First, reboot the system and interrupt the grub bootloader that looks like this by pressing ‘e’. In this example we are booting from CentOS, but it works for pretty much any modern Linux distribution (SuSE, Debian, Ubuntu, etc), because they all use grub:

 grub bootloader interface

The when you press ‘e’ it will display a edit screen like this.  If you are prompted for a password when you press ‘e’ then your system administrator has disabled editing the bootloader configuration.   It is still possible to reset the root password if your bootloader his password it, but you will need to boot off of a rescue disk to do it.

 grub edit menu

Arrow down to the first section/line that starts with ‘linux16’ or ‘linuxefi’ or ‘linux’. it usually looks something like this (yes, it is a big long single line):

linux16 /vmlinuz-3.10.0-1127.19.1.el7.x86_64 root=/dev/mapper/centos-root ro rd.lvm.lv=centos/swap vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=auto vconsole.keymap=us biosdevname=0 net.ifnames=0 rd.auto=1 LANG=en_US.UTF-8

Add this to the end of that line; this is a temporary change to the bootloader that will exist only for this reboot:

init=/bin/bash

It should look something like this:

linux16 ... init=/bin/bash

Then press control-x to boot and eventually it will give you a bash prompt. When it does, run the following commands:

# mount -o remount,rw /
# passwd root
<change the password>
# mount -o remount,ro /
# echo b > /proc/sysrq-trigger

The last line will reboot the system. That should do it! Now you should be able to log in with your new root password.

-Eric

RedHat/CentOS/RHEL 7 does not copy mdadm.conf into Dracut

Force MD and LUKS Auto-Detection

There is a bug in RedHat 7 releases for some systems when md is used that prevents booting. For some reason it does not copy mdadm.conf into the initrd generated by dracut. The fix recommended on the bug page (https://bugzilla.redhat.com/show_bug.cgi?id=1015204) recommends adding rd.md.uuid=<UUID> but that can be alot of work if you have many volumes. In addition, if you cannot paste the UUID then it is hard to type.

To automatically enable md and luks detection, add “rd.auto=1” to the kernel command line. You can see other command line options in the dracut documentation here: https://www.man7.org/linux/man-pages/man7/dracut.cmdline.7.html

Force Docker to Boot Container into One Program

Specifying the Program that Docker Should Launch at Start Time

Sometimes if you are working with a container that gets stuck in a boot loop you need to force it to start a specific application for debugging purposes. We had this problem using the Dropbox container that was build by  janeczku.

It works great most of the time, but during one of the Dropbox updates, the container would not start so it would start and stop rapidly. This is a persistent container that comes up each time the server boots so we need to start it into something like `sleep 30m` so that we can run `docker -it exec dropbox /bin/bash` and inspect to see what the problem is.

Modifying the Configuration

In /var/lib/dropbox/<hash>/config.v2.json you will see an XML file similar to the one below, but notably, it has not been pretty printed as we have done for you. Somewhere in the list you will find the “Entrypoint” setting. Our system came with this set as “Entrypoint” : [ “/root/run” ], but /root/run is the script that is exiting and causing the container to boot loop. For our example, we have moved the “Entrypoint” line in our config at the top so you can see it, though yours can be anywhere in the file. Note that the array is the command to run followed by its list of arguments and we set it to /bin/sleep 30m as you can see below so that we can log into the container and get a bash prompt.

Note that the Docker service needs to be stopped before modifying the config file. In our case there was only one container so stopping the service was not an issue; if you have multiple containers then you may need to find a way to modify the config so that Docker will accept it without a restart—and if you do find out how to do that please post it in the comments below.

I hope this works for you, we searched all over the place and had trouble finding these configuration details.

{
   "Entrypoint" : [
      "/bin/sleep",
      "30m"
   ],
   "NetworkSettings" : {
      "SecondaryIPAddresses" : null,
      "SandboxID" : "cdb37ab22aecc2d6bbde325c515c034f2aa3a2677f39df75c77712ebe2e9c545",
      "SecondaryIPv6Addresses" : null,
      "SandboxKey" : "/var/run/docker/netns/cdb37ab22aec",
      "LinkLocalIPv6Address" : "",
      "Bridge" : "",
      "HasSwarmEndpoint" : false,
      "Service" : null,
      "Ports" : null,
      "Networks" : {
         "bridge" : {
            "IPAMOperational" : false,
            "EndpointID" : "",
            "GlobalIPv6PrefixLen" : 0,
            "IPPrefixLen" : 0,
            "IPAMConfig" : null,
            "IPAddress" : "",
            "IPv6Gateway" : "",
            "Aliases" : null,
            "Links" : null,
            "NetworkID" : "237c33c976ab7863c1a869b2f19492ecdd4d820763b7033cc50397147b26d324",
            "MacAddress" : "",
            "GlobalIPv6Address" : "",
            "Gateway" : ""
         }
      },
      "HairpinMode" : false,
      "LinkLocalIPv6PrefixLen" : 0,
      "IsAnonymousEndpoint" : false
   },
   "SeccompProfile" : "",
   "RestartCount" : 0,
   "HostsPath" : "/var/lib/docker/containers/18ea58191394f07d288383cc66c5ea58d99c0dac8c9c5007158b9a2378d6b66e/hosts",
   "ExposedPorts" : {
      "17500/tcp" : {}
   },
   "HostnamePath" : "/var/lib/docker/containers/18ea58191394f07d288383cc66c5ea58d99c0dac8c9c5007158b9a2378d6b66e/hostname",
   "MountLabel" : "system_u:object_r:svirt_sandbox_file_t:s0:c57,c401",
   "HasBeenStartedBefore" : true,
   "Labels" : {},
   "NoNewPrivileges" : false,
   "OpenStdin" : false,
   "Hostname" : "18ea58191394",
   "Volumes" : {
      "/dbox/Dropbox" : {},
      "/dbox/.dropbox" : {}
   },
   "MountPoints" : {
      "/dbox/Dropbox" : {
         "Spec" : {
            "Target" : "/dbox/Dropbox/",
            "Type" : "bind",
            "Source" : "/data/Dropbox (GPI)/"
         },
         "RW" : true,
         "Destination" : "/dbox/Dropbox",
         "Type" : "bind",
         "Propagation" : "rprivate",
         "Source" : "/data/Dropbox (GPI)",
         "Driver" : "",
         "Name" : ""
      },
      "/dbox/.dropbox" : {
         "Spec" : {
            "Target" : "/dbox/.dropbox",
            "Type" : "volume"
         },
         "Type" : "volume",
         "RW" : true,
         "Driver" : "local",
         "Destination" : "/dbox/.dropbox",
         "Source" : "",
         "Name" : "eb96b3bab31838496f2ca3ce0b6db476aaba9c16d9b6bc0f7da3d05f1f964120"
      }
   },
   "Env" : [
      "DBOX_UID=501",
      "DBOX_GID=1011",
      "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
      "DEBIAN_FRONTEND=noninteractive"
   ],
   "StdinOnce" : false,
   "ArgsEscaped" : true,
   "ResolvConfPath" : "/var/lib/docker/containers/18ea58191394f07d288383cc66c5ea58d99c0dac8c9c5007158b9a2378d6b66e/resolv.conf",
   "HasBeenManuallyStopped" : false,
   "Driver" : "overlay2",
   "AttachStdout" : false,
   "User" : "root",
   "ProcessLabel" : "system_u:system_r:svirt_lxc_net_t:s0:c57,c401",
   "Cmd" : null,
   "AttachStdin" : false,
   "AttachStderr" : false,
   "SecretReferences" : null,
   "AppArmorProfile" : "",
   "ShmPath" : "/var/lib/docker/containers/18ea58191394f07d288383cc66c5ea58d99c0dac8c9c5007158b9a2378d6b66e/shm",
   "Image" : "sha256:a8964074d4f6eac2dfdbf03200c4c73d571b1ea7ad8fcb8d99b918642de2f8d2",
   "Tty" : false,
   "LogPath" : "",
   "Domainname" : "",
   "WorkingDir" : "/dbox/Dropbox",
   "OnBuild" : null,
   "Name" : "/dropbox"
}

-Eric

Debug spinning PHP script on a WHM/cPanel Server

Getting A Backtrace

Sometimes PHP will spin at 100% CPU and it is difficult to figure out why. The `strace` command is too noisy, and without knowing where in the code there is a problem, you cannot insert your own backtrace. The newer version of WHM has support for multiple PHP versions, so make sure you run this for whatever PHP version the site is using. In our case, this is using php-fpm.

First, install xdebug:

/opt/cpanel/ea-php72/root/usr/bin/pecl install xdebug

After that, follow the instructions here: https://stackoverflow.com/questions/14261821/get-a-stack-trace-of-a-hung-php-script#53056294

Basically you just need to run the following:

gdb --batch --readnever --pid=$pid --command=/tmp/dumpstack.gdbscript

And the content of dumpstack.gdbscript is:

set $xstack = ((long**)&xdebug_globals)[2]
if ($xstack !=0) && ($xstack[0]!=0)
set $pcurrent = (long*)$xstack[0]
while $pcurrent
set $xptr = (long*)$pcurrent[0]
set $xptr_s = (char**)$xptr
set $xptr_i = (int*)$xptr
set $filename = $xptr_s[4]
set $funcname = $xptr_s[1]
set $linenum = $xptr_i[10]
if ($funcname!=0)
printf "%s@%s:%d\\n", $funcname, $filename, $linenum
else
printf "global@%s:%d\\n", $filename, $linenum
end
set $pnext = (long*)$pcurrent[2]
set $pcurrent = $pnext
end
else
printf "no stack"
end

Fix LVM Thin Can’t create snapshot, Failed to suspend vg01/pool0 with queued messages

Fix LVM Thin Snapshot Creation Errors

From time to time you might see errors like the following:

~]# lvcreate -s -n foo-snap data/foo
Can’t create snapshot bar-snap as origin bar is not suspended.
Failed to suspend vg01/pool0 with queued messages.

You will note that foo and bar have nothing to do with each other, but the error message prevents creating additional thin volumes. While the cause is unknown, the fix is easy. Something caused LVM to try to create an LVM that it was unable to complete, so it generates this in its metadata:

message1 {
create = "bar-snap"
}

The Fix

  1. deactivate the thinpool
  2. dump the VG metadata
  3. backup the file
  4. remove the message1 section
  5. restore the metadata.

The Procedure

  • vgcfgbackup -f /tmp/pool0-current vg01
  • cp /tmp/pool0-current /tmp/pool0-current-orig # backup the file before making changes
  • vim /tmp/pool0-current # remove the message1 section in vg01 -> logical_volumes -> pool0
  • vgcfgrestore -f /tmp/pool0-current vg01 –force

Hopefully this works for you, and hopefully whatever causes this gets fixed upstream.

LSI Megaraid Storage Manager Does Nothing

Installing Broadcom MSM for LSI Megaraid Cards

On a minimal CentOS install I found that MSM would refuse to load when I ran “/usr/local/MegaRAID\ Storage\ Manager/startupui.sh”.  It would just exit without an error.  If you cat the script you will notice java running into /dev/null, thus hiding useful errors—so remove the redirect!  At least then we can see the error.

Since this was a minimal install, I was missing some of the X libraries that MSM wanted.  This fixed it:

yum install libXrender libXtst

-Eric

 

Redirect Directory Trailing Slash (/) with Restricted Access

Securing Apache and Maintaining Usability

First, you should always avoid .htaccess and use it as a last resort. Still, this example holds whether or not you are using .htaccess.

Let’s say you have a directory you wish to secure so that only the index and some file (test.txt) is available. Other other content in the directory should be denied. For example:

These links should load:

  • www.example.com/foo
  • www.example.com/foo/
  • www.example.com/foo/test.txt

In addition, the link without the trailing / should redirect to the link with the trailing / (from /foo to /foo/) for ease of access for your users.

These links should give a 403:

  • www.example.com/foo/bar
  • www.example.com/foo/letmein.txt

To accomplish this, you might write a .htaccess as follows:

Apache 2.2

Order allow,deny
<Files ~ ^$|^index.html$|^test.txt$>
     Order deny,allow
</Files>

Apache 2.4

Require all denied
<Files ~ ^$|^index.html$|^test.txt$>
     Require all granted
</Files>

However, you will run into a problem: The link without a trailing / will not work (www.example.com/foo) because permissions are evaluated before the mod_dir module’s DirectorySlash functionality evaluates whether or not this is a directory. While not intuitive, we also must add the directory as a file name to be allowed as follows:

Apache 2.2

Order allow,deny
<Files ~ ^foo$|^$|^index.html$|^test.txt$>
     Order deny,allow
</Files>

Apache 2.4

Require all denied
<Files ~ ^foo$|^$|^index.html$|^test.txt$>
     Require all granted
</Files>

Hopefully this will help anyone else dealing with a similar issue because it took us a lot of troubleshooting to pin this down. Here are some search terms you might try to find this post:

  • Apache 403 does not add trailing /
  • Apache does not add trailing slash
  • .htaccess deny all breaks trailing directory slash
  • .htaccess Require all denied breaks trailing directory slash

-Eric

 

Check Authorize.net TLS 1.2 Support: tlsv1 alert protocol version

TLS v1.0 and v1.1 to be Disabled on February 28th, 2018

As you may be aware, Authorize.net is disabling TLS v1.0 and v1.1 at the end of this month.  More information about the disablement schedule is available here.

You may begin to see errors like the following if you have not already updated your system:

error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version

We can help you solve this issue as well as provide security hardening or PCI compliance service for your server. Please call or email if we may be of service!

Checking for TLS v1.2 Support

Most modern Linux releases support TLS v1.2, however, it would be best to check to avoid a surprise. These tests should work on most any Linux version including SUSE, Red Hat, CentOS, Debian, Ubuntu, and many others.

PHP

To check your server, you can use this simple PHP script. Make sure you are running this PHP code from the same PHP executable that runs your website. For example, you might have PHP compiled from source and also have it installed as a package. In some cases, one will work and the other will not:

<?php
$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, 'https://apitest.authorize.net');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

if (($response = curl_exec($ch)) === false) {
 $error = curl_error($ch);
 print "$error\n";
}
else {
 $httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
 print "TLS OK: " . strlen($response) . " bytes received ($httpcode).\n";
}

curl_close($ch);
?>

Perl

As above, make sure that you are using the same Perl interpreter that your production site is using or you can end up with a false positive/false negative test. If you get output saying “403 – Forbidden: Access is denied” then it is working because TLS connected successfully.

# perl -MLWP::UserAgent -e 'print LWP::UserAgent->new->get("https://apitest.authorize.net")->decoded_content'
Can't connect to apitest.authorize.net:443

LWP::Protocol::https::Socket: SSL connect attempt failed with unknown errorerror:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version at /usr/lib/perl5/vendor_perl/5.10.0/LWP/Protocol/http.pm line 57.

OpenSSL/Generic

To check from the command line without PHP, you can use the following which shows a failed TLS negotiation:

# openssl s_client -connect apitest.authorize.net:443
CONNECTED(00000003)
30371:error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version:s23_clnt.c:605

Other Languages

If you use any language, we can help verify that your application is set up to work correctly.  Just let us know and we can work with you directly.  I hope this post helps, please comment below!

-Eric

Netboot CentOS 7 on ATAoE

This writeup covers the process of configuring CentOS 7 to boot seamlessly from an ATAoE target.

This is our setup:

  • TFTP server with a MAC-specific boot menu presenting the kernel and initrd. See http://www.syslinux.org/wiki/index.php/PXELINUX
  • Server exporting an LVM Logical Volume with a CentOS 7 install on it using vblade.
  • Computer to boot from the network (This computer has no hard drive and is using the exported LV as it’s disk).

Required components:

  • A kernel that supports AoE and your network card (We are using a custom build 3.17.4 kernel. If you are building your own kernel, the driver for AoE is in Device Drivers —> Block Devices and is called ATA over Ethernet Support. The driver for your network card is in Device Drivers —> Network device support —> Ethernet driver support. The driver for your NIC will be in there somewhere under the appropriate manufacturer).
  • A custom dracut module to bring up the network and discover AoE targets at boot time.
  • Packages:
    • vblade (This is only required on the server exporting the LV).
    • Dracut (dracut, dracut-network, dracut-tools)

Required Steps:

1. Install CentOS 7 onto an LV on the boot server. Since this is not in the scope of this tutorial, I will leave this part up to you. There is plenty of documentation out there (we installed it under KVM first, and exported the resulting disk as an AoE target).
2. Once you have it installed, you need to enter the environment that you deployed in step 1 to install a dracut AoE module and generate an initrd.
3. Next, make sure that you have a kernel that supports aoe and your network card. It is important that both of those kernel modules get built into your initrd or this will not work.
4. At this point (hopefully) we are in the CentOS 7 environment, whether chrooted, virtual machine, or something else; we can now install packages:

1. yum install dracut-network dracut-tools dracut
2. cd /usr/lib/dracut/modules.d/ && ls

You’ll notice that there are a bunch of folders in here, all starting with two digits. The digits signify the order in which the modules are loaded. To make sure all of the prerequisite modules are loaded, we went with 95aoe for our module directory.

3. mkdir 95aoe && cd 95aoe
4. There are 3 files that we need to create — module-setup.sh, parse-aoe.sh, and aoe-up.sh

1. Let’s start with module-setup.sh, since this is the script that will pull into the initrd all of the pieces we need for AoE to work.

#!/bin/bash
# -*- mode: shell-script; indent-tabs-mode: nil; sh-basic-offset: 4; -*-
# ex: ts=8 sw=4 sts=4 et filetype=sh

check() {
        for i in mknod ip rm bash grep sed awk seq echo mkdir; do
                type -P $i >/dev/null || return 1
        done

        return 0
}

depends() {
        echo network
        return 0
}

installkernel() {
        instmods aoe
}

install() {
        inst_multiple mknod ip rm bash grep sed awk seq echo mkdir

        inst "$moddir/aoe-up.sh" "/sbin/aoe-up"
        inst_hook cmdline 98 "$moddir/parse-aoe.sh"
        dracut_need_initqueue
}

2. Next is parse-aoe.sh. This script will load the modules and queue up our main script to be run by init.

#!/bin/sh

modprobe aoe
udevadm settle --timeout=30
/sbin/initqueue --settled --unique /sbin/aoe-up

3. Finally, we need aoe-up.sh, which is what is actually run by init to bring up the network interfaces and discover the AoE device being exported. It’s worth noting that I am setting the MTU to 9000 on each interface, which won’t necessarily be supported on your system. if you are unsure, remove the line “ip link set dev ${INTERFACES[$i]} mtu 9000”:

#!/bin/bash

PATH=/usr/sbin:/usr/bin:/sbin:/bin

exec >>/run/initramfs/loginit.pipe 2>>/run/initramfs/loginit.pipe

mkdir -p /dev/etherd
rm -f /dev/etherd/discover
mknod /dev/etherd/discover c 152 3

INTERFACES=(`ip link |grep BROADCAST |awk -F ":" '{print $2}' |sed 's/ //g' |sed 's/\n/ /g'`)
TOTAL_INTERFACES=${#INTERFACES[@]}

for i in `seq 0 $(($TOTAL_INTERFACES-1))`; do
    ip link set dev ${INTERFACES[$i]} mtu 9000
    ip link set ${INTERFACES[$i]} up
    wait_for_if_up ${INTERFACES[$i]}
done

ip link show

echo > /dev/etherd/discover

5. Now that we have written our AoE dracut module, we need to rebuild the initrd. Before we do this however, we need to make sure that dracut will pull in the modules we need (aoe and your NIC module). There are a few different ways to do this, but here are two options:

1. modprobe aoe && modprobe NIC_MODULE && dracut –force /boot/initramfs-KERNEL_VER.img KERNEL_VER
2. dracut –force –add-drivers aoe –add-drivers NIC_MODULE /boot/initramfs-KERNEL_VER.img KERNEL_VER

6. Now that we have our new initrd, we need to transfer the kernel and initrd to our tftp server

1. scp /boot/initramfs-KERNEL_VER.img /boot/vmlinuz-KERNEL_VER TFTP_SERVER:/path/to/tftp
2. Make sure that permissions are set correctly (should be chmod 644)

7. Before we shut down our CentOS 7 VM, there is one last bit of configuration we need to adjust. Since we require the network to be active to do anything with our FS, we need to make sure that on shutdown, the network isn’t brought down before the FS is unmounted. In CentOS 6, this was as easy as running `chkconfig –level 0123456 on`. CentOS 7 uses systemd, and so this solution will not work. Fortunately the solution is simple (and probably works in CentOS 6 too): Modify /etc/fstab, adding the option _netdev to each mount point that is being exported with AoE (e.g. change defaults to defaults,_netdev).
8. Now we need to get vblade (http://sourceforge.net/projects/aoetools/files/vblade/) and put it on the system exporting the LV with CentOS 7 on it. If you don’t want to install it onto your system, you can just make it and run it from where you extracted it. To export the disk, just run:

1. vbladed -b 1024 -m MAC 0 1 INTERFACE /path/to/disk (for information on what each command does, check out the vblade man page, which is included with the package)

And that’s it! I recommend at this point to run dracut on your netbooted hardware to make sure everything still loads (at this point you shouldn’t have to specifically install the AoE module or your NIC module, so make sure that this is true). Don’t forget to copy the newly created initrd to the TFTP server, and I recommend not overwriting your working initrd so that you can easily go back to a known working state.

What took me days to get working should now only take you an hour or so! I tried to be as detailed as I could, and I don’t think I left anything out, but if you have any issues, please let me know and I’ll update this guide accordingly.