When an encrypted medium is prepared for use, it is best practice to fill the disk end-to-end with random bits. If the disk is not prepared with random bits, then an attacker could see which blocks have and have not been written, simply by running a block-by-block statistical analysis: if the average 1/0 ratio is near 50%, its probably encrypted. It may be simpler than this for new disks, since they tend to default with all-zero’s.
This is a well-known problem, and many will encourage you to use /dev/urandom to fill the disk. Unfortuntaly, /dev/urandom is much slower than even rotational disks, let-alone GB/sec RAID on SSD’s:
root@geekdesk:~# dd if=/dev/urandom of=/dev/null bs=1M count=100 100+0 records in 100+0 records out 104857600 bytes (105 MB) copied, 7.24238 s, 14.5 MB/s root@geekdesk:~#
So how can we fill a block device with random bits, quickly? The answer might be surprising: we use /dev/zero—but write to the encrypted device. Once the encrypted device is full, we erase the LUKS header with /dev/urandom. The second step is of course slower, but we need only overwrite the first 1MB so it takes a fraction of a second.
Note that the password we are using (below) needn’t be remembered—in fact, you shouldn’t be able to remember it. Use something long and random for a password, and keep it just long enough to erase the volume. I use base64 from /dev/urandom for password generation:
# 256 random bits dd if=/dev/urandom bs=1 count=32 | base64
Now format the volume and map it with luksOpen. Note that we are not using a filesystem—this is all at block-layer:
root@geekdesk:~# cryptsetup luksFormat /dev/loop3 This will overwrite data on /dev/loop3 irrevocably. Are you sure? (Type uppercase yes): YES Enter LUKS passphrase: <random one-time-use password> Verify passphrase:
root@geekdesk:~# cryptsetup luksOpen /dev/loop3 testdev Enter passphrase for /dev/loop3: <same password as above> root@geekdesk:~# dd if=/dev/zero of=/dev/mapper/testdev bs=1M dd: writing `/dev/mapper/testdev': No space left on device 99+0 records in 98+0 records out 103804928 bytes (104 MB) copied, 1.21521 s, 85.4 MB/s
See, more than 6x faster (the disk is most likely the 85MB/s bottleneck)! This will save hours (or days) when preparing multi-terabyte volumes. Now remove the device mapping, and urandom the first 1MB of the underlying device:
# This line is the same as "cryptsetup luksClose testdev" root@geekdesk:~# dmsetup remove /dev/mapper/testdev root@geekdesk:~# dd if=/dev/urandom of=/dev/loop3 bs=512 count=2056 2056+0 records in 2056+0 records out 1052672 bytes (1.1 MB) copied, 0.0952705 s, 11.0 MB/s
Note that we overwrote the first 2056 blocks from /dev/urandom. 2056 is the default LUKS payload offset, but you can verify that you’ve overwritten the correct number of blocks using luksDump:
root@geekdesk:~# cryptsetup luksDump /dev/loop3 LUKS header information for /dev/loop3 Version: 1 Cipher name: aes Cipher mode: cbc-essiv:sha256 Hash spec: sha1 Payload offset: 2056 [...snip...]
Now your volume is prepared with random bits, and you may format it with any cryptographic block-device mechanism you prefer, safe knowing that an attacker cannot tell which blocks are empty, and which are in use (assuming the attacker has a single point-in-time copy of the block device).
I like LUKS since it is based on PKCS#11 and includes features such multiple passphrase slots and passphrase changes (it never reveals the actual device key, your passphrase unlocks the real key), but other volume encryption devices exist—or you might export the volume via iSCSI/ATAoE/FCoE and use a proprietary block-layer encryption mechanism.
If someone can explain an attack against this mechanism, I would be glad to hear about it. In this example we used AES in CBC mode so we are spreading the IV bits across the entire volume. Conceivably one could write an AES-CTR mode tool with a random key to do the same thing and this may be a stronger mechanism. (To my knowledge, the dm-crypt toolchain does not have a CTR mode, nor would you want one for general use).
The method above fails when an attacker can tell the difference between the original AES-CBC wipe with random bits (where all plaintext bits are set to zero)—and the new encryption mechanism with a different key that will be used in production atop of the prepared disk volume. While there may be an attack for AES-CBC with all-zero’s (though I don’t think there is), AES-CTR mode by its definition would make this method more effective since each block is independent of the next. One might be able to argue that AES-CBC creates an AES-CTR mode implementation where the counter is a permutation of AES itself. If this can be proven, then both methods are equally secure.
Either way, this is likely better (and definitely faster) than /dev/urandom for filling a disk, since /dev/urandom is a pseudo-random number generator. Using /dev/urandom for terabytes of data may begin to develop a pattern once its effective entropy pool is spread too thin. Even with seed-help from /dev/random, /dev/urandom might run out of steam.
In the end, random bits XORed with random bits still look like random bits when placed next to other random bits—but you’re welcome to debate this. Yay for crypto!
ok, now back to work 🙂
Edit: Mon Sep 17 19:43:22 PDT 2012
Come to think of it, you don’t even need a password at the luksFormat stage. LUKS generates its own strong random bits for the actual block-cipher key. The passphrase just unlocks that key. For the purposes of wiping the disk with random bits, you can use “<enter><enter>” as your passphrase… just make sure you wipe the LUKS header in step 2 from /dev/urandom.
8 thoughts on “Quickly fill a disk with random bits (without /dev/urandom)”
Two questions that could appear dumb 😀
Can this method work even with aes-xts?
After erasing the luks header with /dev/urandom, i can create a new luks container right?
> Can this method work even with aes-xts?
Sure, aes-xts is fine if you like that better. Random bits are random bits. Once the key is lost when you overwrite the LUKS header, the zero’s on the volume are meaningless. There might be a known-plaintext attack knowing that the bits were written as all zeros, but I think it would be hard to discern which is which.
> After erasing the luks header with /dev/urandom, i can create a new luks container right?
Yep, thats the idea.
I don’t see where you used base64 to “enter” the passphrase when formatting the new LUKS device. I see where you generated a random string of numbers, but how do you then enter it into the input during the luksFormat command?
Notice the text near cryptsetup luksFormat where it prompts for the password.
Also, read the update at the bottom of the post. LUKS generates its own key separate from the passphrase used to encrypt the volume. Since you will be overwriting the LUKS header which contains that key, the passphrase that you choose could very well be empty (assuming that your adversary cannot reverse the header bits that you overwrote in the final 2056 block dd command).
(yes this seems to be 3-4 years later 🙂
Hello, I had a similar use case as I was setting up my new hard drives which will be combined in a RAID1.
I understand the features and advantages of luks to some degree but I prefer plain block device encryption – at least in this case.
Inspired by your idea of not using /dev/urandom I did this:
2 HDDs with each (somewhat) 3TB capacity,
# cryptsetup create sda-encrypt /dev/sda
Enter passphrase: (you don’t need to remember this!)
# cryptsetup status sda-encrypt
/dev/mapper/sda-encrypt is active.
keysize: 256 bits
offset: 0 sectors
size: 5860533168 sectors
# pv -s 3000592982016 /dev/zero > /dev/mapper/sda-encrypt
68MiB 0:00:02 [ 189MiB/s] [> ] 0% ETA 2:47:54
# cryptsetup remove sda-encrypt
I didn’t use
# dd if=/dev/zero of=/dev/mapper/sda-encrypt status=progress
even if sufficiant – as I prefer more details on the transfer.
I use “pv” because of its semi-graphical interface with “percentage remaining” and an ETA.
As /dev/zero has no size I tell “pv” the size of the target drive with “-s ” so “pv” can adjust the progress bar and the ETA.
I hope I didn’t make a major mistake on the “cryptsetup” like I didn’t encrypt at all or alike.
My point is not to use LUKS, as we want to discard its settings anyway. Maybe setting a stronger password would be more adviced – I hacked on my keyboard for up to a minute to generate a pseudo-random password. I also have no need to fill any luks header afterwards and checking for success.
I’m curious on what you think about my method. I am not that experienced with cryptsetup and encryption with Linux etc, so I might have made also major mistakes compromizing the encryption.
(BTW, with “dd” I added: “obs=4096″
# dd if=/dev/zero of=/dev/sda
# dd if=/dev/zero of=/dev/sda obs=4096
The latter command wrote almost 4 times quicker than the first command (on the unencrypted disk)
reason(?): setting the obs=”output block size” to 4096 because it is the same size a sector has in most(?) modern hard drives
Thank you for writing! Yes, I agree that creating a raw volume without the LUKS header is probably easier, and possibly even more secure since it doesn’t store the master key of any form on the disk. Using “pv” is neat, too. It’s always nice to see progress!
Would this approach also be good to erase a drive?
I find many (paranoid?) tutorials which overwrite data many times which is very time consuming.
I very much doubt that even with sophisticated methods it is really possible to recover larger pieces of data from a randomly overwritten disk.
Even if someone can read single bits of data that does not mean that whole data files can be recovered successfully.
I wonder if someone ever did a recovery test with a simply 1pass zeroed drive on one of the professional data recovery companies.
You raise an interesting question. The way that magnetic media is stored is with a small magnetic field that is flipped north or south to indicate 1 or 0. The head reads the field and based on the field strength being above or below a threshold decides whether the bit is 1 or 0. If you simply wipe a disk with all zeros, then all of the bits will be below the 0 threshold, but it is likely that the field strength of the bits that were previously a 1 is higher than those that were 0. Thus, a sensitive reader could still pick up the preimage. By doing multiple passes, the bits are usually flipped 1010, 0101, 1111, and 0000. This helps flatten the preimage by yanking the fields back and forth.
So to try to answer your question: Does a single pass of random noise on the disk prevent the preimage from being detected? If this can be equated to a 1-time pad, where the field is effectively being XOR’d with a pad of field strength, then maybe! It would be interesting to see if others could come up with a reason that this would be insufficient. It would certainly be more difficult to recover than wiping it with all zeros.