I recently ran into the problem of having this very large file and no ‘fast’ way of transferring it to another computer. Over Wireless it was calculated to take upwards of four hours to transfer (due to weak signal etc.), but I had a 4 GB USB key that I could use. The problem was that my file was over 4 gigabytes and of course my key was only 4G, in fact only 3.9G of usable space. What to do?! My initial thought was to do some trickery using dd, where I could make two files with the first up to the 3.9G barrier and create the second file with the remainder then piece it back together. This is a feasible option. But I came across the split function in Linux.
Split allows you to, you guessed it, split a file into pieces and automatically increment the file name so you can keep track of the pieces. This was a perfect solution for my needs. I was transferring the file from a Linux machine to a Windows machine, so I also had to figure out how to piece the file back together in Windows. In Linux its just a matter of “catting” the files together, but in Windows you can actually do it using copy /b for binary.
Here I will outline the steps for you to split a file into chunks and reassemble it at the destination.
Split A File Into Pieces
The split program is quite handy and has various flags to play around with, the important ones are:
- -a, –suffix-length=N – use suffixes of length N (default 2)
- -b, –bytes=SIZE – put SIZE bytes per output file
- -C, –line-bytes=SIZE – put at most SIZE bytes of lines per output file
- -d, –numeric-suffixes – use numeric suffixes instead of alphabetic
- -l, –lines=NUMBER – put NUMBER lines per output file
- –verbose print a diagnostic just before each output file is opened
SIZE may have a multiplier suffix: b 512, kB 1000, K 1024, MB 1000*1000, M 1024*1024, GB 1000*1000*1000, G 1024*1024*1024, and so on for T, P, E, Z, Y
For example, I have a debian 5 ISO that’s 3.2 GB that I decided to split into 500MB files.
$ ls
debian5.iso
$ split -b 500M -d debian5.iso
$ ls -l
total 6304488
-rw-r--r-- 1 erik erik 3224731648 2011-03-16 11:45 debian5.iso
-rw-r--r-- 1 erik erik 524288000 2011-03-16 11:52 x00
-rw-r--r-- 1 erik erik 524288000 2011-03-16 11:52 x01
-rw-r--r-- 1 erik erik 524288000 2011-03-16 11:53 x02
-rw-r--r-- 1 erik erik 524288000 2011-03-16 11:53 x03
-rw-r--r-- 1 erik erik 524288000 2011-03-16 11:54 x04
-rw-r--r-- 1 erik erik 524288000 2011-03-16 11:54 x05
-rw-r--r-- 1 erik erik 79003648 2011-03-16 11:54 x06
So now I can just transfer chunks at a time.
Okay, now the files are on my second machine, but how do I put them all back together again. Poor Humpty Dumpty!
Reassemble Split File In Linux
To reassemble the file in Linux is quite easy:
$ cat x0* > debian5.iso
$ ls -l
total 6304488
-rw-r--r-- 1 erik erik 3224731648 2011-03-16 11:45 debian5.iso
-rw-r--r-- 1 erik erik 524288000 2011-03-16 11:52 x00
-rw-r--r-- 1 erik erik 524288000 2011-03-16 11:52 x01
-rw-r--r-- 1 erik erik 524288000 2011-03-16 11:53 x02
-rw-r--r-- 1 erik erik 524288000 2011-03-16 11:53 x03
-rw-r--r-- 1 erik erik 524288000 2011-03-16 11:54 x04
-rw-r--r-- 1 erik erik 524288000 2011-03-16 11:54 x05
-rw-r--r-- 1 erik erik 79003648 2011-03-16 11:54 x06
cat will automatically combine all files into one binary file. We redirect the output using the Bash redirection symbol >. For more information on Bash redirection.
Reassemble Split File In Windows
Great, but I am running Windows! In fact I am running Windows on my other machine, unfortunately I am not running cygwin so I had to find an alternate method. It turns out you can use the copy command with the /b flags to copy a set of files into one. The /b informs copy that we are dealing with a binary file rather than ASCII.
Using the command prompt in Windows you can rebuild the files with:
E:\>copy /b x00 + x01 + x02 + x03 + x04 + x05 + x06 debian5.iso
x00
x01
x02
x03
x04
x05
x06
The computer chunked along spitting out the x0* on each line until finally my file was pieced back together. The plus sign indicates to copy that these are the files we need to concatenate together with the last argument the output file.
There you have it. This can be used to break down any file, and could be used to send multiple e-mails to yourself to get around an e-mail attachment limit.
Comments