Lessons learned from transferring roughly 14TB of data onto four 4TB HDs.
- Avoid direct data transfers across mixed platforms using hardware solutions
- Use NTFS as the common file system across mixed OS environment
- Use Windows to format exFAT volumes
- Don’t let Windows touch a Mac or Linux formatted exFAT volume
- It takes roughly 3 hours to transfer 1TB over USB 3.0 from HFS+ to exFAT using a HD caddy
- Create a data transfer plan before you start the process
Use software solutions like shared volumes on NAS to transfer data across different platforms if you can; hardware solutions are problem prone. But such is life; we’ve gotta play the hand we’re dealt.
Most modern Mac OS X and many Linux distros support NTFS out of the box. So NTFS is a good choice for common file system across operating systems. However, use Windows to format NTFS volumes if possible. While NTFS drivers are more reliable than say, exFAT drivers, you use Windows to format NTFS volumes to avoid complications.
Microsoft designed the exFAT file system. Based on personal experience and corroborated by Google search results, exFAT volumes created by Windows can be read by Windows, Linux, and Mac OS with appropriate drivers. However, exFAT volumes created by another OS cannot be read by Windows. This is yet another example (like CSV and Unicode) where Microsoft follows the Standard when no one else does. That is, Microsoft did the theoretical right thing but the de facto wrong thing.
Windows will corrupt Linux formatted exFAT volumes. Initially exFAT volume mounts fine with Linux, unmount then attempt to mount on Windows fails. Unable to mount on Linux again with “invalid VBR checksum” error. The only HD that still mounts on Linux is also the only one untouched by Windows.
The solution (to be verified when we complete data transfer) is to exploit the first 24 sectors of exFAT structure. Sector 11 contains checksum info, and Sector 12 to Sector 23 contain periodic backups of Sector 0 to Sector 11. Somehow Windows corrupted the first 12 sectors of each affected exFAT volume, but it’s rare that all 24 sectors are corrupted.
First, make a copy of the first 24 sectors:
$ sudo dd if=/dev/sdb of=exfat_sec_24 bs=512 count=24
Examining using a hex editor, one discovers that the first 12 sectors differ from the second 12 sectors. For unaffected exFAT volumes, the first 12 sectors are identical to the second 12 sectors.
Overwrite the first 12 sectors with the second 12 sectors:
$ sudo dd of=/dev/sdb if=exfat_sec_24 bs=512 count=12 skip=12
After overwriting the first 12 sectors with data from the second 12 sectors, the corrupted exFAT volumes is “repaired” and mounted automatically. Unmount and then mount the affected volume if necessary. No more errors.
The theoretical throughput of USB 3.0 is 5 Gb/s; older SATA III is 3 Gb/s and newer SATA III is 6 Gb/s. 3 hours for 1 TB is roughly one-sixth of 5 Gb/s. You’d expect to get at least half of ly’n cheat’n claims of the Standard. No such luck. Your mileage will vary with HD caddy design and quality assurance.
As a rule of thumb, transfer larger files first and then fill up remaining space with small files later; do not try to minimize the number of times you swap in and out HDs. Although we have four 4TB HDs and “only” 14 TB of data, I realized upon copying the second HD that it might be possible to run out of storage space if I were not careful allocating files to HDs. Two reasons for this. First, each HD has 4 ly’n cheat’n TBs (HD manufacturers define KB as 1000 bytes and not 1024 bytes) and after formatting it’s only 3.7 TB. Secondly, among the files transferred there are three 2TB files. Do the math and you’ll realize these three files must go on three separate HDs. Again, plan ahead before you start.