Beware that these notes were primarily written for personal use to keep track of my tests with various imaging tools. They are also pretty unorganised, and need cleaning up at some point. All mentioned tools tested under Linux (Mint); many of them can be used under Windows using Cygwin.
mount|grep ^'/dev'
Result:
/dev/sda1 on / type ext4 (rw,errors=remount-ro)
/dev/sr0 on /media/johan/REBELS_0 type iso9660
(ro,nosuid,nodev,uid=1000,gid=1000,iocharset=utf8,mode=0400,dmode=0500,uhelper=udisks2)
So, CD drive is /dev/sr0
.
In my case /dev/scd0
worked fine (internal drive); /dev/scd1
for external USB drive.
Use lsblk
. E.g. to get only the label name:
lsblk /dev/sr0 -n -o LABEL
Result:
SRMS41A
Get size in bytes:
lsblk /dev/sr0 -n -o SIZE -b
Result:
660850688
This is the simplest tool, mainly suitable for CDs/DVDs that are not damaged. Command-line:
dd if=/dev/sr0 of=mydisk.iso
Output:
1236416+0 records in
1236416+0 records out
633044992 bytes (633 MB) copied, 255,874 s, 2,5 MB/s
Recommended here and here as it does some error checking (unlike dd).
More info:
http://linux.die.net/man/1/readom
Command line:
readom dev=/dev/sr0 f=mydisk.iso
Result:
Error trying to open /dev/sr0 exclusively (Device or resource busy)... retrying in 1 second.
Unmount first:
umount /dev/sr0
Then repeat. Works now! Output:
Read speed: 4234 kB/s (CD 24x, DVD 3x).
Write speed: 0 kB/s (CD 0x, DVD 0x).
Capacity: 309104 Blocks = 618208 kBytes = 603 MBytes = 633 prMB
Sectorsize: 2048 Bytes
Copy from SCSI (10,0,0) disk to file 'gordi_virussen_readom.iso'
end: 309104
addr: 309104 cnt: 44
Time total: 259.287sec
Read 618208.00 kB at 2384.3 kB/sec.
Option retries
may be needed to avoid excessive processing times in case of damaged blocks - default is 128.
Following Blood Report (page 11):
ddrescue -b 2048 -r4 -v /dev/sr0 REBELS_0.iso REBELS_0.log
Arguments:
-
-b 2048
: block size 2048 bytes -
-r4
: retry bad sectors up to four times -
-v
: verbose output mode
From the online help:
Exit status 0 for a normal exit, 1 for environmental problems (file not found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or invalid input file, 3 for an internal consistency error (eg, bug) which caused ddrescue to panic.
In production environment, command line should be added to metadata (PREMIS event).
# Rescue Logfile. Created by GNU ddrescue version 1.17
# Command line: ddrescue -b 2048 -r4 -v /dev/sr0 REBELS_0.iso REBELS_0.log
# current_pos current_status
0x28AF0000 +
# pos size status
0x00000000 0x28AFA800 +
The log file structure is explained here. So in this case everything went fine. In production environment, contents of log file should be added to metadata.
Alternatively use the dcfldd
utility:
dcfldd bs=2048 if=/dev/sr0 of=Handbook_dcfldd.iso errlog=Handbook_dcfldd.log hashlog=Handbook_dcfldd.md5
In my tests, activating the errlog
option gave a segmentation fault at the end of the imaging process (the tool does write an intact ISO file; although both the log file and the hash log are empty). Looks like a bug. (Using version 1.3.4-1, which is from 2006; apparently no more recent versions exist?!)
Also, from the Precautions section of forensicswiki.org (link here):
- dcfldd is based on an extremely old version of dd: it's known that dcfldd will misalign the data in the image after a faulty sector is encountered on the source drive (see the NIST report), and this kind of bug (wrong offset calculation when seeking over a bad block) was fixed for dd in 2003 (see the fix in the mailing list);
- similarly, dcfldd can enter an infinite loop when a faulty sector is encountered on the source drive, thus writing to the image over and over again until there is no free space left.
So this may not be such a great option after all ...
All of the above tools are usable, but readom is probably the best options here, followed by ddrescue. The potential advantage of readom is that it uses a library that was specifically written for dealing with optical media, so it might be a bit "smarter" in some respects than ddrescue, which is more generic. On the other hand, ddrescue may often be a better choice CD-ROMs that are damaged (in which case readom gives up easily).
Run checksums on both CD and image, e.g.:
md5sum REBELS_0.iso
md5sum /dev/sr0
Note that readom already does this (dd doesn't). Actually I don't think readom does this either! Needs reference.
Part of isoinfo. Command line:
isovfy REBELS_0.iso
Result:
Root at extent 13, 2048 bytes
[0 0]
No errors found
So everything OK. Here's an example where the verification fails on an incomplete ISO file:
isovfy gordi_virussen_failed.iso
Result:
Root at extent 16, 2048 bytes
[0 0]
isovfy: Short read on old image
In a production environment, both isovfy command line and its output should be added to metadata (PREMIS event).
The isovfy documentation isn't very clear what specific checks it performs. In some of my tests I encountered broken (incomplete) ISO images, which were not detected by isovfy. More info here:
http://qanda.digipres.org/1076/incomplete-image-after-imaging-rom-prevent-and-detect-this
This prompted me to write a simple verification script that calculates the expected file size of an ISO from its Primary Volume Descriptor, and compares this against the actual size of the file. It is available here:
https://github.com/KBNLresearch/verifyISOSize
Isoinfo has quite a number of options; here are the ones that look most useful.
isoinfo -d -i REBELS_0.iso
Result:
CD-ROM is in ISO 9660 format
System id:
Volume id: REBELS_0
Volume set id:
Publisher id:
Data preparer id:
Application id: NERO - BURNING ROM
Copyright File id:
Abstract File id:
Bibliographic File id:
Volume set size is: 1
Volume set sequence number is: 1
Logical block size is: 2048
Volume size is: 333151
Joliet with UCS level 3 found
NO Rock Ridge present
The same information can also be extracted directly from the CD-rom (i.e. prior to the imaging process) using:
isoinfo -d -i /dev/sr0
This could be useful for automating some of the ddrescue input, such as block size and volume id (which can be used as a base name for the iso file).
blkid -o value -s LABEL /dev/sr0
Result:
SRMS41A
(Note that this will also output the device name, but this is sent to stderr).
isoinfo -f -i REBELS_0.iso
Result:
/AUTORUN.EXE;1
/AUTORUN.INF;1
/DISK0
/LICENSE2.TXT;1
/LICENSEF.TXT;1
/LICENSEU.TXT;1
/SETUP.EXE;1
/DISK0/CONTROLS.CFG;1
/DISK0/DISK0;1
::
::
etc
It looks like all items that are followed by ;1
are files, and those that aren't are directories. Also, the -l
option gives a detailed dir listing that includes additional file attributes (size, date, etc.).
How to verify integrity of image file (comparison against source medium)?
Command line:
readom dev=/dev/sr0 f=hooked.bin -clone
The -clone
option "retrieve(s) the full TOC and all data".
Output:
Read speed: 4234 kB/s (CD 24x, DVD 3x).
Write speed: 0 kB/s (CD 0x, DVD 0x).
TOC len: 180. First Session: 1 Last Session: 1.
01 10 00 A0 00 00 00 00 01 00 00
01 10 00 A1 00 00 00 00 0D 00 00
01 10 00 A2 00 00 00 00 47 13 21
01 10 00 01 00 00 00 00 00 02 00
01 10 00 02 00 00 00 00 05 08 0A
01 10 00 03 00 00 00 00 0B 32 2D
01 10 00 04 00 00 00 00 11 35 0A
01 10 00 05 00 00 00 00 17 33 2D
01 10 00 06 00 00 00 00 1D 15 05
01 10 00 07 00 00 00 00 22 21 0F
01 10 00 08 00 00 00 00 26 2A 3F
01 10 00 09 00 00 00 00 2B 08 28
01 10 00 0A 00 00 00 00 30 06 46
01 10 00 0B 00 00 00 00 35 38 0A
01 10 00 0C 00 00 00 00 3A 05 21
01 10 00 0D 00 00 00 00 40 2F 46
Lead out 1: 320808
Capacity: 320808 Blocks = 641616 kBytes = 626 MBytes = 657 prMB
Sectorsize: 2048 Bytes
Errno: 5 (Input/output error), mode select g1 scsi sendcmd: no error
CDB: 55 10 00 00 00 00 00 00 14 00
status: 0x2 (CHECK CONDITION)
Sense Bytes: 70 00 05 00 00 00 00 0A 00 00 00 00 26 00 00 00
Sense Key: 0x5 Illegal Request, Segment 0
Sense Code: 0x26 Qual 0x00 (invalid field in parameter list) Fru 0x0
Sense flags: Blk 0 (not valid)
cmd finished after 0.002s timeout 40s
Copy from SCSI (10,0,0) disk to file 'hooked.bin'
end: 320808
addr: 320808 cnt: 80
Time total: 264.775sec
Read 766931.62 kB at 2896.5 kB/sec.
Result: 1 toc file + 1 bin file. Optionally convert toc to cue format:
cueconvert hooked.bin.toc hooked.bin.cue
Doesn't work, apparently no standard toc file?
Bin file seems to work:
cdrdao read-cd --read-raw --datafile histopathologie_cd_i.bin --device /dev/sr0 --driver generic-mmc-raw histopathologie_cd_i.toc
Output:
Reading toc and track data...
Track Mode Flags Start Length
------------------------------------------------------------
1 DATA 4 00:00:00( 0) 48:39:00(218925)
Leadout DATA 4 48:39:00(218925)
PQ sub-channel reading (data track) is supported, data format is BCD.
Raw P-W sub-channel reading (data track) is supported.
Cooked R-W sub-channel reading (data track) is supported.
Copying data track 1 (MODE2_RAW): start 00:00:00, length 48:39:00 to "histopathologie_cd_i.bin"...
Reading of toc and track data finished successfully
Resulting image is slightly larger than aize of CD according to lsblk command. Both how to use it? Readom and ddrescue both fail at imaging these discs at all.
These are CDs with one data track and multiple audio tracks.
Tried to use cdrdao to create bin / cue files, following these instructions. Before doing anything with cdrdao we first need to unmount the disc using1:
umount /dev/sr0
Then run cdrdao with the following command line:
cdrdao read-cd --read-raw --datafile no.bin --device /dev/sr0 --driver generic-mmc-raw no.toc
Result: image successfully created; output .toc and .bin file. The*.toc* file looks like this:
CD_DA
// Track 1
TRACK AUDIO
NO COPY
NO PRE_EMPHASIS
TWO_CHANNEL_AUDIO
ISRC "USIR70200001"
FILE "no.bin" 0 02:10:53
// Track 2
TRACK AUDIO
NO COPY
NO PRE_EMPHASIS
TWO_CHANNEL_AUDIO
ISRC "USIR70200002"
FILE "no.bin" 02:10:53 02:17:34
::
etc
But closer inspection shows that only the audio tracks were copied, not the data track! E.g. compare the above output with below example from thecdrdao documentation:
CD_ROM
TRACK MODE1
DATAFILE "data_1"
ZERO 00:02:00 // post-gap
TRACK AUDIO
SILENCE 00:02:00 // pre-gap
START
FILE "data_2.wav" 0
TRACK AUDIO
FILE "data_3.wav" 0
In particular I would expect the .toc file to start with CD_ROM
, and I would also expect one TRACK MODE1
item. So why is this?
Also, cdrdao's output included this message (all output sent to stderr, not stdout):
Found 61 Q sub-channels with CRC errors.
Reading of toc and track data finished successfully.
Assuming that this refers to cyclic redundancy check errors (which are similar to checksums), I'm assuming here that this implies some data errors. Not quite sure how to interpret this in terms of how serious this is. In any case, for (non-mixed) audio CDs cdparanoia looks like a better choice.
The cdrdao man page mentions that it can read CUE files, whereas it actually writes TOC files. Most emulation-related refs only mention CUE, which appears to be more common. Conversion between the two can be done with the cuetools package. To install it:
sudo apt-get install cuetools
Then convert with the cueconvert tool. To convert a TOC file to CUE:
cueconvert no.toc no.cue
The resulting CUE file looks like this:
FILE "no.bin" WAVE
TRACK 01 AUDIO
ISRC USIR70200001
INDEX 01 00:00:00
TRACK 02 AUDIO
ISRC USIR70200002
INDEX 01 02:10:53
::
etc
This also useful (again only works after unmounting the disc first):
cdrdao disk-info --device /dev/sr0
Result:
CD-RW : no
Total Capacity : n/a
CD-R medium : n/a
Recording Speed : n/a
CD-R empty : no
Toc Type : CD-DA or CD-ROM
Sessions : 2
Last Track : 18
Appendable : no
Result in case of CD-I:
CD-RW : no
Total Capacity : n/a
CD-R medium : n/a
Recording Speed : n/a
CD-R empty : no
Toc Type : CD-I
Sessions : 1
Last Track : 1
Appendable : no
Question: what happens if we do this with a DVD?
Finally I tried to image the same CD using ddrescue, just to see what happens. I had to interrupt ddrescue manually because it stopped making any progress after some time. Interestingly, isovfy didn't find any errors in the resulting image, whereas isoinfo -d
resulted in:
CD-ROM is NOT in ISO 9660 format
Which makes me wonder at how thorough the check by isovfy really is!
-
How to make imaging of mixed content CDs work correctly with cdrdao?
-
How to render a BIN/CUE image in e.g. an emulated environment?
Isoinfo on audio CD:
isoinfo -d -i /dev/sr0
Result:
isoinfo: Input/output error. Read error on old image
Crdao disk-info:
cdrdao disk-info --device /dev/sr0
Result:
CD-RW : no
Total Capacity : n/a
CD-R medium : n/a
Recording Speed : n/a
CD-R empty : no
Toc Type : CD-DA or CD-ROM
Sessions : 1
Last Track : 13
Appendable : no
Install cdparanoia:
sudo apt-get install cdparanoia
Rip CD in batch mode, each track to a separate WAV file:
cdparanoia -B -L
or:
cdparanoia -B -l
The -L
switch results in the generation of a detailed log file; -l
produces a summary log (name cdparanoia.log). Strangely a parse error occurs when I specify user-defined file names here. Also, it seems the summary file gives more detailed info than the detailed one? Needs more in-depth look!
- How are ripped files foreseen to be rendered (play CD as a whole vs separate tracks)? Also needs metadata on track order, perhaps media player play lists.
From StackOverflow - Determine optical media type (Audio CD, DVD, blu-ray) by using UDEV and scripts.
First install package cdtool:
sudo apt-get install cdtool
Then:
cdir -d /dev/sr0
unknown cd - 34:32 in 1 tracks
34:30.35 1 [DATA]
unknown cd - 71:19 in 13 tracks
5:06.10 1
6:42.35 2
::
etc
unknown cd - 37:52 in 18 tracks
2:10.53 1
2:17.56 2
::
::
5:12.36 17
1:29.63 18 [DATA]
cdir: no_disc
So the output of the tool makes it possible to differentiate between these 4 disc types. This suggests that it would be feasible to build a workflow that automatically picks the best imaging sub-workflow, based on whether the inserted disc is a cd-rom, audio CD, mixed content CD or DVD.
In this case use ddrescue as it is much more resilient in case of read errors. Also, even if a CD-ROM gives read errors, it may be possible to recover additional data by re-running ddrescue using multiple CD drives. The ddrescue manual gives an example:
Example 3: Rescue a CD-ROM in /dev/cdrom using two CD drives from two different computers, writing the image into an USB drive >nounted on /mnt/mem.
ddrescue -n -b2048 /dev/cdrom /mnt/mem/cdimage /mnt/mem/mapfile ddrescue -d -r1 -b2048 /dev/cdrom /mnt/mem/cdimage /mnt/mem/mapfile (umount the USB drive and move both USB drive and CD-ROM to second computer) ddrescue -d -r1 -b2048 /dev/cdrom /mnt/mem/cdimage /mnt/mem/mapfile (if errsize is zero, /mnt/mem/cdimage now contains a complete image of the CD-ROM and you can write it to a blank CD-ROM)
Footnotes
-
If you don't do this you will get the error message
ERROR: Unable to open SCSI device /dev/sr0: Device or resource busy
. ↩