March, 2013

The problem with recovering text files

This is a question I get a lot, so I thought I would expand on it more formally: “Why can’t DiskDigger recover plain text (.txt) files in ‘deeper’ mode?” To begin, let me remind everyone that DiskDigger can recover text files in “deep” mode, where it scans the structure of the file system and recovers files based on clues provided by the file system. This makes “deep” mode capable of recovering any type of file, including text files, while only being able to recover these files from a healthy file system. However, in “deeper” mode, things are very different. Since DiskDigger no longer relies on the file system to parse the structure of files on the disk, it can only detect files based on byte sequences known to exist in certain types of files.  For example, all PNG image files begin with the byte sequence 89 50 4E 47. Therefore, DiskDigger can look at every sector of the disk, and if it begins with this sequence of bytes (files must be sector-aligned), it knows that there’s a PNG file at that location. The same is true for many other types of files, like .JPG, .DOC, .MOV, etc.  It’s also true for files that are built from text, but have a consistent structure, such as .HTML, .XML, .RTF, etc. So now we come to the problem with pure text files. Unlike other types of files, text files do not contain any identifiable sequence of bytes. They only contain… well… text!  There’s no underlying binary structure. This makes it nearly impossible for DiskDigger to “pick out” a text file from all the other random content on a disk. Despite all this, there are a few remote possibilities for recovering text files which are an active area of development in DiskDigger. None of these are perfect, but they may eventually lead to a solution for recovering some text files: Some text files may be encoded in Unicode (specifically UTF-16). In this case, the text file will have a starting byte signature, which is either FE FF or FF FE. Unfortunately this signature is far too short to meaningfully identify UTF-16 files, and will produce too many false positives. Since many text files are encoded in ASCII or UTF-8, and written in English, we can expect them to contain only characters between 0x20 and 0x7F (along with \n and \t). We can then...

Read More

Recovering QIC-150 tapes: a case study

A friend of mine recently found something very intriguing:  several backup tapes from his old Amiga 500 computer from around 1995.  Since he no longer has the original Amiga machine, he no longer has a way of reading the tapes, and thus no way of rediscovering the old documents, letters, or any other memories that were lost when the old computer was thrown away. Without hesitation, I offered to help recover the data from the tapes, not only for the challenge of it, but also because I have never dealt with Amiga computers or Amiga-formatted media, and this would be a great opportunity to familiarize myself with a significant part of computing history, even if it has already come and gone.  This is a brief chronicle of the steps I took to recover the data, just in case someone else in the future (including myself) needs to perform a similar task. One of the most pleasurable aspects of this experience was how easily everything came together, owing mostly to the abundance of information on the web regarding every step of these kinds of processes. Hopefully the information in this article will contribute to that abundance. The tapes were QIC-150 cartridges (specifically Sony D6150). After rummaging a bit through my attic, I realized that I actually own a tape drive that’s capable of reading QIC-150 tapes. The drive is an Archive Viper 2150S, which uses a SCSI interface: Since the drive is SCSI, I would need a SCSI host adapter card to interface with it. Luckily, after rummaging a bit more, I found an Adaptec AHA 2940UW adapter which looked like it would be perfect for the job: I installed the adapter into an old PC that “still” has a PCI bus (how far we’ve come!), connected the drive to it with a 50-pin SCSI ribbon cable, and put a terminator on the end of the cable (I could have also used 8-pin SIP resistors for which the drive has sockets).  The jumpers on the drive were already set to have a SCSI ID of 0, so I didn’t have to make any physical changes to the drive or the adapter. I booted up the PC, and was happy to see that the Adaptec card was correctly reporting the Viper 2150S as being connected with an ID of 0. So far, so good! The computer booted into Windows XP, and...

Read More

DiskDigger released

The latest version of DiskDigger is now available for download! Go to the DiskDigger website to check out the new features and download the updated program.

Read More