Tuesday, April 16, 2013

How to recover any file from corrupted or broken HDs, SSDs, or SD cards


A couple weeks ago, I lost a Raspberry Pi SD card to corruption that didn't even have backup superblocks to restore. There seemed to be nothing to do about it since I couldn't mount it. Restoring the system wasn't a big deal. However, I really wanted some of the cpp, h, js, php, sh, and other files that I had changed a good bit but hadn't yet backed up or committed. Sure we all back up and/or use git/svn but we have to remember Murphy's Law.

This is a guide on how to easily recover those files that you really want but hadn't backed up.

Needed packages

To do this, we are going to use scalpel, xxd, diff, and grep (So we are most definitely using Linux)

You will already most likely have xxd, diff, and grep on your system. To install scalpel, run:
sudo apt-get install scalpel
or yum or pacman or whatever package manager you use.

**Note: It is probably a good idea to make an image of the drive and use scalpel on that if you are able.

Editing Scalpel configuration

The default scalpel configuration file is located at /etc/scalpel.conf
All you need to do is comment out the file types you don't care about and add the filetypes you do care about that aren't in there.
To create custom scalpel configurations, we will use a little bit of common sense and xxd as a hex dumper.
Find a different file with the same file type or an older copy of the file you want and run the following command:
xxd -l 0x04 filename.type; xxd -s -0x04 filename.type

This will get the first 4 and last 4 bytes of the file. Which should be enough to make the filetype unique, though you can change this as needed. You should see a pattern with a lot of files.
For instance, my sh files (the ones I wrote) often start with #!/b and end with sac. or fi..
and my cpp files often start with #inc or // and end with }.
Now we will take these hex patterns, reformat them as \x?? and put them into the scalpel.conf.
You should use REVERSE whenever the file may continue multiple instances of the end statement. You may also need to increase the file size from 50000 to something much bigger for larger files.

Here are some of my custom scalpel configurations:

php         y        50000    \x3c\x3f\x70\x68\x70        \x3f\x3e REVERSE
js y 50000 \x3c\x73\x63\x72\x69\x70\x74\x20\x74\x79\x70\x65\x3d\x22\x74\x65\x78\x74\x2f\x6a\x61\x76\x61\x73\x63\x72\x69\x70\x74\x22\x3e \x3c\x2f\x73\x63\x72\x69\x70\x74\x3e 
cpp        y        50000   \x23\x69\x6e\x63              \x7d\x2e  REVERSE      
cpp        y        50000   \x2f\x2f\x20                      \x7d\x2e             REVERSE
h            y        50000   \x23\x69\x66\x6e              \x64\x69\x66 REVERSE
sh           y        50000   \x23\x21\x2f\x62              \x73\x61\x63 REVERSE


Running Scalpel

Once your scalpel configuration is finished, you need to start extracting the files. This will probably take a while and will get a lot of files (multiple copies of each that match on your system). It won't know the filename or newest version but we will look at that later. You can run scalpel with the following command:
scalpel -c /etc/scalpel.conf yourcopy.img -o output
or the following if you didn't make a copy of your HDD (you will have to specify the appropriate drive and partition number).
scalpel -c /etc/scalpel.conf /dev/sdx# -o output

Once everything is finished it will be in the output folder and we can look for the newest version of your file.

Finding your files

cd into the output directory and you can see all the different filetypes organized.
cd output; ls
For all your graphical file types, i.e. videos and pictures, they should all be listed there and you will have to rename them. 
For your other files, we can use grep recursively to find our newest version. This unfortunately requires you to remember things. It is a good idea to use a function or variable name that you used in order to find your files. Ex:
grep -R "playing()" ./
Searches for the function playing in all your files, this could give you matches in js, php, cpp, c, or h files.
grep -R "::CurlWriter" ./cpp*
Searches for the definition of CurlWriter in all of the cpp files
grep -R "VoiceCommand" ./h*
Searches for the variable or function VoiceCommand in all of your header (h files)

These will probably give you a couple of different results. In my case, these were all different versions of my file. This is a really good chance to use diff. All you have to do is select two of the files and diff to see the differences. Ex:
diff ./sh-5-1/00135386.sh ./sh-5-1/00135393.sh

Once you start getting used to this, you will be better about searching for the newest function or variable name you remember. Once you narrow it down to the newest copy of your file, you can move it elsewhere.
Ex:
mv ./sh-5-1/00135393.sh ~/ImportantFile.sh


There it is, this should allow you to recover almost any file you have created from a corrupted or bad HD, SSD, or SD card.

Places you can find me

No comments:

Post a Comment