scarpino.dev

How I lost my blog content

…and, luckily, how I restored it!

Let me say this before you start reading: backup your data NOW!!!

Really, do it. I post-poned this for so long and, as result, I had a drammatic weekend.

Last Friday I had the wonderful idea to update my Ghost setup to the newer 0.5. I did this from my summer house via SSH, but the network isn’t the culprit here.

You have to know that some months ago, maybe more, I switched from a package installation, through this PKGBUILD, to an installation via npm. So, as soon as I typed npm update, all my node_modules/ghost content was gone. Yep, I must be dumb.

After some minute, which helped me to better understand how the situation was, I immediately shutdown the BeagleBone Black.

The day after I went home, I installed Arch Linux ARM on a microSD and obviously the super TestDisk which got SQLite support since a while now. Cool!

This way I restored the Ghost database, BUT it was corrupted. However, a StackOverflow search pointed me to this commad:

cat <( sqlite3 ghost.db .dump | grep "^ROLLBACK" -v ) <( echo "COMMIT;" ) | sqlite3 ghost-fixed.db

After that, I was able to open the database and to restore 14 of 40 posts.

My second attempt has been to use the Google cache. Using this method I recovered about 10 posts. Nice, I already had more than 50% of the total content! I was feeling optimistic.

The Arch Linux Planet let me recover 3 posts more, which however I could recover anyway using Bartle Doo; I never heard of this website before, but thanks to it I recovered some posts by looking for my First and Last Name.

I was almost here. About 10 posts missing, but how to recover them?? I didn’t remember titles and googling without specific keywords didn’t help neither.

I went back on the broken SQLite database, Vim can open it so let’s look into for some data. Bingo! The missing posts titles are still there!

And then I started googling again, but for specific titles, which pointed me to websites mirroring my posts content. At the end of this step I had 38 of 40 posts!

I can’t stop now, it’s more than a challenge now.

I went back again on the broken database where posts content is corrupted: there’s some text, then symbols and then another text which doesn’t make any sense in union with the first part. This looks like a tedious job. This Saturday can end here.

It’s Sunday; I’m motivated and I can’t lose those 2 posts because of my laziness. I’ve the missing posts titles and I now remember their content, so I started to look for their phrases in the database and, with all my surprise and a lot of patience, I recovered their content! This mainly because Ghost keeps both the markdown and the HTML text in the database and then the post content is duplicated which decrease the chance of a corruption in the same phrase.

Another summer, another Linux survival experience (that I’m pleased to link to!).

Tags: (none)

By Andrea Scarpino on 2014-08-20

Discuss on HackerNews