Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save slugcat-dev/44edda28ed7eb93e020687c280122b60 to your computer and use it in GitHub Desktop.
Save slugcat-dev/44edda28ed7eb93e020687c280122b60 to your computer and use it in GitHub Desktop.

How I Restored My Database from a Memory Dump

Yesterday evening, I was coding on my free-time project and dared touching some back-end code and a database schema.

I wanted to rename a key in a collection from created to createdAt, nothing complicated; Renaming a key in MongoDB normally is as simple as calling Schema.updateMany({}, { $rename: { oldKeyName: 'newKeyName' } }), but for some reason it didn't work. After going through the docs, I found out this was not possible for that specific key name:

The createdAt property is immutable, and Mongoose overwrites any user-specified updates to updatedAt by default.

So I came up with another stupid idea: Just fetch all documents, delete them from the database, and insert them back in with the changed key name. This is the code I wrote for the operation:

const docs = await Schema.find({})

await Promise.all(docs.map(async (doc) => {
	await Schema.deleteOne({ _id: doc._id })
	
	doc.createdAt = doc.created
	delete doc.created
	
	await new Schema(doc).save()
}))

You may already notice where this went wrong. I ran the script, it deleted all the documents from the database, and didn't insert them back in, because of some error at await new Schema(doc).save().

And so I was sitting there, crying, staring at the blank screen, trying to somehow get my data back from a backup (of course I had no backup) or some cache or from console logs or by a miracle, for several hours, but I was left with nothing.

I was almost about to accept my fate and thought about how I could explain my friends, who are testing and using my app while in development, how I could loose about five hundred user notes by accident, bu then I somehow stumbled upon this question on StackExchange from a user who lost form data in Firefox. One of the answers suggested:

Do not restart your browser or press the back button! This solution is hit or miss, and works on Linux. In short: dump the memory of the Firefox process, and search through it for fragments of your text. It's ugly, but it's your last resort.

A spark of hope lit up within me; The server process was still running, and some data from the last database fetch was probably still in memory. I followed the answer to create the core dump by using gcore, and after about five minutes, core dump file of massive 65 GB was written to my disk. I could confirm that at least some of the documents are still somewhere in there by using grep, and after trying to find the data by hand (text editors get really slow when open up a 65 GB file), I decided to write a Python script to do this job for me.

import os

def process_file(input_file_path, output_file_path, search_string=b'"_id": "6', chunk_size=4096):
    total_size = os.path.getsize(input_file_path)
    processed_size = 0
    
    with open(input_file_path, 'rb') as input_file, open(output_file_path, 'ab') as output_file:
        while True:
            chunk = input_file.read(chunk_size)
            
            if not chunk:
                break
            
            processed_size += len(chunk)
            progress = (processed_size / total_size) * 100
            
            if search_string in chunk:
                print(f'Match found. Progress: {progress:.2f}% - {processed_size} / {total_size}')
                
                output_file.write(chunk)

The script searches the entire file for the object ids of the deleted documents in under two minutes by splitting it up into smaller chunks. The result? An almost perfect JSON representation of the data just moments before it was deleted. After painstakingly sorting out about about thirty corrupted documents by hand, I could recover ~350 of the ~450 deleted documents. Some parts of the data are still corrupted, especially images stored as Base64 strings, but I think it would be much worse to be left with nothing, so I'm thankful for what I have now.

This entire adventure took me about 4 hours of sleep, and if there is one thing to learn from this, have backups of everything, even if it seems not as important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment