Skip to content

Instantly share code, notes, and snippets.

@Rajveer100
Last active June 19, 2023 08:47
Show Gist options
  • Save Rajveer100/488df6d46ba0e5d03293d1320059bb4f to your computer and use it in GitHub Desktop.
Save Rajveer100/488df6d46ba0e5d03293d1320059bb4f to your computer and use it in GitHub Desktop.
Analysing MongoDB's Full Time Diagnostic Capture (FTDC)

About FTDC

FTDC, originally short for full time diagnostic data capture, is MongoDB's internal diagnostic data collection facility. It encodes data in a space-efficient format, which allows MongoDB to record diagnostic information every second, and store weeks of data with only a few hundred megabytes (MBs) of storage. While it's a great inititative on Mongo's end to help us monitor such metrics, at the moment, there isn't any concrete tool provided by Mongo or any open-source repos that we can solely rely on to effectively decode the data.

But as usual, after some research and finding across the internet, there are certain blogs/tools/docs where many software developers and open source contributors have done some great findings to get the closest possible visualisation in different formats by utilizing open source tools. Let's explore it.

Motivation and Initial Steps

GoLang happens to have a good documentation about FTDC parsing and generation with their specific tools and functions as here. It consists of all the details if one wanted to start some work from scratch. Fortunately, there exists an open-source tool called Keyhole which is implemented by Ken Chen which allows us to visualize FTDC metrics and statistics with several other informative details as described in his blogs(3).

GoLang Setup

Before one proceeds installing KeyHole or any Go specifc tool/repo, here are some pre-requisites for setting up GoLang as it completely depends on modules/packages/directory and can cause problems while setting up other projects written in GoLang as well:

  • Install GoLang via it's website ensuring the command go version command works as described.
  • Open Terminal (or similar based on your OS) and type go env, where you will see a list of environment variables, we are interested in GOROOT="/usr/local/go" and GOPATH="$HOME/go".
  • To make GO accessible in any directory in your command line you can export the PATH env of GO this way, export PATH=$PATH:/usr/local/go/bin.

That completes initial installation setup, just one last step is that whenever a new project/workspace needs to be created, you need to have it in your $GOPATH (i.e. "$HOME/go"). Before proceeding, ensure you have the /src, /pkg, /bin directories in the $GOPATH (if not, you just need to do it the first time you install GO by mkdir). (Anytime you add a new GO project it needs to be in the /src directory.)

Decoding FTDC Metrics Locally

A typical FTDC metrics file would contain two types (in the form of id's), i.e., a BSON Meta Document and a BSON Metrics data field. This is further recognised by a type field (0/1) corresponding to doc/data field. While The latter is a BSON formatted data, the metrics data is encoded in binary/base64 format(i.e. encoded/compressed BSON format) and needs to unzipped.

To save our time once again, Alex Bevilacqua has a well written blog about FTDC Data and how we can use Mongo's bsondump utility and jq to query and filter out the necessary data based on the type of the metrics.

Limitations About Decoding Process using BSONDump

If you try to execute the above command you may encounter issues related to byte size (memory/EOF/buffer limit) or something similar. We can instead divide the metrics data into multiple chunks/objects and decode each object individually and further append the same each time the ith chunk is successfully decoded. Here is modified script to analyse the context:

BASH Script

#Remove file, the script will create/append to the file.
if test -f "$2"; then
    rm $2
fi

#Decode metrics bson file into json.
bsondump --quiet $1 | jq 'select( .type."$numberInt" == "1" ) | .data."$binary".base64' | jq -s '.' > EncodedArray.json

#Get number of chunks.
EncodedArrayLength=$( cat EncodedArray.json | jq 'length' )
echo "$EncodedArrayLength Encoded Chunks Found"

#The Base64 decoding and bsondump MUST be done separately for each chunk.
for (( index=0; index<$EncodedArrayLength; index++ ))
do
    echo -ne "Decoding Chunk $index\033[0K\r"
    cat EncodedArray.json | jq --argjson i $index '.[$i]' | ruby -rzlib -rbase64 -e 'd = STDIN.read; print Zlib::Inflate.new.inflate(Base64.decode64(d)[4..-1])' > "DecodedArray.bson"

    #Append resulting chunk json data to the end of the file
    bsondump --quiet DecodedArray.bson | jq -s '.' >> $2
done

#Clean up temp files used.
rm DecodedArray.bson
rm EncodedArray.json
echo "All Chunks Decoded"


#command+f "vmstat" 

The above script can be executed by ./[Script_Name] [input metric file] [output JSON file] and you will have the decoded JSON data in your working directory.

Importing Decoded data to MongoDB

One could also import the same data to MongoDB for having a better overview and query necessary data based on the attributes of the collection (in this case, FTDC metrics). Mongo already provides the mongoimport tool with various arguments (i.e, --options), in our scenario, we will want to use the --jsonArray flag for each chunk in the metrics file and append the same during the decoding process (as above).

We can now modify the script by adding the following just before we are done exiting with the for loop:

#Import each chunk to Mongo using mongoimport
touch temp_decoded_chunk.json
bsondump --quiet DecodedArray.bson | jq -s '.' >> temp_decoded_chunk.json
mongoimport --db [db_name] --collection [collection_name] --file temp_decoded_chunk.json --jsonArray
rm temp_decoded_chunk.json

You may verify the successful import by checking mongosh (or equivalent GUI, ex. Compass).

Analysing Mongo Logs

There are various tools across the internet that allow us to analyse Mongo Logs, in our case, we would be using Ken Chen's tool called Hatchet (refer GoLang setup for installation). You may observe a variety of options to analyse the logs, i.e, through URL, AWS S3, Atlas, etc, you may accordingly use based on your requirements. This tool requires the log file to be in .gz format which you can create by using gzip tool. I presume the rest is well defined in the repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment