So I suddenly discovered that compile_commands.json files generated by the Bear tool contain duplicate entries despite the fact Bear's documentation says it should filter out duplicates. Probably that was result of the outdated (2 years old) version of the Bear - now I updated it but I had no time to check if this issue is already fixed.
How to filter out duplicate entries from a JSON array with thousands of elements? And preferably leave only the last element of the group because it contains the most up-to-date compilation parameters?
One way is just remove compile_commands.json and rebuild whole project - but that could be very tedious because on my machine it takes more than an hour to rebuild the project which I work on. And I have a dosen of different versions in separate directories... After some thinknig I recalled a tool called jq which is 'JSON query'. I had some doubts if I will be able to do my specific task with this tool so I went to documentation page: https://stedolan.github.io/jq/manual/
After many unsuccessful attempts I found correct syntax to filter out duplicate queries:
cat compile_commands.json |jq '[group_by(.file)[]|last]' >compile_commands_reduced.json
Some files where reduced from ~20000 records to less than 2000 records! To check number of records I used following syntax:
# length of the original file
cat compile_commands.json |jq 'length'
# length of the reduced file
cat compile_commands.json |jq '[group_by(.file)[]|last]|length'
After doing this I am really impressed by the possibilities of the jq tool.