Skip to content

Instantly share code, notes, and snippets.

@apollosadmin
Created June 23, 2025 17:34
Show Gist options
  • Save apollosadmin/7ccdad428e2615d3c2c40f393c1aae67 to your computer and use it in GitHub Desktop.
Save apollosadmin/7ccdad428e2615d3c2c40f393c1aae67 to your computer and use it in GitHub Desktop.

Diff Summary.

diff File

The diff file describes the net changes across the total DB. You should expect changes to this file to represent intended changes to the shovel output. If you are performing a refactor, with no intended code changes, you should expect no changes to this file.

csv File

The csv file represents all operations that were performed that resulted in a data update. This doesn't including touching a content item (only updating a timestamp), but does include wiping a field out and then restoring that field to it's original contents. It will also track id changes.

Optimally, a dag that is not producing changes should not be wiping/restoring data on a run (should produce an empty csv). Practically, we have a long way to go in this area. Some operations (such as keeping track of content_tags) is easier to do by wiping the table and restoring, which produces operations, but doesn't result in a net diff.

table_name changed_fields client_query action row_data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment