The diff
file describes the net changes across the total DB. You should expect changes to this file to represent intended changes to the shovel output. If you are performing a refactor, with no intended code changes, you should expect no changes to this file.
The csv
file represents all operations that were performed that resulted in a data update. This doesn't including touching a content item (only updating a timestamp), but does include wiping a field out and then restoring that field to it's original contents. It will also track id
changes.
Optimally, a dag that is not producing changes should not be wiping/restoring data on a run (should produce an empty csv). Practically, we have a long way to go in this area. Some operations (such as keeping track of content_tags) is easier to do by wiping the table and restoring, which produces operations, but doesn't result in a net diff.