- Can be used in conjunction with rsync. Executed before rsync to renames files to their new location and rsync can synchronize the rest as it works well for that
- Does detect the file renames that happened in a tree and apply the changes in the other tree
No change:
- Given two directories A and B with similar files in both
- When
renametree A B
is executed - Then no change is performed
- And the index file is updated for A and B
Simple rename:
- Given two directories A and B already synchronized
- And some files renamed in A
- When I execute
renametree A B
- Then Files in B are renamed the same way they were renamed in A
- And index for A and B is updated to track the new location of those files
Reverse rename:
- Given two directories A and B already synchronized
- And some files renamed in B
- When I execute
renametree A B
- Then Files in A are renamed the same way they were renamed in B
- And index for A and B is updated to track the new location of those files
Conflicting rename:
- Given two directories A and B already synchronized
- And some files renamed in A
- And the same files renamed in B with a different name
- When I execute
renametree A B
- Then the command stops and shows the files that are in rename conflict
- And index for A and B is updated to track the new location of those files
- A file is a regular file or a directory with an inode. Renames are detected by looking at the inode and finding it in a new location
- A directory tracked by this tool contains an index file which contains the location of each file given their inode
- A rename is detected when the inode is found in another location than specified in the index file
- The index track all earlier locationf of the files too
- The index is updated each time the tool is executed
- Files are associated unique identifiers that are shared across synchronized directories
main:
- record the start time
- for each file in A and B:
- if the file does not have an id
- generate an id based on hashing: the start time, the relative path and a hash of the data
- if the file locations is different than in index
- record the new file location in the index with the start time
- if the file does not have an id
- we know all files in A and B have an id and their location is recorded
- for each file id that exists in both A and B:
- if their relative path is the same
- do nothing, files are already at their correct location
- if the location history tells one file is on an older location
- move it to the new location and record the new location history in the index
- if both files have different locations but history cannot tell which one is the correct location
- record file to show a conflict error
- if their relative path is the same
- show all conflict errors
- exit with status >0 in case of errors
tell history for files A and B
- starting by the end of each file history, find a location that is the same on both files at the same time
- if not found
- this is a conflict, return now
- else, process only history from that point in time to the end of history.
- while the two histories have the same path as first entry in the history
- remove that entry from both histories
- if there is a file with an empty history
- that file is older, return the remaining history of the new file which contains the new location
- else, both files still have an history continuing
- return that there is a conflict
- while the two histories have the same path as first entry in the history
Solving a conflict is as easy as renaming a file in one copy to the same path of the other copy and running the tool again. it will record the same path for both copies of the file for the same point in time and be happy with it.
The index file name should contain in its name the inode number of the directory it refers to, so in case the file is copied by rsync elsewhere, it is not read mistakenly.
The index should contain entries with:
- the list of past locations of a file
- the inode number of a file
- the unique id of a file
- given a file name or an opened file, tell a unique id for it across directories
- given a file name or an opened file, create a new entry if it does not exists
- given a file name or an opened file, add a directory to the location history, with time
- given a unique file id, tell the timed history of the locations of this file
- The database is named
.renametree-%d
where %d is the inode number of the directory the file is on - The database is a line based text file
- the first work up to the first TAB or SPACE character tells the line type:
FID <UUID> <INUM>
: the line associates a unique ID to an inodeLOC <UUID> <TIME> <ESCAPED-PATH>
: the line associates a location in time to a file id
The format can be append-only and is easy to parse
Without an index file synchronized on both parties, a file can be identified by:
- an inode number, but only for the scope of a directory
- a data hash but only at some point in time
- a file name until this is renamed too
It seems impossible to track renames without an index