Skip to content

Instantly share code, notes, and snippets.

@chrisroos
Last active August 29, 2015 14:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save chrisroos/867d3696278f616cb734 to your computer and use it in GitHub Desktop.
Save chrisroos/867d3696278f616cb734 to your computer and use it in GitHub Desktop.
Exploring how to act on files that have changed in a directory

I've been using FileWatcher to perform actions on new/changed files. I ran into problems where FileWatcher invokes the block before the file has finished copying and my code then fails.

My script runs using launchd so I considered investigating the WatchPaths launchd.plist directive. The man page for that directive says not to use it because of the some of the problems I've been having with FileWatcher.

Use of this key is highly discouraged, as filesystem event monitoring is highly race-prone, and it is entirely possible for modifications to be missed. When modifications are caught, there is no guarantee that the file will be in a consistent state when the job is launched.

The scripts in this gist are investigations into alternative approaches to only acting on files that have been created/changed.

In 'storing-a-hash-of-files.rb' I also compared the performance of Ruby's Digest::MD5 to the system md5 but there didn't seem to be any difference. The system md5 required me to use Shellwords to escape the filenames so ultimately I think the Digest::MD5 version is a bit easier to understand.

require 'digest'
require 'fileutils'
require 'logger'
logger = Logger.new(STDOUT)
digest_directory = File.expand_path('../tmp/digests', __FILE__)
while true
logger.info 'Checking for new/changed files'
Dir['/Users/chrisroos/Desktop/tracks/*'].each do |filename|
digest = Digest::MD5.file(filename).hexdigest
digest_path = File.join(digest_directory, digest)
unless File.exists?(digest_path)
logger.info "New or changed file: #{File.basename(filename)}"
FileUtils.touch(digest_path)
end
end
logger.info 'Finished checking'
sleep 10
end
require 'time'
LAST_RAN_AT_FILE = File.expand_path('../tmp/last_ran_at.txt', __FILE__)
def get_last_ran_at
last_ran_at = nil
if File.exists?(LAST_RAN_AT_FILE)
last_ran_at = Time.parse(File.read(LAST_RAN_AT_FILE))
end
last_ran_at
end
def set_last_ran_at
last_ran_at = Time.now
File.open(LAST_RAN_AT_FILE, 'w') { |f| f.puts(last_ran_at) }
end
while true
last_ran_at = get_last_ran_at
Dir['/Users/chrisroos/Desktop/tracks/*'].each do |filename|
file = File.new(filename)
if last_ran_at.nil? || last_ran_at < file.ctime
p "file modified since last run"
end
end
set_last_ran_at
sleep 10
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment