public
Last active

fixing rails' broken caching interface with monkey patches. #YUCK

  • Download Gist
a.rb
Ruby
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
def write_entry(key, entry, options)
data = Entry.data_for(entry)
expires_at = entry.expires_at.to_i
created_at = Time.now.utc.to_i
 
collection.find(_id: key).upsert(_id: key, data: data, expires_at: expires_at, created_at: created_at)
end
 
def read_entry(key, options = {})
expires_at = Time.now.utc.to_i
doc = collection.find(_id: key, expires_at: {'$gt' => expires_at}).first
 
if doc
data = doc['data'].to_s
value = Marshal.load(data)
created_at = doc['created_at'].to_f
 
Entry.for(value, created_at)
end
end
 
# this class exists to normalize between rails3 and rails4, but also to
# repair totally broken interfaces in rails - especially in rails3 - that
# result in lots of extra serialization/deserialzation in a class which is
# supposed to be FAST
#
class Entry < ::ActiveSupport::Cache::Entry
def Entry.is_rails3?
unless defined?(@is_rails3)
@is_rails3 = new(nil).instance_variable_defined?('@value')
end
@is_rails3
end
 
# extract marshaled data from a cache entry without doing unnecessary
# marshal round trips. rails3 will have either a nil or pre-marshaled
# @value whereas rails4 will have either a marshaled or un-marshaled @v.
# in both cases we want to avoid calling the silly 'value' accessor
# since this will cause a potential Marshal.load call and require us to
# make a subsequent Marshal.dump call which is SLOOOWWW.
#
def Entry.data_for(entry)
if is_rails3?
value = entry.instance_variable_get('@value')
marshaled = value.nil? ? Marshal.dump(value) : value
else
v = entry.instance_variable_get('@v')
marshaled = entry.send('compressed?') ? v : entry.send('compress', v)
end
 
Moped::BSON::Binary.new(:generic, marshaled.force_encoding('binary'))
end
 
# the intializer for rails' default Entry class will go ahead and
# perform and extraneous Marshal.dump on the data we just got from the
# db even though we don't need it here. rails3 has a factory to avoid
# this but rails4 does not so we just build the object we want and
# ensure to avoid any unnecessary calls to Marshal.dump/load... sigh.
#
def Entry.for(value, created_at)
allocate.tap do |entry|
if is_rails3?
entry.instance_variable_set(:@value, value)
entry.instance_variable_set(:@created_at, created_at)
entry.instance_variable_set(:@compressed, false)
else
entry.instance_variable_set(:@v, value)
entry.instance_variable_set(:@c, false)
end
end
end
 
def value
Entry.is_rails3? ? @value : @v
end
 
def raw_value
Entry.is_rails3? ? @value : @v
end
end
notes.md
Markdown

rails3

impossible for cache plugin authors to know the state of @value since the object is either contructed via Entry.new, where @value will be nil or Marshal.dump'd

https://github.com/rails/rails/blob/3-2-stable/activesupport/lib/active_support/cache.rb#L561

or maybe the object was created this way

https://github.com/rails/rails/blob/3-2-stable/activesupport/lib/active_support/cache.rb#L541

however, the accessor will go boom if the second way is chosen, since it handles only nil or marshaled values

https://github.com/rails/rails/blob/3-2-stable/activesupport/lib/active_support/cache.rb#L587

so this is broken, but it's also slow since implementations of 'read_entry' will always cause a Marshal.dump to occur if they don't subvert the contructor with .create - but using .create results in broken entries since #value assumes marshal'd... ;-/

rails4

pretty close. since we can call either

  Entry.new(object, :compress => true)

and hit the marshaling code, or not

  Entry.new(object)

the issues with this are

  • it's really a terrible signature: having :compress imply marshal is weird
  • it's inverted, we need to know all possible calls to Entry.new to know if people are inadvertently passing in :compress or not... there is a lot of currying in the cache code so this is definitely possible.

solutions

i think the rails3 concept of having two discerte factories is the correct one since, logically, we are either creating an entry and preparing to be written to cache or loading one from cache. so two approaches suggest themselves

add back Entry.create, something like

def Entry.create(object, *args, &block)
  warn 'never pass extract arguments to .create' unless args.empty?

  allocate.tap do |entry|
    entry.instance_eval do
      @v = object
      @c = false
    end
  end
end

this is ok but the signatures

  Entry.create

  Entry.new

seem entirely inverted to me. as in .create in ruby-land would normally be saving whereas new would not.

maybe naming would help

  def Entry.for_write(object, created_at, options = {})
  end

  def Entry.for_read(key)
  end

but rails doesn't take this approach much - it seems to think .new is the best...

maybe some sort of callback/hook that's called to compress/marshal entries at the last moment...

  def before_cache
    @v = serialize(@v)
  end

icky.

and the end of the day i'd probably take the absolute simplest path: never automagically compress / marshal objects in the base Entry class and, instead, leave that up to Store authors defining 'read_entry' and 'write_entry'. that, however, is a major breaking change and i'm not sure what the rules for rails4 and api signatures are...

thoughts?

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.