A friend asked me for a few pointers to interesting, mostly recent papers on data warehousing and "big data" database systems, and I figured I'd share the list. This is biased and rather incomplete but maybe of interest to someone. While many are obvious choices, I think there are a few underappreciated gems.
###Dataflow/Stream Processing Engines:
Dryad--general-purpose distributed parallel dataflow engine
http://research.microsoft.com/en-us/projects/dryad/eurosys07.pdf
Google Dremel--columnar storage for fast queries (c.f. Impala)
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/36632.pdf