Created
June 7, 2013 05:31
-
-
Save ewhauser/5727229 to your computer and use it in GitHub Desktop.
Notes on Facebook's Presto
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
These notes are from Facebook's Analytics @ Scale conference. I didn't take notes from the presentation or discussions with developers, so feel free to correct any inconsistencies: | |
- Presto is an ANSI SQL engine built on top of HDFS | |
- Similar functionality to Cloudera's Impala | |
- Facebook started developing this project prior to the Impala annoucement, some different design choices | |
- Implemented in Java | |
- Queries execute around 10x faster than Hive, aggregation based queries can be 100x times faster | |
- Byte code generation is used for efficient predicate processing | |
- Efficient fixed memory data structures are used (very low GC overhead) | |
- Presto daemons do not have to run on data nodes | |
- Facebook does run the daemon on the data nodes directly so that they do not saturate the network. Allocates about 8GB of RAM per daemon | |
- Looking to open source in Fall 2013 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment