Skip to content

Instantly share code, notes, and snippets.

@bmarkons
Last active August 13, 2018 21:48
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save bmarkons/78d7568dbccf2b79263965860c78762e to your computer and use it in GitHub Desktop.
Save bmarkons/78d7568dbccf2b79263965860c78762e to your computer and use it in GitHub Desktop.
Google Summer of Code 2017 - summary

Google Summer of Code 2017 - Long Running Ruby and Rails Benchmarks

Project Long Running Ruby and Rails Benchmarks
Organization Ruby on Rails
Project site https://rubybench.org
Repositories https://github.com/ruby-bench
GSoC link https://summerofcode.withgoogle.com/projects/#5251786557358080
Mentors Robin Dupret, Noah Gibbs
Student Marko Bogdanović
Proposal link to PDF
Total changes
202 commits / 5891 ++ / 3224 --

Project goal

The main goal was to create comparison chart which would be used to measure framework overhead. We needed something to tell us how much overhead ORM libraries such as Active Record or Sequel brings by comparing them with raw SQL queries directly through Ruby database adapter (mysql2, pg). On the other hand, comparing overhead between Active Record and Sequel can help us decide which library to use in our stack.

Work done

Working on RubyBench was an awesome experience and I've enjoyed every contribution. We have reached the main goal and have had a lot of unrelated contributions like introducing an admin interface, automation of the deploy process, etc.

The following Pull Requests are the result of three months of work during this program:

1. Comparison chart

This was the very first task I was assigned to. I have acquired the most familiarity with the code while I was solving my first task. It's been done incrementally, step by step, since I was surrounded with unknown until then.

The main goal has been accomplished, and you are now able to compare results on RubyBench. This is handy to compare benchmarks of the same kind across different flavors (like Sequel and Active Record benchmarks) since the data are represented on the same scale.

I have used HighStock multiseries feature to get this nice visualization.

Comparison chart

Code
ruby-bench-web#199 Compare benchmarks on commits graph
ruby-bench-web#202 Sequel events are comming from Jeremy's repo
ruby-bench-web#203 Handle only commits to main branches
ruby-bench-web#216 Show both scripts below comparing commits graphs
ruby-bench-web#231 Improve comparison chart

2. Suite for pg gem

Benchmarks added in pg suite were used to calculate overhead in Sequel and Active Record. Both libraries rely on Ruby database adapters (pg, mysql2) to communicate with the RDBMS so comparing results from these benchmarks can show the exact overhead Active Record or Sequel brings.

Code
ruby-bench-web#243 Add ssh method to support PG suite
ruby-bench-docker#35 Add pg Docker image
ruby-bench-docker#37 Install setup bundle in PG Dockerfile
ruby-bench-docker#43 PG setup for specific commit
ruby-bench-docker#44 Fix PG suite setup
ruby-bench-docker#45 Fix PG setup for previous commits
ruby-bench-suite#76 Add pg gem scope_all benchmark
ruby-bench-suite#92 Add pg suite runner
ruby-bench-suite#94 Add discourse real world benchmark in PG suite
ruby-bench-suite#102 Send pg suite results to web

3. Admin UI

This was needed for some time for convenience.
Sometimes you want to run suite for number of previous commits. It's handy to do it through admin UI.

Admin manual runner

I have introduced marking benchmarks with groups to avoid comparing 🍎 to 🍊.

Benchmark groups

Code
ruby-bench-web#210 Add manual runner
ruby-bench-web#227 Add admin UI
ruby-bench-web#234 Grouping benchmarks

4. Storing logs on runner server

Logs for suite runs were scattered over random temporary files. We needed these logs in one place to be able to debug errors easily.
Code
ruby-bench-web#9bb7b39 Execute scripts on bare metal server
ruby-bench-docker#25 Move ssh commands to runner scripts
ruby-bench-docker#26 Just pass prepared statements argument in rails releases script
ruby-bench-docker#27 Fix rails master script

5. Fixed broken Rails suite

Benchmarks in Rails suite were broken for some time due to outdated versions. These are revived again.

Code
ruby-bench-docker#28 Bump ruby version to 2.4.1
ruby-bench-docker#29 Update rails Dockerfiles
ruby-bench-docker#31 Install postgres and mysql clients in rails_releases Dockerfile
ruby-bench-suite#84 Don't reuse endpoints in sequel and rails drivers
ruby-bench-suite#89 Fix rails benchmarks
ruby-bench-suite#90 Fix scaffold benches

6. Other various contributions

Apart from working on the given tasks, I've made a certain number of other contributions. Mostly I've been trying to fix and improve anything that got on my way :

ruby-bench-suite#78 Make common benchmark runner to avoid duplication
ruby-bench-suite#79 Add sequel scope all with default scope bench
ruby-bench-suite#81 Update readme
ruby-bench-suite#83 Fix: rename organization to jeremyevans
ruby-bench-suite#85 Fix selection benches to construct string from records
ruby-bench-suite#86 Use sequel_pg
ruby-bench-suite#88 Add ActiveRecord benchmark as Discourse example
ruby-bench-suite#100 Remove deprecated set_allowed_columns method
ruby-bench-web#224 Setup automatic deploy on production branch
ruby-bench-web#245 Replace Bugsnag with Sentry

Future work

After GSoC is finished, I am looking forward continuing my contributions on RubyBench.

The following list presents issues I would like to address in the future :

More work on comparison chart

We should include invalid results on chart also and mark points where they became invalid. We could use HighStock flag feature.

Some additional info on comparison chart needs to be added:

  • Docker version
  • Ruby version
  • Suite version used

Make separate page for displaying comparison graphs

I have in mind something like this - https://speed.python.org/comparison

Get notified if running benchmark suite fails

At this moment, logs from every run are being stored in files on server. One file for each suite. If any run fails, you need to ssh to server in order to go through logs to find out what caused failure. More convinient way would be to have these somewhere on the web. We use Sentry to catch errors from ruby-bench-web production server, so we could use it to catch errors from the runner server as well.

Make official reports on collected results

We need a way to reach out to people with results we collected. Weekly reports would be awesome.

Fill ruby-bench/ruby-bench repo with docs

We should include here all kind of information about the project. Describing what is being run, where and how. We should make it easier for people to make their first contribution.

Acknowledgments

Many thanks to my mentors, Noah Gibbs and Robin Dupret who supervised my work along the way. Thank you for your responsiveness on every call for help. Thank you for your support and your kindness!

Special thanks to Sam Saffron and Guo Xiang Tan (the guy who did most of RubyBench you see today). Thank you for actively helping my work during the GSoC program, even though you weren't officially mentoring this year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment