bmarkons/gsoc2017.md

## gsoc2017.md

      
    Raw
  

              gsoc2017.md
            
          
     Google Summer of Code 2017 - Long Running Ruby and Rails Benchmarks 


    Project
    Long Running Ruby and Rails Benchmarks
  
  
    Organization
    Ruby on Rails
  
  
  Project site
  https://rubybench.org
  
  
  Repositories
  https://github.com/ruby-bench
  
  
  GSoC link
  https://summerofcode.withgoogle.com/projects/#5251786557358080
  
  
  Mentors
  Robin Dupret, Noah Gibbs
  
  
  Student
  Marko Bogdanović
  
  
  Proposal
  link to PDF
  
  
  Total changes
  202 commits / 5891 ++ / 3224 --

  
Project goal

The main goal was to create comparison chart which would be used to measure framework overhead. We needed something to tell us how much overhead ORM libraries such as Active Record or Sequel brings by comparing them with raw SQL queries directly through Ruby database adapter (mysql2, pg). On the other hand, comparing overhead between Active Record and Sequel can help us decide which library to use in our stack.
Work done

Working on RubyBench was an awesome experience and I've enjoyed every contribution. We have reached the main goal and have had a lot of unrelated contributions like introducing an admin interface, automation of the deploy process, etc.
The following Pull Requests are the result of three months of work during this program:
1. Comparison chart

This was the very first task I was assigned to. I have acquired the most familiarity with the code while I was solving my first task. It's been done incrementally, step by step, since I was surrounded with unknown until then.
The main goal has been accomplished, and you are now able to compare results on RubyBench. This is handy to compare benchmarks of the same kind across different flavors (like Sequel and Active Record benchmarks) since the data are represented on the same scale.
I have used HighStock multiseries feature to get this nice visualization.

Code


  ruby-bench-web#199
  Compare benchmarks on commits graph 

  ruby-bench-web#202
  Sequel events are comming from Jeremy's repo 


  ruby-bench-web#203
  Handle only commits to main branches 

  ruby-bench-web#216
  Show both scripts below comparing commits graphs 


  ruby-bench-web#231
   Improve comparison chart 


2. Suite for pg gem

Benchmarks added in pg suite were used to calculate overhead in Sequel and Active Record. Both libraries rely on Ruby database adapters (pg, mysql2) to communicate with the RDBMS so comparing results from these benchmarks can show the exact overhead Active Record or Sequel brings.
Code


  ruby-bench-web#243 
  Add ssh method to support PG suite


  ruby-bench-docker#35
  Add pg Docker image


  ruby-bench-docker#37
  Install setup bundle in PG Dockerfile


 ruby-bench-docker#43
 PG setup for specific commit


  ruby-bench-docker#44
  Fix PG suite setup


  ruby-bench-docker#45
  Fix PG setup for previous commits


 ruby-bench-suite#76
 Add pg gem scope_all benchmark


  ruby-bench-suite#92
  Add pg suite runner


  ruby-bench-suite#94
  Add discourse real world benchmark in PG suite


  ruby-bench-suite#102
  Send pg suite results to web


3. Admin UI

This was needed for some time for convenience. 

Sometimes you want to run suite for number of previous commits. It's handy to do it through admin UI.

I have introduced marking benchmarks with groups to avoid comparing 🍎 to 🍊.

Code


   ruby-bench-web#210 
  Add manual runner


   ruby-bench-web#227 
  Add admin UI


   ruby-bench-web#234 
  Grouping benchmarks


4. Storing logs on runner server

Logs for suite runs were scattered over random temporary files.
We needed these logs in one place to be able to debug errors easily.
Code


  ruby-bench-web#9bb7b39
   Execute scripts on bare metal server


  ruby-bench-docker#25
  Move ssh commands to runner scripts


  ruby-bench-docker#26
  Just pass prepared statements argument in rails releases script


  ruby-bench-docker#27 
  Fix rails master script


5. Fixed broken Rails suite

Benchmarks in Rails suite were broken for some time due to outdated versions. These are revived again.
Code


	ruby-bench-docker#28
	Bump ruby version to 2.4.1


	ruby-bench-docker#29
	Update rails Dockerfiles


	ruby-bench-docker#31
	Install postgres and mysql clients in rails_releases Dockerfile


	ruby-bench-suite#84
	Don't reuse endpoints in sequel and rails drivers


	ruby-bench-suite#89
	Fix rails benchmarks


	ruby-bench-suite#90
	Fix scaffold benches


6. Other various contributions

Apart from working on the given tasks, I've made a certain number of other contributions.
Mostly I've been trying to fix and improve anything that got on my way :


	ruby-bench-suite#78
	Make common benchmark runner to avoid duplication


	ruby-bench-suite#79
	Add sequel scope all with default scope bench


	ruby-bench-suite#81
	Update readme


	ruby-bench-suite#83
	Fix: rename organization to jeremyevans


	ruby-bench-suite#85
	Fix selection benches to construct string from records


	ruby-bench-suite#86
	Use sequel_pg


	ruby-bench-suite#88
	Add ActiveRecord benchmark as Discourse example


	ruby-bench-suite#100
	Remove deprecated set_allowed_columns method


	ruby-bench-web#224
	Setup automatic deploy on production branch


	ruby-bench-web#245
	Replace Bugsnag with Sentry


Future work

After GSoC is finished, I am looking forward continuing my contributions on RubyBench.
The following list presents issues I would like to address in the future :
More work on comparison chart

We should include invalid results on chart also and mark points where they became invalid. We could use HighStock flag feature. 
Some additional info on comparison chart needs to be added:

Docker version
Ruby version
Suite version used

Make separate page for displaying comparison graphs

I have in mind something like this - https://speed.python.org/comparison
Get notified if running benchmark suite fails

At this moment, logs from every run are being stored in files on server. One file for each suite.
If any run fails, you need to ssh to server in order to go through logs to find out what caused failure. More convinient way would be to have these somewhere on the web. We use Sentry to catch errors from ruby-bench-web production server, so we could use it to catch errors from the runner server as well.
Make official reports on collected results

We need a way to reach out to people with results we collected. Weekly reports would be awesome.
Fill ruby-bench/ruby-bench repo with docs

We should include here all kind of information about the project. Describing what is being run, where and how. We should make it easier for people to make their first contribution.
Acknowledgments

Many thanks to my mentors, Noah Gibbs and Robin Dupret who supervised my work along the way. Thank you for your responsiveness on every call for help. Thank you for your support and your kindness!
Special thanks to Sam Saffron and Guo Xiang Tan (the guy who did most of RubyBench you see today). Thank you for actively helping my work during the GSoC program, even though you weren't officially mentoring this year.
	Project	Long Running Ruby and Rails Benchmarks
	Organization	Ruby on Rails
	Project site	https://rubybench.org
	Repositories	https://github.com/ruby-bench
	GSoC link	https://summerofcode.withgoogle.com/projects/#5251786557358080
	Mentors	Robin Dupret, Noah Gibbs
	Student	Marko Bogdanović
	Proposal	link to PDF
	Total changes	202 commits / 5891 ++ / 3224 --
ruby-bench-web#199	Compare benchmarks on commits graph
ruby-bench-web#202	Sequel events are comming from Jeremy's repo
ruby-bench-web#203	Handle only commits to main branches
ruby-bench-web#216	Show both scripts below comparing commits graphs
ruby-bench-web#231	Improve comparison chart
ruby-bench-web#243	Add ssh method to support PG suite
ruby-bench-docker#35	Add pg Docker image
ruby-bench-docker#37	Install setup bundle in PG Dockerfile
ruby-bench-docker#43	PG setup for specific commit
ruby-bench-docker#44	Fix PG suite setup
ruby-bench-docker#45	Fix PG setup for previous commits
ruby-bench-suite#76	Add pg gem scope_all benchmark
ruby-bench-suite#92	Add pg suite runner
ruby-bench-suite#94	Add discourse real world benchmark in PG suite
ruby-bench-suite#102	Send pg suite results to web
ruby-bench-web#210	Add manual runner
ruby-bench-web#227	Add admin UI
ruby-bench-web#234	Grouping benchmarks
ruby-bench-web#9bb7b39	Execute scripts on bare metal server
ruby-bench-docker#25	Move ssh commands to runner scripts
ruby-bench-docker#26	Just pass prepared statements argument in rails releases script
ruby-bench-docker#27	Fix rails master script
ruby-bench-docker#28	Bump ruby version to 2.4.1
ruby-bench-docker#29	Update rails Dockerfiles
ruby-bench-docker#31	Install postgres and mysql clients in rails_releases Dockerfile
ruby-bench-suite#84	Don't reuse endpoints in sequel and rails drivers
ruby-bench-suite#89	Fix rails benchmarks
ruby-bench-suite#90	Fix scaffold benches
ruby-bench-suite#78	Make common benchmark runner to avoid duplication
ruby-bench-suite#79	Add sequel scope all with default scope bench
ruby-bench-suite#81	Update readme
ruby-bench-suite#83	Fix: rename organization to jeremyevans
ruby-bench-suite#85	Fix selection benches to construct string from records
ruby-bench-suite#86	Use sequel_pg
ruby-bench-suite#88	Add ActiveRecord benchmark as Discourse example
ruby-bench-suite#100	Remove deprecated set_allowed_columns method
ruby-bench-web#224	Setup automatic deploy on production branch
ruby-bench-web#245	Replace Bugsnag with Sentry