Skip to content

Instantly share code, notes, and snippets.

@satwikkansal
Last active November 8, 2019 08:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save satwikkansal/001a569181c99f21de9905fb9bcd12a6 to your computer and use it in GitHub Desktop.
Save satwikkansal/001a569181c99f21de9905fb9bcd12a6 to your computer and use it in GitHub Desktop.

Python v/s Ruby; Performance and other factors that matter

In this blog post, we'll be going through two server-side scripting languages; Python and Ruby with focus on comparing the performance and other factors that might help you in deciding which language to pick over the other for your web application.

Let's begin with performance first,

What does performance mean?

For the context of this post, you can think of high performant language as the one that,

  • Provides fast code execution in genral
  • Handles concurrent tasks efficiently
  • Has low utilization of computing resources (typically the CPU utilisation and memory footprint)

And a high performance web framework as the one that,

  • Has short response time
  • Provides high throughput (typically responses-per-minute)
  • Provides fast and efficient serialization and deserialisation
  • Has high availablity and fault tolerance
  • Scales better with more resources when the load increases.

Comparing performance of Python with Ruby

We're aware that "real" comparison (a.k.a benchmarking) would require a lot of standardisation in terms of execution environment. I'll be running the code snippets in this post on my i5 machine having 4 cores and 8 GB RAM, taking measures to reduce external influence as muc has possible. Let's start with evaluating the execution times of both of these languages for simple iterative and recursion based programs.

Comparing run times of simple iterative and recursive programs in Python and Ruby

We'll take two well known mathematical problem statements,

  1. Compute nth value in the fibonacci sequence.
  2. Compute factorial of n.

Here are our simple implementations for the same,

Note: It can be argued that these programs are not equivalent in terms of implementation in their respective languages, and a faster verision can be written for them. Written that is beyond the scope of this blog-post (for reference here's what would be an equivalent implementation of pidgits in Python and Ruby would look like).

The point of the above green-apples to red-apples like comparison is to practically see if there's a noteworthy difference in the execution times among the "typical" implementations of these programs in respective langauges. This is going to be the theme of the entire post.

# Python version 3.6.9 (CPython implementation)
def fib(n):
    # Iterative fibonacci
    a, b = 0, 1
    for i in range(0, n):
        a, b = b, a + b
    return a
  
def fib_r(n): 
  # Recursive fibonacci
  return n if n < 2:  
  return fib_r(n-1) + fib_r(n-2) 

def fac(n):
  # Iterative factorial
  x = 1
  for i in range(2, n + 1):
    x = x * i

def fac_r(n):
  # Recursive factorial
  if  n >= 1:
    return n * fac_r(n - 1)
  return 1

# Printing out the run times, the value of n is decided based on execution times and maximum stack depth
print(timeit.timeit(lambda: fib(1000000), number=1))
print(timeit.timeit(lambda: fib_r(40), number=1))
print(timeit.timeit(lambda: fac_r(900), number=1))
print(timeit.timeit(lambda: fac(100000), number=1))
# Ruby version 2.6.5 (CRuby implementation)
require 'benchmark'

def fib(n)
  # Iterative fibonacci
   a, b = 0, 1
   for i in 0..n
     a, b = b, a + b
   end
end

def fib_r(n)
  # Recursive fibonacci
  return 1 if n < 2
  return fib_r(n - 1) + fib_r(n - 2)
end

def fac(n)
  # Iterative factorial
  x = 1
  for i in 2..n + 1
    x = x * i

def fac_r(n)
  # Recursive factorial
  if  n >= 1:
    return n * fac_r(n - 1)
  return 1

# Printing out the run times, The value of n is decided based on execution times and maximum stack depth
puts Benchmark.measure { fib(1000000) }
puts Benchmark.measure { fib_r(40) }
puts Benchmark.measure { fac(100000) }
puts Benchmark.measure { fac_r(900) }

Following are the average execution times after running theses scripts at 7 different points of time. I tried to make sure no other process was running to reduce bias. The n value is adjusted so that the program doesn't take too long and doesn't throw "Maximum recursion depth exceeded error" (happened with Python). Here are the observations,

Method n Ruby Python
fib 1000000 27.935831 s 10.435885478975251 s
fib_r 40 9.442680 s 36.948102285154164 s
fac 100000 6.833936 s 2.502855138000001 s
fac_r 900 2.701 ms 0.643335000006573 ms

And here are some of our observations,

  • Python is in the magnitude of 2.5x faster than Ruby when it comes to computations with typical for loop iteration. The slowness of Ruby here is due to the introduction new scope for every iteration, which involves creation and deletion of these varaibles in every iteration.
  • Ruby is sometimes significantly faster, sometimes slower when it comes to recursion. It is commonly known that funcion call overhead is expensive in Python. Both of the languages have mutliple optimizatizations for dealing with exploding call stacks just like in the case of our naive implementation of fibonacci and factorial.

If you're interested in benchmarking of these languages against programs like fannkuch-redux, fasta, k-nucleotide, mandlebrot, nbody etc, Benchmarks Game's Ruby vs Python 3 comparison is highly recommended (similar source).

Moving on, let's see how these languages perform when it comes to reading files from disk and parsing common formats like JSON.

Comparing run times for file reading from disk and JSON parsing programs in Python and Ruby

For the data I've taken one of my scraping data dump in JSON which is 5.5 Mb in size and it's schema looks something like below

{
  "source": "https://www.startupranking.com",
  "data": [
    {
      "country": "United States",
      "startups": [
        {
          "sr_rank": "93,324",
          "name": "Airbnb",
          "overall_rank": "1",
          "url": "https://www.startupranking.com/airbnb",
          "country_rank": "1",
          "pitch": "Vacation Rentals, Homes, Experiences & Places\n                         -                                         Airbnb is a trusted online marketplace for people  ...",
          "fundingRounds": [
            {
              "date": "Jun 28, 2015",
              "amount": "$ 1,500,000,000",
              "name": "Series E",
              "investors": [
                "General Atlantic",
                "Hillhouse Capital",
                "Tiger Global",
                "Baillie Gifford",
                "China Broadband Capital",
                "Fidelity Investments",
                "Ggv Capital",
                "Horizon Ventures",
                "Kleiner Perkins Caufield Byers",
                "Sequoia Capital",
                "T Rowe Price",
                "Temasek",
                "Wellington Management",
                "Groupe Arnault",
                "Horizons Ventures"
              ]
            },
            {
              "date": "Apr 16, 2014",
              "amount": "$ 475,000,000",
              "name": "Series D",
              "investors": [
                "Dragoneer Investment Group",
                "Sequoia Capital",
                "Sherpa Ventures",
                "T Rowe Price",
                "Tpg Growth",
                "Andreessen Horowitz"
              ]
            },
            {
              "date": "Oct 28, 2013",
              "amount": "$ 200,000,000",
              "name": "Series C",
              "investors": [
                "Founders Fund",
                "Ashton Kutcher",
                "Crunchfund",
                "Sequoia Capital",
                "Airbnb"
              ]
            },
            ...
            ...
]}]}]}

It's nested enough and contain maps as well as arrays. The next task is to create methods to read this file and then load that as JSON.

# Python version 3.6.9 (CPython implementation)

# Importing the in-built json module for parsing
import json

def read_file(path):
  # Reading file contents from path
  with open(path, 'r') as f:
    content = f.read()
  return content

def load_json(path):
  # Parsing json from stored in file
  return(json.loads(read_file(path)))

# print(timeit.timeit(lambda: read_file('data.json'), number=1))
print(timeit.timeit(lambda: load_json('data.json'), number=1))
# Ruby version 2.6.5 (CRuby implementation)

# Importing the in-built json module for parsing
require "json"

def read_file(path)
  # Reading file from `path`
  return File.read(path)
end

def load_json(path)
  # Parsing JSON from file
  JSON.parse(read_file(path))
end

# puts Benchmark.measure { read_data("data.json") }
puts Benchmark.measure { load_json("data.json") }

Nothing fancy, just using in-built ways to read a file and parse JSON from string, and recording their execution times one-by-one. Here are the results,

Method Ruby Python
read_file 4.676 ms 6.013702999553061 ms
load_json 96.573 ms 48.90625600000931 ms

Observations

  • Reading from file is slightly faster in Ruby then in Python.
  • However, parsing json using standard library methods takes almost twice as time in Ruby than in Python.

Concurrency in Python and Ruby

Coming to concurrency, popular implementations of both the languages (CPython and Ruby) are blessed with Global Interpreter lock, which means,

  • Only one thread can execute at a time on a CPU, even if you have a multi-core processor.
  • In essence, you can create multiple threads but they will run turn-by-turn instead of running in parallell (concurrency without parallelism).
  • Parallell I/O is still possible (and happens) among multiple threads.
  • To achieve parallelism with processing, the program will need to spawn separate processes, and kind of coordinate with them.

Python provides some abstraction for performing multiprocessing through the built-in multiprocessing module, and Ruby provides the Process module which is more closer to OS level. For parallelisation of I/O related tasks, Python included asyncio module from 3.x onwards, and the module received significant usability and performance improvements in the recent Python 3.7.x version. Popular third-party options in Ruby are the async framework and the concurrent-ruby toolkit. There's an proposed pull request in Ruby for fibre-based selector that will enhance concurrency.

Comparing performances of web frameworks in Python and Ruby

I'm going to take popular minimalistic web frameworks Flask and Sinatra in the respective languages, and compare their response times for the following functions through REST APIs

  • Simple GET request
  • Simple POST request
  • Rendering a JSON response from an already intialized variable
  • Instantiating a new object and then rendering a JSON response
  • Rendering an HTML response via templating

Here's the code for all this,

# Filename: app.py
# Flask version: 1.1.1
from flask import Flask, jsonify, render_template
app = Flask(__name__)

# Already intialized list of lanaguages
languages = [
		{
			"name": "Python",
			"is_interpreted": True,
			"version": "3.6.9"
		},
		{
			"name": "Ruby",
			"is_interpreted": True,
			"version": "2.6.5"
		},
]

class Language:
  # Our language class
	def __init__(self, name, is_interpreted, version):
		self.name = name
		self.is_interpreted = is_interpreted
		self.version = version

# Simple GET request
@app.route("/simple-get")
def get():
    return "Hello Scout!"

# Simple POST request
@app.route("/simple-post", methods=["POST"])
def post():
	return "Hello Scout!"

# Rendering a JSON response
@app.route("/simple-json", methods=["GET"])
def render_json():
	return jsonify(languages)

# Instantiating an object and then rendering a JSON response
@app.route("/simple-json-2", methods=["GET"])
def render_json_custom_object():
	lang = Language(**{
			"name": "Python",
			"is_interpreted": True,
			"version": "3.6.9"
		})
	return jsonify(lang.__dict__)

# Rendering an HTML response via templating
@app.route("/render-html", methods=["GET"])
def render_html():
	return render_template('template_python.html', languages=languages)

if __name__ == "__main__":
    app.run(debug=False)
<!-- Filename templates/template_python.html -->
<html>
  <head>
    <title>Languages comparison</title>
  </head>
  <body>
    {% for language in languages %}
    <h1> {{ language["name"] }} </h1>
      <ul>
        <li>Is interpreted: {{ language["is_interpreted"] }}</li>
        <li>Current version: {{ language["version"] }}</li>
      </ul>
    {% endfor %}
  </body>
</html>
# File app.rb
# Sintara version 2.0.7

require 'sinatra'

# Already initialized array of languages
languages = [
    	{ :name => 'Ruby', :is_interpreted => true, :version => '2.6.5' },
    	{ :name => 'Python', :is_interpreted => true, :version => '3.6.9' },
 ]

class Language
  # Our language class
  attr_accessor :name, :is_interpreted, :version

  def initialize (name, is_interpreted, version)
    @name = name
    @is_interpreted = is_interpreted
    @version = version
  end

  def as_json(options={})
    {
      name: @name,
      is_interpreted: @is_interpreted,
      version: version
    }
  end

  def to_json(*options)
    as_json(*options).to_json(*options)
  end
end

# Simple GET request
get '/simple-get' do
  "Hello Scout!"
end

# Simple POST request
post '/simple-post' do
  "Hello Scout!"
end

# Rendering a JSON response
get '/simple-json' do
    content_type :json
    languages.to_json
end

# Instantiating an object and then rendering a JSON response
get '/simple-json-2' do
	language = Language.new("Ruby", true, "2.6.5")
    content_type :json
    language.to_json
end

# Rendering an HTML response via templating
get '/render-html' do
	erb :template_ruby, :locals => {:languages => languages}
end
<!-- Filename views/template_ruby.erb -->
<html>
  <head>
    <title>Languages comparison</title>
  </head>
  <body>
    <% languages.each do |language| %>
      <h1><%= language["name"] %></h1>
      <ul>
        <li>Is interpreted: <%= language[:is_interpreted] %></li>
        <li>Current version: <%= language[:version] %></li>
      </ul>
    <% end %>
  </body>
</html>

I used Postman to record the response time information, you can also use cURL to do the same. Here are the observations,

Endpoint Flask response time (in ms) Sinatra response time (in ms)
/simple-get
/simple-post
/simple-json
/simple-json-2
/render-html

You can also choose to add Scout to your application here to monitor the response times. Here's how you'd set up one for flask,

from flask import Flask
from scout_apm.flask import ScoutApm

# Setup a flask 'app' as normal
app = Flask(__name__)

# Attach ScoutApm to the Flask App
ScoutApm(app)

# Scout settings
app.config["SCOUT_MONITOR"] = True
app.config["SCOUT_KEY"] = "YOUR_SCOUT_API_KEY"
app.config["SCOUT_NAME"] = "flask_endpoints"

And here's how you'd do it for Sinatra

equire 'sinatra'
require 'scout_apm'
ScoutApm::Rack.install!

run Sinatra::Application

get '/simple-get' do
  # Letting Scout know that to track a specific request as a Rack transaction
  ScoutApm::Rack.transaction("get /simple-get", request.env) do
  	"Hello Scout!"
  end
end

Here are some observations,

  • Most of the response timings are equivalent, templating is slightly faster in Flask. Same goes with rendering a JSON response from a class object.
  • Sintara saved some time on "DNS Lookup" and "TCP Handshake" part of the final time by caching.

I wanted to compare Network performance and Database operations performance as well, but ditched the idea because of the language specific differences in the implementation of the database drivers and network libraries. Anyways, just like language comparisons, if you're interested in more technical benchmarking of Python and Ruby framework, I'd recommend you to check out this link.

In a real-world scenario, web framework speeds might be just one part of the big story. A request-cycle might consist of following critical components of interest in sequential order (from the moment when the client requests triggers a request),

  • Load balancers like HAProxy
  • Web accelerators like Varnish and Squid
  • Web servers like nginx (nginx by the way can also take up the job of accelerator and load balancer)
  • Application servers like Unicorn and Gunicorn
  • The frameworks like Ruby On Rails and Django
  • Caches at the application level like Redis and Memcached
  • Finally the I/O in the form of disk, databases, network, etc.

The frameworks / technologies used at each of these steps will also contribute to the final response times. So choosing the right design is very critical here.

Differences beyond Performance

So far from our analysis it's evident that some things are slower in one language, some things in other. A lot of these differences are because of the design philosophy of the languages, and how the languages evolved over the time. Also, there can be more reasons to pick a language among Python and Ruby other than performance. Let's go through them next before we conclude the post,

Design philosophies of Python, Ruby and their frameworks

Ruby is designed to be a friendly language keeping programmer's comfort in mind. The core priniciple in Ruby is "The principle of least surprise". As a result Ruby has a lot of high level functionalities to make programming enjoyable, some programmers also called Ruby and frameworks like Ruby on Rails "magical" in that sense.

On the other hand, the core philosophy of Python is aptly summarized in the Zen

>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

The major theme is towards being explicit, and encourage a particular way of doing things with Python. This is slightly in contrast with Ruby, where a lot of things happen implicitly and there are multiple ways to do the same thing. You can notice some of these subtle differences in the way these languages deal with,

  • Unicode strings and byte strings (Ruby is more implicit about the encodings)
  • Switch statements (Python has only if-else and no switch construct)
  • Anonymous functions (Python has only one way; lambdas, while Ruby contains blocks, Procs, and lambdas)
  • Getter and setters (Python has descriptor syntax to access instance varaibles, whereas in Ruby you can specify attr_reader and attr_writer accessors or you can write explicit getter-setter methods)
  • for loops (Python has typical for x in y way, whereas Ruby has multiple ways like n.times do, collection.each do |item|, along with for x in y)

Talking about the most popular frameworks in these languages (Rails and Django), Rails is an integral skillset of most of the Ruby programmers, and some people argue to an extent that Rails is what that has kept the language alive. Unlike Ruby as a programming language, Rails is designed to be strongly opinonated favouring convention over configuration due to which it is considered good for fast prototyping and quick iterations (Rails will do all the heavy-lifting if you do things the Rails way). Django on the other hand is more explicit. It demands the programmer to configure different aspects of the application and thus involves a slight learning curve. You can see similar differences in other framework comparisons like Sintara and Flask. Both the approaches have their pros and cons, and one may outweigh the other depending on your use case.

Community

Python's community has really been growing rapidly due to its suitability in domains beyond web applications (like Data Analytics, Image Processing, Deep Learning, etc). An increasing interest has given a great boost to the language in past few years in terms of features, performance, and supporting packages. The most active community for Ruby is the Rails community, so Rails as a framework is still growing decently.

Dependency Management

Python's dependency management ecosystem is slightly more matured and developer-friendly than Ruby's. I find myself in dependency hell in Ruby more often as compared to Python, mostly becuase of tricky ways to manage isolated environments in Ruby unlike Python's virtual environments. The other aspect is that pypi (Python's package index) is more versatile when it comes finding reusable libraries that are actively maintained and to avoid re-inventing the wheel. At the time of writing this post, there are 191,743 python packages in PyPI and 155,401 gems hosted at RubyGems.

Testing and debugging

Debugging in Ruby has been slightly more difficult from my personal experience. However, it's still more friendly than most of the other languages. For testing, RSpec is widely used to do Behavior-driven development (BDD). In Python, the popular BDD framework is behave followed by pytest plugins like pytest-bdd. Developers find Rspec to be more matured than Python alternatives.

Python and Ruby's current usage in real-world Web development

Both of these languages find usages in tech stack of large scale websites. Some examples being,

Some popular websites that use Ruby

Some popular websites that use Python

In this blog post, we'll be going through two server-side scripting languages; Python and Ruby, with a focus on comparing the performance and other factors that might help you in deciding which language to pick over the other for your web application.

Let's begin with performance first,

What does performance mean?

For the context of this post, you can think of high performant language as the one that,

  • Provides fast code execution in general
  • Handles concurrent tasks efficiently
  • Has low utilization of computing resources (typically the CPU utilization and memory footprint)

And a high performant web framework as the one that,

  • Has short response time
  • Provides high throughput (typically responses-per-minute)
  • Provides fast and efficient serialization and deserialization
  • Has high availability and fault tolerance
  • Scales better with more resources when the load increases.

Comparing the performance of Python with Ruby

We're aware that "real" comparison (a.k.a benchmarking) would require a lot of standardization in terms of the execution environment. I'll be running the code snippets in this post on my i5 machine, having four cores and 8 GB RAM, taking measures to reduce external influence as much as possible. Let's start by evaluating the execution times of both of these languages for simple iterative and recursion based programs.

Comparing run times of simple iterative and recursive programs in Python and Ruby

We'll take two well known mathematical problem statements,

  1. Compute nth value in the Fibonacci sequence.
  2. Compute factorial of n.

Here are our simple implementations for the same,

Note: It can be argued that these programs are not equivalent in terms of implementation in their respective languages, and a faster version can be written for them. Written that is beyond the scope of this blog-post (for reference, here's what would be an equivalent implementation of pidgits in Python and Ruby would look like).

The point of the above green-apples to red-apples like comparison is to practically see if there's a noteworthy difference in the execution times among the "typical" implementations of these programs in respective languages. This is going to be the theme of the entire post.

# Python version 3.6.9 (CPython implementation)
def fib(n):
    # Iterative fibonacci
    a, b = 0, 1
    for i in range(0, n):
        a, b = b, a + b
    return a
  
def fib_r(n): 
  # Recursive fibonacci
  return n if n < 2:  
  return fib_r(n-1) + fib_r(n-2) 

def fac(n):
  # Iterative factorial
  x = 1
  for i in range(2, n + 1):
    x = x * i

def fac_r(n):
  # Recursive factorial
  if  n >= 1:
    return n * fac_r(n - 1)
  return 1

# Printing out the run times, the value of n is decided based on execution times and maximum stack depth
print(timeit.timeit(lambda: fib(1000000), number=1))
print(timeit.timeit(lambda: fib_r(40), number=1))
print(timeit.timeit(lambda: fac_r(900), number=1))
print(timeit.timeit(lambda: fac(100000), number=1))
# Ruby version 2.6.5 (CRuby implementation)
require 'benchmark'

def fib(n)
  # Iterative fibonacci
   a, b = 0, 1
   for i in 0..n
     a, b = b, a + b
   end
end

def fib_r(n)
  # Recursive fibonacci
  return 1 if n < 2
  return fib_r(n - 1) + fib_r(n - 2)
end

def fac(n)
  # Iterative factorial
  x = 1
  for i in 2..n + 1
    x = x * i

def fac_r(n)
  # Recursive factorial
  if  n >= 1:
    return n * fac_r(n - 1)
  return 1

# Printing out the run times, The value of n is decided based on execution times and maximum stack depth
puts Benchmark.measure { fib(1000000) }
puts Benchmark.measure { fib_r(40) }
puts Benchmark.measure { fac(100000) }
puts Benchmark.measure { fac_r(900) }

The following are the average execution times after running theses scripts at 7 different points of time. I tried to make sure no other process was running to reduce bias. The n value is adjusted so that the program doesn't take too long and doesn't throw "Maximum recursion depth exceeded error" (happened with Python). Here are the observations,

Method n Ruby Python
fib 1000000 27.935831 s 10.435885478975251 s
fib_r 40 9.442680 s 36.948102285154164 s
fac 100000 6.833936 s 2.502855138000001 s
fac_r 900 2.701 ms 0.643335000006573 ms

And here are some of our observations,

  • Python is in the magnitude of 2.5x faster than Ruby when it comes to computations with typical for loop iteration. The slowness of Ruby here is due to the introduction of a new scope for every iteration, which involves the creation and deletion of these variables in every iteration.
  • Ruby is sometimes significantly faster, sometimes slower when it comes to recursion. It is commonly known that function call overhead is expensive in Python. Both of the languages have multiple optimizations for dealing with exploding call stacks, just like in the case of our naive implementation of Fibonacci and factorial.

If you're interested in benchmarking of these languages against programs like fannkuch-redux, fasta, k-nucleotide, mandlebrot, nbody, etc., Benchmarks Game's Ruby v/s Python 3 comparison is highly recommended (similar source).

Moving on, let's see how these languages perform when it comes to reading files from disk and parsing common formats like JSON.

Comparing run times for file reading from disk and JSON parsing programs in Python and Ruby

For the data, I've taken one of my scraping data dump in JSON which is 5.5 Mb in size, and its schema looks something like below

{
  "source": "https://www.startupranking.com",
  "data": [
    {
      "country": "United States",
      "startups": [
        {
          "sr_rank": "93,324",
          "name": "Airbnb",
          "overall_rank": "1",
          "url": "https://www.startupranking.com/airbnb",
          "country_rank": "1",
          "pitch": "Vacation Rentals, Homes, Experiences & Places\n                         -                                         Airbnb is a trusted online marketplace for people  ...",
          "fundingRounds": [
            {
              "date": "Jun 28, 2015",
              "amount": "$ 1,500,000,000",
              "name": "Series E",
              "investors": [
                "General Atlantic",
                "Hillhouse Capital",
                "Tiger Global",
                "Baillie Gifford",
                "China Broadband Capital",
                "Fidelity Investments",
                "Ggv Capital",
                "Horizon Ventures",
                "Kleiner Perkins Caufield Byers",
                "Sequoia Capital",
                "T Rowe Price",
                "Temasek",
                "Wellington Management",
                "Groupe Arnault",
                "Horizons Ventures"
              ]
            },
            {
              "date": "Apr 16, 2014",
              "amount": "$ 475,000,000",
              "name": "Series D",
              "investors": [
                "Dragoneer Investment Group",
                "Sequoia Capital",
                "Sherpa Ventures",
                "T Rowe Price",
                "Tpg Growth",
                "Andreessen Horowitz"
              ]
            },
            {
              "date": "Oct 28, 2013",
              "amount": "$ 200,000,000",
              "name": "Series C",
              "investors": [
                "Founders Fund",
                "Ashton Kutcher",
                "Crunchfund",
                "Sequoia Capital",
                "Airbnb"
              ]
            },
            ...
            ...
]}]}]}

It's nested enough and contains maps as well as arrays. The next task is to create methods to read this file and then load that as JSON.

# Python version 3.6.9 (CPython implementation)

# Importing the in-built json module for parsing
import json

def read_file(path):
  # Reading file contents from path
  with open(path, 'r') as f:
    content = f.read()
  return content

def load_json(path):
  # Parsing json from stored in file
  return(json.loads(read_file(path)))

# print(timeit.timeit(lambda: read_file('data.json'), number=1))
print(timeit.timeit(lambda: load_json('data.json'), number=1))
# Ruby version 2.6.5 (CRuby implementation)

# Importing the in-built json module for parsing
require "json"

def read_file(path)
  # Reading file from `path`
  return File.read(path)
end

def load_json(path)
  # Parsing JSON from file
  JSON.parse(read_file(path))
end

# puts Benchmark.measure { read_data("data.json") }
puts Benchmark.measure { load_json("data.json") }

Nothing fancy, just using in-built ways to read a file and parse JSON from a string, and recording their execution times one-by-one. Here are the results,

Method Ruby Python
read_file 4.676 ms 6.013702999553061 ms
load_json 96.573 ms 48.90625600000931 ms

Observations

  • Reading from a file is slightly faster in Ruby than in Python.
  • However, parsing json using standard library methods takes almost twice as time in Ruby than in Python.

Concurrency in Python and Ruby

Popular implementations of both the languages (CPython and Ruby) are blessed with Global Interpreter lock, which means,

  • Only one thread can execute at a time on a CPU, even if you have a multi-core processor.
  • In essence, you can create multiple threads, but they will run turn-by-turn instead of running in parallel (concurrency without parallelism).
  • Parallel I/O is still possible (and happens) among multiple threads.
  • To achieve parallelism with processing, the program will need to spawn separate processes and coordinate with them.

Python provides some abstraction for performing multiprocessing through the built-in multiprocessing module, and Ruby provides the Process module which is closer to the OS level. For parallelization of I/O related tasks, Python included asyncio module from 3.x onwards, and the module received significant usability and performance improvements in the recent Python 3.7.x version. Popular third-party options in Ruby are the async framework and the concurrent-ruby toolkit. There's a proposed pull request in Ruby for a fibre-based selector that will enhance concurrency.

Comparing performances of web frameworks in Python and Ruby

I'm going to take popular minimalistic web frameworks Flask and Sinatra in the respective languages, and compare their response times for the following functions through REST APIs

  • Simple GET request
  • Simple POST request
  • Rendering a JSON response from an already initialized variable
  • Instantiating a new object and then rendering a JSON response
  • Rendering an HTML response via templating

Here's the code for all this,

# Filename: app.py
# Flask version: 1.1.1
from flask import Flask, jsonify, render_template
app = Flask(__name__)

# Already initialized list of languages
languages = [
        {
            "name": "Python",
            "is_interpreted": True,
            "version": "3.6.9"
        },
        {
            "name": "Ruby",
            "is_interpreted": True,
            "version": "2.6.5"
        },
]

class Language:
  # Our language class
    def __init__(self, name, is_interpreted, version):
        self.name = name
        self.is_interpreted = is_interpreted
        self.version = version

# Simple GET request
@app.route("/simple-get")
def get():
    return "Hello Scout!"

# Simple POST request
@app.route("/simple-post", methods=["POST"])
def post():
    return "Hello Scout!"

# Rendering a JSON response
@app.route("/simple-json", methods=["GET"])
def render_json():
    return jsonify(languages)

# Instantiating an object and then rendering a JSON response
@app.route("/simple-json-2", methods=["GET"])
def render_json_custom_object():
    lang = Language(**{
            "name": "Python",
            "is_interpreted": True,
            "version": "3.6.9"
        })
    return jsonify(lang.__dict__)

# Rendering an HTML response via templating
@app.route("/render-html", methods=["GET"])
def render_html():
    return render_template('template_python.html', languages=languages)

if __name__ == "__main__":
    app.run(debug=False)
<!-- Filename templates/template_python.html -->
<html>
  <head>
    <title>Languages comparison</title>
  </head>
  <body>
    {% for language in languages %}
    <h1> {{ language["name"] }} </h1>
      <ul>
        <li>Is interpreted: {{ language["is_interpreted"] }}</li>
        <li>Current version: {{ language["version"] }}</li>
      </ul>
    {% endfor %}
  </body>
</html>
# File app.rb
# Sintara version 2.0.7

require 'sinatra'

# Already initialized array of languages
languages = [
        { :name => 'Ruby', :is_interpreted => true, :version => '2.6.5' },
        { :name => 'Python', :is_interpreted => true, :version => '3.6.9' },
 ]

class Language
  # Our language class
  attr_accessor :name, :is_interpreted, :version

  def initialize (name, is_interpreted, version)
    @name = name
    @is_interpreted = is_interpreted
    @version = version
  end

  def as_json(options={})
    {
      name: @name,
      is_interpreted: @is_interpreted,
      version: version
    }
  end

  def to_json(*options)
    as_json(*options).to_json(*options)
  end
end

# Simple GET request
get '/simple-get' do
  "Hello, Scout!"
end

# Simple POST request
post '/simple-post' do
  "Hello, Scout!"
end

# Rendering a JSON response
get '/simple-json' do
    content_type :json
    languages.to_json
end

# Instantiating an object and then rendering a JSON response
get '/simple-json-2' do
    language = Language.new("Ruby", true, "2.6.5")
    content_type :json
    language.to_json
end

# Rendering an HTML response via templating
get '/render-html' do
    erb :template_ruby, :locals => {:languages => languages}
end
<!-- Filename views/template_ruby.erb -->
<html>
  <head>
    <title>Languages comparison</title>
  </head>
  <body>
    <% languages.each do |language| %>
      <h1><%= language["name"] %></h1>
      <ul>
        <li>Is interpreted: <%= language[:is_interpreted] %></li>
        <li>Current version: <%= language[:version] %></li>
      </ul>
    <% end %>
  </body>
</html>

I used Postman to record the response time information, you can also use cURL to do the same. Here are the observations,

Endpoint Flask response time (in ms) Sinatra response time (in ms)
/simple-get
/simple-post
/simple-json
/simple-json-2
/render-html

You can also choose to add Scout to your application here to monitor the response times. Here's how you'd set up one for the flask,

from flask import Flask
from scout_apm.flask import ScoutApm

# Setup a flask 'app' as normal
app = Flask(__name__)

# Attach ScoutApm to the Flask App
ScoutApm(app)

# Scout settings
app.config["SCOUT_MONITOR"] = True
app.config["SCOUT_KEY"] = "YOUR_SCOUT_API_KEY"
app.config["SCOUT_NAME"] = "flask_endpoints"

And here's how you'd do it for Sinatra

equire 'sinatra'
require 'scout_apm'
ScoutApm::Rack.install!

run Sinatra::Application

get '/simple-get' do
  # Letting Scout know that to track a specific request as a Rack transaction
  ScoutApm::Rack.transaction("get /simple-get", request.env) do
      "Hello, Scout!"
  end
end

Here are some observations,

  • Most of the response timings are equivalent, templating is slightly faster in Flask. The same goes for rendering a JSON response from a class object.
  • Sinatra saved some time on the "DNS Lookup" and "TCP Handshake" part of the final time by caching.

I wanted to compare Network performance and Database operations performance as well but ditched the idea because of the language-specific differences in the implementation of the database drivers and network libraries. Anyways, just like language comparisons, if you're interested in more technical benchmarking of Python and Ruby framework, I'd recommend you to check out this link.

In a real-world scenario, web framework speeds might be just one part of the big story. A request-cycle might consist of following critical components of interest in sequential order (from the moment when the client triggers a request),

  • Load balancers like HAProxy
  • Web accelerators like Varnish and Squid
  • Web servers like nginx (Nginx, by the way, can also take up the job of accelerator and load balancer)
  • Application servers like Unicorn and Gunicorn
  • The frameworks like Ruby On Rails and Django
  • Caches at the application level like Redis and Memcached
  • Finally, the I/O in the form of disk, databases, network, etc.

The frameworks/technologies used at each of these steps will also contribute to the final response times. So choosing the right design is very critical here.

Differences beyond Performance

So far from our analysis, it's evident that some things are slower in one language, some things in other. A lot of these differences are because of the design philosophy of the languages, and how the languages evolved over time. Also, there can be more reasons to pick a language among Python and Ruby other than performance. Let's go through those next before we conclude the post,

Design philosophies of Python, Ruby and their frameworks

Ruby is designed to be a friendly language keeping programmer's comfort in mind. The core principle in Ruby is "The principle of least surprise". As a result, Ruby has a lot of high-level functionalities to make programming enjoyable, and some programmers also called Ruby and frameworks like Ruby on Rails "magical" in that sense.

On the other hand, the core philosophy of Python is aptly summarized in the Zen,

>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

The major theme is towards being explicit and encourage a particular way of doing things with Python. This is slightly in contrast with Ruby, where a lot of things happen implicitly, and there are multiple ways to do the same thing. You can notice some of these subtle differences in the way these languages deal with,

  • Unicode strings and byte strings (Ruby is more implicit about the encodings)
  • Switch statements (Python has only if-else and no switch construct)
  • Anonymous functions (Python has only one way; lambdas, while Ruby contains blocks, Procs, and lambdas)
  • Getter and setters (Python has descriptor syntax to access instance variables, whereas in Ruby you can specify attr_reader and attr_writer accessors or you can write explicit getter-setter methods)
  • for loops (Python has typical for x in y way, whereas Ruby has multiple ways like n.times do, collection.each do |item|, along with for x in y)

Talking about the most popular frameworks in these languages (Rails and Django), Rails is an integral skillset of most of the Ruby programmers, and some people argue to the extent that Rails is what that has kept the language alive. Unlike Ruby as a programming language, Rails is designed to be strongly opinionated favoring convention over configuration due to which it is considered good for fast prototyping and quick iterations (Rails will do all the heavy lifting if you do things the Rails way). Django, on the other hand, is more explicit. It demands the programmer to configure different aspects of the application and thus involves a slight learning curve. You can see similar differences in other framework comparisons like Sinatra and Flask. Both the approaches have their pros and cons, and one may outweigh the other depending on your use case.

Community

Python's community has really been growing rapidly due to its suitability in domains beyond web applications (like Data Analytics, Image Processing, Deep Learning, etc.). Increasing interest has given a great boost to the language in past few years in terms of features, performance, and supporting packages. The most active community for Ruby is the Rails community, so Rails as a framework is still growing decently.

Dependency Management

Python's dependency management ecosystem is slightly more matured and developer-friendly than Ruby's. I find myself in dependency hell in Ruby more often as compared to Python, mostly because of tricky ways to manage isolated environments in Ruby, unlike Python's virtual environments. The other aspect is that PyPI (Python's package index) is more versatile when it comes to finding reusable libraries that are actively maintained and to avoid re-inventing the wheel. At the time of writing this post, there are 191,743 python packages in PyPI and 155,401 gems hosted at RubyGems.

Testing and debugging

Debugging in Ruby has been slightly more difficult from my personal experience. However, it's still more friendly than most of the other languages. For testing, RSpec is widely used to do Behavior-driven development (BDD). In Python, the popular BDD framework is behave followed by pytest plugins like pytest-bdd. Developers find Rspec to be more matured than Python alternatives.

Python and Ruby's current usage in real-world Web development

Both of these languages find usages in the tech stack of large scale websites. Some examples being,

Some popular websites that use Ruby

Some popular websites that use Python

Conclusion

In this post, we tried to evaluate the performance of Python, Ruby, and their frameworks for simple but commonly performed tasks. There are certain cases where one language shines over other, but only performance doesn't seem like a good reason to pick one of these language over the other because,

  • Developers matter more: The per hour CPU costs in the cloud are cheaper than per hour developer time.
  • In most business cases solving the problem first (getting product-market-fit) is more important than focusing on performance.
  • For large scale web applications, performance is more of a design-architecture game than of picking one language among the two.
  • If language-performance is really what you want, then there are other low-level languages (probably the compiled ones), which can do much better.

No language among these two can be objectively said better among each other. Recent StackOverflow developer survey results are slightly more favorable to Python, and it's frameworks, but both of these languages are happily used and even supported by large scale companies. The better reasons for choosing a language among these two can be can be,

  • The community support for your use-case
  • The developer team's familiarity and preference
  • Necessary third-party support in terms of reusable packages and their ease of use (documentation)
  • The level of control that you need (configuration v/s convention)
  • The speed at which you want to develop your application

Anyways, no matter which language you end up choosing among these, Scout is available for all of them :) I hope this article helped you inch closer in your decision to pick one language out of these two. Anyways, no matter which language you end up choosing, Scout has got your application monitoring needs covered!

Conclusion

In this post, we tried to evaluate performance of Python, Ruby, and their frameworks for simple but commonly performed tasks. There are certain cases where one language shines over other, but only performance doesn't seem like a good reason to pick one of these language over the other because,

  • Developers matter more: The per hour CPU costs in the cloud are cheaper than per hour developer time.
  • In most business cases solving the problem first (getting product-market-fit) is more important then focusing on performance.
  • For large scale web applications, performance is more of a design-architecture game than of picking one language among the two.
  • If language-performance is really what you want, then there are other low-level languages (probably the compiled ones) which can do much better.

No language among these two can be objectively said better among each other. Recent StackOverflow developer survey results are slightly more favorable to Python and it's frameworks, but both of these languages are happily used and even supported by large scale companies. The better reasons for choosing a language among these two can be can be,

  • The community support for your use-case
  • The developer team's familiarity and preference
  • Necessary third-party support in terms of reusable packages and their ease of use (documentation)
  • The level of control that you need (configuration vs convention)
  • The speed at which you want to develop your application

Anyways, no matter which language you end up choosing among these, Scout is available for all of them :) Hope this article helped you inch closer in your decision to pick one language out of these two. Anyways, no matter which language you end up choosing, Scout has got your application monitoring needs covered!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment