Skip to content

Instantly share code, notes, and snippets.

@Magnus167
Last active November 22, 2023 16:43
Show Gist options
  • Save Magnus167/5dfb447cd0727b9a90aa14b83179f892 to your computer and use it in GitHub Desktop.
Save Magnus167/5dfb447cd0727b9a90aa14b83179f892 to your computer and use it in GitHub Desktop.
What is a Monte Carlo simulation anyway?

In a nutshell, what is a Monte Carlo simulation anyway?

Monte Carlo simulations are typically used to get estimates of results, when the actual problem is too vast to be solved in entirety (or in a reasonable amount of time).

There are 3 examples, each closer to the real world version than the last. Example 1 is the best 😄

Example 1: Estimating the number of potholes in London

Let's take an example: Palash wants to get an estimate of the total number of potholes in London. Now, a seamingly obvious way to do this would be to systematically go through each and every street in London and count the number of potholes in each street.

However, let's say that there are a million streets in London. You can't expect Palash to go through each and every street and count the number of potholes, he'll be dead before he's done with even a fraction of the streets.

So what does Palash do? Being the wise man he is, he decides to take a sample of the streets in London. He randomly selects 100 streets and counts the number of potholes in each of them. He then takes the average of the number of potholes in these 100 streets and multiplies it by the total number of streets in London to get an estimate of the total number of potholes.

This is a Monte Carlo simulation.

You can have additional factors in your simulation,such as:

  • the length of the street
  • the width of the street,
  • the total number of cars you saw on the street
  • the time of the day you saw the street

to enhance the accuracy of your simulation.

But the whole idea is to take a sample of the population and use it to estimate the characteristics of the population.

Example 2: Estimating the average commute time for people in New York

Estimating the average commute time for everyone in New York City is a massive task. It's impractical to ask every single person about their commute time. The city is huge, and there are millions of people with varying distances and modes of transport.

How would we sample the population?

Instead of surveying everyone, we decide to randomly select a sample of, say, 5000 people from different parts of the city.

What data are we collecting?

For each person, we collect the following data:

  • The distance they travel to work
  • The mode of transport they use
  • The suburbs/neighborhoods they commute to and from
  • The times at which they make their commute

What are some seasonal or time-based factors that we should consider?

  • The time of the year (summer, winter, spring, fall)
  • The time of the day (rush hour, mid-day, late night)
  • Weather conditions (rain, snow, sunny)
  • Holidays (Christmas, Thanksgiving, New Year's Eve, Black Friday)

Calculating the average commute time

This isn't trivial simply because of the nature of the data we're dealing with. But quite doable with some Excel and statistics.

Extrapoloating the results to the entire population

This is the most important part of the simulation, where we "apply" our results to the entire population.

This step typically includes some A/B testing, where we test our results against a control group.

Example 3 - Studying structural integrity of infrastructure using cameras

This is a very real-world example of how Monte Carlo simulations are used.

Problem:

In a large urban area, structural engineers want to assess the long-term effects of environmental stress (like wind, earthquakes, heavy traffic vibrations, etc.) on buildings. It's impractical to instrument every building with sensors due to cost and logistical constraints.

Sampling:

The engineers select a sample of buildings, say 100, of various ages, designs, and in different locations throughout the city.

Photographic Data Collection:

They set up time-lapse cameras or periodically take high-resolution photographs of these buildings over an extended period. These photographs capture subtle changes in the buildings' structures that might indicate stress, like cracks, tilting, or bending.

Additional Data Gathering:

Alongside photographs, they might also collect data on:

  • Environmental conditions (temperature, humidity, wind speed).
  • Seismic activity in the area.
  • Traffic density around the buildings (for vibration analysis).
  • Historical data on the buildings’ maintenance and renovations.

Image Analysis for Stress Indicators:

Using image processing techniques, the engineers analyze the photographs for signs of structural stress over time. They look for patterns or changes that might indicate weakening or potential failure points in the buildings.

The actual Monte Carlo Simulation:

They use this data to run simulations for each building in the sample. The simulations incorporate the observed data and additional environmental factors to estimate each building's stress level and potential lifespan.

Extrapolation and Predictive Analysis:

From these simulations, the engineers develop a model to estimate the structural integrity and lifespan of similar buildings across the city. For example, if the data shows that buildings of a certain age or design are more prone to stress under specific conditions, they can predict which buildings in the city might need closer monitoring or reinforcement.


If you'd like to watch a video on this topic, you can watch it here:

https://ocw.mit.edu/courses/6-0002-introduction-to-computational-thinking-and-data-science-fall-2016/resources/lecture-6-monte-carlo-simulation/

If you simply want to read through the lecture notes, you can find them here:

https://ocw.mit.edu/courses/6-0002-introduction-to-computational-thinking-and-data-science-fall-2016/5af20311b02eaab959fcdb7ffb5694d3_MIT6_0002F16_lec6.pdf

ChatGPT explains Monte Carlo simulations as a middle-aged Indian man in Harrow with a newfound love for rap music

Alright, let's frame an explanation of Monte Carlo simulations in a way that resonates with a middle-aged Indian man living in Harrow, who has a keen interest in rap music:


"Chalo, let's roll with some desi beats on Monte Carlo simulations, bhai style!"

Picture this, yaar: you're in Harrow, your second home, but your heart beats to the dhol of your Indian roots. Now, you're vibing with rap, trying to mix those sick beats with a tadka of Bollywood zest. But here's the twist – you wanna know how your unique blend will groove with the crowd, not just in Harrow, but maybe back in the gullies of Mumbai too. Enter the world of Monte Carlo simulations, like a DJ mixing East with West.

  1. Sampling Like a Masala Mix (Selecting a Sample):

    • It's like choosing the right spices for your biryani – essential! You select a sample of, say, 100 spots where your desi rap might echo, from the streets of Harrow to the lanes of Delhi.
  2. Laying Down Your Ragas and Rhymes (Collecting Data):

    • At each spot, you imagine dropping a beat. Will the crowd swing to your fusion of tabla and turntables? That's your data, bro – the response to your unique sound.
  3. The Remix of Variables (Considering Variables):

    • Every venue has its own vibe, yaar. From the age group (college crowd or the more mature uncles and aunties), to the location (a hip London club or a Mumbai chai tapri), even to festive times (Diwali beats vs Christmas tunes).
  4. Spinning the Harmonium and Decks (Running Simulations):

    • Mix all this like you're blending classical with hip-hop. Each scenario spins a different crowd reaction. Imagine the aunties nodding to your rhymes or the London crowd tapping to a dhol beat.
  5. The Grand Finale (Analyzing Results):

    • After your musical yatra across these scenarios, you get a range of reactions. Maybe your desi rap hits big in Harrow, but it's only mildly nod-worthy in Mumbai chai tapris. That's your insight, dost.
  6. The Encore Across Seas (Extrapolation):

    • Now, you use this knowledge to gauge the feel across different scenes. If your Indo-Western mix got 70% of your sample grooving, maybe it's the new thing across Harrow and beyond!

So, bhai, Monte Carlo simulations in our world? It's like testing your Indo-rap across a myriad of scenarios, gathering insights without having to physically drop beats in every corner of London or Mumbai. It's playing with the probabilities of success, just like blending classical Indian tunes with street-smart Western rap. Let's keep the beats rolling, and the data flowing! 🎤🎧🇮🇳✌️


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment