Skip to content

Instantly share code, notes, and snippets.

@NeganSmith92
Created February 26, 2024 06:11
Show Gist options
  • Save NeganSmith92/69a7709b0b63d4141c74e154d785d5b9 to your computer and use it in GitHub Desktop.
Save NeganSmith92/69a7709b0b63d4141c74e154d785d5b9 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "52fcce79",
"metadata": {},
"outputs": [],
"source": [
"#EXERCISE_1\n",
"# 1.Explain the difference between median and middle value?\n",
"# 2.Are mean and mode values always the same for unsorted and sorted dataset? Why?\n",
"# 3.If range is caculate with last datapoint – first datapoint, should the dataset is sorted first or not? Why?\n",
"\n",
"\n",
"# 1.The median is a measure of central tendency that represents the middle value of a dataset when it is ordered from smallest to largest. If the dataset has an odd number of observations, the median is the middle value. If the dataset has an even number of observations, the median is the average of the two middle values.\n",
"# The middle value is simply the value that occupies the middle position in a dataset. It does not necessarily have to be the median. For example, in the dataset [1, 3, 5, 7, 9], the median is 5, while the middle value is also 5 because it occupies the middle position. However, in the dataset [1, 2, 3, 4, 5, 6], the median is (3 + 4) / 2 = 3.5, but the middle value is 4.\n",
"\n",
"# 2.The mean (average) and mode (most frequent value) may or may not be the same for unsorted and sorted datasets.\n",
"# For the mean, sorting the dataset does not change the sum of the values, so the mean remains the same whether the dataset is sorted or not.\n",
"# For the mode, sorting the dataset may change the frequency distribution of the values, potentially affecting which value appears most frequently and thus changing the mode. However, if there is a single mode (one value that appears most frequently), it will remain the same regardless of sorting.\n",
"\n",
"# 3.It doesn't matter whether the dataset is sorted or not when calculating the range. The range is simply the difference between the maximum and minimum values in the dataset.\n",
"# Sorting the dataset does not change the maximum and minimum values, so whether the dataset is sorted or not, the range will be the same.\n",
"# Therefore, sorting the dataset is unnecessary and does not affect the calculation of the range."
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "de193fbe",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[34, 66, 41, 68, 32, 72, 78, 37, 49, 58, 41, 33, 34, 47, 41, 10, 68, 77, 38, 20, 34, 61, 49, 64, 48, 56, 11, 62, 46, 67, 43, 28, 68, 37, 66, 13, 50, 41, 74, 43, 40, 77, 72, 11, 79, 78, 33, 38, 53, 11, 45, 23, 50, 66, 58, 44, 47, 65, 12, 63, 41, 40, 70, 36, 45, 27, 14, 77, 42, 17, 54, 64, 24, 23, 56, 15, 51, 32, 36, 31, 28, 54, 10, 72, 61, 31, 57, 21, 59, 72, 46, 42, 38, 73, 16, 13, 24, 13, 80, 43]\n"
]
},
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#EXERCISE_2\n",
"import random as rnd\n",
"x = [\n",
" rnd.randint(10, 80)\n",
" for i in range(0, 100)\n",
"]\n",
"print(x)\n",
"\n",
"import matplotlib.pyplot as plt\n",
"plt.hist(x, bins=25, color='red', edgecolor='black')\n",
"plt.xlabel('Value')\n",
"plt.ylabel('Frequency')\n",
"plt.title('Histogram of Random Data by zafran marvis')\n",
"plt.grid(True)\n",
"plt.show()\n",
"\n",
"# In this example, I've chosen bins=25 to divide the data into 25 equally spaced bins.\n",
"# Because to make the histogram look didn't worst and the detailed information is easy to read.\n",
"# You can adjust the number of bins based on your preference and the distribution of your data.\n",
"# If you have more data points or if you want more detailed information about the distribution, you may consider increasing the number of bins.\n",
"# Conversely, if you have fewer data points or if you want a broader overview of the distribution, you may consider decreasing the number of bins."
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "4ac1eb33",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#EXERCISE_3\n",
"import numpy as np\n",
"import statistics as stat\n",
"N = 1000; a = 20; b = 60; x = []\n",
"for i in range(N):\n",
" xi = rnd.randint(a, b)\n",
" x.append(xi)\n",
"plt.hist(x, bins=10)\n",
"plt.xticks(np.arange(20, 64, 4))\n",
"plt.xlim([20, 60])\n",
"plt.grid()\n",
"plt.show()\n",
"\n",
"# n = b - a + 1 = 60 - 20 + 1 = 41\n",
"# Average Value = N * 1/n * the width of the bins = 1000 * 1/41 * 4 = 97,5609"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "027f2e30",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#EXERCISE_4\n",
"N = 6000; a=1; b=6; dice = []\n",
"for i in range(N):\n",
" xi = rnd.randint(a, b)\n",
" dice.append(xi)\n",
"plt.hist(dice, bins=6)\n",
"plt.grid()\n",
"plt.show()\n",
"\n",
"# if we choose bin = 6, The value is 1000"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "f585227f",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 640x480 with 6 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#EXERCISE_5\n",
"mu = 100\n",
"si = 10\n",
"N = 100000\n",
"values = np.random.normal(mu, si, N)\n",
"\n",
"plt.subplot(3,3,1)\n",
"plt.hist(values, 100)\n",
"plt.axvline(values.mean(), color='r')\n",
"plt.xlim([60,340])\n",
"plt.title(\"mu = 100\")\n",
"\n",
"mu = 200\n",
"si = 10\n",
"N = 100000\n",
"values = np.random.normal(mu, si, N)\n",
"\n",
"plt.subplot(3,3,4)\n",
"plt.hist(values, 100)\n",
"plt.axvline(values.mean(), color='r')\n",
"plt.xlim([60,340])\n",
"plt.title(\"mu = 200\")\n",
"\n",
"mu = 300\n",
"si = 10\n",
"N = 100000\n",
"values = np.random.normal(mu, si, N)\n",
"\n",
"plt.subplot(3,3,7)\n",
"plt.hist(values, 100)\n",
"plt.axvline(values.mean(), color='r')\n",
"plt.xlim([60,340])\n",
"plt.title(\"mu = 300\")\n",
"\n",
"mu = 100\n",
"si = 5\n",
"N = 100000\n",
"values = np.random.normal(mu, si, N)\n",
"\n",
"plt.subplot(3,3,3)\n",
"plt.hist(values, 100)\n",
"plt.axvline(values.mean(), color='r')\n",
"plt.xlim([60,340])\n",
"plt.title(\"si = 5\")\n",
"\n",
"mu = 100\n",
"si = 10\n",
"N = 100000\n",
"values = np.random.normal(mu, si, N)\n",
"\n",
"plt.subplot(3,3,6)\n",
"plt.hist(values, 100)\n",
"plt.axvline(values.mean(), color='r')\n",
"plt.xlim([60,340])\n",
"plt.title(\"si = 10\")\n",
"\n",
"mu = 100\n",
"si = 20\n",
"N = 100000\n",
"values = np.random.normal(mu, si, N)\n",
"\n",
"plt.subplot(3,3,9)\n",
"plt.hist(values, 100)\n",
"plt.axvline(values.mean(), color='r')\n",
"plt.xlim([60,340])\n",
"plt.title(\"si = 20\")\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e16ff1f1",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment