"metadata": {
"id": "4J2lIggJLUMS",
"colab_type": "text"
"cell_type": "markdown",
"source": [
# **Essential Machine Learning and Exploratory Data Analysis with Python and Jupyter Notebook**
"metadata": {
"id": "fu_dR92sSS5-",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Part 1.1-1.2**
"* [Part 1.1: Introductory Concepts in Python, IPython and Jupyter](\n",
"* [Part 1.2 Functions](\n",
"* [References](\n",
"metadata": {
"id": "c_Id55m6Jsbu",
"colab_type": "text"
"cell_type": "markdown",
"source": [
## Pragmatic AI Labs
"metadata": {
"id": "e5p96AqpSDZa",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"![alt text](\n",
This notebook was produced by [Pragmatic AI Labs](
"* Buying a copy of [Pragmatic AI: An Introduction to Cloud-Based Machine Learning](\n",
"* Reading an online copy of [Pragmatic AI:Pragmatic AI: An Introduction to Cloud-Based Machine Learning](\n",
"* Watching video [Essential Machine Learning and AI with Python and Jupyter Notebook-Video-SafariOnline]( on Safari Books Online.\n",
"* Purchasing video [Essential Machine Learning and AI with Python and Jupyter Notebook- Purchase Video](\n",
"* Viewing more content at [](\n"
## Part 1.1: Introductory Concepts in Python, IPython and Jupyter
"metadata": {
"id": "nP_6tcHM7yFm",
"colab_type": "text"
"cell_type": "markdown",
"source": [
### Using IPython, Jupyter, and Python executable
"metadata": {
"id": "n7JKyOrPkITH",
"colab_type": "text"
"cell_type": "markdown",
"source": [
#### Using IPython
"Very similar to Jupyter, but run from terminal:\n",
"* IPython predates Jupyter\n",
"* Both Jupyter and IPython accept *!ls -l* format to execute shell commands\n",
"metadata": {
"id": "Pci2aotqPIME",
"colab_type": "text"
"cell_type": "markdown",
"source": [
#### Jupyter Notebook
"Many flavors of Jupyter Notebook. A few popular ones:\n",
"![Jupyter]( =100x100)\n",
"![JupyterHug]( =150x150)\n",
"![Colab]( =100x100)\n",
"![Kaggle]( =200x100)\n",
"![Sagemaker]( =200x100)\n",
"#### Hosted Commercial Flavors\n",
"* [Google Colaboratory]( Free\n",
"* [Kaggle]( Free\n",
"#### Pure Open Source\n",
"* [Jupyter]( standalone, original\n",
"* [JupyterHub]( multi-user, docker friendly\n",
"#### Hybrid Solutions\n",
"* Running Jupyter on [AWS Spot Instances](\n",
"* [Google Data Lab](\n",
"* [Azure Data Science Virtual Machines](\n",
"* [AWS Sagemaker](\n",
"* [Azure Machine Learning Studio](\n"
#### Colab Notebook Key Features
##### Upload to Colab
"id": "0kmgxW7Sxa6j",
"colab_type": "text"
"cell_type": "markdown",
"source": [
##### Forms in Colab
"cell_type": "code",
"source": [
"print(f\"You select it is {Use_Python} you use Python\")"
"execution_count": 18,
"outputs": [
"output_type": "stream",
"text": [
"You select it is True you use Python\n"
"name": "stdout"
#### Python executable
"cell_type": "code",
"source": [
"#this is how you capture input to a program\n",
"import sys;sys.argv"
"execution_count": 2,
"outputs": [
"output_type": "execute_result",
"data": {
"text/plain": [
" '-f',\n",
" '/root/.local/share/jupyter/runtime/kernel-a0b0350b-f3e1-4e60-87be-11bef5dc5d43.json']"
"metadata": {
"id": "6HJtGuiKUYFU",
"colab_type": "text"
"cell_type": "markdown",
"source": [
### Introductory Concepts
"* **Procedural Statements**\n",
"* Strings and String Formatting\n",
"* Numbers and Arithmetic Operations\n",
"* Data Structures\n",
#### Procedural Statements
"metadata": {
"id": "yTePE3imVqfP",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"metadata": {
"id": "8nUjNnApV6cC",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"**Create Variable and Use Variable**"
"metadata": {
"id": "H004n8R-WDGF",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"**Multiple procedural statements**"
"metadata": {
"id": "6IiZ6RnLWKsq",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"**Adding Numbers**"
"metadata": {
"id": "mlszsN87WmcO",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"**Adding Phrases**"
"metadata": {
"id": "WSrfrIO1XYts",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"**Complex statements**\n",
"More complex statements can be created that use data structures like the belts variable, which is a list."
"metadata": {
"id": "UYzPEDtnXlbx",
"colab_type": "text"
"cell_type": "markdown",
"source": [
#### Strings and String Formatting
"Strings are a sequence of characters and they are often programmatically formatted. Almost all Python programs have strings because they can be used to send messages to users who use the program. When creating strings there are few core concepts to understand:\n",
"* Strings can be create with the single, double and triple/double quotes\n",
"* Strings are can be formatted\n",
"* One complication of strings is they can be encoded in several formats including unicode\n",
"* Many methods are available to operate on strings. In an editor or IPython shell you can see these methods by tab completion: \n",
" capitalize() format() islower() lower() rpartition() title() \n",
" casefold() format_map() isnumeric() lstrip() rsplit() translate() \n",
" center() index() isprintable() maketrans() rstrip() upper() \n",
" count() isalnum() isspace() partition() split() zfill() \n",
" encode() isalpha() istitle() replace() splitlines() \n",
" endswith() isdecimal() isupper() rfind() startswith() \n",
" expandtabs() isdigit() join() rindex() strip() \n",
" find() isidentifier() ljust() rjust() swapcase() \n",
"metadata": {
"id": "8afQjS25YDvQ",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Basic String**
"metadata": {
"id": "GMYqc-cKYJlL",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Splitting String**
"Turn a string in a list by splitting on spaces, or some other thing"
"metadata": {
"id": "Ti9hySuGZbbX",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**All Capital**
"Turn a string into all Capital Letter"
"metadata": {
"metadata": {
"id": "ZlifO6c4Zk5W",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Slicing Strings**
"Strings can be referenced by length and sliced"
"metadata": {
"id": "lxvtolozZy1f",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Strings Can Be Added Together**
"metadata": {
"id": "nNYxshNHZ6O8",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**F-Strings Can Be Formatted in More Complex Ways**
"One of the best ways to format a string in modern Python 3 is to use f-strings"
"metadata": {
"id": "wlpMuwdmbui0",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Strings Can Use Triple Quotes to Wrap**
"metadata": {
"id": "mBuMTMQWb239",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Line Breaks Can Be Removed with Replace**
"The last long line contained line breaks, which are the **\\n** character, and they can be removed by using the replace method"
"metadata": {
"id": "1sv8Q1ALePXM",
"colab_type": "code",
"colab": {}
"cell_type": "code",
"source": [
"execution_count": 0,
"outputs": []
#### Numbers and Arithmetic Operations
**Adding and Subtracting Numbers**
"metadata": {
"id": "nMKS0ZYkdjvp",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Multiplication with Decimals**
"Can use float type to solve decimal problems"
"metadata": {
"id": "gjMkBnuxiCQW",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Can also use Decimal Library to set precision and deal with repeating decimal\n"
"metadata": {
"id": "f1_B6OUrdxlU",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Using Exponents**
"Using the Python math library it is straightforward to call 2 to the 3rd power"
"metadata": {
"id": "nkPcDry7jWt-",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Can also use built in exponent operator to accomplish same thing"
"metadata": {
"id": "P6gOf9qtd6Nt",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Converting Between different numerical types**
"There are many numerical forms to be aware of in Python.\n",
"A couple of the most common are:\n",
"* Integers\n",
"* Floats"
"metadata": {
"id": "MrrFXb99gQ1Z",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Numbers can also be rounded**
"Python Built in round "
"metadata": {
"id": "SuJDUTLFWUJz",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Numpy round"
"metadata": {
"id": "X3aHIe6qW8ab",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Pandas round"
"metadata": {
"id": "elnncfa0XtOt",
"colab_type": "text"
"cell_type": "markdown",
"source": [
Simple benchmark of all three (**Python**, **numpy** and **Pandas** round): using **%timeit**
"*Depending on what is getting rounded (i.e. a very large DataFrame, performance may very, so knowing how to benchmark performance is important with round) *\n"
"metadata": {
"id": "CuJ2OPSZiewO",
"colab_type": "text"
"cell_type": "markdown",
"source": [
### Data Structures
"Python has a couple of core Data Structures that are used very frequently\n",
"* Lists\n",
"* Dictionaries\n",
"Dictionaries and lists are the real workhorses of Python, but there are also other Data Structers like tuples, sets, Counters, etc, that are worth exploring too."
#### Python Dictionaries
##### Creating Python Dictionaries
"metadata": {
"id": "TjNRS-YOjYxw",
"colab_type": "text"
"cell_type": "markdown",
"source": [
##### Using Python Dictionaries
"A common dictionary usage pattern is to *iterate* on a dictionary by using the items method. In the example below the key and the value are printed:"
"metadata": {
"id": "5cue3xbzjfhF",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Dictionaries can also be used to *filter*. In the example below, only the submission attacks on the lower body are displayed:"
"id": "9jkR5VTHjsc2",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Dictionary keys and values can also be selected with built in *keys() * and *values()* methods"
"metadata": {
"metadata": {
"id": "sgPM6V_slmoy",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Key lookup is very performant, and one of the most common ways to use a dictionary."
"metadata": {
"id": "IyHkox1Rj66y",
"colab_type": "text"
"cell_type": "markdown",
"source": [
#### Python Lists
"Lists are also very commonly used in Python. They allow for sequential collections. Lists can hold dictionaries, just as dictionaries can hold lists."
##### Creating Lists
"metadata": {
"id": "WUgrRDDsoobr",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Another method os creating lists is with built in *list()* method\n"
"metadata": {
"id": "i1gjso964T1T",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Yet another way, very performant way to create lists is to use list comprehsion syntax"
"metadata": {
"id": "CBDIYoFB7yHS",
"colab_type": "text"
"cell_type": "markdown",
"source": [
##### Using Lists
"For loops are one of the simplist ways to use a list."
"metadata": {
"metadata": {
"id": "YAMxQJobkCy8",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Lists can also be used to select elements by slicing."
"metadata": {
"id": "XgxwgI1I7Ex3",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Lists can also be used to unpack powerful, succinct statements when used with built-in functions like zip.\n"
"metadata": {
"id": "YXalviiD7yHZ",
"colab_type": "text"
"cell_type": "markdown",
"source": [
#### Python Sets
"Sets are unordered unique collections"
"metadata": {
##### Creating Python Sets
"metadata": {
"id": "Kj_JgH317yHZ",
"colab_type": "text"
"cell_type": "markdown",
"source": [
##### Using Sets
"One of the most powerful ways to use sets is to find the differences between to collections"
"metadata": {
"metadata": {
"id": "ZTmBkJG0Xud_",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Question: \n",
"Q: set() is used to select unique values. what is its performance for a deep learning large data sets. in large data sets, if set() is not performant enough, what are the alternatives?"
"metadata": {
"id": "1wY-GUQc3te-",
"colab_type": "text"
"cell_type": "markdown",
"source": [
## Part 1.2 Functions
"* *[Read related material covered in Chapter 1 (Functions Section) of Pragmatic AI](*\n",
"* *[Watch video section 2: Writing and Applying Functions](* \n",
"* **Writing Functions**\n",
"* Function arguments: positional, keyword\n",
"* Functional Currying: Passing uncalled functions\n",
"* Functions that Yield\n",
"* Decorators: Functions that wrap other functions\n",
"* Making Classes Behave Like Functions\n",
"* Applying a Function to a Pandas DataFrame\n",
"* Writing Lambdas"
#### Writing Functions
**Simple function**
"id": "lq9VQm0V4BTE",
"colab_type": "code",
"colab": {}
"cell_type": "code",
"source": [
"def myfunc():pass"
"execution_count": 0,
"outputs": []
"metadata": {
"id": "AMPU657F4JWF",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Documenting Functions**
"It is a very good idea to document functions. \n",
"In Jupyter Notebook and IPython docstrings can be viewed by referring to the function with a ?. ie.\n",
"In [2]: favorite_martial_art_with_docstring?\n",
"Signature: favorite_martial_art_with_docstring()\n",
"Docstring: This function returns the name of my favorite martial art\n",
"File: ~/src/functional_intro_to_python/<ipython-input-1-bef983c31735>\n",
"Type: function\n",
"metadata": {
"id": "8OW1SbDg4Rd1",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"**Docstrings of functions can be printed out by referring to *```__doc__```*** "
"metadata": {
"id": "ew4-TwH84YJa",
"colab_type": "text"
"cell_type": "markdown",
"source": [
#### Function arguments: positional, keyword
"A function is most useful when arguments are passed to the function. New values for times are processed inside the function. This function is also a 'positional' argument, vs a keyword argument. Positional arguments are processed in the order they are created in."
"metadata": {
"id": "LILwF-Nt4iTm",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Positional Arguments are processed in order**
"Note, *position* is the key to pay attention to.\n",
"metadata": {
"metadata": {
"id": "ntX-r2Bp4uTw",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Keyword Arguments are processed by key, value and can have default values**
"One handy feature of keyword arguments is that you can set defaults and only change the defaults you want to change."
"metadata": {
"metadata": {
"id": "2wBL_tQ846Qd",
"colab_type": "text"
"cell_type": "markdown",
"source": [
*****args and ****kwargs
"allow dynamic argument passing to functions\n",
"Should be used with discretion because it can make code hard to understand"
"metadata": {
"metadata": {
"id": "6Bd0PwiD5NnI",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**passing dictionary of keywords to function**
"**kwargs syntax can also be used to pass in arguments all at once"
"metadata": {
"metadata": {
"id": "3Srgz3C65Vdk",
"colab_type": "text"
"cell_type": "markdown",
"source": [
**Passing Around Functions**
"Object-Oriented programming is a very popular way to program, but it isn't the only style available in Python. For concurrency and for Data Science, functional programming fits as a complementary style.\n",
"In the example, below a function can be used inside of another function by being passed into the function itself as an argument."
"metadata": {
"metadata": {
"metadata": {
"metadata": {
"id": "bCdDA35S5mQ4",
"colab_type": "text"
"cell_type": "markdown",
"source": [
#### Closures and Functional Currying
"Closures are functions that contain other nested functions with state from outer function.\n",
"In Python, a common way to use them is to keep track of the state. In the example below, the outer function, attack_counter keeps track of counts of attacks. The inner fuction attack_filter uses the \"nonlocal\" keyword in Python3, to modify the variable in the outer function.\n",
"This approach is called \"functional currying\". It allows for a specialized function to be created from general functions. As shown below, this style of function could be the basis of a simple video game or maybe for the statistics crew of a mma match."
"metadata": {
"metadata": {
"id": "hZx708yd7lpU",
"colab_type": "text"
"cell_type": "markdown",
"source": [
#### Partial Functions
"Useful to partial assign default values to functions"
"metadata": {
"id": "fOhajVSK7oeX",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
"outputId": "547b37cb-a2da-469f-de4a-913b8a89fd95"
"metadata": {
"id": "ymZDwSka7rrh",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"By using this partial function, only one argument is needed"
Alternately, the original function can also be called with a different two attacks
"metadata": {
"id": "5-tKvYXt6ftS",
"colab_type": "text"
"cell_type": "markdown",
"source": [
#### Lazy Evaluated Functions (Generators)
"A very useful style of programming is \"lazy evaluation\". A generator is an example of that. Generators yield an items at a time.\n",
"The example below return an \"infinite\" random sequence of attacks. The lazy portion comes into play in that while there is an infinite amount of values, they are only returned when the function is called."
"metadata": {
"metadata": {
"id": "0IxoCkJu6y-B",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
"outputId": "4919032d-73f0-4de9-e204-f52275ac28b9"
"cell_type": "code",
"source": [
"for _ in range(1):\n",
" print(next(attack))"
"execution_count": 66,
"outputs": [
"output_type": "stream",
"text": [
"name": "stdout"
"metadata": {
"id": "hlJXHUVq6391",
"colab_type": "text"
"cell_type": "markdown",
"source": [
#### Decorators: Functions that wrap other functions
##### Randomized Sleep Decorator
"metadata": {
"id": "qO-uoYUClvqr",
"colab_type": "text"
"cell_type": "markdown",
"source": [
##### Timing Decorator
"metadata": {
"id": "McePAV81l2Cm",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Using a decorator to time code is very common"
"metadata": {
"metadata": {
"id": "SFisqs8tmJer",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"Using decorator to time execution of a function"
"metadata": {
"metadata": {
"id": "WT4sChux989o",
"colab_type": "text"
"cell_type": "markdown",
"source": [
#### Making Classes Behave Like Functions
"Creating callable functions"
"metadata": {
"metadata": {
"id": "hQpf0vu_-Cnk",
"colab_type": "text"
"cell_type": "markdown",
"source": [
#### Applying a Function to a Pandas DataFrame
"The final lesson on functions is to take this knowledge and use it on a DataFrame in Pandas. One of the more fundamental concepts in Pandas is use apply on a column vs iterating through all of the values. An example is shown below where all of the numbers are rounded to a whole digit."
"metadata": {
"metadata": {
"id": "QrQiWvRC-QY9",
"colab_type": "text"
"cell_type": "markdown",
"source": [
This was done with a built in function, but a custom function can also be written and applied to a column. In the example below, the values are multiplied by 100. The alternative way to accomplish this would be to create a loop, transform the data and then write it back. In Pandas, it is straightforward and simple to apply custom functions instead.
"metadata": {
"metadata": {
"id": "gq3i_yeDWWHt",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 87
"outputId": "8ea67505-8873-4f1b-d901-f11876837eb4"
"cell_type": "code",
"source": [
"#example of a smarter function\n",
"def smart_multiply_by_100(x):\n",
" if x > 5:\n",
" return 1\n",
" return x\n",
"inputs = [1,2,6,10]\n",
"for input in inputs:\n",
" print(smart_multiply_by_100(input))\n",
" \n",
" "
"execution_count": 174,
"outputs": [
"output_type": "stream",
"text": [
"name": "stdout"
"metadata": {
"id": "C95FIjW5-rJc",
"colab_type": "text"
"cell_type": "markdown",
"source": [
#### Writing Lambdas
"Generally considered to be unnecessary. A Python lambda is an inline python and it can often lead to confusing code. \n"
"metadata": {
"metadata": {
"id": "FlIAJo2GK7KL",
"colab_type": "code",
"colab": {}
"metadata": {
"metadata": {
"id": "rnXLQ6rZLKj5",
"colab_type": "code",
"colab": {}
"metadata": {
"id": "fQhEoJu1tNRe",
"colab_type": "text"
"cell_type": "markdown",
"source": [
## References
"metadata": {
"id": "u6BaXDS7tQTD",
"colab_type": "text"
"cell_type": "markdown",
"source": [
"* [Pragmatic AI: An Introduction to Cloud-Based Machine Learning-Physical Book](\n",
"* [Pragmatic AI: Pragmatic AI: An Introduction to Cloud-Based Machine Learning-SafariOnline Book](\n",
"* [Essential Machine Learning and AI with Python and Jupyter Notebook-Video-SafariOnline](\n",
"* [Essential Machine Learning and AI with Python and Jupyter Notebook- Purchase Video](\n",
"* [](\n",
"* [](\n",
"* [Forbes: Here come the notebooks](\n",
"* [Circleci: Increase reliability in data science and ML projects]("
## FAQ
### Q: Why are Python Dictionaries so commonly used vs other data structures?
A: They are highly performant, easy to use, and flexible to program with
"metadata": {
"id": "P3xpdAnD3aqo",
"colab_type": "text"
"cell_type": "markdown",
"source": [
### Q: Can use '+=1' to keep adding to the variable instead of replacing it with 'upper_body_counter = upper_body_counter + 1'?
"A: yes ```var +=1``` is *syntatic sugar*\n"
"id": "nSZHcS0x37yc",
"colab_type": "code",
"colab": {}
"metadata": {
"id": "5Kh5LRGT4CA1",
"colab_type": "text"
"cell_type": "markdown",
"source": [
### Q: Does a function with 'return' instead of using 'print' do the same thing?
"A: No, these are different mostly"
"metadata": {
"metadata": {
"metadata": {
"id": "SBJKnr3h5P0L",
"colab_type": "text"
"cell_type": "markdown",
"source": [
A function with no Return value in python return None
"metadata": {
"metadata": {
"metadata": {
"metadata": {
"id": "1EBc2ohAonjw",
"colab_type": "text"
"cell_type": "markdown",
"source": [
## Homework Excercises
"metadata": {
### Can you Return random attacks without repeating?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment