{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "ML Dotnet in Google Colab.ipynb",
      "provenance": [],
      "collapsed_sections": []
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "S4EDsgBJArmU"
      },
      "source": [
        "<p align=\"center\">\n",
        "<a target=\"_blank\" href=\"https://colab.research.google.com/drive/1MQF2B_chRddk7WFKS6H5wLs_vdI7Xfg8?usp=sharing\">\n",
        "  <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
        "</a>\n",
        "</p>\n",
        "\n",
        "# Running ML Dotnet in Colab\n",
        "\n",
        "In this tutorial, we will learn to build a classification model in Colab using **ML Dotnet** which is a machine learning framework for .NET. To know more about, ML.NET visit [https://dotnet.microsoft.com/learn/ml-dotnet](https://dotnet.microsoft.com/learn/ml-dotnet). Let's start building our classification model."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Dyn4zYHqEAMd"
      },
      "source": [
        "**Step 1:** First we need to install .NET Core SDK and ML.NET in our workspace."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "A4SpMCbaJOls"
      },
      "source": [
        "! wget -q https://packages.microsoft.com/config/ubuntu/18.04/packages-microsoft-prod.deb -O packages-microsoft-prod.deb \\\n",
        "&& dpkg -i packages-microsoft-prod.deb \\\n",
        "&& add-apt-repository universe \\\n",
        "&& apt-get update \\\n",
        "&& apt-get install apt-transport-https \\\n",
        "&& apt-get update \\\n",
        "&& apt-get install dotnet-sdk-3.1 \\\n",
        "&& dotnet tool install -g mlnet"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5dBaf_PMEN8j"
      },
      "source": [
        "Let's check the version of dotnet and mlnet to make sure everything is ready to explore."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "7rUEn7ACPenO",
        "outputId": "3182c889-cb5c-45c4-9b3d-64cec58e01d6"
      },
      "source": [
        "! dotnet --version &&  ~/.dotnet/tools/mlnet --version"
      ],
      "execution_count": 2,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "3.1.404\n",
            "Welcome to the ML.NET CLI!\r\n",
            "--------------------------\r\n",
            "Learn more about the ML.NET CLI: https://aka.ms/mlnet-cli\r\n",
            "Use 'mlnet --help' to see available commands and options.\r\n",
            "\r\n",
            "Telemetry\r\n",
            "---------\r\n",
            "The ML.NET CLI tool collects usage data in order to help us improve your experience.\r\n",
            "The data doesn't include personal information or data from your datasets.\r\n",
            "You can opt-out of telemetry by setting the MLDOTNET_CLI_TELEMETRY_OPTOUT environment variable to '1' or 'true' using your favorite shell.\r\n",
            "\r\n",
            "Read more about ML.NET CLI telemetry: https://aka.ms/mlnet-cli-telemetry\r\n",
            "\n",
            "16.2.0+511cc1082bef7a4bbb2f83ad88c1c425c932079c\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9gme4JERQK6m"
      },
      "source": [
        "**Step 2:** Seems like everythings is okay to start exploring. Now we will import our dataset and save it as **dataset.tsv**."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "yPcECNqys-e5"
      },
      "source": [
        "! wget -O dataset.tsv https://raw.githubusercontent.com/dotnet/machinelearning/master/test/data/wikipedia-detox-250-line-data.tsv"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JbHAmeUFQ2gx"
      },
      "source": [
        "Each row in **dataset.tsv** represents a different review left by a user on Wikipedia. The first column represents the sentiment of the text (0 is non-toxic, 1 is toxic), and the second column represents the comment left by the user. The columns are separated by tabs. The dataset looks like:\n",
        "\n",
        "```\n",
        "Sentiment\tSentimentText\n",
        "1\t        ==RUDE== Dude, you are rude upload that carl picture back, or else.\n",
        "1\t        == OK! ==  IM GOING TO VANDALIZE WILD ONES WIKI THEN!!!\n",
        "0\t        I hope this helps.\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xKgpLDpkRNYD"
      },
      "source": [
        "**Step 3:** Now we'll train our model with the **dataset.tsv** dataset. In this case we need to specify the target column which we want to predict (**Sentiment**). Additionally, we need to specify the amount of time we would like the ML.NET CLI to explore the different models, in this case **8 seconds**."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "D4JM5oFzw5r8",
        "outputId": "3d6f8a54-3e00-4e2e-e11d-e752c5d74b72"
      },
      "source": [
        "! ~/.dotnet/tools/mlnet classification --dataset \"dataset.tsv\" --label-col \"Sentiment\" --train-time 8"
      ],
      "execution_count": 4,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "\u001b[?1h\u001b=\u001b[m\u001b[37mStart Training\n",
            "\u001b[m\u001b[m\u001b[37m|     Trainer                              MicroAccuracy  MacroAccuracy  Duration #Iteration                     |\n",
            "\u001b[m\u001b[m\u001b[37m|1    AveragedPerceptronOva                       0.7529         0.7111       4.4          1                     |\n",
            "\u001b[m\u001b[m\u001b[37m\n",
            "===============================================Experiment Results=================================================\n",
            "------------------------------------------------------------------------------------------------------------------\n",
            "|                                                     Summary                                                    |\n",
            "------------------------------------------------------------------------------------------------------------------\n",
            "|ML Task: multiclass-classification                                                                              |\n",
            "|Dataset: /content/dataset.tsv                                                                                   |\n",
            "|Label : Sentiment                                                                                               |\n",
            "|Total experiment time : 4.3551944 Secs                                                                          |\n",
            "|Total number of models explored: 1                                                                              |\n",
            "------------------------------------------------------------------------------------------------------------------\n",
            "\n",
            "\u001b[m\u001b[m\u001b[33m\u001b[m\u001b[m\u001b[37m|                                              Top 1 models explored                                             |\n",
            "------------------------------------------------------------------------------------------------------------------\n",
            "|     Trainer                              MicroAccuracy  MacroAccuracy  Duration #Iteration                     |\n",
            "|1    AveragedPerceptronOva                       0.7529         0.7111       4.4          1                     |\n",
            "------------------------------------------------------------------------------------------------------------------\n",
            "\n",
            "\u001b[m\u001b[?1h\u001b=\u001b[?1h\u001b=\u001b[m\u001b[37mCode Generated\n",
            "\u001b[m\u001b[m\u001b[37mGenerated C# code for model consumption: /content/SampleClassification/SampleClassification.ConsoleApp\n",
            "\u001b[m\u001b[m\u001b[37mCheck out log file for more information: /root/.mlnet/log.txt\n",
            "\u001b[m\u001b[m\u001b[37mExiting ...\n",
            "\u001b[m"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jMlqAikDSdLd"
      },
      "source": [
        "**Hurrrrrrrrrrrayyy !!!** We have done it !!\n",
        "\n",
        "After the ML.NET CLI selects the best model, it will display the training Summary, which is showing us a summary of the exploration process, including how many models were explored in the given training time, in this case 1.\n",
        "\n",
        "If you check the file explorer in colab, you'll see a **SampleClassification** folder which contains our trained model.\n",
        "\n",
        "<p align=\"center\">\n",
        "<img width=\"250\" src=\"https://i.imgur.com/8P8WDn5.png\"  />\n",
        "</p>\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ebsjnUjKbnQd"
      },
      "source": [
        "**Step 4:** Now we will create a new dotnet console application which is where we will test our trained model.\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "yeGiAUa7zh6x"
      },
      "source": [
        "! dotnet new console -o consumeModelApp"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "eeP-UsfXWKVZ"
      },
      "source": [
        "**Step 5:** If you check your colab workspace again, there should be a new dotnet console application named as consumeModelApp. Now open the consumeModelApp, double click over the **Program.cs** file and in the csharp file replace everything with this piece of code below and press **Ctrl + S** to save the code there.\n",
        "\n",
        "```csharp\n",
        "using System;\n",
        "using SampleClassification.Model;\n",
        "\n",
        "namespace consumeModelApp\n",
        "{\n",
        "    class Program\n",
        "    {\n",
        "        static void Main(string[] args)\n",
        "        {\n",
        "            // Add input data\n",
        "            var input = new ModelInput();\n",
        "            input.SentimentText = \"That is rude.\";\n",
        "\n",
        "            // Load model and predict output of sample data\n",
        "            ModelOutput result = ConsumeModel.Predict(input);\n",
        "            Console.WriteLine(\"\\n\\n\");\n",
        "            Console.WriteLine($\"Text: {input.SentimentText}\\nIs Toxic: {result.Prediction}\");\n",
        "        }\n",
        "    }\n",
        "}\n",
        "```\n",
        "\n",
        "When you double click over the Program.cs file you will see the editor tab at the right and there you can paste that piece of code like below.\n",
        "\n",
        "<p align=\"center\">\n",
        "<img width=\"600\" src=\"https://i.imgur.com/bVPVfDZ.png\" />\n",
        "</p>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JGcZ-XoBXKJX"
      },
      "source": [
        "**Final Step:** Now we will copy and add a reference of our trained model in our console application and then we will build and run our application to predict a text's sentiment which in background uses our classification model."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "tHcH5hTi-_5e",
        "outputId": "d1840a0e-6771-4459-cd55-3a5407386e56"
      },
      "source": [
        "! cp SampleClassification/SampleClassification.Model/MLModel.zip consumeModelApp \\\n",
        "&& cd consumeModelApp \\\n",
        "&& dotnet add reference \"../SampleClassification/SampleClassification.Model/SampleClassification.Model.csproj\" \\\n",
        "&& dotnet run"
      ],
      "execution_count": 6,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Reference `..\\SampleClassification\\SampleClassification.Model\\SampleClassification.Model.csproj` added to the project.\n",
            "\u001b[?1h\u001b=\u001b[?1h\u001b=\u001b[?1h\u001b=\u001b[?1h\u001b=\u001b[?1h\u001b=\u001b[?1h\u001b=\n",
            "\n",
            "\n",
            "Text: That is rude.\n",
            "Is Toxic: 1\n",
            "\u001b[?1h\u001b="
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3QXQH__ZdaUv"
      },
      "source": [
        "Finally, we are done with building a classification model in Colab with ML.NET. Cheers !!!\n",
        "\n",
        "# 🎉🎉🎉"
      ]
    }
  ]
}