Skip to content

Instantly share code, notes, and snippets.

@luisquintanilla
Created September 14, 2022 21:15
Show Gist options
  • Save luisquintanilla/fa3e8d1fb61c7f8669be179efe4e0bae to your computer and use it in GitHub Desktop.
Save luisquintanilla/fa3e8d1fb61c7f8669be179efe4e0bae to your computer and use it in GitHub Desktop.
Infer data schema and train using ML.NET AutoML
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Install NuGet Packages"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [
{
"data": {
"text/html": [
"<div><div><strong>Restore sources</strong><ul><li><span>https://pkgs.dev.azure.com/dnceng/public/_packaging/MachineLearning/nuget/v3/index.json</span></li></ul></div><div></div><div><strong>Installed Packages</strong><ul><li><span>Microsoft.Data.Analysis, 0.20.0-preview.22424.1</span></li><li><span>Microsoft.ML.AutoML, 0.20.0-preview.22424.1</span></li><li><span>Plotly.NET.CSharp, 0.0.1</span></li><li><span>Plotly.NET.Interactive, 3.0.2</span></li></ul></div></div>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
"Loading extensions from `Plotly.NET.Interactive.dll`"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
"Loading extensions from `Microsoft.Data.Analysis.Interactive.dll`"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"#i \"nuget:https://pkgs.dev.azure.com/dnceng/public/_packaging/MachineLearning/nuget/v3/index.json\"\n",
"#r \"nuget: Plotly.NET.Interactive, 3.0.2\"\n",
"#r \"nuget: Plotly.NET.CSharp, 0.0.1\"\n",
"#r \"nuget: Microsoft.ML.AutoML, 0.20.0-preview.22424.1\"\n",
"#r \"nuget: Microsoft.Data.Analysis, 0.20.0-preview.22424.1\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Import NuGet packages"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [],
"source": [
"using System;\n",
"using System.IO;\n",
"using Microsoft.Data.Analysis;\n",
"using Microsoft.ML;\n",
"using Microsoft.ML.AutoML;\n",
"using Microsoft.ML.Data;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define training data path"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [],
"source": [
"var trainDataPath = @\"C:\\\\Datasets\\\\taxi-fare-train.csv\";"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize MLContext"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [],
"source": [
"var ctx = new MLContext();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Infer training data schema"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [],
"source": [
"var columnInferenceResults = ctx.Auto().InferColumns(trainDataPath, \"fare_amount\", groupColumns: false);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Inspect inference results\n",
"\n",
"### Label Column (Column to predict)\n",
"\n",
"fare_amount\n",
"\n",
"### Features\n",
"\n",
"- Numeric Columns\n",
" - rate_code\n",
" - passenger_count\n",
" - trip_time_in_secs\n",
" - trip_distance\n",
"- Categorical Columns\n",
" - vendor_id\n",
" - payment_type"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [
{
"data": {
"text/html": [
"<table><thead><tr><th>LabelColumnName</th><th>UserIdColumnName</th><th>GroupIdColumnName</th><th>ItemIdColumnName</th><th>ExampleWeightColumnName</th><th>SamplingKeyColumnName</th><th>CategoricalColumnNames</th><th>NumericColumnNames</th><th>TextColumnNames</th><th>IgnoredColumnNames</th><th>ImagePathColumnNames</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">fare_amount</div></td><td><div class=\"dni-plaintext\">&lt;null&gt;</div></td><td><div class=\"dni-plaintext\">&lt;null&gt;</div></td><td><div class=\"dni-plaintext\">&lt;null&gt;</div></td><td><div class=\"dni-plaintext\">&lt;null&gt;</div></td><td><div class=\"dni-plaintext\">&lt;null&gt;</div></td><td><div class=\"dni-plaintext\">[ vendor_id, payment_type ]</div></td><td><div class=\"dni-plaintext\">[ rate_code, passenger_count, trip_time_in_secs, trip_distance ]</div></td><td><div class=\"dni-plaintext\">[ ]</div></td><td><div class=\"dni-plaintext\">[ ]</div></td><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"columnInferenceResults.ColumnInformation"
]
},
{
"cell_type": "markdown",
"metadata": {
"dotnet_interactive": {
"language": "csharp"
}
},
"source": [
"## Load data into IDataView"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [],
"source": [
"var textLoader = ctx.Data.CreateTextLoader(columnInferenceResults.TextLoaderOptions);\n",
"var idv = textLoader.Load(trainDataPath);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Inspect IDataView Schema"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [
{
"data": {
"text/html": [
"<table><thead><tr><th><i>index</i></th><th>Name</th><th>Index</th><th>IsHidden</th><th>Type</th><th>Annotations</th></tr></thead><tbody><tr><td>0</td><td>vendor_id</td><td><div class=\"dni-plaintext\">0</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.ReadOnlyMemory&lt;System.Char&gt;</div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr><tr><td>1</td><td>rate_code</td><td><div class=\"dni-plaintext\">1</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.Single</div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr><tr><td>2</td><td>passenger_count</td><td><div class=\"dni-plaintext\">2</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.Single</div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr><tr><td>3</td><td>trip_time_in_secs</td><td><div class=\"dni-plaintext\">3</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.Single</div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr><tr><td>4</td><td>trip_distance</td><td><div class=\"dni-plaintext\">4</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.Single</div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr><tr><td>5</td><td>payment_type</td><td><div class=\"dni-plaintext\">5</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.ReadOnlyMemory&lt;System.Char&gt;</div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr><tr><td>6</td><td>fare_amount</td><td><div class=\"dni-plaintext\">6</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.Single</div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr></tbody></table>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"idv.Schema"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Split data into train / validation set\n",
"\n",
"80% Train\n",
"20% Validation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [],
"source": [
"var trainTestSplit = ctx.Data.TrainTestSplit(idv,testFraction:0.2);\n",
"var trainSet = trainTestSplit.TrainSet;\n",
"var validation = trainTestSplit.TestSet;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define training pipeline\n",
"\n",
"- `Featurizer`: Applies transformations to data to prepare it for training based on the schema information provided by the column inference results. The resulting output is a feature vector called *Features*.\n",
"- `Regression`: Estimator that will automatically explore various regression algorithms and settings to find the best model for the given dataset. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [],
"source": [
"var pipeline = \n",
" ctx.Auto().Featurizer(trainSet,columnInferenceResults.ColumnInformation,outputColumnName:\"Features\")\n",
" .Append(ctx.Auto().Regression(labelColumnName:columnInferenceResults.ColumnInformation.LabelColumnName, useLgbm:false));"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configure Experiment"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [],
"source": [
"var experiment = ctx.Auto().CreateExperiment();\n",
"\n",
"experiment\n",
"\t.SetPipeline(pipeline)\n",
"\t.SetTrainingTimeInSeconds(60)\n",
"\t.SetRegressionMetric(RegressionMetric.RootMeanSquaredError, labelColumn: columnInferenceResults.ColumnInformation.LabelColumnName)\n",
"\t.SetDataset(trainSet, validation);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configure logging"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [],
"source": [
"// Configure logging\n",
"ctx.Log += (object? sender, LoggingEventArgs e) =>\n",
"{\n",
" if (e.Source.Contains(\"AutoMLExperiment\")) Console.WriteLine(e.RawMessage);\n",
"};"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Run experiment"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Channel started\r\n",
"Channel started\r\n",
"Update Running Trial - Id: 0 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression\r\n",
"Update Running Trial - Id: 0 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression\r\n",
"Update Completed Trial - Id: 0 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression - Duration: 2744\r\n",
"Update Completed Trial - Id: 0 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression - Duration: 2744\r\n",
"Update Best Trial - Id: 0 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression\r\n",
"Update Best Trial - Id: 0 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression\r\n",
"Update Running Trial - Id: 1 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression\r\n",
"Update Running Trial - Id: 1 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression\r\n",
"Update Completed Trial - Id: 1 - Metric: 5.402454859092199 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression - Duration: 2892\r\n",
"Update Completed Trial - Id: 1 - Metric: 5.402454859092199 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression - Duration: 2892\r\n",
"Update Running Trial - Id: 2 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression\r\n",
"Update Running Trial - Id: 2 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression\r\n",
"Update Completed Trial - Id: 2 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression - Duration: 3014\r\n",
"Update Completed Trial - Id: 2 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression - Duration: 3014\r\n",
"Update Running Trial - Id: 3 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression\r\n",
"Update Running Trial - Id: 3 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression\r\n",
"Update Completed Trial - Id: 3 - Metric: 10.530742315550812 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression - Duration: 5229\r\n",
"Update Completed Trial - Id: 3 - Metric: 10.530742315550812 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression - Duration: 5229\r\n",
"Update Running Trial - Id: 4 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression\r\n",
"Update Running Trial - Id: 4 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression\r\n",
"Update Completed Trial - Id: 4 - Metric: 5.3937182735638975 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression - Duration: 5183\r\n",
"Update Completed Trial - Id: 4 - Metric: 5.3937182735638975 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression - Duration: 5183\r\n",
"Update Running Trial - Id: 5 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression\r\n",
"Update Running Trial - Id: 5 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression\r\n",
"Update Completed Trial - Id: 5 - Metric: 10.530742315550812 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression - Duration: 2010\r\n",
"Update Completed Trial - Id: 5 - Metric: 10.530742315550812 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression - Duration: 2010\r\n",
"Update Running Trial - Id: 6 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression\r\n",
"Update Running Trial - Id: 6 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression\r\n",
"Update Completed Trial - Id: 6 - Metric: 14.211149762211765 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>LbfgsPoissonRegressionRegression - Duration: 5686\r\n",
"Update Completed Trial - Id: 6 - Metric: 14.211149762211765 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>LbfgsPoissonRegressionRegression - Duration: 5686\r\n",
"Update Running Trial - Id: 7 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>LbfgsPoissonRegressionRegression\r\n",
"Update Running Trial - Id: 7 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>LbfgsPoissonRegressionRegression\r\n",
"Update Completed Trial - Id: 7 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression - Duration: 3093\r\n",
"Update Completed Trial - Id: 7 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression - Duration: 3093\r\n",
"Update Running Trial - Id: 8 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression\r\n",
"Update Running Trial - Id: 8 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression\r\n",
"Update Completed Trial - Id: 8 - Metric: 14.226784641417762 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>LbfgsPoissonRegressionRegression - Duration: 7934\r\n",
"Update Completed Trial - Id: 8 - Metric: 14.226784641417762 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>LbfgsPoissonRegressionRegression - Duration: 7934\r\n",
"Update Running Trial - Id: 9 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>LbfgsPoissonRegressionRegression\r\n",
"Update Running Trial - Id: 9 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>LbfgsPoissonRegressionRegression\r\n",
"Update Completed Trial - Id: 9 - Metric: 5.040059189971107 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression - Duration: 4136\r\n",
"Update Completed Trial - Id: 9 - Metric: 5.040059189971107 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression - Duration: 4136\r\n",
"Update Running Trial - Id: 10 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression\r\n",
"Update Running Trial - Id: 10 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression\r\n",
"Update Completed Trial - Id: 10 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression - Duration: 3233\r\n",
"Update Completed Trial - Id: 10 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression - Duration: 3233\r\n",
"Update Running Trial - Id: 11 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression\r\n",
"Update Running Trial - Id: 11 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression\r\n",
"Update Completed Trial - Id: 11 - Metric: 13.89212957681258 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression - Duration: 3765\r\n",
"Update Completed Trial - Id: 11 - Metric: 13.89212957681258 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression - Duration: 3765\r\n",
"Update Running Trial - Id: 12 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression\r\n",
"Update Running Trial - Id: 12 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression\r\n",
"Update Completed Trial - Id: 12 - Metric: 4.442418125775676 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression - Duration: 6165\r\n",
"Update Completed Trial - Id: 12 - Metric: 4.442418125775676 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression - Duration: 6165\r\n",
"Update Running Trial - Id: 13 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression\r\n",
"Update Running Trial - Id: 13 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression\r\n",
"Update Completed Trial - Id: 13 - Metric: 14.50397627304784 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression - Duration: 3285\r\n",
"Update Completed Trial - Id: 13 - Metric: 14.50397627304784 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression - Duration: 3285\r\n",
"Update Running Trial - Id: 14 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression\r\n",
"Update Running Trial - Id: 14 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression\r\n"
]
}
],
"source": [
"var result = await experiment.RunAsync();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Display evaluation metric for the best model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"R-Squared: 2.775531046547196\r\n"
]
}
],
"source": [
"Console.WriteLine($\"R-Squared: {result.Metric}\");"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Save the best model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"dotnet_interactive": {
"language": "csharp"
},
"vscode": {
"languageId": "dotnet-interactive.csharp"
}
},
"outputs": [],
"source": [
"ctx.Model.Save(result.Model,idv.Schema,\"taxi-fare.mlnet\");"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".NET (C#)",
"language": "C#",
"name": ".net-csharp"
},
"language_info": {
"name": "C#"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment