Using the Python bindings and its local cluster object to find the centroids for a CSV file.
Usage:
cat input.csv | python local_batch_centroid.py > centroids.csv
{ | |
"name": "Unite datasets", | |
"description": "Creating a dataset that contains the data in several datasets tagged with an identifier.", | |
"inputs": [ | |
{ | |
"name": "tag", | |
"type": "string", | |
"description": "tag identifier" | |
} | |
], |
{ | |
"name": "Forecast dataset", | |
"description": "Creating a dataset that contains the original fields and the forecast prediction for a certain ets-model", | |
"inputs": [ | |
{ | |
"name": "timeseries-id", | |
"type": "timeseries-id", | |
"description": "Select the timeseries" | |
}, | |
{ |
{ | |
"description": "Script that updates the field types for an existing Source. The `base-type` is assigned to all the fields unless otherwise stated in `explicit-types`.\nThe `explicit-types` argument expects a list of [field, type] pairs like:\n\n[[\"field1\", \"categorical\"], [\"field2\", \"numeric\"]]", | |
"inputs": [ | |
{ | |
"name": "source-id", | |
"type": "source-id", | |
"description": "Select the source to be updated" | |
}, | |
{ | |
"name": "base-type", |
sepal length | sepal width | petal length | petal width | species | |
---|---|---|---|---|---|
5.1 | 3.5 | 1.4 | 0.2 | Iris-setosa | |
4.9 | 3.0 | 1.4 | 0.2 | Iris-setosa | |
4.7 | 3.2 | 1.3 | 0.2 | Iris-setosa | |
4.6 | 3.1 | 1.5 | 0.2 | Iris-setosa | |
5.0 | 3.6 | 1.4 | 0.2 | Iris-setosa | |
5.4 | 3.9 | 1.7 | 0.4 | Iris-setosa | |
4.6 | 3.4 | 1.4 | 0.3 | Iris-setosa | |
5.0 | 3.4 | 1.5 | 0.2 | Iris-setosa | |
4.4 | 2.9 | 1.4 | 0.2 | Iris-setosa |
;; Here's a custom generator for creating BigML ensembles. As | |
;; "random_candidate_ratio" tends towards 1, the ensemble becomes a | |
;; bag. | |
(define (smacdown-ensemble--model-params-generator objective-type) | |
(lambda () | |
(let (max-trees 127 | |
max-nodes 1999 | |
regression (= "numeric" objective-type)) | |
{"random_candidate_ratio" (rand) | |
"stat_pruning" (if (< (rand) 0.5) false true) |
Using the python bindings and its local model object to predict from a CSV file you can create the predictions for test data stored in any local file. In the example, the CSV data is read from stdin and predictions are written in stdout, but this can be easily changed to use any local file.
The command options available are:
-h, --help show the help message and exit --delimiter DELIMITER
{"model": {"mean_squared_error_standard_deviation": 39.620234999999994, "average_mean_squared_error": 225.551395, "mean_absolute_error_standard_deviation": 1.6843499999999993, "r_squared_standard_deviation": 0.13543, "average_mean_absolute_error": 12.1284, "average_r_squared": 0.44447}, "random": {"mean_squared_error_standard_deviation": 218.02284500000002, "average_mean_squared_error": 944.4215550000001, "mean_absolute_error_standard_deviation": 4.773059999999999, "r_squared_standard_deviation": 0.693865, "average_mean_absolute_error": 24.850749999999998, "average_r_squared": -1.335135}, "mean": {"mean_squared_error_standard_deviation": 29.405699999999996, "average_mean_squared_error": 413.17726000000005, "mean_absolute_error_standard_deviation": 0.9322850000000003, "r_squared_standard_deviation": 0.0, "average_mean_absolute_error": 17.066145, "average_r_squared": 0.0}} |