Skip to content

Instantly share code, notes, and snippets.

@imshashank
Last active August 29, 2015 14:26
Show Gist options
  • Save imshashank/9a882b04143b59ca458b to your computer and use it in GitHub Desktop.
Save imshashank/9a882b04143b59ca458b to your computer and use it in GitHub Desktop.

Using the Hopdata API

HOST: http://api.hopdata.com/

Machine Learning, Data Mining and Artificial Intelligence algorithms as a Service

Tag Lines: Simplifying machine learning. Bringing the power of AI to startups and developers. The Machine Learning for hackers.

Our motive is to create a simple to integrate "Machine Learning" platform but yet powerful enough to provide high accuracy and low latency API.

Such a system provides Data Mining, Machine Learning and Artificial Intelligence algorithms as a service. The system has ability to create training model for datasets uploaded as a training set and performs classification on similar datasets in the future using the saved model.

##Algorithms Implemented:

####KNN :
Performs K-nearest neighbors classification for the data based on the previous training data uploaded and trained. The K value is set to the number of labels in the training dataset.

####K-Means Clustering and Train KNN : If the data is not labelled then we can cluster the dataset using K-means algorithm and then train the KNN classifier using the clustered data. Now the data can be classified based on the clustered training set.

####Naive Bayes (Bernoulli, Gaussian and Multinomial) : Provides three types of Naive Bayes classification. Based on user requirement he can choose the algorithm and train the model. Once trained he can classify the data using the trained model.

Source A source is the "csv" formatted file that you upload to train the model.The training source currently used is comma separated and the first column is the label if we are training a model.

Download our sample dataset from here: http://hopdata.com/sample.csv The above dataset trains the model on different languages.

###Dataset We convert the dataset to a "patent pending" format which helps us extract all the features from the csv file and improve overall accuracy.

###Model The model is the final trained machine learning model which is created using the datasets. When calling an API the model is used to find the prediction.

###Tasks You can all scheduled jobs here. Any request for creating a dataset or model is sent here. Based on your account, different tasks have different priorities

##Tutorial

  1. Upload a source here, see our "Source Guidelines" to make sure that you upload a valid source.

You may use on of our sample sources: Twitte Sentiment Adult Income Prediction

  1. Creating a dataset. i) Go to dataset page and click on "New Dataset". ii) You will see the list of uploaded sources. Enter a name for the dataset and select the source from the dropdown list. iii) The "Dataset Wizard" will be started. Now select if your file has a header. iv) The Wizard will ask you to select appropriate data type for each columns a) Numerical -> If all the entries are continuous and represent numerical quantities. b) String -> If the column has only string values c) Categorial -> The Column has discrete values which can represent some category.

v) On the new page select the "target" column which you want to be predicted by the model. Also if you selected "no header" in the previous page, you will see new column names created. These are the column labels. Once done click on Submit.

vi) A new "Task" will be created and will be sent to the job scheduler for processing. Based on your account the jobs have different priorities. Once the job is done you will see it on the dataset page.

  1. Creating a Model i) Go to the dataset page and click on "New Model". ii) Enter a name for the Model and from the dropdown list select the Dataset you want to create model of. iii) A new "Task" will be created and will be sent to the job scheduler for processing.Based on your account the jobs have different priorities. Once the job is done you will see it on the models page.

  2. Using the API i) Go to the Predict->API page and note the API_KEY. ii) You can use one of our sample codes from the API page. iii) To get the predicted class, send a POST request with the appropriate headers to "api.hopdata.com" iv) On successful request processing, a json response is returned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment