Vijay Anand Pandian vijayanandrp

## tutorial_with_solutions.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                vijayanandrp
                / tutorial_with_solutions.md
            
            
              Created
              December 15, 2017 07:42
            
              
                Pycon 2016 tutorial by Kevin Markham. - 
              
          
    Tutorial: Machine Learning with Text in scikit-learn

Agenda


Model building in scikit-learn (refresher)
Representing text as numerical data
Reading a text-based dataset into pandas
Vectorizing our dataset
Building and evaluating a model


## gender_analysis.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                vijayanandrp
                / gender_analysis.md
            
            
              Last active
              December 7, 2017 12:35
            
              
                Data Wrangling tool with simple example - https://informationcorners.com/ml-002-data-wrangling-2/
              
          
    Data wrangling

Data wrangling, sometimes referred to as data munging, is the process of transforming and mapping data from one "raw" data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics. A data wrangler is a person who performs these transformation operations. Wiki
Wrangler is an interactive tool for data cleaning and transformation.
Spend less time formatting and more time analyzing your data. stanford
Example - 1


## age_analysis.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                vijayanandrp
                / age_analysis.md
            
            
              Last active
              December 7, 2017 12:32
            
              
                Data Wrangling tool with simple example - https://informationcorners.com/ml-002-data-wrangling-1/ 
              
          
    Data wrangling

Data wrangling, sometimes referred to as data munging, is the process of transforming and mapping data from one "raw" data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics. A data wrangler is a person who performs these transformation operations. Wiki
Wrangler is an interactive tool for data cleaning and transformation.
Spend less time formatting and more time analyzing your data. stanford
Example - 1


## gender_analysis.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                vijayanandrp
                / gender_analysis.ipynb
            
            
              Last active
              December 7, 2017 10:23
            
              
                Data Wrangling tool with simple example - https://informationcorners.com/ml-002-data-wrangling-2/
              
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## age_analysis.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                vijayanandrp
                / age_analysis.ipynb
            
            
              Created
              December 6, 2017 00:33
            
              
                Data Wrangling tool with simple example - https://informationcorners.com/ml-002-data-wrangling-1/ 
              
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## Evaluation of the model used.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                vijayanandrp
                / Evaluation of the model used.md
            
            
              Last active
              November 30, 2017 15:05
            
              
                https://informationcorners.com/ml-001-name-text-gender-predictor-classifier/
              
          
    This manual mode where you can test this predicition model with runtime names.
def model_evaluation(classifier):
    print('<<<  Testing Module   >>> ')
    print('Enter "q" or "quit" to end testing module')
    while 1:
        test_name = input('\n Enter name to classify: ')
        if test_name.lower() == 'q' or test_name.lower() == 'quit':
            print('End')
            exit(1)

  
## Applying Machine learning algorithm.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                vijayanandrp
                / Applying Machine learning algorithm.md
            
            
              Created
              November 30, 2017 14:44
            
              
                https://informationcorners.com/ml-001-name-text-gender-predictor-classifier/
              
          
    def train_and_test(train_percent=0.80):
    feature_set = prepare_data_set()
    validate_data_set(feature_set)
    random.shuffle(feature_set)
    total = len(feature_set)
    cut_point = int(total * train_percent)
    # splitting Dataset into train and test
    train_set = feature_set[:cut_point]
 test_set = feature_set[cut_point:]

  
## feature_extraction.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                vijayanandrp
                / feature_extraction.md
            
            
              Last active
              November 30, 2017 14:57
            
              
                https://informationcorners.com/ml-001-name-text-gender-predictor-classifier
              
          
    Feature/attributes/input/predictors extraction from given name string.
def extract_feature(name: str):
    name = name.upper()
    feature = dict()
    
    # additional feature extraction
    # feature["first_1"] = name[0]
    # for letter in 'abcdefghijklmnopqrstuvwxyz'.upper():

  
## prepare_dataset.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                vijayanandrp
                / prepare_dataset.md
            
            
              Last active
              November 30, 2017 14:28
            
              
                https://informationcorners.com/ml-001-name-text-gender-predictor-classifier/
              
          
    You can download the dataset at here
!/usr/bin/env python3.5
# -*- coding: utf-8 -*-

import os
import random
from zipfile import ZipFile
from nltk import NaiveBayesClassifier, MaxentClassifier, DecisionTreeClassifier, classify

  
## read_email.py
# -*- coding: utf-8 -*-

import re
import email
import smtplib
import mimetypes
from email.mime.multipart import MIMEMultipart
from email import encoders
from email.mime.audio import MIMEAudio
from email.mime.base import MIMEBase
	# -- coding: utf-8 --

	import re
	import email
	import smtplib
	import mimetypes
	from email.mime.multipart import MIMEMultipart
	from email import encoders
	from email.mime.audio import MIMEAudio
	from email.mime.base import MIMEBase