Devon turingDH

## Supervised_SKLearn.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                turingDH
                / Supervised_SKLearn.ipynb
            
            
              Created
              March 28, 2019 21:34
            
              
                SKLearn Supervised Learning
              
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## sum_square_Pool.py
## Credit to LucidProgramming https://www.youtube.com/watch?v=u2jTn-Gj2Xw for the walkthrough.  My changes weren't terribly significant.

import os # to get core count on machine
import time  # to time the duration

from multiprocessing import Pool  # to instantiate a Pool of workers to distribute the process across the cores in CPU


def sum_square(number):
    s=0

## gist:80a91745a7d60d4272486c0618a91476
library(dplyr)
library(DT)
library(purrr)
library(tidyr)

phraseToSearch <- 'this|that'

scriptFiles <-
  bind_rows(
    map_dfc('~/R', list.files, full.names = T) %>% rename(fileNames = V1),

## sparklyr_cv_pipeline_example.R
# Load packages
library(dplyr)
library(sparklyr)

# Set up connect
sc <- spark_connect(master = "local")

# Create a Spark DataFrame of mtcars
mtcars_sdf <- copy_to(sc, mtcars)

## gist:54fa4d3a712760ccba15ccb7ebaea8d1
R to python useful data wrangling snippets

The dplyr package in R makes data wrangling significantly easier.
The beauty of dplyr is that, by design, the options available are limited.
Specifically, a set of key verbs form the core of the package.
Using these verbs you can solve a wide range of data problems effectively in a shorter timeframe.
Whilse transitioning to Python I have greatly missed the ease with which I can think through and solve problems using dplyr in R.
The purpose of this document is to demonstrate how to execute the key dplyr verbs when manipulating data using Python (with the pandas package).

dplyr is organised around six key verbs

## gist:e13650caa08c6b65388e915d14ac29a4
import numpy as np
import pandas as pd

greeks = [chr(code) for code in range(945,970)]
Greeks = ['alpha', 'beta', 'gamma', 'delta', 'epsilon', 'zeta', 'eta', 'theta', 'iota', 'kappa', 'lambda', 'mu', 'nu', 'xi', 'omicron', 'pi', 'rho', 'word-final sigma', 'sigma', 'tau', 'upsilon', 'phi', 'chi', 'psi', 'omega']

df = pd.DataFrame(greeks, Greeks).reset_index().reset_index()
df.rename(columns={df.columns[0]:"chr_val", df.columns[1]:"greek text", df.columns[2]:"greek symbol"}, inplace=True)
df['chr_val'] += 945
print(df)

## AirQuality_Regression.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                turingDH
                / AirQuality_Regression.ipynb
            
            
              Last active
              March 30, 2018 00:39
            
              
                UCI Air Quality = initial EDA
              
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## LogisticRegressionRanges_R_Screen.png

      
              2 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                turingDH
                / LogisticRegressionRanges_R_Screen.png
            
            
              Last active
              February 9, 2018 03:52
            
              
                Logistic Regression: probability vs. odds vs. log odds
	## Credit to LucidProgramming https://www.youtube.com/watch?v=u2jTn-Gj2Xw for the walkthrough. My changes weren't terribly significant.

	import os # to get core count on machine
	import time # to time the duration

	from multiprocessing import Pool # to instantiate a Pool of workers to distribute the process across the cores in CPU


	def sum_square(number):
	s=0
	library(dplyr)
	library(DT)
	library(purrr)
	library(tidyr)

	phraseToSearch <- 'this\|that'

	scriptFiles <-
	bind_rows(
	map_dfc('~/R', list.files, full.names = T) %>% rename(fileNames = V1),
	# Load packages
	library(dplyr)
	library(sparklyr)

	# Set up connect
	sc <- spark_connect(master = "local")

	# Create a Spark DataFrame of mtcars
	mtcars_sdf <- copy_to(sc, mtcars)
	R to python useful data wrangling snippets

	The dplyr package in R makes data wrangling significantly easier.
	The beauty of dplyr is that, by design, the options available are limited.
	Specifically, a set of key verbs form the core of the package.
	Using these verbs you can solve a wide range of data problems effectively in a shorter timeframe.
	Whilse transitioning to Python I have greatly missed the ease with which I can think through and solve problems using dplyr in R.
	The purpose of this document is to demonstrate how to execute the key dplyr verbs when manipulating data using Python (with the pandas package).

	dplyr is organised around six key verbs
	import numpy as np
	import pandas as pd

	greeks = [chr(code) for code in range(945,970)]
	Greeks = ['alpha', 'beta', 'gamma', 'delta', 'epsilon', 'zeta', 'eta', 'theta', 'iota', 'kappa', 'lambda', 'mu', 'nu', 'xi', 'omicron', 'pi', 'rho', 'word-final sigma', 'sigma', 'tau', 'upsilon', 'phi', 'chi', 'psi', 'omega']

	df = pd.DataFrame(greeks, Greeks).reset_index().reset_index()
	df.rename(columns={df.columns[0]:"chr_val", df.columns[1]:"greek text", df.columns[2]:"greek symbol"}, inplace=True)
	df['chr_val'] += 945
	print(df)