This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <component name="InspectionProjectProfileManager"> | |
| <profile version="1.0"> | |
| <option name="myName" value="Project Default" /> | |
| <inspection_tool class="PyArgumentEqualDefaultInspection" enabled="true" level="WEAK WARNING" enabled_by_default="true" /> | |
| <inspection_tool class="PyAugmentAssignmentInspection" enabled="true" level="WEAK WARNING" enabled_by_default="true" /> | |
| <inspection_tool class="PyClassicStyleClassInspection" enabled="true" level="WARNING" enabled_by_default="true" /> | |
| <inspection_tool class="PyCompatibilityInspection" enabled="true" level="WARNING" enabled_by_default="true"> | |
| <option name="ourVersions"> | |
| <value> | |
| <list size="2"> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import quandl,math | |
| import pandas as pd | |
| df = quandl.get('WIKI/GOOGL') | |
| df = df[['Adj. Open', 'Adj. High', 'Adj. Low', 'Adj. Close', 'Adj. Volume']] | |
| df['HL_PCT'] = (df['Adj. High'] - df['Adj. Close'] / 100 * df['Adj. Close']) | |
| df['HL_Change'] = (df['Adj. Close'] - df['Adj. Open'] / 100 * df['Adj. Open']) | |
| df = df[['Adj. Close', 'HL_PCT', 'HL_Change', 'Adj. Volume']] | |
| forecast_col= 'Adj. Close' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import quandl,math | |
| import pandas as pd | |
| df = quandl.get('WIKI/GOOGL') | |
| df = df[['Adj. Open', 'Adj. High', 'Adj. Low', 'Adj. Close', 'Adj. Volume']] | |
| df['HL_PCT'] = (df['Adj. High'] - df['Adj. Close'] / 100 * df['Adj. Close']) | |
| df['HL_Change'] = (df['Adj. Close'] - df['Adj. Open'] / 100 * df['Adj. Open']) | |
| df = df[['Adj. Close', 'HL_PCT', 'HL_Change', 'Adj. Volume']] | |
| forecast_col= 'Adj. Close' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| This gist includes components of a oozie workflow - scripts/code, sample data | |
| and commands; Oozie actions covered: shell action, email action | |
| Action 1: The shell action executes a shell script that does a line count for files in a | |
| glob provided, and writes the line count to standard output | |
| Action 2: The email action emails the output of action 1 | |
| Pictorial overview of job: | |
| -------------------------- |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| sample pig script, runs fine in local mode. the elephantbird magic is the JsonLoader() in the LOAD command and then | |
| converting user to a java map so that i can extract screen_name. I haven't read the docs yet but there may be a better way to do this. I'm sure I can combine the two generate statements into one, this is just a first attempt. | |
| REGISTER '/Users/nkodner/Downloads/cdh3/elephant-bird/build/elephant-bird-2.2.4-SNAPSHOT.jar'; | |
| REGISTER '/Users/nkodner/Downloads/cdh3/pig-0.8.1-cdh3u4/contrib/piggybank/java/lib/json-simple-1.1.jar'; | |
| REGISTER '/Users/nkodner/Downloads/cdh3/pig-0.8.1-cdh3u4/build/ivy/lib/Pig/guava-r06.jar'; | |
| raw = LOAD '/Users/nkodner/clean_tweets/with_deletedaa' using com.twitter.elephantbird.pig.load.JsonLoader(); | |
| bah = limit raw 100; | |
| cc = foreach bah generate (chararray)$0#'text' as text,(long)$0#'id' as id,com.twitter.elephantbird.pig.piggybank.JsonStringToMap($0#'user') as user; | |
| dd = foreach cc generat |