This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# first, make sure you have downloaded the Azure XPlat CLI | |
# http://azure.microsoft.com/en-us/documentation/articles/xplat-cli/#install | |
# eg, sudo npm install azure-cli -g | |
# login to Azure (will prompt for your username/password) | |
azure login | |
# show all of your storage accounts | |
azure storage account list |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from pyspark.sql.functions import udf | |
import httplib, urllib, base64, json | |
def whichLanguage(text): | |
headers = { | |
# Request headers | |
'Content-Type': 'application/json', | |
'Ocp-Apim-Subscription-Key': '{your subscription key here}', | |
} | |
params = urllib.urlencode({ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
text_file = sc.textFile("hdfs://sandbox.hortonworks.com/user/guest/install.log") | |
counts = text_file.flatMap( lambda line: line.split(" ")) \ | |
.map(lambda word: (word, 1) ) \ | |
.reduceByKey(lambda a, b : a + b) | |
counts.saveAsTextFile("hdfs://sandbox.hortonworks.com/user/guest/output1.txt") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
su hdfs | |
hadoop fs –mkdir /user/root | |
hadoop fs –chmod 777 /user/root | |
hadoop fs –chmod 777 /user/guest | |
exit | |
wget http://www.gutenberg.org/files/50831/50831-0.txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# start of something to help me manage git repos... | |
# starting with a very basic list of them. | |
gci -recurse -force -include .git | select -Property Parent, {$_.Parent.FullName} | |
# ideal next step would potentially parse & return a list of remotes... | |
# once I get remotes, I should also output current branches | |
# serialize to a json doc and allow me to pick that back up elsewhere |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// sample U-SQL UDO (Processor) written in F# | |
// Note, currently (11/2015) requires deployment of F#.Core | |
namespace fSharpProcessor | |
open Microsoft.Analytics.Interfaces | |
type myProcessor() = | |
inherit IProcessor() | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
' sample U-SQL UDO (Processor) written in VB.NET | |
Imports Microsoft.Analytics.Interfaces | |
Public Class vbProcessor | |
Inherits IProcessor | |
Private CountryTranslation As New Dictionary(Of String, String) From | |
{{"Deutschland", "Germany"}, | |
{"Schwiiz", "Switzerland"}, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# assumes table is my_json, with one column containing all of the json body | |
add file wasb:///example/apps/process_json.py; | |
SELECT transform(json_body) | |
USING 'd:\python27\python.exe process_json.py' | |
AS id, lessonbranch, elapsedseconds, activity, the_date | |
FROM my_json; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# this is a python streaming program designed to be called from a Hive query | |
# this will process a complex json document, and will return the right set of columns and rows | |
# a second GIST will contain the hive query that can be used to process this | |
import sys | |
import json | |
# this returns five columns | |
# id, lessonbranch, elapsedseconds, activity, datetime |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import sys | |
lines = [] | |
for line in sys.stdin: | |
lines.append(line) | |
if len(lines) > 0: | |
cleaned_lines = [line.strip() for line in lines] | |
single_line = ' '.join(cleaned_lines) |
NewerOlder