Skip to content

Instantly share code, notes, and snippets.

@AzimUddin
AzimUddin / HadoopConfig_HDI_PowerShell.ps1
Last active August 29, 2015 13:56
Hadoop job configurations with HDinsight PowerShell
# mapreduce example with hadoop job configurations
$clusterName = "YourClusterName"
$jobConfig = @{ "mapred.output.compress"="true"; "mapred.output.compression.codec"="org.apache.hadoop.io.compress.GzipCodec" }
$myWordCountJob = New-AzureHDInsightMapReduceJobDefinition -JarFile "/example/jars/hadoop-examples.jar" -ClassName "wordcount" -jobName "WordCountJob" -StatusFolder "/MyMRJobs/WordCountJobStatus" -Defines $jobConfig
$myWordCountJob.Arguments.Add("/example/data/gutenberg/davinci.txt")
$myWordCountJob.Arguments.Add("MyMRJobs/WordCountOutput")
$MyMRJob = Start-AzureHDInsightJob -Cluster $clusterName -JobDefinition $myWordCountJob
#Hive Job example with hadoop job configurations
$clusterName = "YourClusterName"
@AzimUddin
AzimUddin / HadoopJobConfig_HDI_SDK.cs
Last active August 29, 2015 13:56
Hadoop job configurations via HDInsight .Net SDK
var mapReduceJob = new MapReduceJobCreateParameters()
{
ClassName = "wordcount", // required
JobName = "MyWordCountJob", //optional
JarFile = "/example/jars/hadoop-examples.jar", // Required, alternative syntax: wasb://hdijobs@azimasv2.blob.core.windows.net/example/jar/hadoop-examples.jar
StatusFolder = "/AzimMRJobs/WordCountJobStatus" //Optional, but good to use to know where logs are uploaded in Azure Storage
};
//WordCount progam needs two arguments
mapReduceJob.Arguments.Add("/example/data/gutenberg/davinci.txt"); //input file
@AzimUddin
AzimUddin / hadoop_config_via_rest_api.ps1
Created February 13, 2014 19:15
Hadoop Job configurations via direct REST API call
# An Example of using passing hadoop configurations for a job in HDInsight, via direct REST API
$MyHDInsightUserName = "YourClusterUserName"
$MyHDInsightPwd = "YourPwd"
$clusterName = "YourClusterName"
$storageAcctname = "YourStorageAcctname"
$containerName = "YourDefaultContainerName"
$HdInsightPwd = ConvertTo-SecureString $MyHDInsightPwd -AsPlainText -Force
$HdInsightCreds = New-Object System.Management.Automation.PSCredential ($MyHDInsightUserName, $HdInsightPwd)
@AzimUddin
AzimUddin / HDInsight_Cluster_Customization_Via_SDK.cs
Last active August 29, 2015 14:00
An Example of HDInsight Cluster customization via HDInsight .Net SDK
/*1. Create a Visual studio 2012 Project.
2. Add HDInsight SDK NuGet to your project -
In Visual Studio 2012, Click on Tools -> library package manager -> Package Manager Console
PM> Install-Package Microsoft.WindowsAzure.Management.HDInsight
3. Use the following code, fill up the relevant info, then build and run -
*/
using System;
using System.Collections.Generic;
using System.Linq;
@AzimUddin
AzimUddin / HBase-JAVA-API-hbase-site.xml
Last active August 29, 2015 14:08
sample hbase-site.xml for using HBase JAVA API
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
* Copyright 2010 The Apache Software Foundation
*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.microsoft.css</groupId>
<artifactId>HBaseJavaApiTest</artifactId>
<packaging>jar</packaging>
<version>1.0-SNAPSHOT</version>
<name>HBaseJavaApiTest</name>
<url>http://maven.apache.org</url>
<dependencies>
@AzimUddin
AzimUddin / JDBC_Hive_POM_XML.xml
Last active August 29, 2015 14:22
POM.xml for using JDBC to access HiveServer2
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.microsoft.css</groupId>
<artifactId>HiveJdbcTest</artifactId>
<packaging>jar</packaging>
<version>1.0-SNAPSHOT</version>
<name>HiveJdbcTest</name>
<url>http://maven.apache.org</url>
<dependencies>
@AzimUddin
AzimUddin / ExecuteWithRetryNode.js
Last active August 30, 2015 17:52
Azure DocumentDB node.js example of executing a method with retry to handle RequestRateTooLargeException or HTTP 429 errors
var queryIterator = documentClient.queryDocuments(collection._self, query);
executeNextWithRetry(yourCallback);
function executeNextWithRetry(callback) {
queryIterator.executeNext(function(err, results, responseHeaders) {
if(err && err.code === 429 && responseHeaders['x-ms-retry-after-ms']) {
console.log("Retrying after " + responseHeaders['x-ms-retry-after-ms']);
setTimeout(function() {
@AzimUddin
AzimUddin / QueryMasterResources.cs
Last active September 2, 2015 21:29
An example of querying Master resources in Azure DocumentDB
// Check if database exists, if not create it
Database database = client.CreateDatabaseQuery().Where(db => db.Id == id).ToArray().FirstOrDefault();
if (database == null)
{
database = await client.CreateDatabaseAsync(new Database { Id = id });
}
// Get collection
StudentsCollection = client.CreateDocumentCollectionQuery(database.SelfLink).Where(c => c.Id == collectionId).ToArray().FirstOrDefault();
@AzimUddin
AzimUddin / MeasureRequestCharge.cs
Last active September 3, 2015 15:02
An example of measuring Request Charge for an insert Operation for Azure DocumentDB
private async Task InsertDocumentAsync(Student student, bool showDebugInfo)
{
ResourceResponse<Document> response = await client.CreateDocumentAsync(colSelfLink, student);
Console.WriteLine("{0}\tInsert Operation, # of RUs: {1}", DateTime.UtcNow, response.RequestCharge);
}