Skip to content

Instantly share code, notes, and snippets.

@AzimUddin
AzimUddin / HadoopConfig_HDI_PowerShell.ps1
Last active August 29, 2015 13:56
Hadoop job configurations with HDinsight PowerShell
# mapreduce example with hadoop job configurations
$clusterName = "YourClusterName"
$jobConfig = @{ "mapred.output.compress"="true"; "mapred.output.compression.codec"="org.apache.hadoop.io.compress.GzipCodec" }
$myWordCountJob = New-AzureHDInsightMapReduceJobDefinition -JarFile "/example/jars/hadoop-examples.jar" -ClassName "wordcount" -jobName "WordCountJob" -StatusFolder "/MyMRJobs/WordCountJobStatus" -Defines $jobConfig
$myWordCountJob.Arguments.Add("/example/data/gutenberg/davinci.txt")
$myWordCountJob.Arguments.Add("MyMRJobs/WordCountOutput")
$MyMRJob = Start-AzureHDInsightJob -Cluster $clusterName -JobDefinition $myWordCountJob
#Hive Job example with hadoop job configurations
$clusterName = "YourClusterName"
@AzimUddin
AzimUddin / HadoopJobConfig_HDI_SDK.cs
Last active August 29, 2015 13:56
Hadoop job configurations via HDInsight .Net SDK
var mapReduceJob = new MapReduceJobCreateParameters()
{
ClassName = "wordcount", // required
JobName = "MyWordCountJob", //optional
JarFile = "/example/jars/hadoop-examples.jar", // Required, alternative syntax: wasb://hdijobs@azimasv2.blob.core.windows.net/example/jar/hadoop-examples.jar
StatusFolder = "/AzimMRJobs/WordCountJobStatus" //Optional, but good to use to know where logs are uploaded in Azure Storage
};
//WordCount progam needs two arguments
mapReduceJob.Arguments.Add("/example/data/gutenberg/davinci.txt"); //input file
@AzimUddin
AzimUddin / hadoop_config_via_rest_api.ps1
Created February 13, 2014 19:15
Hadoop Job configurations via direct REST API call
# An Example of using passing hadoop configurations for a job in HDInsight, via direct REST API
$MyHDInsightUserName = "YourClusterUserName"
$MyHDInsightPwd = "YourPwd"
$clusterName = "YourClusterName"
$storageAcctname = "YourStorageAcctname"
$containerName = "YourDefaultContainerName"
$HdInsightPwd = ConvertTo-SecureString $MyHDInsightPwd -AsPlainText -Force
$HdInsightCreds = New-Object System.Management.Automation.PSCredential ($MyHDInsightUserName, $HdInsightPwd)
@AzimUddin
AzimUddin / HDInsight_Cluster_Customization_Via_SDK.cs
Last active August 29, 2015 14:00
An Example of HDInsight Cluster customization via HDInsight .Net SDK
/*1. Create a Visual studio 2012 Project.
2. Add HDInsight SDK NuGet to your project -
In Visual Studio 2012, Click on Tools -> library package manager -> Package Manager Console
PM> Install-Package Microsoft.WindowsAzure.Management.HDInsight
3. Use the following code, fill up the relevant info, then build and run -
*/
using System;
using System.Collections.Generic;
using System.Linq;
@AzimUddin
AzimUddin / HBase-JAVA-API-hbase-site.xml
Last active August 29, 2015 14:08
sample hbase-site.xml for using HBase JAVA API
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
* Copyright 2010 The Apache Software Foundation
*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.microsoft.css</groupId>
<artifactId>HBaseJavaApiTest</artifactId>
<packaging>jar</packaging>
<version>1.0-SNAPSHOT</version>
<name>HBaseJavaApiTest</name>
<url>http://maven.apache.org</url>
<dependencies>
@AzimUddin
AzimUddin / MyHiveJdbcTest.java
Last active November 6, 2018 14:32
Java sample using JDBC to connect to Hiveserver2 on Azure HDInsight
package com.microsoft.css;
/**
* Created by muddin on 6/4/2015.
*/
import java.sql.*;
public class MyHiveJdbcTest {
public static void main(String[] args) throws SQLException {
@AzimUddin
AzimUddin / JDBC_Hive_POM_XML.xml
Last active August 29, 2015 14:22
POM.xml for using JDBC to access HiveServer2
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.microsoft.css</groupId>
<artifactId>HiveJdbcTest</artifactId>
<packaging>jar</packaging>
<version>1.0-SNAPSHOT</version>
<name>HiveJdbcTest</name>
<url>http://maven.apache.org</url>
<dependencies>
@AzimUddin
AzimUddin / ExecuteWithRetryExample.cs
Last active January 18, 2016 10:18
Azure DocumentDB .Net SDK example of executing an Async method with retry to handle RequestRateTooLargeException or HTTP 429 errors
/// <summary>
/// Execute the function with retries on throttle
/// </summary>
/// <typeparam name="V"></typeparam>
/// <param name="client"></param>
/// <param name="function"></param>
/// <returns></returns>
private static async Task<V> ExecuteWithRetries<V>(DocumentClient client, Func<Task<V>> function)
{
TimeSpan sleepTime = TimeSpan.Zero;
@AzimUddin
AzimUddin / BulkImport.js
Last active September 18, 2015 23:10
An example of DocumentDB performance scale test with .Net SDK
/* Copied from: https://github.com/Azure/azure-documentdb-net/blob/master/samples/code-samples/ServerSideScripts/JS/BulkImport.js
*/
function bulkImport(docs) {
var collection = getContext().getCollection();
var collectionLink = collection.getSelfLink();
// The count of imported docs, also used as current doc index.
var count = 0;