Skip to content

Instantly share code, notes, and snippets.

Ian Downard iandow

Block or report user

Report or block iandow

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View app.py
############################################################################
# This code shows how to split large text documents along sentence
# boundaries using NLTK and process each chunk with AWS Translate.
############################################################################
# Be sure to first install nltk and boto3
import nltk.data
import boto3
# Define the source document that needs to be translated
source_document = "My little pony heart is yours..."
# Tell the NLTK data loader to look for resource files in /tmp/
View mediainfo.json
{
"tracks": [
{
"track_type": "General",
"count": "331",
"count_of_stream_of_this_kind": "1",
"kind_of_stream": "General",
"other_kind_of_stream": [
"General"
],
View text_splitter.py
# Tell the NLTK data loader to look for resource files in /tmp/
nltk.data.path.append("/tmp/")
# Download NLTK tokenizers to /tmp/
# We use /tmp because that's where AWS Lambda provides write access to the local file system.
nltk.download('punkt', download_dir='/tmp/')
# Load the English language tokenizer
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
# Split input text into a list of sentences
sentences = tokenizer.tokenize(transcript)
print("Input text length: " + str(len(transcript)))
View create_stack.sh
$ aws cloudformation describe-stack-events --stack-name mas1
{
"StackEvents": [
{
"StackId": "arn:aws:cloudformation:us-west-2:773074507832:stack/mas1/e0df81c0-54c7-11e9-9a27-0ae01deb57d8",
"EventId": "49ea43c0-54c9-11e9-bfc1-064b816f3a4a",
"StackName": "mas1",
"LogicalResourceId": "mas1",
"PhysicalResourceId": "arn:aws:cloudformation:us-west-2:773074507832:stack/mas1/e0df81c0-54c7-11e9-9a27-0ae01deb57d8",
"ResourceType": "AWS::CloudFormation::Stack",
View gist:8236b9789d17c6788a572e04369f28f1
[Feb 22, 2019 1:49:45 PM]: Integration test for 'Connection' started.
[Feb 22, 2019 1:49:45 PM]: Using Radoop version 9.1.0.
[Feb 22, 2019 1:49:45 PM]: Running 8 tests: [Fetch dynamic settings, NameNode networking, DataNode networking, YARN services networking, MapReduce, HDFS upload, Radoop jar upload, Import job]
[Feb 22, 2019 1:49:45 PM]: Running test 1/8: Fetch dynamic settings
[Feb 22, 2019 1:49:45 PM]: Retrieving required configuration properties...
[Feb 22, 2019 1:49:45 PM]: Successfully fetched property: hive.execution.engine
[Feb 22, 2019 1:49:45 PM]: Successfully fetched property: yarn.resourcemanager.scheduler.address
[Feb 22, 2019 1:49:45 PM]: Successfully fetched property: yarn.resourcemanager.resource-tracker.address
[Feb 22, 2019 1:49:45 PM]: Successfully fetched property: yarn.resourcemanager.admin.address
[Feb 22, 2019 1:49:45 PM]: Successfully fetched property: yarn.app.mapreduce.am.staging-dir
View license.txt
-----BEGIN SIGNED MESSAGE-----
clusterid: "5427061983691452296"
version: "4.0"
customerid: "ignore"
issuer: "MapR Technologies"
licType: Demo
description: "MapR Enterprise Trial Edition"
enforcement: HARD
gracePeriod: 0
issuedate: 1550010076
View teststaticpvc.yaml
# Copyright (c) 2009 & onwards. MapR Tech, Inc., All rights reserved
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: test-static-pvc
namespace: test-csi
spec:
accessModes:
- ReadWriteOnce
resources:
@iandow
iandow / teststaticpv.yaml
Last active Jan 29, 2019
teststaticpv.yaml
View teststaticpv.yaml
# Copyright (c) 2009 & onwards. MapR Tech, Inc., All rights reserved
apiVersion: v1
kind: PersistentVolume
metadata:
name: test-static-pv
namespace: test-csi
spec:
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
View mapred-site.xml
<configuration>
<property>
<name>mapreduce.jobhistory.address</name>
<value>nodec:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>nodec:19888</value>
</property>
<property>
View yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>nodeb</value>
<description>host is the hostname of the resourcemanager</description>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
You can’t perform that action at this time.