Skip to content

Instantly share code, notes, and snippets.

View alpoza's full-sized avatar

Albert alpoza

View GitHub Profile
@alpoza
alpoza / ucd_upload_plugins_from_dir.py
Created March 20, 2019 10:01 — forked from sgwilbur/ucd_upload_plugins_from_dir.py
Example of howto use the UCD rest api to update plugins from a directory of plugins, requires the requests library.
#!/usr/bin/env python
import os, requests, json, re
## requirements.txt:
#
# httplib2==0.8
# requests==2.0.1
# wsgiref==0.1.2
# hard coded
@alpoza
alpoza / curl.sh
Created November 20, 2018 11:24 — forked from bartfastiel/curl.sh
curl -s -u admin:admin -XPOST "localhost:9000/api/organizations/enable_support" | python -m json.tool
curl -s -u admin:admin -XPOST "localhost:9000/api/organizations/create?name=myorg&key=myorg" | python -m json.tool
curl -s -u admin:admin -XPOST "localhost:9000/api/qualityprofiles/create?name=myprofile&language=java&organization=myorg" | python -m json.tool
curl -s -u admin:admin -XPOST "localhost:9000/api/qualityprofiles/search" | python -m json.tool
curl -s -u admin:admin -XPOST "localhost:9000/api/qualityprofiles/search?organization=myorg" | python -m json.tool
curl -s -u admin:admin -XPOST "localhost:9000/api/qualityprofiles/search?defaults=true" | python -m json.tool
curl -s -u admin:admin -XPOST "localhost:9000/api/projects/create?project=myproject&name=myproject" | python -m json.tool
curl -s -u admin:admin -XPOST "localhost:9000/api/qualityprofiles/search?project=myproject" | python -m json.tool
#delete
@alpoza
alpoza / curl.md
Created March 23, 2018 09:20 — forked from subfuzion/curl.md
curl POST examples

Common Options

-#, --progress-bar Make curl display a simple progress bar instead of the more informational standard meter.

-b, --cookie <name=data> Supply cookie with request. If no =, then specifies the cookie file to use (see -c).

-c, --cookie-jar <file name> File to save response cookies to.

@alpoza
alpoza / Query LDAP from R
Created September 26, 2017 09:31 — forked from jeremyshantz/Query LDAP from R
Query LDAP from R
library(RCurl)
val <- getURL('ldap://ldap.domain.net/DC=domain,DC=net?sAMAccountName?sub?(employeeID=0123456)',
.opts=list(userpwd = "DOMAIN\\domainid:password"))
@alpoza
alpoza / jenkins_copying_configuration.sh
Last active March 23, 2017 08:45 — forked from mriddle/jenkins_copying_configuration.sh
Moving Jenkins server configuration from one server to another
ORIGINAL_JENKINS_SERVER=
ORIGINAL_SERVER_USER=
NEW_JENKINS_SERVER=
NEW_SERVER_USER=
# ON THE ORIGINAL JENKINS SERVER
ssh $ORIGINAL_SERVER_USER@$ORIGINAL_JENKINS_SERVER
cd /var/lib/jenkins/
for i in `ls jobs`; do echo "jobs/$i/config.xml";done > config.totar
@alpoza
alpoza / import_csv_to_mongo
Created February 26, 2017 10:04 — forked from mprajwala/import_csv_to_mongo
Store CSV data into mongodb using python pandas
#!/usr/bin/env python
import sys
import pandas as pd
import pymongo
import json
def import_content(filepath):
mng_client = pymongo.MongoClient('localhost', 27017)
@alpoza
alpoza / CsvSlurper.groovy
Created December 1, 2016 16:59
Groovy Csv slurper
package groovy.csv
/**
* CSV slurper which parses text or reader content into a data strucuture of lists and maps.
* <p>
* Example usage:
* <code><pre>
* def slurper = new CsvSlurper()
* def result = slurper.parseText('''
* name, age
@alpoza
alpoza / gist:26f4177d85d3d134c350b9752bcf772a
Created November 29, 2016 21:55 — forked from mpas/gist:58497115057068f15751
Groovy script to convert Csv to Json
import groovy.json.JsonOutput
/**
* A simple CSV file to Json converter
*
* The CSV file is expected to have a header row to identify the columns. These
* columns will be used to generate the corresponding Json field.
*
* @author Marco Pas
*/
@alpoza
alpoza / one-hot.py
Created May 11, 2016 13:58 — forked from ramhiser/one-hot.py
Apply one-hot encoding to a pandas DataFrame
import pandas as pd
import numpy as np
from sklearn.feature_extraction import DictVectorizer
def encode_onehot(df, cols):
"""
One-hot encoding is applied to columns specified in a pandas DataFrame.
Modified from: https://gist.github.com/kljensen/5452382

The pyspark documentation doesn't include an example for the aggregateByKey RDD method. I didn't find any nice examples online, so I wrote my own.

Here's what the documetation does say:

aggregateByKey(self, zeroValue, seqFunc, combFunc, numPartitions=None)

Aggregate the values of each key, using given combine functions and a neutral "zero value". This function can return a different result type, U, than the type of the values in this RDD, V. Thus, we need one operation for merging a V into a U and one operation for merging two U's, The former operation is used for merging values within a partition, and the latter is used for merging values between partitions. To avoid memory allocation, both of these functions are allowed to modify and return their first argument instead of creating a new U.

reduceByKey and aggregateByKey are much more efficient than groupByKey and should be used for aggregations as much as possible.