Skip to content

Instantly share code, notes, and snippets.

View dzzh's full-sized avatar

Zmicier Zaleznicenka dzzh

View GitHub Profile
@dzzh
dzzh / teamcity_github_pr_branch.py
Created March 7, 2017 15:07
Inject source and target branches from Github pull request into TeamCity CI
#!/usr/bin/env python
import os
import sys
import urllib2
import json
'''
This script queries Github for source and target branches of a pull request
and updates environment variables at TeamCity CI to make these variable
/**
* Workaround for CRUNCH-102 bug (https://issues.apache.org/jira/browse/CRUNCH-102)
*
* If before executing union() operation on two PCollections, there was groupByKey() executed on one collection,
* but not on another, union() will only include data from the former collection.
*
* This bug was fixed in Crunch-0.4.0, but for those using CDH3 it still occurs.
*
* To prevent this bug, call this method on ungrouped collection before using it in union() operation.
*