Skip to content

Instantly share code, notes, and snippets.

View jaketf's full-sized avatar
🚲

Jake Ferriero jaketf

🚲
View GitHub Profile
@jaketf
jaketf / lack_of_ordering_pipelines.py
Created July 2, 2019 06:48
Demonstrates that PCollections do maintain some order.
"""
Example pipeline to show that PCollection are not written in order.
"""
import os
import numpy as np
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
#!/usr/bin/env python3
"""
Module for sessionizing on sampling rate.
"""
import argparse
import csv
import os
import logging
import random
from concurrent.futures import ThreadPoolExecutor
from dataclasses import dataclass
from time import sleep, time
from functools import wraps
logging.basicConfig(
format='%(asctime)s %(levelname)-8s %(message)s',
"""
Command line utility to create/update views on a big query's table.
This utility pulls the JSON schema of a given table and iterates through them
and filters out the blacklisted fields , and creats a string with all the
required columns for a view
Inut as Commandline Args:
- Arg 1: projectname.datasetname.tablename (Fully qualified table name)
- Arg2 :blacklisted fields: A string with comma separated black listed fields
- Arg3 : bq command path
- Arg4 : View project name.View Dataset.View Name
@jaketf
jaketf / iam_search.sh
Created February 14, 2020 01:17
Search for GCP Roles containing a given permission
#!/usr/bin/bash
# This is not an official product of Google Inc.
# This is a SLOW but convenient utility for finding
# GCP roles containing a permission.
# Example use:
# ./iam_search.sh compute.instances.setMetadata
# $1 a gcp permission (e.g. compute.instances.setMetadata)
jferriero@shadow-gallery:~/VersionControl/beam$ sdks/java/build-tools/beam-linkage-check.sh
Fri 10 Apr 2020 02:19:44 PM PDT: Installing artifacts of HL7v2IO(730ab326) to Maven local repository.
Fri 10 Apr 2020 02:23:39 PM PDT: Running linkage check for beam-sdks-java-core in HL7v2IO
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Fri 10 Apr 2020 02:24:04 PM PDT: Done: build/linkagecheck/730ab326-beam-sdks-java-core
Fri 10 Apr 2020 02:24:04 PM PDT: Running linkage check for beam-sdks-java-io-google-cloud-platform in HL7v2IO
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
@jaketf
jaketf / EarlyFiringsTest.java
Created April 29, 2020 00:16
[BEAM-9847] Single Elements Outputs CANNOT be eagerly output via triggering
/** Utility DoFn for purpose of this test (defined in a test util class)
public static class EmitNSlowlyFn extends DoFn<Integer, Integer> {
@ProcessElement
public void slowlyEmit(ProcessContext context) throws InterruptedException {
Integer num = context.element();
int i = 0;
while (i < num) { // output elements at a rate of ~ 2 elements per secon
Sleeper.DEFAULT.sleep(500);
context.output(i++);
}
@jaketf
jaketf / checkstyleError.out
Created May 14, 2020 19:10
checkstyleErrors for 4c7d546
./gradlew checkstyleMain checkstyleTest
Starting a Gradle Daemon (subsequent builds will be faster)
Configuration on demand is an incubating feature.
> Task :runners:apex:buildDependencyTree
See the report at: file:///usr/local/google/home/jferriero/VersionControl/beam/runners/apex/build/classes/java/main/org/apache/beam/runners/apex/dependency-tree
> Task :sdks:java:io:solr:compileTestJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
@jaketf
jaketf / pre-commit.out
Created May 14, 2020 20:23
Output of pre-commit for f583065cb69d5c4f62e77d8e60c7811652ab49aa
jferriero@shadow-gallery:~/VersionControl/beam$ git rev-parse HEAD && ./gradlew spotlessApply && ./gradlew checkstyleMain checkstyleTest javadoc spotbugsMain compileJava compileTestJava
f583065cb69d5c4f62e77d8e60c7811652ab49aa
Configuration on demand is an incubating feature.
Deprecated Gradle features were used in this build, making it incompatible with Gradle 6.0.
Use '--warning-mode all' to show the individual deprecation warnings.
See https://docs.gradle.org/5.2.1/userguide/command_line_interface.html#sec:command_line_warnings
BUILD SUCCESSFUL in 2s
100 actionable tasks: 1 executed, 99 up-to-date
@jaketf
jaketf / lookup_side_input_with_cache.py
Last active June 9, 2020 03:18
Apache Beam Python Example: Side Input look up with cache
# Copyright 2020 Google Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,