Skip to content

Instantly share code, notes, and snippets.

@dharadhruve
dharadhruve / flatten.java
Last active July 18, 2019 08:15 — forked from ebuildy/flatten.java
Flatten Spark data frame fields structure, via SQL in Java. This fork also supports ArrayType fields.
class Toto
{
public void Main()
{
final DataFrame source = GetDataFrame();
final String querySelectSQL = flattenSchema(source.schema(), null);
source.registerTempTable("source");
final DataFrame flattenData = sqlContext.sql("SELECT " + querySelectSQL + " FROM source")
@gxercavins
gxercavins / more_sessions.py
Last active January 6, 2020 10:51
SO question 55261957 - Per-session statistics
import argparse, json, logging, time
import apache_beam as beam
import apache_beam.transforms.window as window
from apache_beam.io import WriteToText
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.options.pipeline_options import SetupOptions
class AnalyzeSession(beam.DoFn):