Skip to content

Instantly share code, notes, and snippets.

Ivan sadikovi

Block or report user

Report or block sadikovi

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@sadikovi
sadikovi / parquet_read.scala
Created Nov 4, 2018
Parquet MR read file and list all of the records
View parquet_read.scala
////////////////////////////////////////////////////////////////
// == Parquet read ==
////////////////////////////////////////////////////////////////
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.Path
import org.apache.hadoop.mapreduce._
import org.apache.hadoop.mapreduce.lib.input.FileSplit
import org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
import org.apache.parquet.hadoop.ParquetInputSplit
import org.apache.parquet.hadoop.ParquetRecordReader
@sadikovi
sadikovi / CustomEncoder.scala
Created Sep 21, 2018
Code to create a custom encoder for a class with different fields, including Row
View CustomEncoder.scala
def clazz[T](cls: Class[T], encoders: Seq[(String, ExpressionEncoder[_])]): ExpressionEncoder[T] = {
encoders.foreach { case (_, enc) => enc.assertUnresolved() }
val schema = StructType(encoders.map {
case (fieldName, e) =>
val (dataType, nullable) = if (e.flat) {
e.schema.head.dataType -> e.schema.head.nullable
} else {
e.schema -> true
}
@sadikovi
sadikovi / Example.java
Last active Sep 14, 2018
Issue #158 example
View Example.java
final class Example {
void /* test */ func() {
String a = "a";
String b = "a" + b + "c()";
Buffer buf = "test" + "new Buffer() {};";
HashSet<String> test = new HashSet<String>();
}
public int get_int() {
@sadikovi
sadikovi / DefaultSource.scala
Last active Jun 18, 2018
Example of StreamSinkProvider for structured streaming with custom query execution
View DefaultSource.scala
package org.apache.spark.sql.sadikovi
import java.io.{ObjectInputStream, ObjectOutputStream}
import java.util.UUID
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.Path
import org.apache.hadoop.mapreduce.{JobContext, TaskAttemptContext}
import org.apache.spark.internal.io._
@sadikovi
sadikovi / module.patch
Created May 31, 2018
Java 9 module grammar for language-java
View module.patch
diff --git a/grammars/java.cson b/grammars/java.cson
index cb9947a..399c914 100644
--- a/grammars/java.cson
+++ b/grammars/java.cson
@@ -109,6 +109,9 @@
{
'include': '#code'
}
+ {
+ 'include': '#module'
@sadikovi
sadikovi / stats.rs
Created Apr 26, 2018
Mutable Statistics Buffer (for collecting statistics during writes). For PR https://github.com/sunchao/parquet-rs/pull/94
View stats.rs
// ----------------------------------------------------------------------
// Statistics updates
struct MutableStatisticsBuffer<T: DataType> {
typed: TypedStatistics<T>,
sort_order: SortOrder
}
impl<T: DataType> MutableStatisticsBuffer<T> {
pub fn new(column_order: ColumnOrder, is_min_max_deprecated: bool) -> Self {
@sadikovi
sadikovi / utc_date.rs
Last active Apr 15, 2018
Convert timestamp in seconds into datetime in UTC as Rust function
View utc_date.rs
use std::fmt;
#[derive(Clone, Debug)]
pub struct DateTime {
/// Seconds after the minute - [0, 59]
pub sec: i32,
/// Minutes after the hour - [0, 59]
pub min: i32,
/// Hours after midnight - [0, 23]
pub hour: i32,
@sadikovi
sadikovi / rustfmt.toml
Created Apr 4, 2018
Rust format for parquet-rs
View rustfmt.toml
max_width = 90
hard_tabs = false
tab_spaces = 2
newline_style = "Unix"
indent_style = "Block"
use_small_heuristics = false
format_strings = false
wrap_comments = true
comment_width = 90
normalize_comments = true
@sadikovi
sadikovi / spellchecker.scala
Last active Apr 25, 2018
Simple spell checker based on dynamic programming
View spellchecker.scala
abstract class Spelling
case class CorrectSpelling(word: String) extends Spelling
case class IncorrectSpelling(word: String, suggestions: List[String]) extends Spelling
case class Spellchecker(dictionary: String) {
private val numSuggestions = 10
private val maxDistance = 5
// set of valid words (replace with trie for space efficiency)
private val set = readDict(dictionary)
private val heap = new java.util.PriorityQueue[(Int, String)](
@sadikovi
sadikovi / minimum-ascii-delete-sum-for-two-strings.md
Last active Nov 6, 2017
712. Minimum ASCII Delete Sum for Two Strings
View minimum-ascii-delete-sum-for-two-strings.md

Given two strings s1, s2, find the lowest ASCII sum of deleted characters to make two strings equal.

Example 1:

  • Input: s1 = "sea", s2 = "eat"
  • Output: 231
  • Explanation: Deleting "s" from "sea" adds the ASCII value of "s" (115) to the sum. Deleting "t" from "eat" adds 116 to the sum. At the end, both strings are equal, and 115 + 116 = 231 is the minimum sum possible to achieve this.

Example 2:

You can’t perform that action at this time.