Sam Bessalah samklr

## ItemSimilarity.scala
import com.twitter.scalding._
import com.twitter.algebird.{ MinHasher, MinHasher32, MinHashSignature }

/**
 * Computes similar items (with a string itemId), based on approximate
 * Jaccard similarity, using LSH.
 *
 * Assumes an input data TSV file of the following format:
 *
 *    itemId   userId

## benchmark-commands.txt
Producer

Setup
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test-rep-one --partitions 6 --replication-factor 1
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test --partitions 6 --replication-factor 3

Single thread, no replication

bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 50000000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196

## SparkCopyPostgres.scala
import java.io.InputStream

import org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils
import org.apache.spark.sql.{ DataFrame, Row }

import org.postgresql.copy.CopyManager
import org.postgresql.core.BaseConnection

val jdbcUrl = s"jdbc:postgresql://..." // db credentials elided
val connectionProperties = {

## gist:5558573
// This F# dojo is directly inspired by the
// Digit Recognizer competition from Kaggle.com:
// http://www.kaggle.com/c/digit-recognizer
// The datasets below are simply shorter versions of
// the training dataset from Kaggle.

// The goal of the dojo will be to
// create a classifier that uses training data
// to recognize hand-written digits, and
// evaluate the quality of our classifier

## INSTALL.md

      
              1 file
            
          
              5 forks
            
          
              16 comments
            
          
              52 stars
            
          
                jpillora
                / INSTALL.md
            
            
              Last active
              October 25, 2023 05:03
            
              
                Headless Transmission on Mac OS X
              
          
Go to https://developer.apple.com/downloads/index.action and search for "Command line tools" and choose the one for your Mac OSX


Go to http://brew.sh/ and enter the one-liner into the Terminal, you now have brew installed (a better Mac ports)


Install transmission-daemon with
brew install transmission


Copy the startup config for launchctl with
ln -sfv /usr/local/opt/transmission/*.plist ~/Library/LaunchAgents


## article.md

      
              3 files
            
          
              2 forks
            
          
              0 comments
            
          
              5 stars
            
          
                jkpl
                / article.md
            
            
              Last active
              October 17, 2023 13:45
            
              
                Error handling pitfalls in Scala
              
          
    Error handling pitfalls in Scala

There are multiple strategies for error handling in Scala.
Errors can be represented as [exceptions][],
which is a common way of dealing with errors in languages such as Java.
However, exceptions are invisible to the type system,
which can make them challenging to deal with.
It's easy to leave out the necessary error handling,
which can result in unfortunate runtime errors.

  
## FIleSystemOperations.java
package com.cloudwick.mapreduce.FileSystemAPI;

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

## kafka-move-leadership.sh
#!/usr/bin/env bash
#
# File:      kafka-move-leadership.sh
#
# Description
# ===========
#
# Generates a Kafka partition reassignment JSON snippet to STDOUT to move the leadership
# of any replicas away from the provided "source" broker to different, randomly selected
# "target" brokers.  Run this script with `-h` to show detailed usage instructions.

## tsp.py
#!/usr/bin/env python

"""
This Python code is based on Java code by Lee Jacobson found in an article
entitled "Applying a genetic algorithm to the travelling salesman problem"
that can be found at: http://goo.gl/cJEY1
"""

import math
import random

## article.org

      
              3 files
            
          
              3 forks
            
          
              0 comments
            
          
              20 stars
            
          
                jkpl
                / article.org
            
            
              Last active
              November 9, 2022 18:46
            
              
                Enforcing invariants in Scala datatypes
              
          
    Enforcing invariants in Scala datatypes

Scala provides many tools to help us build programs with less runtime errors.
  Instead of relying on nulls, the recommended practice is to use the Option type.
  Instead of throwing exceptions, Try and Either types are used for representing potential error scenarios.
  What’s common with these features is that they’re used for capturing runtime features in the type system,
  thus lifting the runtime scenario handling to the compilation phase:
  your program doesn’t compile until you’ve explicitly handled nulls, exceptions, and other runtime features in your code.
In his “Strategic Scala Style” blog post series,
	import com.twitter.scalding._
	import com.twitter.algebird.{ MinHasher, MinHasher32, MinHashSignature }

	/**
	* Computes similar items (with a string itemId), based on approximate
	* Jaccard similarity, using LSH.
	*
	* Assumes an input data TSV file of the following format:
	*
	* itemId userId
	Producer

	Setup
	bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test-rep-one --partitions 6 --replication-factor 1
	bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test --partitions 6 --replication-factor 3

	Single thread, no replication

	bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 50000000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196
	import java.io.InputStream

	import org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils
	import org.apache.spark.sql.{ DataFrame, Row }

	import org.postgresql.copy.CopyManager
	import org.postgresql.core.BaseConnection

	val jdbcUrl = s"jdbc:postgresql://..." // db credentials elided
	val connectionProperties = {
	// This F# dojo is directly inspired by the
	// Digit Recognizer competition from Kaggle.com:
	// http://www.kaggle.com/c/digit-recognizer
	// The datasets below are simply shorter versions of
	// the training dataset from Kaggle.

	// The goal of the dojo will be to
	// create a classifier that uses training data
	// to recognize hand-written digits, and
	// evaluate the quality of our classifier
	package com.cloudwick.mapreduce.FileSystemAPI;

	import java.io.BufferedInputStream;
	import java.io.BufferedOutputStream;
	import java.io.File;
	import java.io.FileInputStream;
	import java.io.FileOutputStream;
	import java.io.IOException;
	import java.io.InputStream;
	import java.io.OutputStream;
	#!/usr/bin/env bash
	#
	# File: kafka-move-leadership.sh
	#
	# Description
	# ===========
	#
	# Generates a Kafka partition reassignment JSON snippet to STDOUT to move the leadership
	# of any replicas away from the provided "source" broker to different, randomly selected
	# "target" brokers. Run this script with `-h` to show detailed usage instructions.
	#!/usr/bin/env python

	"""
	This Python code is based on Java code by Lee Jacobson found in an article
	entitled "Applying a genetic algorithm to the travelling salesman problem"
	that can be found at: http://goo.gl/cJEY1
	"""

	import math
	import random