Skip to content

Instantly share code, notes, and snippets.

@timvw
timvw / sparkdemo.scala
Created September 22, 2016 19:53
SparkSQL and CTE for increased readability
val df = spark.read.text(inputFile)
df.createOrReplaceTempView("data")
val query =
"""
| WITH loglevel AS (SELECT SPLIT(value, ' ')[0] AS level FROM data WHERE LENGTH(value) > 0),
| levelcount AS (SELECT level, COUNT(*) as count FROM loglevel GROUP BY level)
| SELECT level, count FROM levelcount ORDER BY count DESC
""".stripMargin
@Chaser324
Chaser324 / GitHub-Forking.md
Last active May 2, 2024 05:49
GitHub Standard Fork & Pull Request Workflow

Whether you're trying to give back to the open source community or collaborating on your own projects, knowing how to properly fork and generate pull requests is essential. Unfortunately, it's quite easy to make mistakes or not know what you should do when you're initially learning the process. I know that I certainly had considerable initial trouble with it, and I found a lot of the information on GitHub and around the internet to be rather piecemeal and incomplete - part of the process described here, another there, common hangups in a different place, and so on.

In an attempt to coallate this information for myself and others, this short tutorial is what I've found to be fairly standard procedure for creating a fork, doing your work, issuing a pull request, and merging that pull request back into the original project.

Creating a Fork

Just head over to the GitHub page and click the "Fork" button. It's just that simple. Once you've done that, you can use your favorite git client to clone your repo or j

@Bouke
Bouke / gist:10454272
Last active September 22, 2023 17:23
Install FreeTDS, unixODBC and pyodbc on OS X

First, install the following libraries:

$ brew install unixodbc
$ brew install freetds --with-unixodbc

FreeTDS should already work now, without configuration:

$ tsql -S [IP or hostname] -U [username] -P [password]
locale is "en_US.UTF-8"

locale charset is "UTF-8"

@NotBadPad
NotBadPad / MMWordMatch
Created March 11, 2014 13:17
简单的中文最大前缀匹配,目前仅能切分连续中文句子。可以将标点、数字、英文、特殊字符考虑进来进行处理,就能处理一般的文本了。 经测试800w+字数,22M的文件分词需要900ms,可能因为处理方式和文件内容比较简单吧。 有空实现个中文trie树试试,看看咋样。
package com.gj.split;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.HashSet;
@viktorklang
viktorklang / Future-retry.scala
Last active July 23, 2023 23:48
Asynchronous retry for Future in Scala
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext
import scala.concurrent.Future
import akka.pattern.after
import akka.actor.Scheduler
/**
* Given an operation that produces a T, returns a Future containing the result of T, unless an exception is thrown,
* in which case the operation will be retried after _delay_ time, if there are more possible retries, which is configured through
* the _retries_ parameter. If the operation does not succeed and there is no retries left, the resulting Future will contain the last failure.
@jars
jars / hacker-radiio
Last active December 29, 2022 16:25
Music For Hackers
Music For Hackers
==
To a hacker, there's something distracting about booting up a GUI to listen to your tunes. You live your life in the terminal, you treat the mouse like a high voltage tap.
So give these commands a run in the terminal, and toss on your headphones.
sudo apt-get install mplayer
echo "alias defcon-start='nohup mplayer http://sfstream1.somafm.com:6200 > /dev/null 1>&2 &'" >> ~/.bashrc
echo "alias defcon-stop='killall -9 mplayer'" >> ~/.bashrc
@mobilemind
mobilemind / git-tag-delete-local-and-remote.sh
Last active April 30, 2024 23:36
how to delete a git tag locally and remote
# delete local tag '12345'
git tag -d 12345
# delete remote tag '12345' (eg, GitHub version too)
git push origin :refs/tags/12345
# alternative approach
git push --delete origin tagName
git tag -d tagName
@tommct
tommct / README.md
Last active November 9, 2023 20:00
FreeTDS and pyodbc on Mac OS X 10.8 via Homebrew

After spending many hours trying to get FreeTDS and unixodbc to run on a Mac OS X 10.8 system with the python module, pyodbc, I eventually came to this recipe, which is remarkably simple thanks to homebrew. I also found unixodbc was unnecessary and I couldn't get it to play well with FreeTDS, so this install does not include unixodbc. See also http://www.acloudtree.com/how-to-install-freetds-and-unixodbc-on-osx-using-homebrew-for-use-with-ruby-php-and-perl/ and http://www.cerebralmastication.com/2013/01/installing-debugging-odbc-on-mac-os-x/.

Prerequisites: Be sure you have XCode and the Commandline Tools for XCode installed from Apple. Also install homebrew followed with brew update and brew doctor.

Install FreeTDS:

brew install freetds

Test your install:

@komamitsu
komamitsu / gist:1528682
Created December 28, 2011 17:00
NIO Echo Server
package com.komamitsu;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
import java.util.Iterator;