Skip to content

Instantly share code, notes, and snippets.


Georgios Gousios gousiosg

View GitHub Profile
gousiosg /
Last active Feb 4, 2020
Rebuild RAID array when one disk has been marked as faulty (but it is not really)
View deps.bib
title={Could I Have a Stack Trace to Examine the Dependency Conflict Issue?},
author={Wang, Ying and Wen, Ming and Wu, Rongxin and Liu, Zhenwei and Tan, Shin Hwei and Zhu, Zhiliang and Yu, Hai and Cheung, Shing-Chi},
booktitle={ICSE 2019},
Note = {
The authors consider the problem of dependency conflicts.
This happens when imported libraries include classes of the same name or multiple versions of the same library are imported.
The authors found several issues on GitHub related to dependency conflicts.
The build a full scale CFG (including the program and dependencies) and they initially short-circuit all branch conditions
gousiosg / ml4se.bib
Last active Jul 16, 2019
My reading list for ML4SE
View ml4se.bib
author = {Alon, Uri and Zilberstein, Meital and Levy, Omer and Yahav, Eran},
title = {Code2Vec: Learning Distributed Representations of Code},
journal = {Proc. ACM Program. Lang.},
issue_date = {January 2019},
volume = {3},
number = {POPL},
month = jan,
year = {2019},
issn = {2475-1421},
highlight -O rtf -s seashell -k Monaco -K 20 foo.rb |pbcopy
#!/usr/bin/env python
# (c) 2018 Georgios Gousios <>
# Barebones linear equation solving trainer
from __future__ import division
from random import randint
import codecs
import sys
gousiosg /
Last active Apr 30, 2018
Restoring the GHTorrent MongoDB database

This is a collection of scripts to restore a full GHTorrent MongoDB database from the dumps available at

To do the restore:

  1. Open a MongoDB terminal and run the createCollections.js script to create the necessary collections. You can block_compressor to either snappy or zlib to make your databases compressed. I am using none here, as I am using compression at the filesystem level.

  2. Run to restore the cummulative dumps. Wait 3-4 days.

digraph g {
graph [fontname = "helvetica"];
node [shape=record, fontname = "helvetica"];
edge [fontname = "helvetica"];
1 -> 95;
1 -> 10;
2 -> 78;
gousiosg /
Last active Nov 20, 2017
How compatible is your Unix with the original one?
#!/usr/bin/env bash
echo 0 0 > $TEMPFILE
curl ""|
grep "(I)"|
gousiosg /
Last active Sep 23, 2016
Experiments with various languages on low level file parsing

So today I was experimenting with various languages in order to make the GHTorrent MySQL "CSV" dumps to behave like RFC-compliant CSV files. This involved parsing multi-GB, UTF-8 encoded files and running a small state-machine at the character level. I started with Ruby, but it was slow:

$ time ruby csvify.rb projects.csv >/dev/null

real	0m36.714s
user	0m35.689s
View gist:8e735fb65a8d5666c502fabed8071c5d
# start the replset nodes
$ mongod --dbpath mongodb/ --replSet ghtorrent
$ mongod --dbpath mongodb-repl1/ --port 27018 --replSet ghtorrent
$ mongod --dbpath mongodb-repl2/ --port 27019 --replSet ghtorrent
# connect to primary
$ mongo
# In mongo shell
ghtorrent:PRIMARY> rs.initiate()
You can’t perform that action at this time.