Skip to content

Instantly share code, notes, and snippets.

View amorton's full-sized avatar

Aaron Morton amorton

View GitHub Profile
@amorton
amorton / sstable_link.py
Created October 20, 2011 00:26
sstable_link
@amorton
amorton / reverse_query_profile.py
Created October 3, 2011 08:43
reverse_query_profile.py
#!/usr/bin/env python
"""Tool for profiling Cassandra query performance using reverse comparators.
Tests are run by profile() multiple times and the 'Read Latency' is
extracted using node tool.
Usage:
#Create the schema using the cassandra-cli.
@amorton
amorton / token_range.py
Created September 29, 2011 10:39
Script to output the token range ownership for Cassandra nodes.
#!/usr/bin/env python
"""Script to output the token range ownership for Cassandra nodes.
usage:
$ ./token_range.py --interactive 154009024815050802110273337963779530663 141704132449535340642001248672108470009 102889564695022956386161396156024583904
154009024815050802110273337963779530663 141704132449535340642001248672108470009 102889564695022956386161396156024583904
69.95% - 102889564695022956386161396156024583904
22.81% - 141704132449535340642001248672108470009
7.23% - 154009024815050802110273337963779530663
@amorton
amorton / gist:1153956
Created August 18, 2011 12:30
log4j 1.2.16 diff to show FileAppender been reset.
diff --git a/build.xml b/build.xml
index e5ec33b..c05f24c 100644
--- a/build.xml
+++ b/build.xml
@@ -99,8 +99,8 @@
<!-- Directory for temporary files. -->
<property name="dist.tmp" value="${dist.dir}/tmp"/>
- <property name="javac.source" value="1.2"/>
- <property name="javac.target" value="1.1"/>
@amorton
amorton / gist:1106067
Created July 26, 2011 05:50
Brisk Hive b2 select from counter column
2011-07-26 17:23:34,376 ERROR CliDriver (SessionState.java:printError(343)) - Failed with exception java.io.IOException:java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:341)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:133)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1114)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
@amorton
amorton / gist:1103267
Created July 24, 2011 23:59
100 hits a minute, distributed per second using Poisson distributed
In [6]: import numpy as np
In [7]: np.random.po
np.random.poisson np.random.power
In [7]: np.random.poisson(100 / 60.0, 60)
Out[7]:
array([0, 2, 2, 1, 4, 3, 5, 3, 1, 0, 3, 1, 1, 1, 1, 2, 3, 2, 2, 0, 3, 2, 0,
2, 3, 1, 2, 1, 4, 1, 5, 1, 3, 1, 1, 3, 1, 3, 1, 1, 0, 1, 3, 1, 2, 4,
2, 3, 4, 2, 3, 3, 4, 3, 3, 1, 2, 3, 1, 1])
@amorton
amorton / query_profile.py
Created July 10, 2011 17:17
Tool for profiling Cassandra query performance.
"""Tool for profiling Cassandra query performance.
Tests are run by profile() multiple times and the 'Read Latency' is
extracted using node tool.
Usage:
#Create the schema using the cassandra-cli.
create keyspace query
@amorton
amorton / gist:1068855
Created July 7, 2011 03:28
Logging Cassandra column pages for slice by Name
diff --git a/src/java/org/apache/cassandra/db/columniterator/SSTableNamesIterator.java b/src/java/org/apache/cassandra/db/columniterator/SSTableNamesIterator.java
index 6cbc64b..014ce2c 100644
--- a/src/java/org/apache/cassandra/db/columniterator/SSTableNamesIterator.java
+++ b/src/java/org/apache/cassandra/db/columniterator/SSTableNamesIterator.java
@@ -184,6 +184,7 @@ public class SSTableNamesIterator extends SimpleAbstractColumnIterator implement
}
}
}
+ logger.info("Column Page Count {}/{}", Integer.toString(indexList.size()), Integer.toString(ranges.size()));
}
@amorton
amorton / gist:1064312
Created July 5, 2011 05:56
My super simple Python CLI
if __name__ == "__main__":
action = sys.argv[1] if len(sys.argv) > 1 else None
args = [
token
for token in sys.argv[2:]
if token.find("=") < 0
]
kwargs = dict(
token.split("=")
for token in sys.argv[2:]
@amorton
amorton / gist:1024636
Created June 14, 2011 10:17
Diff the files in a directory using Python
#used as part of a unit test
def _compare_directories(self, expected_dir, actual_dir, header=None):
"""Compares two directories and diffs any different files.
The file list of both directories must match and the files must match.
Sub directories are not considered as I don't need to in my case.
The diff will not show that a file was deleted from one side. It shows
that all the lines were removed.