Skip to content

Instantly share code, notes, and snippets.

View busbey's full-sized avatar

Sean Busbey busbey

  • Champaign, IL USA
View GitHub Profile

HBASE 2.4.7 Release Notes

These release notes cover new developer and user-facing incompatibilities, important issues, features, and major improvements.


  • HBASE-26274 | Major | Create an option to reintroduce BlockCache to mapreduce job

Introduce `hfile.onheap.block.cache.fixed.size` and default to disable. When using ClientSideRegionScanner, it will be enabled with a fixed size for caching INDEX/LEAF_INDEX block when a client, e.g. snapshot scanner, scans the entire HFile and does not need to seek/reseek to index block multiple times.

@busbey
busbey / host_offset.txt
Last active August 3, 2021 19:01
Helper for running YCSB "Core Workloads" on multiple hosts.
client-1.example.com:0
client-2.example.com:429496729
client-3.example.com:858993458
client-4.example.com:1288490187
client-5.example.com:1717986916
@busbey
busbey / example.py
Created March 3, 2020 16:37
Example of calling a python script as a part of a maven build
#!/usr/bin/env python2
# Copyright 2020 Sean Busbey
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of
# this software and associated documentation files (the "Software"), to deal in
# the Software without restriction, including without limitation the rights to
# use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
# of the Software, and to permit persons to whom the Software is furnished to do
# so, subject to the following conditions:
#
#!/bin/bash -e
if [[ "true" = "${DEBUG}" ]]; then
set -x
printenv
fi
##To set jenkins Environment Variables:
export TOOLS_HOME=/home/jenkins/tools
export FINDBUGS_HOME=${TOOLS_HOME}/findbugs/latest
@busbey
busbey / 0 example for compatible rename
Last active March 1, 2019 23:02
mike had a question
Let's walk through renaming a class in a way that maintains compatibility for a factory class!
Licensed under the MIT license (https://opensource.org/licenses/MIT)
Copyright 2019 Sean Busbey
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
# Run with --help for cli options
#
# Look below for the section marked XXX on how to enable deletes
#
# python2 old_slack_files.py --aggregate-by-type --domain example path/to/my/example.oauth.token.file
#
# Original content from
#
# https://www.shiftedup.com/2014/11/13/how-to-bulk-remove-files-from-slack
#
hbase(main):002:0> create 't1', 'family', 'access'
hbase(main):006:0> put = org.apache.hadoop.hbase.client.Put.new("row1".to_java_bytes)
hbase(main):008:0> put.add_column("family".to_java_bytes, "column1".to_java_bytes, "a value".to_java_bytes)
=> #<Java::OrgApacheHadoopHbaseClient::Put:0x648855fd>
hbase(main):009:0> put.add_column("family".to_java_bytes, "column2".to_java_bytes, "another value".to_java_bytes)
=> #<Java::OrgApacheHadoopHbaseClient::Put:0x648855fd>
hbase(main):010:0> put.add_column("access".to_java_bytes, nil, "@group1".to_java_bytes)
=> #<Java::OrgApacheHadoopHbaseClient::Put:0x648855fd>
hbase(main):012:0> get_table("t1").table.put(put)
0 row(s) in 0.0010 seconds
>>> import jenkins
>>> server = jenkins.Jenkins('https://builds.apache.org/')
>>> job_info = server.get_job_info('PreCommit-HBASE-Build', 1, True)
>>> builds = [catch(lambda : server.get_build_info('PreCommit-HBASE-Build', x['number'])) for x in job_info['builds']]
>>> for build in builds:
... if build['result'] == 'SUCCESS':
... host_counts[build['builtOn']]['success']+=1
... elif build['result'] == 'FAILURE':
... for test_log in [artifact for artifact in build['artifacts'] if artifact['fileName'].startswith('patch-unit')]:
... try:

Problem Statement

Currently, Hadoop exposes downstream clients to a variety of third party libraries. As our code base grows and matures we increase the set of libraries we rely on. At the same time, as our user base grows we increase the likelihood that some downstream project will run into a conflict while attempting to use a different version of some library we depend on. While there are hot-button third party libraries that drive most of the development and support issues (e.g. Guava, Apache Commons, and Jackson), a coherent general practice will ensure that we avoid future complications. Simply attempting to coordinate library versions among Hadoop and various downstream projects is untenable, because each project has its own release schedule and often attempts to support multiple versions of other ecosystem projects. Furthermore, our current approach of taking a conservative approach to dependency updates leads to reliance on stale versions of everything. Those stale versions include

#!/bin/bash -e
if [[ "true" = "${DEBUG}" ]]; then
set -x
fi
##To set jenkins Environment Variables:
export TOOLS_HOME=/home/jenkins/tools
export JAVA_HOME=${TOOLS_HOME}/java/jdk1.7.0_79
export FINDBUGS_HOME=${TOOLS_HOME}/findbugs/latest