Skip to content

Instantly share code, notes, and snippets.

@toddlipcon
Created September 20, 2016 18:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save toddlipcon/e2a8ca78e51773fabb70aae34207199f to your computer and use it in GitHub Desktop.
Save toddlipcon/e2a8ca78e51773fabb70aae34207199f to your computer and use it in GitHub Desktop.
commit b866f40184b9198a3d4d2fd7c3f78bf728e9ff19
Author: Todd Lipcon <todd@apache.org>
Date: Mon Sep 12 20:27:23 2016 -0700
Change version to non-SNAPSHOT in branch
Change-Id: Ibc73006692673591a78c1bf3a101058ad62fc014
Reviewed-on: http://gerrit.cloudera.org:8080/4399
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <todd@apache.org>
commit 9504d5e312aee05c6214c916fd8d36760af1873c
Author: Todd Lipcon <todd@apache.org>
Date: Mon Sep 12 20:27:37 2016 -0700
Revert "java: fix leak of TabletClient objects in client2tablets map"
This reverts commit d5082d8ec1218e3f3bd2143d117ddd64772a6162. It was
fonud to be unstable on master, so reverting for branch-1.0.
Change-Id: I2e9e227af0acfb677572a8baad9692ce08c46be0
Reviewed-on: http://gerrit.cloudera.org:8080/4398
Tested-by: Kudu Jenkins
Reviewed-by: David Ribeiro Alves <dralves@apache.org>
commit d5082d8ec1218e3f3bd2143d117ddd64772a6162
Author: Todd Lipcon <todd@apache.org>
Date: Wed Aug 24 16:41:28 2016 -0700
java: fix leak of TabletClient objects in client2tablets map
After running YCSB for a week, most of the clients had hit OOMEs. Some
heap dump analysis showed that the client2tablets map had hundreds of
thousands of leaked clients.
It seems that we were neglecting to remove the client from the
client2tablets map upon a disconnect. This fixes the issue and adds a
regression test which reproduced the bug.
This patch was worked on by Todd Lipcon and David Alves.
Change-Id: I302650f2a6526e7c51537264137a4f00cbbda073
Reviewed-on: http://gerrit.cloudera.org:8080/4119
Tested-by: David Ribeiro Alves <dralves@apache.org>
Reviewed-by: Todd Lipcon <todd@apache.org>
commit 7eaeb6d902a59428780a91a460ded537afe3c4d4
Author: Todd Lipcon <todd@apache.org>
Date: Fri Sep 9 01:35:39 2016 -0700
KUDU-1594. Rename TIMESTAMP to UNIXTIME_MICROS
As described in the JIRA, this is to reduce confusion between Kudu's
timestamp (which is just an int64 since the unix UTC epoch) vs the
TIMESTAMP type in other systems.
Change-Id: Ia2c76ade3c82bc7c413aebdb2a09a26a70b47c62
Reviewed-on: http://gerrit.cloudera.org:8080/4343
Reviewed-by: David Ribeiro Alves <dralves@apache.org>
Tested-by: Kudu Jenkins
commit c7dab48ebc1dd05ff05a872ed7c3ab2a92f72a0d
Author: Dan Burkert <dan@cloudera.com>
Date: Fri Aug 19 13:30:52 2016 -0700
KUDU-1065: [java client] Flexible Partition Pruning
This commit introduces an internal utility ByteVec class which is a
mashup of the C++ std::string / Rust Vec<u8> types. KeyEncoder has been
transitioned to use this type instead of ByteArrayOutputStream. The
partition pruning algorithm incrementally builds up partition keys from
predicates, and requires cloning the keys as they are being built in
order to multiply over hash partition buckets. ByteArrayOutputStream
doesn't have a clone method. ByteArrayOutputStream is also synchronized
internally, which is dumb. Thus begat ByteVec.
This version of partition pruning only looks at predicates when
determining which partitions to prune. Constraints in the primary key
bounds are not considered, unless the table is range partitioned over
the primary key columns and not hash partitioned (simple partitioning).
This limits the pruned partitions in some pretty rare cases, but the
workaround of explicitly setting the predicate is not too onerous.
Finally, this commit changes the default test flags to remove mini
cluster verbose logging, since it is extremely noisy.
Change-Id: Ib27b54841d87cf854175ab8cdfa8798b337d71f9
Reviewed-on: http://gerrit.cloudera.org:8080/4299
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <adar@cloudera.com>
commit 4fd0572eef60b163f9fb02742a3f5661117e8c8b
Author: Dan Burkert <dan@cloudera.com>
Date: Mon Aug 29 17:48:31 2016 -0700
[java-client] Add ScanToken.stringifySerializedToken
stringifySerializedToken takes a serialized scan token, and returns a
String suitable for debug printing. The string contains information
sufficient to determine which range of tablet(s) the scan token will
cover, and which rows within the tablets. Namely, the range partition
bounds and primary key bounds. Example output, wrapped for readability
(a, b, and c are column names with type STRING):
ScanToken{table=org.apache.kudu.client.TestKuduClient-1472595465767,
lower-bound-primary-key=(string a=1, string b=3, string c=5),
upper-bound-primary-key=(string a=2, string b=4, string c=),
hash-partition-buckets: [2],
range-partition: [(string a=0, string b=0, string c=0),
(string a=3, string b=5, string c=6))}
The Java client did not have any method of deserializing encoded primary
or partition keys, so most of the work in this commit is introducing
utility methods for that purpose. I haven't added tests of the specific
format of the strings, but I have added the printing to many of the
existing ScanToken tests in order to make sure that the formatting code
itself doesn't fail. I've also verified that the output looks good. The
format doesn't include information like predicates or consistency, but
that could be added in the future if so desired.
Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Reviewed-on: http://gerrit.cloudera.org:8080/4173
Tested-by: Kudu Jenkins
Reviewed-by: Dan Burkert <dan@cloudera.com>
commit ed216bcdf4dc62926eef7776dcf6d97129214c97
Author: Todd Lipcon <todd@apache.org>
Date: Tue Aug 23 15:09:44 2016 -0700
KUDU-1231. Add "unlock" flag for experimental and unsafe flags
This adds two new flags:
--unlock_experimental_flags
--unlock_unsafe_flags
If a flag is tagged as 'unsafe' or 'experimental', and the user tries
to set this flag on the command line without the corresponding 'unlock'
flag being set, then the process will exit at startup with an error.
Example error output without flags unlocked:
E0824 14:04:57.263624 14821 flags.cc:296] Flag --never_fsync is unsafe and unsupported.
E0824 14:04:57.263749 14821 flags.cc:302] 1 unsafe flag(s) in use.
E0824 14:04:57.263761 14821 flags.cc:303] Use --unlock_unsafe_flags to proceed at your own risk.
E0824 14:04:57.264104 14821 flags.cc:296] Flag --local_ip_for_outbound_sockets is experimental and unsupported.
E0824 14:04:57.264128 14821 flags.cc:302] 1 experimental flag(s) in use.
E0824 14:04:57.264137 14821 flags.cc:303] Use --unlock_experimental_flags to proceed at your own risk.
<exits>
Example error output with flags unlocked:
W0824 14:04:42.922560 14773 flags.cc:294] Enabled unsafe flag: --never_fsync=true
W0824 14:04:42.923032 14773 flags.cc:294] Enabled experimental flag: --local_ip_for_outbound_sockets=127.0.0.1
<continues>
Change-Id: Iec49e77fca604a7c5ee7501121a6263b7ee590d6
Reviewed-on: http://gerrit.cloudera.org:8080/4100
Reviewed-by: Adar Dembo <adar@cloudera.com>
Tested-by: Todd Lipcon <todd@apache.org>
commit d911b304d7f4e2c345a01d4e6a7637cf71835a4a
Author: Todd Lipcon <todd@apache.org>
Date: Tue Aug 23 15:53:51 2016 -0700
KUDU-1157. Don't use array reference equality for EMPTY_ARRAY
It seems like the only way that the user can specify these byte[] bounds
is some deprecated 'raw bounds' APIs. So, this isn't likely a fix for
any real bugs. However, it's quite bad form to compare reference
equality on arrays.
Change-Id: I30163098926822aafbf23b03ba4c9e26a7c91349
Reviewed-on: http://gerrit.cloudera.org:8080/4101
Tested-by: Kudu Jenkins
Reviewed-by: Dan Burkert <dan@cloudera.com>
commit 854b71563dee5791a3a593947fe1b7233b59cba1
Author: Dan Burkert <dan@cloudera.com>
Date: Tue Aug 16 11:45:25 2016 -0700
[c++-client] fix KuduScanTokenBuilder token generation bugs
This commit fixes two critical issues in the KuduScanTokenBuilder
implementation:
1. Column predicates are now correctly carried through to the scan token.
Prior testing didn't catch this because the predicates were being
transformed into PK bounds, which have always worked correctly. This is
only an issue on the serialization side, so it doesn't affect scan tokens
generated by the Java client and deserialized by the C++ client.
2. Token building now works on tables with non-covering range partitioned
tables. The fix is mostly copy/paste from scanner-internal, which is
very similar.
flex_partitioning-itest has been extended to check scan tokens, and the scan
token unit tests have been updated. I also added a unit test for issue 1 on the
Java side to add some confidence that the Java side is not affected.
Change-Id: Iff3ec3e2399b191c71595c96212471b1e21c7446
Reviewed-on: http://gerrit.cloudera.org:8080/4007
Reviewed-by: Adar Dembo <adar@cloudera.com>
Tested-by: Kudu Jenkins
commit 60f7851601707e8e5059d1c100c69f5d55cb1c9b
Author: Todd Lipcon <todd@apache.org>
Date: Tue Aug 16 00:17:14 2016 -0700
Enable replay cache by default
We've done looping of the stress tests, as well as some significant
cluster testing (YCSB on 72 node cluster for several days). In the
cluster, we didn't see any significant increased memory usage due to
this feature, nor any stability issues.
Additionally, the risk of the feature is relatively low: the worst
it could cause is memory leaks or crashes, but no chance for permanent
data corruption or loss. For the time being, I am leaving the ability
to disable the feature via a gflag, in case we do see any instability.
We can remove the flag entirely in the 1.0 timeframe.
Change-Id: I35b5e74ac30aa3309a0a7e035c8dff7f61c3f275
Reviewed-on: http://gerrit.cloudera.org:8080/4002
Reviewed-by: Adar Dembo <adar@cloudera.com>
Tested-by: Kudu Jenkins
commit 188058d0b6337df970257c99a52fd8af0f70ee0b
Author: Todd Lipcon <todd@apache.org>
Date: Mon Aug 15 23:19:26 2016 -0700
Bump version to 1.0.0-SNAPSHOT
Change-Id: I555d01b7704f4bd71559207520b68f64d58cd66c
Reviewed-on: http://gerrit.cloudera.org:8080/4000
Reviewed-by: Mike Percy <mpercy@apache.org>
Tested-by: Kudu Jenkins
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment