Skip to content

Instantly share code, notes, and snippets.

2010-09-29 10:28:27
Full thread dump Java HotSpot(TM) 64-Bit Server VM (17.0-b16 mixed mode):
"org.apache.hadoop.hdfs.server.datanode.DataBlockScanner@1278dc4c" daemon prio=10 tid=0x0000000050e67000 nid=0x153b waiting on condition [0x0000000042a7b000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:625)
at java.lang.Thread.run(Thread.java:619)
"org.apache.hadoop.hdfs.server.datanode.DataXceiverServer@406754d6" daemon prio=10 tid=0x0000000050e64800 nid=0x1537 runnable [0x000000004297a000]
2010-09-29 11:03:14,157 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 127.0.0.1:50010 storage DS-622878021-127.0.0.1-50010-1285783394153
2010-09-29 11:03:14,160 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/127.0.0.1:50010
2010-09-29 11:03:40,476 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=root,root,bin,daemon,sys,adm,disk,wheel ip=/127.0.0.1 cmd=create src=/user/root/core-site.xml dst=null perm=root:supergroup:rw-r--r--
2010-09-29 11:03:40,486 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1(excluded: 127.0.0.1:50010)
2010-09-29 11:03:40,488 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 8020, call addBlock(/user/root/core-site.xml, DFSClient_1907075850, null) from 127.0.0.1:39665: error: java.io.IOException: File /user/root/core-site.xml could only be replicated to 0 nodes, instead of 1
java.io.IOException
URL url = new URL("http://localhost:" + server.getPort());
xceiver = new HttpTransceiver(url);
proxy = (NozzleIPC) SpecificRequestor.getClient(NozzleIPC.class, xceiver);
@henryr
henryr / gist:1977470
Created March 5, 2012 08:28
HDFS-2834 test code
#include <time.h>
long get_time() {
struct timespec tp;
clock_gettime(CLOCK_MONOTONIC, &tp);
return (long)((tp.tv_sec * 1000000000) + tp.tv_nsec);
}
#include "../hadoop-common/hadoop-hdfs-project/hadoop-hdfs
hive> create table oh_hive(col int) partitioned by (part int);
OK
Time taken: 0.091 seconds
hive> insert into table oh_hive partition(part) select NULL, NULL from functional.alltypes limit 1;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Loading data to table hank.oh_hive partition (part=null)
Loading partition {part=__HIVE_DEFAULT_PARTITION__}
Partition hank.oh_hive{part=__HIVE_DEFAULT_PARTITION__} stats: [num_files: 1, num_rows: 0, total_size: 3, raw_data_size: 0]
Table hank.oh_hive stats: [num_partitions: 1, num_files: 1, num_rows: 0, total_size: 3, raw_data_size: 0]
# [thread 140550751581952 also had an error]
[thread 140550776760064 also had an error]
[thread 140550785152768 also had an error]
[thread 140550759974656 also had an error]
[thread 140550734796544 also had an error]
[thread 140550676047616 also had an error]
C [impalad+0x985df7] impala::HdfsOp::Execute() const+0x119
#
# An error report file with more information is saved as:
# /home/henry/src/cloudera/impala/hs_err_pid14403.log
Let's say this: an operation is durable if it is effects persist following any sequence of crash-restart failures, including a total restart. Note that this implies the entire cluster is available after the failure sequence; we can weaken this to "sufficiently many" (e.g. N/2+1) nodes are available after the failure sequence if we want.
Meeting this requirement naturally requires persistent storage, because any in-memory only approach can't survive a total restart. Since the failures may not be staggered, this also rules out anti-entropy style dissemination where restarted nodes are told about already committed operations.
Do we want to propose a parameterised form of durability, where we tolerate up to F crash-restart faults before sacrificing durability? Maybe: then you'd be able to achieve such durability by guaranteeing that data were written to F+1 nodes (i.e. the DF formulation). Or if you didn't care much about the recency of the version available after F failures, you could have a very weak requirem
import java.io.ByteArrayOutputStream;
public class TestByteArray {
static byte[] chunk = new byte[1024 * 1024];
public static void main(String[] args) {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int numChunks = 2 * 1024 * 1024;
for (int i = 0; i < numChunks; ++i) {
long start = System.currentTimeMillis();
@henryr
henryr / gist:8655574
Created January 27, 2014 19:22
External consistency?
The claim for 'external consistency' is as following (all quotes are from the journal paper):
"external-consistency invariant: if the start of a transaction T2 occurs after the commit of a transaction T1, then the commit time-stamp of T2 must be greater than the commit timestamp of T1."
But when assigning a commit timestamp, section 4.2.1 has:
"The commit timestamp s must be greater than or equal to all pre- pare timestamps (to satisfy the constraints discussed in Section 4.1.3), greater than TT.now().latest at the time the coordinator received its commit message, and greater than any timestamps the leader has assigned to previous transactions (again, to pre- serve monotonicity)."
This lead me down the following path:
1. Is TT.now().latest monotonic increasing? Presumably not, otherwise the third requirement (about being larger than any previous transaction) would be implicit, and also epsilon is not monotonic increasing.
com.cloudera.impala.planner.PlannerTest.testJoinOrder
Failing for the past 1 build (Since Failed#594 )
Took 0.21 sec.
add description
Error Message
section PLAN of query:
select
n_name,