Skip to content

Instantly share code, notes, and snippets.

Let's say this: an operation is durable if it is effects persist following any sequence of crash-restart failures, including a total restart. Note that this implies the entire cluster is available after the failure sequence; we can weaken this to "sufficiently many" (e.g. N/2+1) nodes are available after the failure sequence if we want.
Meeting this requirement naturally requires persistent storage, because any in-memory only approach can't survive a total restart. Since the failures may not be staggered, this also rules out anti-entropy style dissemination where restarted nodes are told about already committed operations.
Do we want to propose a parameterised form of durability, where we tolerate up to F crash-restart faults before sacrificing durability? Maybe: then you'd be able to achieve such durability by guaranteeing that data were written to F+1 nodes (i.e. the DF formulation). Or if you didn't care much about the recency of the version available after F failures, you could have a very weak requirem
# [thread 140550751581952 also had an error]
[thread 140550776760064 also had an error]
[thread 140550785152768 also had an error]
[thread 140550759974656 also had an error]
[thread 140550734796544 also had an error]
[thread 140550676047616 also had an error]
C [impalad+0x985df7] impala::HdfsOp::Execute() const+0x119
# An error report file with more information is saved as:
# /home/henry/src/cloudera/impala/hs_err_pid14403.log
hive> create table oh_hive(col int) partitioned by (part int);
Time taken: 0.091 seconds
hive> insert into table oh_hive partition(part) select NULL, NULL from functional.alltypes limit 1;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Loading data to table hank.oh_hive partition (part=null)
Loading partition {part=__HIVE_DEFAULT_PARTITION__}
Partition hank.oh_hive{part=__HIVE_DEFAULT_PARTITION__} stats: [num_files: 1, num_rows: 0, total_size: 3, raw_data_size: 0]
Table hank.oh_hive stats: [num_partitions: 1, num_files: 1, num_rows: 0, total_size: 3, raw_data_size: 0]
henryr / gist:1977470
Created March 5, 2012 08:28
HDFS-2834 test code
#include <time.h>
long get_time() {
struct timespec tp;
clock_gettime(CLOCK_MONOTONIC, &tp);
return (long)((tp.tv_sec * 1000000000) + tp.tv_nsec);
#include "../hadoop-common/hadoop-hdfs-project/hadoop-hdfs
URL url = new URL("http://localhost:" + server.getPort());
xceiver = new HttpTransceiver(url);
proxy = (NozzleIPC) SpecificRequestor.getClient(NozzleIPC.class, xceiver);
2010-09-29 11:03:14,157 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from storage DS-622878021-
2010-09-29 11:03:14,160 INFO Adding a new node: /default-rack/
2010-09-29 11:03:40,476 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=root,root,bin,daemon,sys,adm,disk,wheel ip=/ cmd=create src=/user/root/core-site.xml dst=null perm=root:supergroup:rw-r--r--
2010-09-29 11:03:40,486 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place enough replicas, still in need of 1(excluded:
2010-09-29 11:03:40,488 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 8020, call addBlock(/user/root/core-site.xml, DFSClient_1907075850, null) from error: File /user/root/core-site.xml could only be replicated to 0 nodes, instead of 1
2010-09-29 10:28:27
Full thread dump Java HotSpot(TM) 64-Bit Server VM (17.0-b16 mixed mode):
"org.apache.hadoop.hdfs.server.datanode.DataBlockScanner@1278dc4c" daemon prio=10 tid=0x0000000050e67000 nid=0x153b waiting on condition [0x0000000042a7b000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
"org.apache.hadoop.hdfs.server.datanode.DataXceiverServer@406754d6" daemon prio=10 tid=0x0000000050e64800 nid=0x1537 runnable [0x000000004297a000]