Skip to content

Instantly share code, notes, and snippets.

View markpapadakis's full-sized avatar
💭
Seeking Knowledge 24x7

Mark Papadakis markpapadakis

💭
Seeking Knowledge 24x7
View GitHub Profile
@markpapadakis
markpapadakis / gist:6247810
Created August 16, 2013 06:50
Suggestion for improving reliability and increasing performance for GET/READ requests on Cassandra
Suggestion for improving reliability and increasing performance for GET/READ requests on Cassandra
====================================================================================================
Each node maintains a [UUIDs] list, of all hosts said node has pending Hinted Handoff data to deliver, whenever possible. Whenever that list changes (all data delivered to one of those nodes, etc), it will issue an ApplicationState update for e.g APPSTATE_PENDING_HH so that all other nodes in the cluster will be aware of it.
Each node will maintain a UUID=>[UUIDs], that is which nodes have 1 or more nodes that have pending HH content to deliver to it.
Effectively, by knowing that a node A doesn't have any pending HH content to be delivered to any one, you can account for that when building the list of nodes to ask for digests for read repair. e.g
for (uint32_t i = 0; i != liveNodesForKey.size(); )
{
#define __ALIGNUP_MASK(x,mask) (((x)+(mask))&~(mask))
#define ALIGNUP_TO_POWEROF2BOUNDARY(x, a) __ALIGNUP_MASK(x, (typeof(x))(a)-1)
int TouchFilePages(int fd, uint64_t offset, size_t len)
{
void *fileData;
static const size_t pageSize = getpagesize();
uint64_t upto, end;
if (len == 0)
Zynga Engineering
zBase – A high-performance, elastic, distributed key-value store
zBase is a distributed key-value store that is the storage backbone for almost all Zynga games. It supports the memcached protocol on the front-end and is designed for persisting data to disk along with in-memory data replication for high availability. zBase was forked from membase in early Jan 2012 and, since then, the storage services team has added a host of useful features and made many performance enhancements and has been successfully been deployed on more than 6000 zBase nodes in production. We stood on the shoulder of giants and built some really cool stuff and its time that we gave back to the OpenSource community. The remainder of this post will be spent on describing some of the significant features that we have added that make ZBase what it is today.
Improved Memory Management and Multi-Disk Support
In the lifecycle of a typical Zynga game, the total install base continues to increase, while the Daily Active U
// Update operation
for (;;)
{
struct Ring *const cur = theRing, *const n = cur->Clone();
n->Update(…);
n->AnotherUpdate(...);
if (cas(&theRing, cur, n))
{
@markpapadakis
markpapadakis / common_ancestors.cpp
Last active December 26, 2015 09:59
Quickly identify a common ancestor among 2+ categories.
// Suppose you have a bunch of category IDs:uint16_t and you need to identify their common ancestor (0 == root category ID)
// Also, suppose `byIndex[]` provides access to a structure that holds the parentId for each category.
// This here implementation will quickly identify the most common ancestor among them.
uint16_t CommonAncestor(const uint16_t *const cids, const uint8_t cnt)
{
if (cnt < 2)
{
// Too few to have any common ancestors
return 0;
}
// Useful for sorting things - e.g TrivialCmp<int>(a, b)
template<typename T>
inline int32_t TrivialCmp(const T a, const T b)
{
if (a < b)
return -1;
else if (a > b)
return 1;
else
return 0;
@markpapadakis
markpapadakis / gist:7376127
Last active December 27, 2015 19:19
Timings
// use e.g Timings::Minutes::Sleep(2), Timings::Seconds::ToMicroseconds(1) etc
#include <stdint.h>
#include <time.h>
#include <errno.h>
#include <switch.h>
namespace Timings
{
template<uint64_t asNanoseconds>
struct Unit
// Previously
for (uint32_t i = 0; i != bookProps->columns.Size(); ++i)
{
const IColumn *const ic = bookProps->columns[i];
if (likely(!ic->IsSuperColumn()))
{
const Column *const c = (Column *)ic;
@markpapadakis
markpapadakis / narrowedsort.cpp
Last active February 14, 2020 00:20
Fast sort by elimination
// Assuming you, for example, are going through dates and for each date you collect ids and for each id the number of times
// the object identified by that id (e.g an ad banner, a product, anything) was clicked, or anything like that.
//
// That is, assuming that you are going to have to update the value of an object 1+ times, as opposed to just once (you can use other
// more efficient algorithms to deal with that -- e.g maintain a top-K list as you go, etc), how do you then sort the final list that holds
// (id, value) - that is, for each distinct id, the sum of times it was clicked (or anything)?
//
// Sorting a list that is comprised of 1million distinct IDs is just really slow.
// This is one way to do it, which is particularly efficient if the distribution of values is large.
//
class BufFileReader
: public ISerializer
{
private:
int fd;
uint64_t chunkOffset;
uint64_t chunkSize;
uint64_t chunkEnd;
uint64_t offset;