Skip to content

Instantly share code, notes, and snippets.

Migrate from Facebook scribe to Apache Flume (Part II)

In last article we talked about how to setup flume and write files HDFS. This article, we begin to change flume to write file in scribe like style category. Multiplexing Way?

The first thought is using source multiplex to distribute log to different destination. Flume distribute log events by event header. So we google to find out which field in header is referring to scribe header.

https://apache.googlesource.com/flume/+/d66bf94b1dd059bc7e4b1ff332be59a280498077/flume-ng-sources/flume-scribe-source/src/main/java/org/apache/flume/source/scribe/ScribeSource.java

category in header will refer to scribe category. So we try to use multiplexing source:

Migrate from Facebook scribe to Apache Flume (Part I)

Reason

We use scribe as our logging server for a long time. At first, everything works fine. Easy to config, easy to manage. As data grows everyday, single scribe server can’t handle that. We have to migrate some category to second log server and attach a big disk. As data is keep growing, we want a big data storage for that instead of local disk. So we decide to use Scribe with HDFS plugin. It is as tough as we first compile scribe from source. Finally we complied the scribed with hdfs support. But after a short period usage, we find a bug that haven’t solved by facebook. (the project is deprecated several years ago). The bug cause scribe can’t write to hdfs if it accidently killed by SIG 9. So we start to test flume and find out ways to migrate.

Configure Flume

Flume is easy to deploy because it is written in Java. Install java and download jar package, we’ve done all the jobs

@dqtweb
dqtweb / redis_cheatsheet.bash
Created October 22, 2018 09:51 — forked from LeCoupa/redis_cheatsheet.bash
Redis Cheatsheet - Basic Commands You Must Know --> UPDATED VERSION --> https://github.com/LeCoupa/awesome-cheatsheets
# Redis Cheatsheet
# All the commands you need to know
redis-server /path/redis.conf # start redis with the related configuration file
redis-cli # opens a redis prompt
# Strings.
@dqtweb
dqtweb / flickr-original.js
Created October 21, 2018 17:49
Flickr view original
window.open($(".zoom-modal img:nth-child(2)").attributes.src.value, "_blank");
@dqtweb
dqtweb / argparse_date_datetime_custom_types.py
Created September 19, 2018 09:37 — forked from monkut/argparse_date_datetime_custom_types.py
Custom date/datetime type validators for python's argparse module
def valid_date_type(arg_date_str):
"""custom argparse *date* type for user dates values given from the command line"""
try:
return datetime.datetime.strptime(arg_date_str, "%Y-%m-%d")
except ValueError:
msg = "Given Date ({0}) not valid! Expected format, YYYY-MM-DD!".format(arg_date_str)
raise argparse.ArgumentTypeError(msg)
def valid_datetime_type(arg_datetime_str):
"""custom argparse type for user datetime values given from the command line"""
@dqtweb
dqtweb / gist:7f5723909c65e4d7fe1f81e2af8af4f3
Created September 19, 2018 04:17 — forked from s4553711/gist:9488399
Some example for subprocess.Popen exception example
#!/usr/bin/python
import subprocess
import os
import sys
#res = subprocess.Popen(['ls','-al','/ahome'],stdout=subprocess.PIPE,stderr=subprocess.PIPE);
#output,error = res.communicate()
#if res.returncode:
# #raise Exception(error)
from datetime import datetime
import time
#-------------------------------------------------
# conversions to strings
#-------------------------------------------------
# datetime object to string
dt_obj = datetime(2008, 11, 10, 17, 53, 59)
date_str = dt_obj.strftime("%Y-%m-%d %H:%M:%S")
print date_str
@dqtweb
dqtweb / gist:5033fde8ce949686092c27848ecde148
Last active August 27, 2018 06:33
Technical analysis Indicators without Talib
import numpy
import pandas as pd
import math as m
#@author: Bruno Franca
#@author: Peter Bakker
#Moving Average
def MA(df, n):
MA = pd.Series(pd.rolling_mean(df['Close'], n), name = 'MA_' + str(n))
@dqtweb
dqtweb / server.js
Created June 16, 2018 14:41 — forked from johannesMatevosyan/server.js
Websockets: send message to all clients except sender.
var http = require('http');
var Static = require('node-static');
var WebSocketServer = new require('ws');
// list of users
var CLIENTS=[];
var id;
// web server is using 8081 port
var webSocketServer = new WebSocketServer.Server({ port: 8081 });
@dqtweb
dqtweb / gist:9c03f9ecf5a3b00d0ea2fb462acf091b
Created June 7, 2018 07:47 — forked from dbrugne/gist:2a62d4dd88f11fa36b75
MongoDB bulk insert from mongoose models
var mongoose = require('mongoose');
var hitSchema = mongoose.Schema({
text: String,
music: String
});
hitSchema.statics.bulkInsert = function(models, fn) {
if (!models || !models.length)
return fn(null);