Skip to content

Instantly share code, notes, and snippets.

@ketankr9
Last active April 25, 2018 18:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ketankr9/ea7c87005c20591926762936cc6cc877 to your computer and use it in GitHub Desktop.
Save ketankr9/ea7c87005c20591926762936cc6cc877 to your computer and use it in GitHub Desktop.
Feel free to comment below, this is the first draft, awaiting improvements :)

Introduction

HTTP Batch Sending

Syslog-ng is widely used cross-platform log management software, and being such an old software organization, the efficiency plays an important role.
This proposed project (HTTP batch sending) is an enhancement/improvement of the existing HTTP destination driver, enabling the ability to combine(batch) multiple messages as a single HTTP request, using libcurl based existing HTTP destination driver implementation.

Usefulness to the Community

One of the many characteristics which make a software stand among the crowd is the one who consumes fewer resources (network loads/uptime, processes load/uptime, etc.)
This project aims to reduce the network activity time/bandwidth on the host system running syslog-ng provided the destination entertains batch requests like Splunk.

In present scenario, N number of messages are sent over N HTTP requests. This situation is a bit inefficient as all those N numbers of messages could have been sent in a single batch request. Thus reducing the network uptime to nearly 1/N'th of the previous one.

Why am I taking this project ?

  • Interest in networking and prior experience of cURL(command line), this opportunity would help me by extending my current knowledge of cURL(libcurl specifically) based implementation in C.
  • I have become familiar with syslog-ng so much that it feels like I have already spent my community bonding period time:)
  • I have written few automation scripts using cURl(command line) in past, and have a project in mind which uses curl or libcurl to download files in parts over local computers networks in colleges and stream it in parts over the same local network.
  • And honestly, having gone through many organizations idea pages, this project looked perfect based on my interest and prior knowledge.

Project Goals

  • Proper DataStructures and implementation design.
  • Adding an optional option support like message-per-count(n) | batch-size(n) to the configuration file, if not specified it will default to 1.
    • Proper sample configuration example will be provided, beneficial for people unfamiliar with this, and will be documented at last.
  • Adding an struct event in HTTPDestinationDriver which have attributes like present_count, batch_size_max, msg_array, start_time [ may vary based on actual design ].
  • A function which adds each message received, as an event log to the buffer and increments n until n==batch_size_max, this function is executed from insert function, so after each addition of msg (logmsg) into the batch buffer/list Success is returned to the insert function.
    This implies that the message will rarely be pushed back to the LogThrDestDriver queue as when the buffer is full (n == batch_size_max) the buffer will be freed by being pushed back to the batch queue. Failure is only possible when the batch queue itself becomes full.
  • A function which sends all the messages present in the batch queue when above condition is met, or time-limit is reached and drops the batch messages when it fails. {clear/realloc the buffer/list, set n=0: if success}
  • Queue Handling with proper message acknowledgement: The case when the above step/function returns FAILURE, i.e., pushing back the messages to the queue (batch queue), instead of dropping.
  • Tests cases development.
  • Self and Community testing, modification based on community feedback.

Area Relevant To This Project

  • libcurl
    • Basic HTTP requests
    • Batch Requests implementation (most important as very less resource is available on the internet) (partially)
  • syslog-ng
    • General destination module:
      • Familiarity with fundamental functions of any destination
      • how message is handled/pushed back to the queue when sending fails
    • HTTP destination
    • Threading: ivykis{lib}, event handling threading library
    • logthrdestdrv{lib} : Present queue management
  • External destinations
  • GNU coding style
    • I must study this soon.

Project Time Schedule

Community Bonding Period ( April 23 - May 14)

  • Understanding documentation.
  • Understanding existing HTTP module, riemann module(it does implement batch functionality, but messages are lost if sending fails), java: elastic-v2 which also has bulk message processor.
  • Implementing various possible configuration for HTTP destination.
  • General understanding of destinations, sources, queues, lib: logmsg implemented in syslog-ng.
  • Module design: configuration of option for HTTP destination which will invoke the batch request usage, batch queue, message buffer, batch HTTP format (global headers and event specific headers) and community feedback on the design.

Coding Period ( May 14 - August 6 )

Design Phase - 1 Week

  1. Prototyping of all the functions decided in community bonding period (Module Design), with proper functions.

Functionality Development Phase - 4 Weeks

  1. Libcurl implementation: Writing a script in C based on libcurl which reads a collection(>1) of messages from a file/array/stdin and sends it to any URL in batch_format (decided in the previous step).
  2. Divided into two parts below:
    • Understanding Splunk's HTTP API event handler and testing above code for it, and subsequent modification based on Splunk's requirement
    • Testing above code for other HTTP endpoints which accepts batch request
  3. Constructing buffer which collects messages till a given number and debug_print when the limit reaches by modifying the prototyped code. And surplus for the unforeseen delay.
  4. Modifying HTTP module( of syslog-ng ) so that it collects messages from specified sources and prints (via msg_debug) it only when the total message count reaches a certain number say n.

Integration Phase - 5 Weeks

  1. Integration of the functionalities tested, this time the collected messages, instead of being printed as debug messages, are sent as a batch request. If message sending failed than drop message buffer as of now.
  2. Handling the Queue and Message Acknowledgement: The case when _insert() function returns FAILURE, i.e., preventing the loss of messages when send() or _insert() function returns FAILURE or syslog-ng is restarted, including proper functioning of re-open and retry functionality.
  3. ...continuation of above step (2).
  4. Divided into two parts below:
    • Alpha - self_testing (including memory leak and extreme cases)
    • Beta - community testing
  5. Documentation of this new feature with examples.

Extra time | Bug Fix - 2 weeks

  • This 2 weeks time is kept as buffer/extra in case of unforeseen cases, bug fixing and code refactoring.

Note: Enough time has been allotted to each phase of the project developement. Since Design Phase needs extensive discussion and approval from the community, so it will be started from community bonding period itself thus it could be improved further and may save atleast one more week at the end.

Code Submission & Final Evaluation (August 6 - 14)


Deliverables

  • In HTTP destination, an option will be given which will determine the number of messages/events to be sent as a batch in a single HTTP request defaulting to 1, i.e., one message per HTTP request.
  • Sending messages as bulk.
  • Verified support for most of the batch HTTP supporting destinations.
  • Examples demonstrating the configuration for HTTP batch sending for each of the threaded destinations, etc.
  • Tests for batch messages.
  • Documentation.

Why syslog-ng

  • Syslog-ng has a well defines repository, the more I explore/use syslog-ng, the more it gets exciting.
  • Configurability - The best part of syslog-ng is its configuration file; you can use custom destinations in your favorite language. Any use any parser, filters.
  • The responses of the contributors/mentors of syslog-ng have been great so far.
  • Very Old Project foundation.

My involvement in syslog-ng till now.

  • Link to merge request authored by me Click Here
  • Link to issues that I have contributed in by authoring or commenting or testing. Click Here
  • Have discussed the problem statement with the mentor on Syslog-ng Gitter channel actively.
  • Have read about the basics of libcurl (the library in which http batch request will be implemented ) and its implementation.
  • Have brief understanding of what destinations, sources, filters, and parsers do.
  • Have gone through The syslog-ng Open Source Edition 3.13 Administrator Guide lightly (in-depth during community bonding time).

Biographical Information

  • Name : Utsav Krishnan
  • Github : ketankr9
  • Education :
    • 2015-Current: 3rd Year Undergrad Computer Science and Engineering Student at Indian Institute of Technology (Banaras Hindu University), Varanasi, India.
  • Email : utsav.krishnan.cse15@iitbhu.ac.in
  • Phone : +91 9151643205
  • Areas of Interest : IoT, client-server communication, Competitve Porgramming, tinkering with raspberry pi, Web Scraping, REST API, Web App Services, machine learning, arduino and sensors.
  • Programming Languages : C, python, bash, MySQL, Java, C++, solidity. (chronological order in which I learned it.)
  • Projects Pursued : Click Here
  • In free time you can find me writing scripts or exploring new technologies.
  • I love exploring/testing open source projects/softwares: especially new, and implementing it.
  • Courses which I have taken so far relevant to this project: Software Engineering, Theory of Computation, Computer Programming, Data Structures and Algorithms.
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment