Skip to content

Instantly share code, notes, and snippets.

@bubba-h57
Last active August 29, 2015 14:00
Show Gist options
  • Save bubba-h57/11259079 to your computer and use it in GitHub Desktop.
Save bubba-h57/11259079 to your computer and use it in GitHub Desktop.
Script to get AWS Elastic Load Balancer (ELB) Access Logs into Splunk Storm. Keep in mind that ELB ships logs to S3 every five minutes, so if you schedule this to run every five minutes, and it runs as fast as possible, your Splunk will still lag real time by at least five minutes.
#!/usr/bin/php
<?php
/**
* ELBAccessToSplunk.php
*
* Take ELB Access logs that have been shipped to AWS S3
* and transforms them into Splunks generic_single_line
* in a new composite logfile on the system, monitored
* by Splunk Storm.
*
* @author Bubba Hines
* @copyright 2014 Signature Tech Studio
* @license http://www.php.net/license/3_0.txt PHP License 3.0
* @link https://gist.github.com/bubba-h57/11259079
*/
// Include the AWS PHP SDK using the Composer autoloader
require '/some/path/to/composer/vendor/autoload.php';
use Aws\S3\S3Client;
// This is the s3 bucket you are shipping your ELB Access Logs to.
$bucket = '<Enter Your BUCKET Name Here>';
/*
If you instantiate a new client for Amazon Simple Storage Service (S3) with
no parameters or configuration, the AWS SDK for PHP will look for access keys
in the AWS_ACCESS_KEY_ID and AWS_SECRET_KEY environment variables.
We will explicitly define the key and secret in this script for clarity on what
is occuring. However, you are encouraged to not keep credentials like this harcoded
in a script.
For more information about this interface to Amazon S3, see:
http://docs.aws.amazon.com/aws-sdk-php-2/guide/latest/service-s3.html#creating-a-client
*/
$aws_key = '<Enter your AWS Key Here>';
$aws_secret = '<Enter your AWS Key Here>';
// Create the factory.
$s3Client = S3Client::factory(array(
'key' => $aws_key,
'secret' => $aws_secret));
// This is the log pattern AWS is using the ELB Access Logs
$elbLogPattern = '/^(?P<timestamp>[\S]+) (?P<elb>[\S]+) (?P<client>[\S]+) (?P<backend>[\S]+) (?P<request_processing_time>[\S]+) (?P<backend_processing_time>[\S]+) (?P<response_processing_time>[\S]+) (?P<elb_status_code>[\S]+) (?P<backend_status_code>[\S]+) (?P<received_bytes>[\S]+) (?P<sent_bytes>[\S]+) "(?P<request>.*)"$/';
// The generic_single_line splunk format we ultimately want the logs in.
$splunkFormat = '%s elb="%s", client="%s", backend="%s", request_processing_time="%s", backend_processing_time="%s", response_processing_time="%s", elb_status_code="%s", backend_status_code="%s", received_bytes="%s", sent_bytes="%s", request="%s"' . "\n";
// Where we intend to place the splunk formated log for splunk to monitor.
// Note that splunk will need to be configured similar to:
// [monitor:///var/log/elb/*access*]
// disabled=false
// sourcetype=generic_single_line
// You need to verify that the directory structure exists manually.
$splunkLog = '/var/log/elb/access.log';
// Error's out if there is an issue accessing our splunk log.
if (!($fpSplunkLog = fopen($splunkLog, 'a'))) {
exit('Error opening $splunkLog for appending.');
}
// Fetches us a list of all the objects in the bucket.
$s3ObjectIterator = $s3Client->getIterator('ListObjects', array(
'Bucket' => $bucket
));
// We just iterate through the objects.
foreach ($s3ObjectIterator as $object) {
// Checking the extension of each one.
$ext = pathinfo($object['Key'], PATHINFO_EXTENSION);
// if it isn't a log file, we don't care.
if ($ext !== 'log'){
// Go do the next one
continue;
}
// we have a log file, whoot.
// lets get a temp file
$tempFile = tmpfile();
// we will write the object to a temp file, because it is
// potentially to stinking large to write to a php string in memory.
$result = $s3Client->getObject(array(
'Bucket' => $bucket,
'Key' => $object['Key'],
'SaveAs' => $tempFile
));
// Now, lets open the tmp file
$handle = @fopen($result['Body']->getUri(), "r");
if ($handle) {
// and read it line by line
while (($buffer = fgets($handle)) !== false) {
$matches = NULL;
preg_match($elbLogPattern, $buffer, $matches);
// Transform the timestamp
$timestamp = new DateTime($matches['timestamp']);
// Print the new formated line to our splunk logfile
fprintf($fpSplunkLog, $splunkFormat, $timestamp->format('c'), $matches['elb'], $matches['client'],$matches['backend'],$matches['request_processing_time'],$matches['backend_processing_time'],$matches['response_processing_time'],$matches['elb_status_code'],$matches['backend_status_code'],$matches['received_bytes'],$matches['sent_bytes'],$matches['request']);
}
// We ought to be at the end of file now
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
}
// Be a good boy and cleanup.
fclose($handle);
}else{
echo "We didn't get a handle on {$result['Body']->getUri()}\n";
}
// Always cleanup the temp file
unlink($result['Body']->getUri());
// And lets clean up S3 now
$result = $s3Client->deleteObject(array(
'Bucket' => $bucket,
'Key' => $object['Key'],
));
}
// final bit of cleaning.
fclose($fpSplunkLog);
exit();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment