Skip to content

Instantly share code, notes, and snippets.

@drumadrian
Last active March 15, 2024 17:22
Show Gist options
  • Save drumadrian/8a0c1f6bd95cb70f871cffbc38084c22 to your computer and use it in GitHub Desktop.
Save drumadrian/8a0c1f6bd95cb70f871cffbc38084c22 to your computer and use it in GitHub Desktop.
Sample logstash.conf file for S3 Input plugin
# References:
# https://www.elastic.co/guide/en/logstash/current/plugins-inputs-s3.html
# https://www.elastic.co/blog/logstash-lines-inproved-resilience-in-S3-input
# https://www.elastic.co/guide/en/logstash/6.3/installing-logstash.html
# https://www.elastic.co/guide/en/logstash/current/working-with-plugins.html
# https://www.garron.me/en/bits/curl-delete-request.html
sudo yum update -y
sudo yum install -y java-1.8.0-openjdk
java -version
# Logstash requires Java 8
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
sudo vi /etc/yum.repos.d/logstash.repo
# Insert this below as the contents (omitting the leading "#" ):
# [logstash-6.x]
# name=Elastic repository for 6.x packages
# baseurl=https://artifacts.elastic.co/packages/6.x/yum
# gpgcheck=1
# gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
# enabled=1
# autorefresh=1
# type=rpm-md
# Now install Logstash
sudo yum install -y logstash
sudo systemctl start logstash
sudo systemctl stop logstash
#ensure that logstash starts on boot
sudo systemctl enable logstash
# The S3 Logstash plugins should be present by default....otherwise you will need to install them
sudo yum install -y mlocate
sudo updatedb
cd /usr/share/logstash
bin/logstash-plugin list
# Config files are stored here:
# /etc/logstash/conf.d/*.conf
cd /etc/logstash/conf.d/
sudo vi s3_input.conf
sudo systemctl start logstash
# Now look at the log file for logstash here: tail -f /var/log/logstash/logstash-plain.log
# Sample Logstash configuration for creating a simple
# AWS S3 -> Logstash -> Elasticsearch pipeline.
# References:
# https://www.elastic.co/guide/en/logstash/current/plugins-inputs-s3.html
# https://www.elastic.co/blog/logstash-lines-inproved-resilience-in-S3-input
# https://www.elastic.co/guide/en/logstash/current/working-with-plugins.html
input {
s3 {
#"access_key_id" => "your_access_key_id"
#"secret_access_key" => "your_secret_access_key"
"region" => "us-west-2"
"bucket" => "testlogstashbucket1"
"prefix" => "Logs"
"interval" => "10"
"additional_settings" => {
"force_path_style" => true
"follow_redirects" => false
}
}
}
output {
elasticsearch {
hosts => ["http://vpc-test-3ozy7xpvkyg2tun5noua5v2cge.us-west-2.es.amazonaws.com:80"]
index => "logs-%{+YYYY.MM.dd}"
#user => "elastic"
#password => "changeme"
}
}
@stoufa
Copy link

stoufa commented Apr 16, 2023

Hi @ganeshk-nd,
Before switching to AWS Kinesis Firehose, We used to generate the date in the required format and inject it in a template config file.
Here are some snippets from both the template file and the Python script.

input {
  s3 {
    "region" => "REGION_PLACEHOLDER"
    "bucket" => "BUCKET_PLACEHOLDER"
    "prefix" => "PREFIX_PLACEHOLDER"
    "interval" => "10"
    "additional_settings" => {
      "force_path_style" => true
      "follow_redirects" => false
    }
  }
}

...
# generating the folder having today's files;
# format: yyyy/mm/dd/ e.g. 2022/07/05/
today = date.today()
PREFIX = f'{today.year:4}/{today.month:02}/{today.day:02}/'

...

data = {
    'REGION_PLACEHOLDER': args.region,
    'BUCKET_PLACEHOLDER': BUCKET_NAME,
    'PREFIX_PLACEHOLDER': PREFIX,
    # if no environment is set, use dev by default
    'ENVIRONMENT_PLACEHOLDER': 'prod' if args.prod else 'dev'
}

with open(f'templates/pipeline_{context}.conf') as f:
    template = f.read()
    result = template
    for placeholder, value in data.items():
        result = result.replace(placeholder, value)

# saving results to a file
output_file_path = '/path/to/logstash-x.y.z/config/pipeline.conf'

with open(output_file_path, 'w') as f:
    f.write(result)

I hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment