Thank you to the Log2timeline and Timesketch teams for putting out some solid work. I am thankful that I have the opportunity to create this write up, which comes on the back of their hard work. While there may be holes in their documentation and descriptions, the code is well written and fairly easy to understand. Any criticism should not be interpretted as a reflection of the quality of their work.
Setting up Timesketch and Plaso seem pretty easy, but what happens when we need to create an instance dynamically with modifications of custom parsers. This document captures some of my experience and provides some insight into getting this up and running in a reasonable amount of time.
A co-worker gave me an IP address and asked me to:
- Provide guidance on any additional information
- Investigate related hosts in the network
These are pretty standard tasks, so I fired up a VM and started visiting the IP address, which turned out to be pretty simple task. I repeatedly visited a site and got redirected through a sketchy ad-network.
To support the analysis, an Ubuntu Linux VM was created, and a passive DNS service to capture the domain resolutions.
- Ubuntu 18.04 in a VM
- PassiveDNS
Once the rig was set-up, the IP address was plugged in, and the site was visited and redirected to a number of different sites. This results in two desparate log and information sources (e.g. passive DNS, FireFox web history, and cookies).
Additional tools were needed to combine the different information sources.
Plaso is a tool that helps simplify and automate the extraction forensic information files, logs, etc. Plaso can handle analyzing known one-off log files or disk images and help build timelines based on the events and information recovered. If the log file is not known, a new parser and formatter will need to be written and added to the project. In this case, the passive DNS service is not supported, so additional development is required. However, the current documentation is lacking on how to write, test, and deploy the plugin, so this will be covered.
Timesketch is an application that helps with the analysis and visualization of timelines from Plaso or other sources of timeline data. The easiest way to get started with Timesketch is using Docker. The current documentation does not create a working instance of all the services.
In this example, a Plaso parser and formatter will be created. The parser and formatter will then be used to extract and export events from the log file using Timesketch, psort, etc.
-
Create the parser and event Python classes.
-
event class contains fields that detail the event
- set the
DATA_TYPE
(used to map event to a formatter) - create the important fields for the event (not
timestamp
because it causes aliasing with event data)
- set the
-
parser takes a file object, data, or parsed log line. Information is extracted and put into an event.
- set the
NAME
(used to map the parser to a filter) - set the
DESCRIPTION
- Parsing different file/data types
- parsing binary or unpredictable content formats deriving parsers from
interface.FileObjectParser
- parsing textual and predictable (e.g. logs, etc.) content formats deriving parsers from
text_parser.PyparsingSingleLineTextParser
, etc. (seeparsers.text_parser
for other types). - If creating a binary parser, derive class from
interface.FileObjectParser
:- put the parsing logic in
parser.ParserFileObject
, at run-time this method gets called
- put the parsing logic in
- If creating a binary parser, derive class
text_parser.PyparsingSingleLineTextParser
:- put the parsing logic in
parser.ParserRecord
, at run-time this method gets called fromFileObjectParser
after reading a line - create the parsing grammar using
pyparsing
Python module - create a mapping of the name to grammar in
LINE_STRUCTURES
map the grammars to akey
(see comments intext_parser.PyparsingSingleLineTextParser
class)
- put the parsing logic in
- parsing binary or unpredictable content formats deriving parsers from
- set the
-
-
Create the formatter Python class
- import the event created in the newly created parser module
- set the
DATA_TYPE
(used to map event to a formatter) - create the important fields for the event (not
timestamp
because it causes aliasing with event data)
-
Update the imports in
parsers.__init__.py
andformatters.__init__.py
-
Testing the parser
- Create the
unittest
test case intests/parsers/
based on the new parser module - Create the test runner
- copy/create a sample of the expected log data in
test_data/
- copy the
run_tests.py
-->new_run_tests.py
- for an idea on how to create the test, look in
tests\parsers
andtests\formatters
for a representative example - change the runner to focus only on
tests\parsers
directory and specific parser moduletest_suite = unittest.TestLoader().discover('tests/parsers/', pattern='CHANGEME_TO_PARSER_NAME.py')
- change the runner to focus only on
tests\formatters
directory and specific formatter moduletest_suite = unittest.TestLoader().discover('tests/formatters/', pattern='CHANGEME_TO_FORMATTER_NAME.py')
- copy/create a sample of the expected log data in
- Create the
-
Testing Plaso
- test the parser on the specific file content
- execute
log2timeline.py --parsers `PARSER_NAME` --status_view none test.plaso EXAMPLE.log
- execute
- look for potential warnings or error and correct them in the code (see below for an example)
pinfo.py -v test.plaso
- test the parser on the specific file content
************************ Warnings generated per parser *************************
Parser (plugin) name : Number of warnings
--------------------------------------------------------------------------------
<No parser> : 1
--------------------------------------------------------------------------------
************************* Pathspecs with most warnings *************************
Number of warnings : Pathspec
--------------------------------------------------------------------------------
1 : type: OS, location:
/home/you/EXAMPLE.log
--------------------------------------------------------------------------------
********************************** Warning: 0 **********************************
Message : unable to process path specification with error: Expected
"||" (at char 82), (line:1, col:83)
Parser chain :
Path specification : type: OS, location:
/home/you/EXAMPLE.log
--------------------------------------------------------------------------------
Several modifications need to be made to the Timesketch to get it working and then also update the containers code with a fresh build of Plaso with the new code.
-
Dockerfile
updates- create configuration file for the container
- update code in the Timesketch distribution (see item 4)
- update and install Plaso from source with the new parsers
-
Added a few explict configurations to the
docker-compose.yml
- Set-up passwords
neo4j
and link it totimesketch
- Set-up passwords
-
Set host IPs and passwords in
timesketch.conf
manually to ensure services can connect. Where IPs are concerned, the address is set to the default docker interface (e.g. 172.17.0.1).- set
neo4j
IP and password - set
sqlalchemy
URI forpostgres
: IP, username, and password - set
celery
IP for URI - set
elaticsearch
IP
- set
-
Update
tasks.py
in Timesketch to properly handle Plaso uploads.
In the Dockerfile
, the following lines were added to faciliate the install and update of Plaso:
#install util dependencies
RUN apt install -y sudo wget software-properties-common wget
# clone the plaso repo
RUN git clone https://github.com/log2timeline/plaso/ /tmp/plaso
# place the parser and formatter files into the right directory
ADD docker/parser_passivedns.py /tmp/plaso/plaso/parsers/passivedns.py
ADD docker/formatter_passivedns.py /tmp/plaso/plaso/formatters/passivedns.py
# place the parser and formatter files into the right directory
RUN echo -e "\nfrom plaso.parsers import passivedns\n" >> /tmp/plaso/plaso/parsers/__init__.py
RUN echo -e "\nfrom plaso.formatters import passivedns\n" >> /tmp/plaso/plaso/formatters/__init__.py
# install dependencies
RUN bash /tmp/plaso/config/linux/gift_ppa_install_py3.sh
RUN pip3 install -r /tmp/plaso/requirements.txt
RUN pip3 install -r /tmp/plaso/test_requirements.txt
RUN pip3 install /tmp/plaso
In the Dockerfile
, the following lines were added to faciliate update of Timesketch:
# Copy the Timesketch configuration file into /etc
# Copy Timesketch config files into /etc/timesketch
RUN mkdir /etc/timesketch
RUN cp -r /tmp/timesketch/config/* /etc/timesketch/
ADD docker/tasks.py /usr/local/lib/python3.6/dist-packages/timesketch/lib/tasks.py
ADD docker/timesketch.conf /etc
Here are the commands that can be executed to install, update, and run Timesketch:
git clone https://github.com/google/timesketch/
cd timesketch/docker
# assuming wget is installed
# Download required files for timesketch
wget -O Dockerfile https://gist.githubusercontent.com/deeso/7418cb71f37fd9b7e37861de77dd9fb8/raw/dec8cbd34a9e876626b562bd2922b3b6dfb9de66/Dockerfile
wget -O docker-compose.yaml https://gist.githubusercontent.com/deeso/5e3e07b515ee6d0a43e91c0dfadf3bdf/raw/5f258ba6257e81890444b6282dfb95f89537dea8/docker-compose.yml
wget -O tasks.py https://gist.githubusercontent.com/deeso/0def71c215509c87e15ce7f22e7441f6/raw/6208919d3fb854e6f142c219375151a8b2ee503a/tasks.py
wget -O timesketch.conf https://gist.githubusercontent.com/deeso/6533d515693247664fa423d6c669e67b/raw/c3be406fdda8d5285466bfcc6c5f47885f44db89/timesketch.conf
# Download required files for the plaso update and install
wget -O parser_passivedns.py https://gist.githubusercontent.com/deeso/513dfad9fbbc9638d57129acd96762d6/raw/d7a5d3af5520eab145067ec98ec70085ee6baa0c/parser_passivedns.py
wget -O formatter_passivedns.py https://gist.githubusercontent.com/deeso/d0f8eb304d9c6771fa9a9b085d40a382/raw/e271ba5d873ccdc0089d9f78c1a70c13a24693c8/formatter_passivedns.py
# execute build and execute
docker-compose build; docker-compose up --detach
Reference Timesketch Files
Reference Plaso Files