Skip to content

Instantly share code, notes, and snippets.

@bshuler
Forked from halr9000/README.md
Last active September 30, 2023 15:24
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save bshuler/5d0d75ac43ed8f57809fed6b60c4bfca to your computer and use it in GitHub Desktop.
Save bshuler/5d0d75ac43ed8f57809fed6b60c4bfca to your computer and use it in GitHub Desktop.
How to make fake data in Splunk using SPL

How to make fake data in Splunk using SPL

Sometimes, you need to fake something in Splunk. Might be during development and you don't feel like writing a real search, but you really need a number for a dashboard panel to look right. Maybe you are helping someone with a hairy regex, and you don't want to index data just to test it on your instance. Whatever the reason, here are some searches that have helped me out.

Note that when using these techniques, you are not going through the indexing and parsing pipelines, so you can't test everything.

Make event containing a string and numeric field

| makeresults | eval msg="hello", seq=1

Make events containing a random number

This uses random() function to the eval command. Unfortunately, this command does not have a range parameter, so it spits out a random 32-bit integer. We can make it fit a desired range with the modulo operator. Since modulo math "wraps around", you know that the remainder will alwys be less than your divisor, in this case, 10. Note that all events generated this way will have the same _time.

| makeresults count=10 | eval int=random() % 10

And if you want to modify the range, you could add to it. This will create numbers between 1..10:

| makeresults count=10 | eval int=random() % 10 + 1

Fields containing random values from a set

This will create events containing one of the two "answers" as supplied to the if() function. Because we are using random() and modulo again, we know the remainder will be 0..divisor (which is 5 here). Because the if() checks for equality to 1, this means that the first option will appear in approximately 25% of the events. I'm using a higher count of 100 to give the results more entropy. This makes for better fake data.

    | makeresults count=100 | eval poll=if((random()%5) == 1, "Option A", "Option B")

These are good to pass to a stats function (and then visualiations) like this:

    | makeresults count=100 | eval poll=if((random()%5) == 1, "Option A", "Option B") | stats count by poll

To generate more than 2 values, you'll need to use case():

    | makeresults count=10 | eval num = random() % 100, error = case(num < 10, 404, num >= 10 AND num < 13, 500, num >= 13, 200), error_msg = case(error == 404, "Not found", error == 500, "Internal Server Error", error == 200, "OK")

To make a table of data:

    | makeresults | eval _raw = "John Doe (jdoe@splunk.com)%Manager%Trusted#Peter Doe (pdoe@splunk.com)%Contractor%Trusted#Sally Doe (sdoe@splunk.com)%VP%WatchList#Al Doe (adoe@splunk.com)%DevOps%WatchList" | rex  max_match=99 "(?<_raw>[^#]+)" | mvexpand _raw | table _raw | rex "(?<Name>.*) \((?<Email>.*)\)%(?<Group>.*)%(?<Class>.*)" | table Name Email Group Class

Test a regular expression

Now I'm just slapping a whole event into _raw so that I can test a regular expression on it. Be careful with embedded quotes, those will not parse well and it's hard to escape them. Since we are faking it, it's usually ok to just remove or alter double quotes to single.

    | makeresults | eval _raw = "2016-09-13 22:23:28,289 INFO	[57d8b4a04210814f1d0] cached:77 - memoized decorator used on function <function getEntities at 0x106af22a8> with non hashable arguments" 
    | rex field=_raw "\[(?<foo>.*)\]"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment