bshuler/README.md

## README.md

      
    Raw
  

              README.md
            
          
    How to make fake data in Splunk using SPL

Sometimes, you need to fake something in Splunk. Might be during development
and you don't feel like writing a real search, but you really need a number for a
dashboard panel to look right. Maybe you are helping someone with a hairy regex,
and you don't want to index data just to test it on your instance. Whatever the
reason, here are some searches that have helped me out.
Note that when using these techniques, you are not going through the indexing
and parsing pipelines, so you can't test everything.
Make event containing a string and numeric field

| makeresults | eval msg="hello", seq=1

Make events containing a random number

This uses random() function to the eval command. Unfortunately, this command
does not have a range parameter, so it spits out a random 32-bit integer. We
can make it fit a desired range with the modulo operator. Since modulo math
"wraps around", you know that the remainder will alwys be less than your
divisor, in this case, 10. Note that all events generated this way will have the same _time.
| makeresults count=10 | eval int=random() % 10

And if you want to modify the range, you could add to it. This will create numbers between 1..10:
| makeresults count=10 | eval int=random() % 10 + 1

Fields containing random values from a set

This will create events containing one of the two "answers" as supplied
to the if() function. Because we are using random() and modulo again, we know the
remainder will be 0..divisor (which is 5 here). Because the if() checks for equality
to 1, this means that the first option will appear in approximately 25% of the events.
I'm using a higher count of 100 to give the results more entropy. This makes for better
fake data.
    | makeresults count=100 | eval poll=if((random()%5) == 1, "Option A", "Option B")

These are good to pass to a stats function (and then visualiations) like this:
    | makeresults count=100 | eval poll=if((random()%5) == 1, "Option A", "Option B") | stats count by poll

To generate more than 2 values, you'll need to use case():
    | makeresults count=10 | eval num = random() % 100, error = case(num < 10, 404, num >= 10 AND num < 13, 500, num >= 13, 200), error_msg = case(error == 404, "Not found", error == 500, "Internal Server Error", error == 200, "OK")

To make a table of data:
    | makeresults | eval _raw = "John Doe (jdoe@splunk.com)%Manager%Trusted#Peter Doe (pdoe@splunk.com)%Contractor%Trusted#Sally Doe (sdoe@splunk.com)%VP%WatchList#Al Doe (adoe@splunk.com)%DevOps%WatchList" | rex  max_match=99 "(?<_raw>[^#]+)" | mvexpand _raw | table _raw | rex "(?<Name>.*) \((?<Email>.*)\)%(?<Group>.*)%(?<Class>.*)" | table Name Email Group Class

Test a regular expression

Now I'm just slapping a whole event into _raw so that I can test a regular expression on it. Be careful
with embedded quotes, those will not parse well and it's hard to escape them. Since we are faking it,
it's usually ok to just remove or alter double quotes to single.
    | makeresults | eval _raw = "2016-09-13 22:23:28,289 INFO	[57d8b4a04210814f1d0] cached:77 - memoized decorator used on function <function getEntities at 0x106af22a8> with non hashable arguments" 
    | rex field=_raw "\[(?<foo>.*)\]"