Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Get some extra file names from http
redef record HTTP::Info += {
potential_fname: string &optional;
};
event http_request(c: connection, method: string, original_URI: string,
unescaped_URI: string, version: string) &priority=5
{
# Get rid of uri arguments
local path = split_string(c$http$uri, /\?/)[0];
local out = split_string(path, /\//);
# Take the last component in the uri path
c$http$potential_fname = out[|out|-1];
}
event http_header(c: connection, is_orig: bool, name: string, value: string) &priority=3
{
if ( is_orig )
return;
if ( name == "ETAG" && /\"/ in value )
{
if ( c$http?$potential_fname && c$http$potential_fname != "" )
c$http$current_entity$filename = c$http$potential_fname;
}
}
@duffy-ocraven
Copy link

duffy-ocraven commented Sep 15, 2020

Oh and a small clarification, so that we don't digress over a canard. I realize Zeek logs aren't sequences of bytes where anything could end up in them, because the tab separated data and json both escape non-printable stuff. But internally in Zeek I worry if in every datatype they're all just arbitrary sequences of bytes which means they can technically haves nulls or anything else in them. I would blanche if hash results could haves nulls or anything such in them. The point I am raising in this discussion is that programmers carry some semantic baggage as they read variable and type names. I blanche if a "filename" can contain a * or / or \. It needs to be termed a filepath if it is the '/' delimited hierarchy. It needs to be a fullpath if it is the filepath and filename concatenated. It needs to be a pattern if it can contain * or ?.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment