Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
netstat on all machines -> python -> graphviz -> png
$ knife ssh -m "...every host in the network..." "sudo netstat -nutap" -a hostname > meganetstat.txt
$ python
>>> from collections import Counter as C
>>> HS = "...every host in the network...".split()
>>> ip = lambda s: s.split(":")[0]
>>> xs = [map(ip, [x[0], x[4], x[5]]) for x in [x.strip().split() for x in open("meganetstat.txt").readlines() if "tcp" in x] if len(x)>=6]
>>> ipmap = [(h, C([x[1] for x in xs if x[0] == h])) for h in HS]
>>> ipmapx = dict([(sorted([(x,y) for (x,y) in ip[1].items() if x.startswith("10.")], key=lambda t: -t[1])[0][0], ip[0]) for ip in ipmap])
>>> sorted(C(map(ipmapx.get, [x[2] for x in xs if x[2].startswith("10.")])).items(), key=lambda t: t[1])
[...a list of hosts ordered by # of incoming edges, load balancers had the most, etc...]
>>> edges = [(x[0], ipmapx.get(x[2])) for x in xs]
>>> open("out.gv", "w").write(("digraph world {\n" + ("\n".join('\t"%s" -> "%s";' % x for x in set(edges) if "None" not in repr(x) and x[0] != x[1])) + "\n}\n"))
out.gv looks like this:
digraph world {
"host3" -> "host4";
"host18" -> "host1";
"host3" -> "host10";
"host5" -> "host7";
...hundreds of more edges...
}
then you use the "dot" command to render the graph to an image:
$ dot -Tpng out.gv > out
$ dot -Tpng -Ktwopi out.gv > out3.png
$ dot -Tpng -Kcirco out.gv > out4.png
$ dot -Tpng -Ksfdp out.gv > out5.png
$ dot -Ksfdp -Gsize=100! -Goverlap=prism -Tpng out.gv > out6.png
i believe that last one gave the best output

Sorry this code is really brittle, has horrible names, and depends very specifically on the format of the "meganetstat.txt" file and some hardcoded assumptions. I will try to walk through the code a bit in case you want to re-use it.

Here's what the input file should look like:

foo-server     Active Internet connections (servers and established)
foo-server     Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
foo-server     tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      1292/sshd
foo-server     tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      1497/master
...

This is just netstat -nutap with the name of the server prefixed to every line. I don't use "knife" anymore, so I would probably do something different (e.g. dump each machine's output to a file?), but the idea is the same, you need the full netstat output along with the host it belongs to.

Once you have this file, xs is defined to be a list of [src_hostname, src_ip, dst_ip] triples:

>>> pprint.pprint(xs)
[['foo-server', '0.0.0.0', '0.0.0.0'],
 ['foo-server', '127.0.0.1', '0.0.0.0'],
 ['foo-server', '0.0.0.0', '0.0.0.0'],
 ['foo-server', '0.0.0.0', '0.0.0.0'],
 ['foo-server', '172.11.111.11', '172.11.111.55'],
...
]

Any host can have multiple IPs (127.0.0.1, 0.0.0.0, multiple NICs, public network, private network, etc.), so how do we know how to associate these with other machines in the cluster? We first define ipmap as a list of (hostname, Counter(ip -> occurences)) pairs.

>>> ipmap = [(h, C([x[1] for x in xs if x[0] == h])) for h in HS]
>>> ipmap
[('foo-server', Counter({'172.11.111.11': 929, '127.0.0.1': 708, '': 679, '0.0.0.0': 3}))]

For this experiment I only wanted to map out the connections on the private network. For this I used the dumb heuristic of filtering the IPs with x.startswith('10.') (or 172. or whatever it is on your network). I also decided to only use the "most common" private IP to make the code simpler. ipmapx is defined by mapping the most common private IP for each hostname to that hostname:

>>> ipmapx = dict([(sorted([(x,y) for (x,y) in ip[1].items() if x.startswith("172.")], key=lambda t: -t[1])[0][0], ip[0]) for ip in ipmap])
>>> ipmapx
{'172.11.111.11': 'foo-server'}

Finally we can use this mapping to walk over the [src_hostname, src_ip, dst_ip] triples and associate the dst_ip with a hostname using ipmapx to get a list of [src_hostname, dst_hostname] pairs, which are the edges of the network graph:

>>> edges = [(x[0], ipmapx.get(x[2])) for x in xs]
>>> pprint.pprint(edges)
[('foo-server', None),
 ('foo-server', None),
 ('foo-server', None),
 ('foo-server', None),
 ('foo-server', 'foo-server'),
 ('foo-server', 'foo-server'),
 ('foo-server', 'foo-server'),
 ('foo-server', 'foo-server'),
 ('foo-server', 'foo-server'),
...
]

Then the rest of the code from there it's just writing the graphviz format as text like so:

digraph world {
    "host3" -> "host4";
    "host18" -> "host1";
    "host3" -> "host10";
    "host5" -> "host7";
    ...hundreds of more edges...
}

And showing different layout / size options.

@schlomo
Copy link

schlomo commented Aug 23, 2013

Cool thing! Sadly I get

Traceback (most recent call last):
  File "netstat2dot.py", line 7, in <module>
    ipmapx = dict([(sorted([(x,y) for (x,y) in ip[1].items() if x.startswith("10.")], key=lambda t: -t[1])[0][0], ip[0]) for ip in ipmap])

My input looks like this:

Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name   
tcp        0      0 0.0.0.0:1099                0.0.0.0:*                   LISTEN      4671/jstatd         
tcp        0      0 0.0.0.0:58027               0.0.0.0:*                   LISTEN      4671/jstatd         
tcp        0      0 0.0.0.0:875                 0.0.0.0:*                   LISTEN      4177/rpc.rquotad    
tcp        0      0 0.0.0.0:40845               0.0.0.0:*                   LISTEN      4181/rpc.mountd     
tcp        0      0 0.0.0.0:111                 0.0.0.0:*                   LISTEN      1254/rpcbind        
tcp        0      0 0.0.0.0:28017               0.0.0.0:*                   LISTEN      4634/mongod         
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      4518/sshd           
tcp        0      0 127.0.0.1:25                0.0.0.0:*                   LISTEN      4612/master         
tcp        0      0 0.0.0.0:34331               0.0.0.0:*                   LISTEN      -                   
tcp        0      0 0.0.0.0:2049                0.0.0.0:*                   LISTEN      -                   
tcp        0      0 0.0.0.0:5666                0.0.0.0:*                   LISTEN      4534/nrpe           
tcp        0      0 0.0.0.0:40738               0.0.0.0:*                   LISTEN      4181/rpc.mountd     
tcp        0      0 0.0.0.0:60546               0.0.0.0:*                   LISTEN      1272/rpc.statd      
tcp        0      0 0.0.0.0:8935                0.0.0.0:*                   LISTEN      4621/python         
tcp        0      0 127.0.0.1:199               0.0.0.0:*                   LISTEN      4280/snmpd          
tcp        0      0 0.0.0.0:50920               0.0.0.0:*                   LISTEN      4181/rpc.mountd     
tcp        0      0 0.0.0.0:27017               0.0.0.0:*                   LISTEN      4634/mongod         
tcp        0      0 10.93.242.186:56504         10.93.133.239:389           ESTABLISHED 1198/nslcd          
tcp        0     80 10.93.242.186:22            10.3.134.30:39364           ESTABLISHED 13949/sshd          
tcp        0      0 10.93.242.186:53670         10.1.130.146:2003           ESTABLISHED 5302/python         

Not knowing knife I am not sure what the "-a hostname" does in your knife call. I used pdsh...

@stuart-warren
Copy link

stuart-warren commented Sep 21, 2013

@schlomo See the docs - not that i've tried this yet, but it looked cool as an idea
http://docs.opscode.com/chef/knife.html#id293

@lost-theory
Copy link
Author

lost-theory commented Sep 4, 2015

@schlomo sorry for the 2 year late response 😸, but I updated the gist with a walkthrough of the code and the data that each step should be producing:

https://gist.github.com/lost-theory/6309478#file-netstat-2015-md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment