Skip to content

Instantly share code, notes, and snippets.

@yarick
Last active July 30, 2018 16:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save yarick/eb3d5645cb854a1a268f90a334c2348f to your computer and use it in GitHub Desktop.
Save yarick/eb3d5645cb854a1a268f90a334c2348f to your computer and use it in GitHub Desktop.
Fast Search for text or pattern in memory mapped file
#!/usr/bin/python2.7
import re, mmap, os, contextlib, sys
#print "Usage: ./s1.py file regex(to print)"
# print "Example: ./s1.py /tmp/a.log 10.213.194.\S+:http://([^/]+)/"
myString = str(sys.argv[2])
print "Searching file: " , str(sys.argv[1])
fn = sys.argv[1]
size = os.stat(fn).st_size
re.compile(myString)
with open(fn, 'r+') as f:
data = mmap.mmap(f.fileno(), size)
for match in re.finditer(myString, data, re.S):
print match.group(0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment