Skip to content

Instantly share code, notes, and snippets.

@FilipDominec
Created December 14, 2018 10:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save FilipDominec/44088ce4a0f1f9a921fc0c9df2ab800d to your computer and use it in GitHub Desktop.
Save FilipDominec/44088ce4a0f1f9a921fc0c9df2ab800d to your computer and use it in GitHub Desktop.
File splitter based on (multiple) keyword searches (must be at line start!). Keeps 1st line as header.
#!/usr/bin/python3
#-*- coding: utf-8 -*-
import re, sys
with open(sys.argv[1]) as inputf:
c = 1
ls = inputf.readlines()
fromline = 1
for splitter in sys.argv[2:]:
for cc,ll in enumerate(ls):
#print('testing', splitter, ll)
if re.match("^"+splitter, ll):
toline=cc
#print('found match on line', cc,ll)
break
else:
print("splitting error - no match for", splitter)
with open(sys.argv[1].replace('.txt', '') + ('_cycle%d.txt'%c), 'w') as outputf:
print('exporting', fromline,':',toline, 'c=',c)
outputf.write(''.join(ls[0:1]+ls[fromline:toline]))
c += 1
fromline = toline
with open(sys.argv[1].replace('.txt', '') + ('_cycle%d.txt'%c), 'w') as outputf:
print('exporting', fromline,':',len(ls), 'c=',c)
outputf.write(''.join(ls[0:1]+ls[fromline:]))
@FilipDominec
Copy link
Author

Example: Contents of the "aa" file:

HEADER
adam 
barbora
cyril
dusan
ema
filip

Invocation:
python3 splitter.py aa dusa

Output in "aa_cycle1.txt"

HEADER
adam 
barbora
cyril

And in "aa_cycle2.txt"

HEADER
dusan
ema
filip

Note you can detect the rising edges in your (x,y) data using e.g. python:
convol = 2**-np.linspace(-2,2,5)**2; y = np.convolve(y,convol/np.sum(convol), mode='same') ## simple smoothing
skippy = (y[1:]-y[:-1])
skipx = (x[:-1])[skippy>((np.max(y)-np.min(y))/30)]
print((x[np.argsort(-skippy)])[:30])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment