Skip to content

Instantly share code, notes, and snippets.

@bsmithyman
Last active December 17, 2015 17:39
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bsmithyman/5647184 to your computer and use it in GitHub Desktop.
Save bsmithyman/5647184 to your computer and use it in GitHub Desktop.
Python gist to take a complicated nested object and grab a list of numbers from it.
import re
parser = re.compile('(?:[0-9\.]+)')
lambda x: [float(item) for item in parser.findall(repr(x))]
@bpostlethwaite
Copy link

justthenumbers is the second parse solution for extracting out a 1D data structure scipy's loadmat's attempt at parsing a matlab .mat file. The scipy.loadmat structure is some seriously byzantine crazytown.

So what are we dealing with

print type(mat['db']['conrad'])
<type 'numpy.ndarray'>

Ok, so mat['db'] is a dictionary and attribute conrad is a numpy array.

print type(mat['db'])
<type 'numpy.ndarray'>

oh.
Well we actually want the .ih attribute... it's supposed to be on conrad, maybe conrad is a dictionary after all.

print type(mat['db']['conrad']['ih'])
---------------------------------------------------------------------------
ValueError

Ok so it's a ndarray. What if we ...

print type(mat['db'][0,0]['conrad']['ih'])
<type 'numpy.ndarray'>

right.

What does this array look like anyway?

print mat['db'][0,0]['conrad']['ih']
[[[[ 39]
 [149]]]]

Of course it is.

@bsmithyman is dangerous, so here is the first hack attempt at parsing a bunch of this data, for the record:

repr(mat['db'][0,0]['conrad']['ih'])
Out[63]: 'array([[[[ 39]\n [149]]]], dtype=object)'

which demands:

fixer = lambda x: [float(item.replace('[','').replace(']','').replace(',','')) for item in repr(x).split()[1:-1]]

We went with the regex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment