Skip to content

Instantly share code, notes, and snippets.

View borowis's full-sized avatar

Borys Zibrov borowis

View GitHub Profile

You changed your file like this:

<key>S_Credit_Amount_Will_Expire_ORIGINAL</key>
<dict>
    <key>NSStringLocalizedFormatKey</key>
    <string>%#@creditamount@</string>
    <key>creditamount</key>
    <dict>
        <key>NSStringFormatSpecTypeKey</key>
@borowis
borowis / xml_split.py
Last active September 2, 2017 14:07 — forked from benallard/xml_split.py
``./xml_split.py -M 8092 --split_on_tag "tu" big.xml`` --> split potentially huge big.xml into approximately 8 Mb chunks with names big.0.xml, big.1.xml etc. Split only on </tu> tags (can be omitted if you do not care).
#!/usr/bin/env python
import os
import xml.parsers.expat
from xml.sax.saxutils import escape
from optparse import OptionParser
from math import log10
DEBUG_MODE = False
@borowis
borowis / file_split.py
Created April 22, 2016 12:14
Split text/xml files on regexp, file_size. Similar to some hybrid of csplit + split. Usage: ./file_split.py -M 8092 --regex ""^<tu>\s*$" big.xml --> split potentially huge big.xml into approximately 8 Mb chunks with names big.0.xml, big.1.xml etc. Split on lines matching regular expression provided. Inital idea from https://gist.github.com/benal…
#!/usr/bin/env python
import os
import re
import codecs
from optparse import OptionParser
from math import log10
# default encoding to use
ENCODING = "utf-8"
import pandas
variables_of_interest = {
'S3BQ1A5': ('ever used cannabis', {1: 'Yes', 2: 'No', 9: 'Unknown'}),
'S3BQ1A6': ('ever used cocaine/crack', {1: 'Yes', 2: 'No', 9: 'Unknown'}),
'S3BQ1A9A': ('ever used heroin', {1: 'Yes', 2: 'No', 9: 'Unknown'}),
'S3CD5Q13A': ('age at onset of cannabis abuse', {
'5-64': 'Age',
99: 'Unknown',
'BL': 'NA, didnt meet symptom criteria for lifetime cannabis abuse'}),
Python 2.7.10 |Anaconda 2.4.0 (64-bit)| (default, Oct 19 2015, 18:04:42)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> >>> >>> >>> >>> main()
# of observations is 43093
# of variables is 3008
__main__:59: FutureWarning: convert_objects is deprecated. Use the data-type specific converters pd.to_datetime, pd.to_timedelta and pd.to_numeric.
counts for S3BQ1A6 -- ever used cocaine/crack( 1 : Yes 2 : No 9 : Unknown )