Created
December 5, 2016 14:53
-
-
Save tpapp/db1ae5380ab0b495b09557d9d0c2a90d to your computer and use it in GitHub Desktop.
eyeballing AMDB data with lots of spells
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using AMDB | |
using DataStructures | |
using GZip | |
using UnicodePlots | |
# replace this line with the path on your own machine | |
records = GZip.open(deserialize, | |
expanduser("~/research/AMDB/data/AMDB_subsample.jls.gz"), "r") | |
long_samples = [data for (id,data) in records if length(data.AMP_spells) > 1000] | |
length(long_samples) # 6 | |
# let's look at the first one | |
s = long_samples[1].AMP_spells | |
unique(spell.employer for spell in s) # about 9 employers | |
# let's count them | |
c = counter(Tuple{Int,AMP.Spell}) | |
for spell in s | |
push!(c, (spell.employer, spell.status)) | |
end | |
c.map # lots of minor employment spells |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment