Created
March 17, 2020 15:30
-
-
Save olekscode/16fb7ba1c4d3d1ed1edc99133377f980 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Metacello new | |
baseline: 'DataFrame'; | |
repository: 'github://PolyMathOrg/DataFrame/src'; | |
load. | |
covidDataFile := '/Users/oleks/Documents/Data/COVID-19-geographic-disbtribution-worldwide-2020-03-17.csv' asFileReference. | |
"The date format is not understood by Pharo by default. | |
So we read CSV with all values as strings (without type inference) | |
and then manually convert dates and numbers" | |
reader := DataFrameCsvReader new. | |
reader shouldParseTypes: false. | |
covid := DataFrame readFrom: covidDataFile using: reader. | |
covid columnNames. | |
"an OrderedCollection('DateRep' 'Day' 'Month' 'Year' 'Cases' 'Deaths' 'Countries and territories' 'GeoId')" | |
"Those columns are redundant" | |
covid removeColumns: #(Day Month Year GeoId). | |
covid | |
renameColumn: 'DateRep' to: 'Date'; | |
renameColumn: 'Countries and territories' to: 'Country'. | |
covid columnNames. | |
"an OrderedCollection('Date' 'Cases' 'Deaths' 'Country')" | |
covid | |
toColumn: 'Date' | |
applyElementwise: [ :each | | |
Date readFrom: each pattern: 'dd/mm/yyyy' ]. | |
covid | |
toColumns: #(Cases Deaths) | |
applyElementwise: [ :each | each asInteger ]. | |
covidFrance := covid select: [ :row | (row at: 'Country') = 'France' ]. | |
(covidFrance columns: #(Cases Deaths)) sum. | |
"a DataSeries('Cases'->6633 'Deaths'->148)" | |
deathsByCountry := (covid | |
group: 'Deaths' | |
by: 'Country' | |
aggregateUsing: #sum) sortDescending. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment