Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Decision tree breakdown of the Titanic survivor data

Referencing the decision tree from https://en.wikipedia.org/wiki/File:CART_tree_titanic_survivors.png

Original data from https://www.kaggle.com/c/titanic/data

Description Count % from total Notes
Total 891 100%
Died 549 62%
Survived 342 38%
Male 577 65%
Female 314 35% (is male: no)
Female died 81 9%
Female survived 233 26% Out of female population: 74%
Males age > 9.5 421 47%
Males age unknown 124 14%
Males age > 9.5 or unknown 545 61% (is male, age >9.5)
Males age <=9.5 32 4%
Males age>9.5/unk died 455 51% Out of male>9.5/unk pop: 83%
Males age>9.5/unk survived 90 10% Out of male>9.5/unk pop: 17%
Males age<=9.5, sibsp>2.5 14 2% (is male, age<=9.5, sibsp>2.5)
Males age<=9.5, sibsp<=2.5 18 2% (is male, age<=9.5, sibs<=2.5)
(From here on values are different than on the tree)
Males age<=9.5, sibsp>2.5 died 13 1.46%
Males age<=9.5, sibsp>2.5 survived 1 0.11%
Males age<=9.5, sibsp<=2.5 died 0 0%
Males age<=9.5, sibsp<=2.5 survived 18 2.02%
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment