Skip to content

Instantly share code, notes, and snippets.

View oladapo-joseph's full-sized avatar

Adeleke Oladapo Joseph oladapo-joseph

  • Nigeria
View GitHub Profile

Interpreting the clusters using the top 3 components.

component key
marital Status {' Never-married': -0.9, ' Married-civ-spouse': -0.1, ' Divorced': 0.7, ' Married-spouse-absent': 1.6, ' Separated': 2.4, ' Married-AF-spouse': 3.1, ' Widowed':3.9 }
sex {' Male': -0.75, ' Female': 1.45}
fnlwgt -1.8 upto 8
@oladapo-joseph
oladapo-joseph / combined.tsv
Created February 8, 2024 12:13
clustering-adult-dataset-combined
We can make this file beautiful and searchable if this error is corrected: It looks like row 6 should actually have 22 columns, instead of 6. in line 5.
age workclass fnlwgt education education_num marital_status occupation relationship race sex ... Component4 Component5 Component6 Component7 Component8 Component9 Component10 Component11 Component12 Label
0 0.029411 -1.869799 -1.064913 -0.993444 1.152189 -0.857994 -1.386350 -1.069920 -0.361967 -0.695154 ... -0.884192 -1.154893 -0.381603 -1.499777 -0.322020 -0.950544 0.235116 0.078541 0.445735 1
1 0.838632 -1.074884 -1.010422 -0.993444 1.152189 -0.070699 -1.092619 -0.376697 -0.361967 -0.695154 ... -1.014554 -0.416200 -0.455524 -1.024343 -0.255394 -2.341608 0.458212 0.336097 -0.572849 2
2 -0.044154 -0.279969 0.233924 -0.698058 -0.425936 0.716596 -0.798888 -1.069920 -0.361967 -0.695154 ... -0.734063 -0.941219 -0.144708 -0.367693 0.759681 -0.286324 -0.044357 -0.174632 0.600966 2
3 1.059328 -0.279969 0.413286 -0.402671 -1.214999 -0.070699 -0.798888 -0.376697 1.296910 -0.695154 ... 0.414431 -1.028683 -0.393852 -0.496796 0.684878 -0.138222 1.081454 -0.716558 0.065552 2
4 -0.779809 -0.279969 1.388265 -0.993444 1.152
@oladapo-joseph
oladapo-joseph / kmeans.tsv
Created February 8, 2024 12:02
clustering-adult-dataset-kmeans
Component1 Component2 Component3 Component4 Component5 Component6 Component7 Component8 Component9 Component10 Component11 Component12
0 1.385096 -1.743607 0.757280 -0.884192 -1.154893 -0.381603 -1.499777 -0.322020 -0.950544 0.235116 0.078541 0.445735
1 0.377618 -0.438815 1.517646 -1.014554 -0.416200 -0.455524 -1.024343 -0.255394 -2.341608 0.458212 0.336097 -0.572849
2 0.371911 0.135254 0.196502 -0.734063 -0.941219 -0.144708 -0.367693 0.759681 -0.286324 -0.044357 -0.174632 0.600966
3 -0.237800 0.444952 -0.042851 0.414431 -1.028683 -0.393852 -0.496796 0.684878 -0.138222 1.081454 -0.716558 0.065552
4 -0.663421 -1.546723 1.359329 0.604464 0.441148 -0.127318 0.498663 1.386125 0.010999 0.200780 -1.176535 0.398386
@oladapo-joseph
oladapo-joseph / pca.md
Last active February 8, 2024 10:49
clustering-adult-dataset-pca-imprortant
feature Value
marital_status 0.490227
sex 0.405224
fnlwgt 0.370218
capital_loss 0.359067
education 0.290163
pay_grade 0.259627
race 0.209083
native_country 0.187307
@oladapo-joseph
oladapo-joseph / info.txt
Last active February 8, 2024 12:06
clustering-adult-dataset-info
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 age 5000 non-null int64
1 workclass 5000 non-null object
2 fnlwgt 5000 non-null int64
3 education 5000 non-null object
4 education_num 5000 non-null int64
5 marital_status 5000 non-null object
6 occupation 5000 non-null object
7 relationship 5000 non-null object
@oladapo-joseph
oladapo-joseph / head.csv
Created February 8, 2024 10:01
clustering-adult-dataset-head
age workclass fnlwgt education education_num marital_status occupation relationship race sex capital_gain capital_loss hours_per_week native_country pay_grade
39 State-gov 77516 Bachelors 13 Never-married Adm-clerical Not-in-family White Male 2174 0 40 United-States <=50K
50 Self-emp-not-inc 83311 Bachelors 13 Married-civ-spouse Exec-managerial Husband White Male 0 0 13 United-States <=50K
38 Private 215646 HS-grad 9 Divorced Handlers-cleaners Not-in-family White Male 0 0 40 United-States <=50K
53 Private 234721 11th 7 Married-civ-spouse Handlers-cleaners Husband Black Male 0 0 40 United-States <=50K
28 Private 338409 Bachelors 13 Married-civ-spouse Prof-specialty Wife Black Female 0 0 40 Cuba <=50K
@oladapo-joseph
oladapo-joseph / Dataset.md
Last active February 8, 2024 09:34
Clustering-adult-dataset
Variable Type Missing Values
1 age Integer no
2 workclass Categorical yes
3 fnlwgt Integer no
4 education Categorical no
5 education-num Integer no
6 marital-status Categorical no
7 occupation Categorical yes
8 relationship Categorical no