Skip to content

Instantly share code, notes, and snippets.

@halfak
Created April 8, 2019 20:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save halfak/01310bcabee7fe0290f35bfa9f32631b to your computer and use it in GitHub Desktop.
Save halfak/01310bcabee7fe0290f35bfa9f32631b to your computer and use it in GitHub Desktop.
$ python
Python 3.5.1+ (default, Mar 30 2016, 22:46:26)
[GCC 5.3.1 20160330] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from revscoring import Model
>>> m = Model.load(open("models/enwiki.damaging.gradient_boosting.model"))
>>> m.estimator.feature_importances_
array([8.39962459e-03, 7.93533540e-03, 3.04496444e-08, 2.25381150e-02,
2.08058610e-02, 2.26880141e-02, 1.87132900e-02, 1.60180859e-02,
2.23545834e-02, 2.14488512e-02, 1.90494208e-02, 2.54534679e-02,
2.66413539e-02, 2.75077614e-02, 2.24572831e-02, 2.85643380e-02,
1.40673462e-02, 1.08283797e-02, 5.28226208e-03, 1.98192473e-02,
1.20970667e-02, 7.50752762e-03, 9.40235191e-03, 9.02839723e-03,
3.37030110e-03, 1.13464298e-02, 1.21129339e-02, 6.38404279e-03,
3.09859252e-03, 2.05198411e-03, 1.40617645e-03, 5.02068509e-03,
2.84700947e-03, 2.23312854e-03, 2.60885299e-02, 1.92700904e-02,
1.58282365e-02, 1.34496524e-02, 4.13376140e-03, 7.59400902e-03,
1.05878865e-02, 6.09718173e-03, 1.14254854e-02, 5.77975715e-03,
3.86463005e-03, 3.94155522e-03, 5.69761327e-03, 2.79282947e-06,
5.06161491e-03, 3.82128496e-05, 2.79054289e-02, 2.26680674e-05,
3.60489149e-02, 1.23666918e-01, 5.83742795e-03, 1.04978762e-02,
2.40607425e-03, 2.32938135e-03, 2.94554717e-04, 4.04082964e-03,
3.80689748e-03, 2.61396452e-04, 3.73664007e-03, 3.22918560e-03,
3.67060822e-04, 7.11120470e-03, 6.88971235e-03, 2.62036017e-04,
1.51509897e-02, 1.34228730e-02, 9.61967865e-03, 2.93981526e-02,
2.76017846e-02, 1.56533996e-02, 9.17425514e-03, 6.41188488e-03,
4.61004255e-03, 2.15756091e-02, 1.37733335e-02, 7.55350051e-03])
>>> feature_importances = list(zip(m.features, m.estimator.feature_importances_))
>>> feature_importances.sort(key=lambda i: i[1], reverse=True)
>>> for feature, importance in feature_importances:
... print(feature, importance)
...
feature.log((temporal.revision.user.seconds_since_registration + 1)) 0.12366691844592674
feature.revision.user.is_anon 0.036048914867770496
feature.english.dictionary.revision.diff.dict_word_prop_delta_sum 0.02939815258260414
feature.revision.parent.markups_per_token 0.028564338001569347
feature.revision.user.is_patroller 0.027905428860413462
feature.english.dictionary.revision.diff.dict_word_prop_delta_increase 0.027601784631011272
feature.revision.parent.words_per_token 0.027507761420741125
feature.revision.parent.chars_per_word 0.026641353940771998
feature.revision.diff.chars_change 0.02608852986429265
feature.log((wikitext.revision.parent.ref_tags + 1)) 0.02545346787161345
feature.log((len(<datasource.wikitext.revision.parent.words>) + 1)) 0.022688014137860642
feature.log((wikitext.revision.parent.chars + 1)) 0.022538115006001207
feature.revision.parent.uppercase_words_per_word 0.022457283134036873
feature.log((wikitext.revision.parent.wikilinks + 1)) 0.022354583376726862
feature.english.dictionary.revision.diff.non_dict_word_prop_delta_sum 0.021575609104639532
feature.log((wikitext.revision.parent.external_links + 1)) 0.021448851189920102
feature.log((len(<datasource.tokenized(datasource.revision.parent.text)>) + 1)) 0.020805860972874828
feature.wikitext.revision.diff.markup_prop_delta_sum 0.01981924734354541
feature.revision.diff.tokens_change 0.019270090409458637
feature.log((wikitext.revision.parent.templates + 1)) 0.01904942084841798
feature.log((len(<datasource.wikitext.revision.parent.uppercase_words>) + 1)) 0.01871328996256505
feature.log((wikitext.revision.parent.headings + 1)) 0.016018085890865324
feature.revision.diff.words_change 0.01582823654911597
feature.english.dictionary.revision.diff.dict_word_prop_delta_decrease 0.015653399551568116
feature.english.dictionary.revision.diff.dict_word_delta_sum 0.015150989716091658
feature.wikitext.revision.diff.markup_delta_sum 0.014067346247203705
feature.english.dictionary.revision.diff.non_dict_word_prop_delta_increase 0.013773333528693383
feature.revision.diff.markups_change 0.013449652435971635
feature.english.dictionary.revision.diff.dict_word_delta_increase 0.01342287297765271
feature.wikitext.revision.diff.number_prop_delta_increase 0.01211293387957829
feature.wikitext.revision.diff.markup_prop_delta_increase 0.012097066726975006
feature.revision.diff.tags_change 0.011425485365658749
feature.wikitext.revision.diff.number_prop_delta_sum 0.01134642977250017
feature.wikitext.revision.diff.markup_delta_increase 0.010828379723584245
feature.revision.diff.wikilinks_change 0.010587886494373096
feature.revision.comment.has_link 0.010497876181199798
feature.english.dictionary.revision.diff.dict_word_delta_decrease 0.009619678652952076
feature.wikitext.revision.diff.number_delta_sum 0.0094023519140284
feature.english.dictionary.revision.diff.non_dict_word_delta_sum 0.0091742551371313
feature.wikitext.revision.diff.number_delta_increase 0.009028397234705816
feature.revision.page.is_articleish 0.00839962458724949
feature.revision.page.is_mainspace 0.007935335395640522
feature.revision.diff.external_links_change 0.0075940090208491155
feature.english.dictionary.revision.diff.non_dict_word_prop_delta_decrease 0.007553500511543981
feature.wikitext.revision.diff.markup_prop_delta_decrease 0.007507527624647461
feature.english.informals.revision.diff.match_prop_delta_sum 0.007111204704111318
feature.english.informals.revision.diff.match_prop_delta_increase 0.006889712352520465
feature.english.dictionary.revision.diff.non_dict_word_delta_increase 0.00641188487936276
feature.wikitext.revision.diff.number_prop_delta_decrease 0.0063840427928057355
feature.revision.diff.templates_change 0.006097181732340459
feature.revision.comment.suggests_section_edit 0.00583742794779075
feature.revision.diff.ref_tags_change 0.005779757152431128
feature.revision.user.is_bot 0.005697613274778659
feature.wikitext.revision.diff.markup_delta_decrease 0.005282262078290739
feature.revision.user.is_admin 0.005061614910269018
feature.wikitext.revision.diff.uppercase_word_prop_delta_sum 0.005020685087434651
feature.english.dictionary.revision.diff.non_dict_word_delta_decrease 0.004610042548827057
feature.revision.diff.headings_change 0.004133761404837777
feature.english.badwords.revision.diff.match_prop_delta_sum 0.004040829642063472
feature.revision.diff.longest_new_repeated_char 0.003941555219983169
feature.revision.diff.longest_new_token 0.003864630046476644
feature.english.badwords.revision.diff.match_prop_delta_increase 0.0038068974752894892
feature.english.informals.revision.diff.match_delta_sum 0.0037366400654502505
feature.wikitext.revision.diff.number_delta_decrease 0.003370301104268293
feature.english.informals.revision.diff.match_delta_increase 0.0032291856024817024
feature.wikitext.revision.diff.uppercase_word_delta_sum 0.0030985925217799654
feature.wikitext.revision.diff.uppercase_word_prop_delta_increase 0.0028470094671780564
feature.english.badwords.revision.diff.match_delta_sum 0.0024060742467007412
feature.english.badwords.revision.diff.match_delta_increase 0.0023293813475623493
feature.wikitext.revision.diff.uppercase_word_prop_delta_decrease 0.002233128541417208
feature.wikitext.revision.diff.uppercase_word_delta_increase 0.002051984108632766
feature.wikitext.revision.diff.uppercase_word_delta_decrease 0.001406176448753365
feature.english.informals.revision.diff.match_delta_decrease 0.0003670608217441542
feature.english.badwords.revision.diff.match_delta_decrease 0.00029455471697025686
feature.english.informals.revision.diff.match_prop_delta_decrease 0.0002620360172541197
feature.english.badwords.revision.diff.match_prop_delta_decrease 0.0002613964515332821
feature.revision.user.is_trusted 3.82128496286288e-05
feature.revision.user.is_curator 2.2668067373879832e-05
feature.revision.user.has_advanced_rights 2.7928294714833776e-06
feature.revision.page.is_draftspace 3.044964439759688e-08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment