This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import mwreverts | |
from models import RevRevert, Page, Revision | |
import mwxml | |
import pdb | |
from collections import deque | |
from mwapilib import get_revs_for_revert_labeling | |
import sys | |
# This script is used for processing edits from the dump for reverts and store | |
# the revert status in a revert table. Edits for the pages from the page table |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Max depth - 6, learning_rate - 0.1, max_features - log2 | |
Estimators: 50 | |
real 1m20.208s | |
user 1m19.112s | |
sys 0m1.3s | |
Estimators: 75 | |
real 1m18.758s | |
user 1m17.748s |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Max depth - 3, learning_rate - 0.1, max_features - log2 | |
Estimators: 50 | |
real 1m29.432s | |
user 1m28.176s | |
sys 0m1.632s | |
Estimators: 75 | |
real 1m23.595s | |
user 1m22.288s | |
sys 0m1.484s |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2018-03-16 03:47:33,577 WARNING:revscoring.scoring.statistics.classification.micro_macro_stats -- Could not generate micro-average of f1: unsupported operand type(s) for *: 'NoneType' and 'int' | |
2018-03-16 03:47:33,577 WARNING:revscoring.scoring.statistics.classification.micro_macro_stats -- Could not generate macro-average of f1: unsupported operand type(s) for +: 'float' and 'NoneType' | |
2018-03-16 03:47:52,831 DEBUG:revscoring.utilities.tune -- Cross-validated GradientBoosting with n_estimators=50, max_depth=5, max_features="log2", learning_rate=0.01 in 48.394 minutes: pr_auc.macro=0.6543 | |
2018-03-16 03:48:04,537 WARNING:revscoring.scoring.statistics.classification.micro_macro_stats -- Could not generate micro-average of precision: unsupported operand type(s) for *: 'NoneType' and 'int' | |
2018-03-16 03:48:04,537 WARNING:revscoring.scoring.statistics.classification.micro_macro_stats -- Could not generate macro-average of precision: unsupported operand type(s) for +: 'float' and 'NoneType' | |
2018-03-16 03:48:04,538 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
These meta-datasources operate on :class:`revscoring.Datasource`'s that | |
return `list`'s of items and produce vectors out of the same. | |
.. autoclass:: revscoring.datasources.meta.vectors | |
""" | |
import os.path | |
import logging | |
from gensim.models.keyedvectors import KeyedVectors |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command | |
5814 codezee 20 0 1271M 399M 27464 S 92.0 2.5 0:59.55 python buggy.py | |
5834 codezee 20 0 1271M 398M 27464 R 92.0 2.5 0:54.93 python buggy.py | |
5828 codezee 20 0 1417M 836M 6592 S 0.0 5.2 0:08.23 python buggy.py | |
5829 codezee 20 0 1417M 836M 6592 S 0.0 5.2 0:08.28 python buggy.py | |
5827 codezee 20 0 1416M 836M 6592 S 0.0 5.2 0:08.41 python buggy.py | |
5848 codezee 20 0 19532 3884 2872 R 0.7 0.0 0:00.17 htop | |
5835 codezee 20 0 1271M 398M 27464 S 0.7 2.5 0:00.12 python buggy.py | |
5645 redis 20 0 40860 2264 1304 S 0.0 0.0 1h44:44 /usr/bin/redis-server 0.0.0.0:6379oload --ini /etc/uwsgi/apps-enabled/ores.ini | |
5826 codezee 20 0 1416M 836M 6592 S 0.0 5.2 0:08.56 python buggy.pyts/0r/bin/diamond --foreground /etc/uwsgi/apps-enabled/ores.ini |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Statistics: | |
counts (n=93415): | |
label n TP FP FN TN | |
--------------------------------------------- ----- --- ----- ---- ----- ----- | |
'STEM.Time' 2382 --> 1904 478 4702 86331 | |
'STEM.Physics' 2633 --> 2411 222 7577 83205 | |
'STEM.Space' 2522 --> 2381 141 2824 88069 | |
'STEM.Mathematics' 1659 --> 1462 197 5666 86090 | |
'Culture.Crafts and hobbies' 2150 --> 1754 396 2419 88846 | |
'History_And_Society.Transportation' 4276 --> 3711 565 2520 86619 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
counts (n=93415): | |
label n TP FP FN TN | |
--------------------------------------------- ----- --- ----- ----- ---- ----- | |
'STEM.Time' 2382 --> 1515 867 116 90917 | |
'STEM.Physics' 2633 --> 1498 1135 378 90404 | |
'STEM.Space' 2522 --> 2135 387 101 90792 | |
'STEM.Mathematics' 1659 --> 1090 569 74 91682 | |
'Culture.Crafts and hobbies' 2150 --> 1236 914 67 91198 | |
'History_And_Society.Transportation' 4276 --> 3091 1185 339 88800 | |
'Geography.Maps' 2552 --> 1374 1178 73 90790 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{"title": "List_of_fish_on_stamps_of_Madeira", "actual": ["Culture.Crafts and hobbies", "Geography.Europe", "Assistance.Maintenance", "STEM.Biology"], "predicted": ["Culture.Crafts and hobbies", "Geography.Countries", "Culture.Language and literature", "Geography.Europe"]} | |
{"title": "Arne_Tumyr", "actual": ["Culture.Language and literature", "Geography.Europe"], "predicted": ["Culture.Media", "History_And_Society.History and society", "Culture.Language and literature", "Geography.Europe", "History_And_Society.Politics and government"]} | |
{"title": "Irradiation", "actual": ["STEM.Technology", "STEM.Physics", "STEM.Medicine"], "predicted": ["STEM.Physics", "STEM.Biology", "STEM.Technology", "STEM.Engineering", "STEM.Medicine", "STEM.Chemistry"]} | |
{"title": "Wesbank,_Western_Cape", "actual": ["Geography.Countries"], "predicted": ["Geography.Countries", "History_And_Society.History and society", "Geography.Europe"]} | |
{"title": "60_Cycle", "actual": ["Culture.Language and literature", "Geography.Countries", "Culture.P |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
counts (n=10000): | |
label n TP FP FN TN | |
--------------------------------------------- ---- --- ---- ---- ---- ---- | |
'STEM.Time' 270 --> 200 70 155 9575 | |
'STEM.Physics' 284 --> 245 39 554 9162 | |
'STEM.Space' 251 --> 229 22 111 9638 | |
'STEM.Mathematics' 164 --> 133 31 283 9553 | |
'Culture.Crafts and hobbies' 232 --> 156 76 44 9724 | |
'History_And_Society.Transportation' 469 --> 389 80 155 9376 | |
'Geography.Maps' 287 --> 217 70 564 9149 |
NewerOlder