Skip to content

Instantly share code, notes, and snippets.

View chtnnh's full-sized avatar

chtnnh chtnnh

View GitHub Profile
@chtnnh
chtnnh / questbook-cla.md
Created January 24, 2022 15:50
Individual CLA for Questbook by CreatorOS

CreatorOS Individual Contributor License Agreement

Thank you for your interest in contributing to open source software projects (“Projects”) made available by CreatorOS Inc. or its affiliates (“CreatorOS”). This Individual Contributor License Agreement (“Agreement”) sets out the terms governing any source code, object code, bug fixes, configuration changes, tools, specifications, documentation, data, materials, feedback, information or other works of authorship that you submit or have submitted, in any form and in any manner, to CreatorOS in respect of any of the Projects (collectively “Contributions”). If you have any questions respecting this Agreement, please contact engineering@creatoros.co.

You agree that the following terms apply to all of your past, present and future Contributions. Except for the licenses granted in this Agreement, you retain all of your right, title and interest in and to your Contributions.

Copyright License. You hereby grant, and agree to grant, to CreatorOS a non-excl

terminal >> npm i --save graphql-voyager
npm WARN old lockfile
npm WARN old lockfile The package-lock.json file was created with an old version of npm,
npm WARN old lockfile so supplemental metadata must be fetched from the registry.
npm WARN old lockfile
npm WARN old lockfile This is a one-time fix-up, please be patient...
npm WARN old lockfile
npm WARN deprecated domelementtype@1.3.0: update to domelementtype@1.3.1
npm WARN deprecated flatten@0.0.1: flatten is deprecated in favor of utility frameworks such as lodash.
npm WARN deprecated flatten@1.0.2: flatten is deprecated in favor of utility frameworks such as lodash.
"""
Process a collection of XML dumps looking for the introduction and removal of {{Beginnetje}} templates
and assume the introduction represents a quality label ("E") and the removal represents the quality
label "D". Note: This script does not yet handle reverts (e.g. vandalism). To do that, look into
the mwreverts libraray
USAGE:
nlwiki_template_extractor (-h|--help)
nlwiki_template_extractor <xml-dump>...
[--processes=<num>]
"""
Process a collection of XML dumps looking for the introduction and removal of {{Beginnetje}} templates
and assume the introduction represents a quality label ("E") and the removal represents the quality
label "D". Note: This script does not yet handle reverts (e.g. vandalism). To do that, look into
the mwreverts libraray
USAGE:
nlwiki_template_extractor (-h|--help)
nlwiki_template_extractor <xml-dump>...
[--namespace=<num>...] [--processes=<num>]
@chtnnh
chtnnh / revertslabel
Created October 28, 2020 05:24 — forked from codez266/revertslabel
Reverts labeling from dump - involves loading revids from db and storing back, but that part is trivial
import mwreverts
from models import RevRevert, Page, Revision
import mwxml
import pdb
from collections import deque
from mwapilib import get_revs_for_revert_labeling
import sys
# This script is used for processing edits from the dump for reverts and store
# the revert status in a revert table. Edits for the pages from the page table
@chtnnh
chtnnh / cmd.bash
Created May 4, 2020 15:06 — forked from halfak/cmd.bash
Sample of labels and words_to_watch
$ bzcat datasets/ptwiki.draft_quality.balanced_3k.with_text.json.bz2 | \
shuf -n 100 | python demo_ptwiki_w2w.py | sort -k1,1
$ python
Python 3.5.3 (default, Sep 27 2018, 17:25:39)
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from revscoring import Model
>>> model = Model.load(open("models/ptwiki.wp10.gradient_boosting.model"))
>>> importance_features = list(sorted(zip(model.estimator.feature_importances_, model.features), reverse=True))
>>> for importance, feature in importance_features:
... print(round(importance, 3), feature)
...
@chtnnh
chtnnh / example.py
Last active March 12, 2020 14:03 — forked from halfak/example.py
import mwparsefromhell
example = """
{{foo bar baz}}
{{I am a random template}}
{{Marca de projeto|3|Biografias|4|Políticos|4|Brasil|3|WP Offline|2|bot=4/20111127|rev=20170714}}"""
templates = list(mwparserfromhell.parse(example_text).filter_templates())
def from_template(template):
import time
import textstat
import mwapi
from revscoring.dependencies import solve
from revscoring.datasources.meta import filters
from revscoring.features import wikitext
from articlequality.feature_lists.enwiki import text_complexity
session = mwapi.Session("https://en.wikipedia.org")