halfak/arwiki.txt

## arwiki.txt
$ python get_thresholds.py arwiki
-------------------------------------------  --------  ---------  ---------  ------
label                                        pop rate  threshold  precision  recall
Culture.Biography.Biography*                 0.123     0.338      0.7        0.975
Culture.Biography.Women                      0.015     0.617      0.5        0.661
Culture.Food and drink                       0.002     0.792      0.7        0.61
Culture.Internet culture                     0.004     0.818      0.7        0.702
Culture.Linguistics                          0.007     0.251      0.7        0.739
Culture.Literature                           0.016     0.707      0.7        0.636
Culture.Media.Books                          0.004     0.583      0.7        0.727
Culture.Media.Entertainment                  0.004     0.218      0.15       0.675
Culture.Media.Films                          0.011     0.207      0.7        0.847
Culture.Media.Media*                         0.059     0.635      0.7        0.768
Culture.Media.Music                          0.024     0.268      0.7        0.825
Culture.Media.Radio                          0.002     0.311      0.3        0.618
Culture.Media.Software                       0.001     0.797      0.3        0.598
Culture.Media.Television                     0.009     0.621      0.7        0.539
Culture.Media.Video games                    0.003     0.361      0.7        0.893
Culture.Performing arts                      0.003     0.375      0.3        0.577
Culture.Philosophy and religion              0.011     0.453      0.5        0.52
Culture.Sports                               0.071     0.064      0.7        0.96
Culture.Visual arts.Architecture             0.011     0.454      0.7        0.688
Culture.Visual arts.Comics and Anime         0.002     0.839      0.7        0.692
Culture.Visual arts.Fashion                  0.001     0.489      0.3        0.736
Culture.Visual arts.Visual arts*             0.018     0.6        0.7        0.67
Geography.Geographical                       0.024     0.409      0.7        0.753
Geography.Regions.Africa.Africa*             0.008     0.905      0.7        0.616
Geography.Regions.Africa.Central Africa      0.0       0.9        < 0.15
Geography.Regions.Africa.Eastern Africa      0.0       0.305      0.3        0.814
Geography.Regions.Africa.Northern Africa     0.001     0.834      0.3        0.592
Geography.Regions.Africa.Southern Africa     0.001     0.603      0.5        0.786
Geography.Regions.Africa.Western Africa      0.001     0.519      0.5        0.782
Geography.Regions.Americas.Central America   0.003     0.707      0.7        0.553
Geography.Regions.Americas.North America     0.064     0.381      0.5        0.765
Geography.Regions.Americas.South America     0.006     0.468      0.7        0.764
Geography.Regions.Asia.Asia*                 0.046     0.532      0.7        0.821
Geography.Regions.Asia.Central Asia          0.001     0.924      0.5        0.569
Geography.Regions.Asia.East Asia             0.011     0.507      0.7        0.732
Geography.Regions.Asia.North Asia            0.001     0.457      0.15       0.72
Geography.Regions.Asia.South Asia            0.015     0.092      0.7        0.886
Geography.Regions.Asia.Southeast Asia        0.006     0.257      0.7        0.798
Geography.Regions.Asia.West Asia             0.011     0.651      0.7        0.765
Geography.Regions.Europe.Eastern Europe      0.013     0.542      0.7        0.72
Geography.Regions.Europe.Europe*             0.076     0.639      0.7        0.708
Geography.Regions.Europe.Northern Europe     0.031     0.594      0.7        0.599
Geography.Regions.Europe.Southern Europe     0.013     0.653      0.7        0.669
Geography.Regions.Europe.Western Europe      0.019     0.666      0.7        0.652
Geography.Regions.Oceania                    0.015     0.072      0.7        0.918
History and Society.Business and economics   0.01      0.286      0.3        0.674
History and Society.Education                0.007     0.246      0.3        0.575
History and Society.History                  0.011     0.462      0.3        0.504
History and Society.Military and warfare     0.014     0.748      0.7        0.541
History and Society.Politics and government  0.028     0.647      0.7        0.519
History and Society.Society                  0.013     0.207      0.15       0.669
History and Society.Transportation           0.015     0.198      0.7        0.915
STEM.Biology                                 0.034     0.109      0.7        0.879
STEM.Chemistry                               0.002     0.82       0.5        0.626
STEM.Computing                               0.003     0.587      0.3        0.742
STEM.Earth and environment                   0.005     0.788      0.7        0.508
STEM.Engineering                             0.005     0.799      0.7        0.601
STEM.Libraries & Information                 0.001     0.512      0.3        0.658
STEM.Mathematics                             0.0       0.94       0.5        0.528
STEM.Medicine & Health                       0.006     0.797      0.7        0.596
STEM.Physics                                 0.001     0.418      0.15       0.742
STEM.STEM*                                   0.069     0.429      0.7        0.89
STEM.Space                                   0.006     0.089      0.7        0.948
STEM.Technology                              0.005     0.578      0.3        0.677
-------------------------------------------  --------  ---------  ---------  ------

## cswiki.txt
$ python get_thresholds.py cswiki
-------------------------------------------  --------  ---------  ---------  ------
label                                        pop rate  threshold  precision  recall
Culture.Biography.Biography*                 0.123     0.191      0.7        0.964
Culture.Biography.Women                      0.015     0.498      0.7        0.864
Culture.Food and drink                       0.002     0.77       0.7        0.742
Culture.Internet culture                     0.004     0.791      0.7        0.76
Culture.Linguistics                          0.007     0.25       0.7        0.846
Culture.Literature                           0.016     0.645      0.7        0.707
Culture.Media.Books                          0.004     0.541      0.7        0.84
Culture.Media.Entertainment                  0.004     0.626      0.5        0.506
Culture.Media.Films                          0.011     0.258      0.7        0.899
Culture.Media.Media*                         0.059     0.55       0.7        0.882
Culture.Media.Music                          0.024     0.208      0.7        0.925
Culture.Media.Radio                          0.002     0.723      0.7        0.568
Culture.Media.Software                       0.001     0.833      0.3        0.589
Culture.Media.Television                     0.009     0.291      0.7        0.905
Culture.Media.Video games                    0.003     0.238      0.7        0.957
Culture.Performing arts                      0.003     0.739      0.7        0.616
Culture.Philosophy and religion              0.011     0.588      0.5        0.566
Culture.Sports                               0.071     0.04       0.7        0.965
Culture.Visual arts.Architecture             0.011     0.535      0.7        0.756
Culture.Visual arts.Comics and Anime         0.002     0.338      0.7        0.914
Culture.Visual arts.Fashion                  0.001     0.635      0.5        0.76
Culture.Visual arts.Visual arts*             0.018     0.579      0.7        0.757
Geography.Geographical                       0.024     0.731      0.7        0.56
Geography.Regions.Africa.Africa*             0.008     0.638      0.7        0.652
Geography.Regions.Africa.Central Africa      0.0       0.9        < 0.15
Geography.Regions.Africa.Eastern Africa      0.0       0.415      0.3        0.728
Geography.Regions.Africa.Northern Africa     0.001     0.416      0.3        0.734
Geography.Regions.Africa.Southern Africa     0.001     0.701      0.7        0.521
Geography.Regions.Africa.Western Africa      0.001     0.116      0.3        0.603
Geography.Regions.Americas.Central America   0.003     0.482      0.7        0.695
Geography.Regions.Americas.North America     0.064     0.451      0.7        0.672
Geography.Regions.Americas.South America     0.006     0.348      0.7        0.76
Geography.Regions.Asia.Asia*                 0.046     0.505      0.7        0.812
Geography.Regions.Asia.Central Asia          0.001     0.84       0.5        0.509
Geography.Regions.Asia.East Asia             0.011     0.414      0.7        0.8
Geography.Regions.Asia.North Asia            0.001     0.613      0.15       0.639
Geography.Regions.Asia.South Asia            0.015     0.134      0.7        0.869
Geography.Regions.Asia.Southeast Asia        0.006     0.325      0.7        0.791
Geography.Regions.Asia.West Asia             0.011     0.434      0.7        0.813
Geography.Regions.Europe.Eastern Europe      0.013     0.585      0.5        0.694
Geography.Regions.Europe.Europe*             0.076     0.75       0.7        0.615
Geography.Regions.Europe.Northern Europe     0.031     0.416      0.7        0.706
Geography.Regions.Europe.Southern Europe     0.013     0.66       0.7        0.609
Geography.Regions.Europe.Western Europe      0.019     0.755      0.7        0.579
Geography.Regions.Oceania                    0.015     0.187      0.7        0.813
History and Society.Business and economics   0.01      0.465      0.5        0.655
History and Society.Education                0.007     0.568      0.7        0.553
History and Society.History                  0.011     0.382      0.3        0.724
History and Society.Military and warfare     0.014     0.79       0.7        0.553
History and Society.Politics and government  0.028     0.61       0.7        0.508
History and Society.Society                  0.013     0.428      0.3        0.577
History and Society.Transportation           0.015     0.201      0.7        0.952
STEM.Biology                                 0.034     0.114      0.7        0.915
STEM.Chemistry                               0.002     0.806      0.5        0.75
STEM.Computing                               0.003     0.866      0.5        0.627
STEM.Earth and environment                   0.005     0.767      0.7        0.653
STEM.Engineering                             0.005     0.737      0.7        0.714
STEM.Libraries & Information                 0.001     0.765      0.5        0.629
STEM.Mathematics                             0.0       0.862      0.5        0.789
STEM.Medicine & Health                       0.006     0.641      0.7        0.703
STEM.Physics                                 0.001     0.676      0.3        0.724
STEM.STEM*                                   0.069     0.41       0.7        0.916
STEM.Space                                   0.006     0.096      0.7        0.973
STEM.Technology                              0.005     0.829      0.5        0.547
-------------------------------------------  --------  ---------  ---------  ------

## enwiki.txt
$ python get_thresholds.py enwiki
-------------------------------------------  --------  ---------  ---------  ------
label                                        pop rate  threshold  precision  recall
Culture.Biography.Biography*                 0.123     0.247      0.7        0.946
Culture.Biography.Women                      0.015     0.667      0.5        0.668
Culture.Food and drink                       0.002     0.782      0.7        0.661
Culture.Internet culture                     0.004     0.797      0.7        0.722
Culture.Linguistics                          0.007     0.201      0.7        0.814
Culture.Literature                           0.016     0.763      0.7        0.618
Culture.Media.Books                          0.004     0.858      0.7        0.516
Culture.Media.Entertainment                  0.004     0.387      0.3        0.593
Culture.Media.Films                          0.011     0.318      0.7        0.864
Culture.Media.Media*                         0.059     0.637      0.7        0.814
Culture.Media.Music                          0.024     0.146      0.7        0.908
Culture.Media.Radio                          0.002     0.365      0.7        0.824
Culture.Media.Software                       0.001     0.639      0.15       0.543
Culture.Media.Television                     0.009     0.573      0.7        0.722
Culture.Media.Video games                    0.003     0.335      0.7        0.921
Culture.Performing arts                      0.003     0.816      0.7        0.547
Culture.Philosophy and religion              0.011     0.499      0.5        0.551
Culture.Sports                               0.071     0.03       0.7        0.97
Culture.Visual arts.Architecture             0.011     0.641      0.7        0.682
Culture.Visual arts.Comics and Anime         0.002     0.932      0.7        0.614
Culture.Visual arts.Fashion                  0.001     0.778      0.5        0.645
Culture.Visual arts.Visual arts*             0.018     0.728      0.7        0.66
Geography.Geographical                       0.024     0.416      0.7        0.712
Geography.Regions.Africa.Africa*             0.008     0.74       0.7        0.785
Geography.Regions.Africa.Central Africa      0.0       0.9        < 0.15
Geography.Regions.Africa.Eastern Africa      0.0       0.99       0.7        0.506
Geography.Regions.Africa.Northern Africa     0.001     0.818      0.5        0.627
Geography.Regions.Africa.Southern Africa     0.001     0.884      0.7        0.628
Geography.Regions.Africa.Western Africa      0.001     0.381      0.3        0.838
Geography.Regions.Americas.Central America   0.003     0.755      0.7        0.59
Geography.Regions.Americas.North America     0.064     0.53       0.7        0.678
Geography.Regions.Americas.South America     0.006     0.652      0.7        0.684
Geography.Regions.Asia.Asia*                 0.046     0.473      0.7        0.867
Geography.Regions.Asia.Central Asia          0.001     0.944      0.7        0.61
Geography.Regions.Asia.East Asia             0.011     0.542      0.7        0.762
Geography.Regions.Asia.North Asia            0.001     0.448      0.15       0.689
Geography.Regions.Asia.South Asia            0.015     0.065      0.7        0.94
Geography.Regions.Asia.Southeast Asia        0.006     0.243      0.7        0.853
Geography.Regions.Asia.West Asia             0.011     0.3        0.7        0.84
Geography.Regions.Europe.Eastern Europe      0.013     0.534      0.7        0.746
Geography.Regions.Europe.Europe*             0.076     0.648      0.7        0.678
Geography.Regions.Europe.Northern Europe     0.031     0.607      0.7        0.622
Geography.Regions.Europe.Southern Europe     0.013     0.619      0.7        0.642
Geography.Regions.Europe.Western Europe      0.019     0.71       0.7        0.537
Geography.Regions.Oceania                    0.015     0.117      0.7        0.904
History and Society.Business and economics   0.01      0.395      0.3        0.565
History and Society.Education                0.007     0.211      0.3        0.673
History and Society.History                  0.011     0.364      0.3        0.559
History and Society.Military and warfare     0.014     0.673      0.7        0.647
History and Society.Politics and government  0.028     0.514      0.7        0.628
History and Society.Society                  0.013     0.31       0.3        0.532
History and Society.Transportation           0.015     0.301      0.7        0.898
STEM.Biology                                 0.034     0.067      0.7        0.914
STEM.Chemistry                               0.002     0.588      0.3        0.668
STEM.Computing                               0.003     0.765      0.3        0.511
STEM.Earth and environment                   0.005     0.645      0.7        0.67
STEM.Engineering                             0.005     0.77       0.7        0.645
STEM.Libraries & Information                 0.001     0.702      0.3        0.529
STEM.Mathematics                             0.0       0.903      0.3        0.571
STEM.Medicine & Health                       0.006     0.735      0.7        0.613
STEM.Physics                                 0.001     0.83       0.3        0.51
STEM.STEM*                                   0.069     0.389      0.7        0.895
STEM.Space                                   0.006     0.069      0.7        0.937
STEM.Technology                              0.005     0.63       0.3        0.588

## get_thresholds.py
"""
Queries for optimal thresholds from ORES.


Usage:
    get_thresholds (-h|--help)
    get_thresholds <wiki>

Options:
    -h --help  Prints this documentation
    <wiki>     The DBname of the wiki to query thresholds for.
"""
import docopt
import requests
from tabulate import tabulate

ORES_HOST = "https://ores.wikimedia.org"
PATH = "/v3/scores"
MODEL = "articletopic"
PRECISION_TARGETS = [0.7, 0.5, 0.3, 0.15]


def main(argv=None):
    args = docopt.docopt(__doc__, argv=argv)

    wiki = args['<wiki>']

    headers = [['label', 'pop rate', 'threshold', 'precision', 'recall']]

    table_data = headers
    for label, pop_rate in get_labels(wiki, MODEL):
        threshold, precision, recall = get_best_threshold(wiki, label)
        row = [label, pop_rate, threshold, precision, recall]
        table_data.append(row)

    print(tabulate(table_data))


def get_labels(wiki, model):
    doc = requests.get(
        ORES_HOST + PATH + "/" + wiki + "/",
        params={
            'models': MODEL,
            'model_info': "params|statistics.rates"
        }
    ).json()
    labels = doc[wiki]['models'][MODEL]['params']['labels']
    pop_rates = doc[wiki]['models'][MODEL]['statistics']['rates']['population']
    return [(l, pop_rates[l]) for l in labels]


def get_threshold(wiki, label, target):
    doc = requests.get(
        ORES_HOST + PATH + "/" + wiki + "/",
        params={
            'models': MODEL,
            'model_info': "statistics.thresholds.{0}.'maximum recall @ precision >= {1}'".format(repr(label), target)
        }
    ).json()

    thresholds = doc[wiki]['models'][MODEL]['statistics']['thresholds'][label]
    if len(thresholds) == 1 and thresholds[0] is not None:
        return thresholds[0]['threshold'], thresholds[0]['recall']
    else:
        return None, None


def get_best_threshold(wiki, label):
    for target in PRECISION_TARGETS:
        threshold, recall = get_threshold(wiki, label, target)
        if recall is not None and recall >= 0.5:
            return threshold, target, recall

    return 0.9, "< 0.15", None


if __name__ == '__main__':
    main()

## getting_ores_thresholds.md

      
    Raw
  

              getting_ores_thresholds.md
            
          
    Let's get some useful thresholds for models.  Generally, these thresholds are going to look a lot worse than they really are -- mostly because of labels we used to train are messy and incomplete.  We're targeting at least 70% precision, but we're likely to get that when we ask for 50% precision -- and in some cases, we'll still get it when we target even lower precision.
So!  We're going to use ORES "threshold optimization" querying system.  We'll need to make a call for each topic in order to get an appropriate threshold:

Culture.Biography.Biography* [maximum recall @ precision >= 0.5]

              {
                "!f1": 0.925,
                "!precision": 0.996,
                "!recall": 0.863,
                "accuracy": 0.877,
                "f1": 0.662,
                "filter_rate": 0.759,
                "fpr": 0.137,
                "match_rate": 0.241,
                "precision": 0.5,
                "recall": 0.977,
                "threshold": 0.086
              }


Culture.Biography.Women [maximum recall @ precision >= 0.5]

              {
                "!f1": 0.993,
                "!precision": 0.995,
                "!recall": 0.99,
                "accuracy": 0.985,
                "f1": 0.572,
                "filter_rate": 0.981,
                "fpr": 0.01,
                "match_rate": 0.019,
                "precision": 0.501,
                "recall": 0.668,
                "threshold": 0.667
              }


Culture.Media.Entertainment [maximum recall @ precision >= 0.5]

              {
                "!f1": 0.998,
                "!precision": 0.998,
                "!recall": 0.998,
                "accuracy": 0.996,
                "f1": 0.47,
                "filter_rate": 0.997,
                "fpr": 0.002,
                "match_rate": 0.003,
                "precision": 0.503,
                "recall": 0.442,
                "threshold": 0.646
              }


STEM.Mathematics maximum recall @ precision >= 0.3]

              {
                "!f1": 1.0,
                "!precision": 1.0,
                "!recall": 0.999,
                "accuracy": 0.999,
                "f1": 0.401,
                "filter_rate": 0.999,
                "fpr": 0.001,
                "match_rate": 0.001,
                "precision": 0.309,
                "recall": 0.571,
                "threshold": 0.903
              }

Here, we can see some diversity.  Culture.Biography.Biography* is easy to model and it's very common in the labeled data, so we can get very high precision and very high recall and a strict threshold.  STEM.Mathematics is on the other end of the spectrum.  There are very few math-related articles at all.  I've relaxed the minimum precision to 0.3 in order to get a threshold.

  
## kowiki.txt
$ python get_thresholds.py kowiki
-------------------------------------------  --------  ---------  ---------  ------
label                                        pop rate  threshold  precision  recall
Culture.Biography.Biography*                 0.123     0.236      0.7        0.954
Culture.Biography.Women                      0.015     0.739      0.7        0.608
Culture.Food and drink                       0.002     0.688      0.7        0.76
Culture.Internet culture                     0.004     0.851      0.7        0.661
Culture.Linguistics                          0.007     0.276      0.7        0.797
Culture.Literature                           0.016     0.657      0.7        0.705
Culture.Media.Books                          0.004     0.552      0.7        0.759
Culture.Media.Entertainment                  0.004     0.414      0.3        0.627
Culture.Media.Films                          0.011     0.301      0.7        0.876
Culture.Media.Media*                         0.059     0.62       0.7        0.826
Culture.Media.Music                          0.024     0.255      0.7        0.883
Culture.Media.Radio                          0.002     0.264      0.5        0.717
Culture.Media.Software                       0.001     0.827      0.3        0.585
Culture.Media.Television                     0.009     0.484      0.7        0.725
Culture.Media.Video games                    0.003     0.419      0.7        0.91
Culture.Performing arts                      0.003     0.541      0.5        0.623
Culture.Philosophy and religion              0.011     0.458      0.5        0.594
Culture.Sports                               0.071     0.043      0.7        0.948
Culture.Visual arts.Architecture             0.011     0.508      0.7        0.724
Culture.Visual arts.Comics and Anime         0.002     0.876      0.7        0.743
Culture.Visual arts.Fashion                  0.001     0.442      0.3        0.772
Culture.Visual arts.Visual arts*             0.018     0.597      0.7        0.715
Geography.Geographical                       0.024     0.647      0.7        0.574
Geography.Regions.Africa.Africa*             0.008     0.676      0.7        0.677
Geography.Regions.Africa.Central Africa      0.0       0.9        < 0.15
Geography.Regions.Africa.Eastern Africa      0.0       0.083      0.15       0.844
Geography.Regions.Africa.Northern Africa     0.001     0.785      0.5        0.617
Geography.Regions.Africa.Southern Africa     0.001     0.329      0.5        0.804
Geography.Regions.Africa.Western Africa      0.001     0.044      0.3        0.794
Geography.Regions.Americas.Central America   0.003     0.495      0.7        0.715
Geography.Regions.Americas.North America     0.064     0.452      0.7        0.68
Geography.Regions.Americas.South America     0.006     0.432      0.7        0.765
Geography.Regions.Asia.Asia*                 0.046     0.724      0.7        0.725
Geography.Regions.Asia.Central Asia          0.001     0.79       0.5        0.659
Geography.Regions.Asia.East Asia             0.011     0.668      0.5        0.739
Geography.Regions.Asia.North Asia            0.001     0.792      0.3        0.599
Geography.Regions.Asia.South Asia            0.015     0.111      0.7        0.889
Geography.Regions.Asia.Southeast Asia        0.006     0.422      0.7        0.764
Geography.Regions.Asia.West Asia             0.011     0.373      0.7        0.804
Geography.Regions.Europe.Eastern Europe      0.013     0.543      0.7        0.761
Geography.Regions.Europe.Europe*             0.076     0.575      0.7        0.772
Geography.Regions.Europe.Northern Europe     0.031     0.28       0.7        0.825
Geography.Regions.Europe.Southern Europe     0.013     0.543      0.7        0.722
Geography.Regions.Europe.Western Europe      0.019     0.558      0.7        0.712
Geography.Regions.Oceania                    0.015     0.149      0.7        0.85
History and Society.Business and economics   0.01      0.512      0.5        0.614
History and Society.Education                0.007     0.451      0.7        0.683
History and Society.History                  0.011     0.604      0.5        0.543
History and Society.Military and warfare     0.014     0.654      0.7        0.623
History and Society.Politics and government  0.028     0.563      0.7        0.549
History and Society.Society                  0.013     0.387      0.3        0.577
History and Society.Transportation           0.015     0.191      0.7        0.939
STEM.Biology                                 0.034     0.106      0.7        0.909
STEM.Chemistry                               0.002     0.701      0.5        0.762
STEM.Computing                               0.003     0.865      0.5        0.554
STEM.Earth and environment                   0.005     0.663      0.7        0.623
STEM.Engineering                             0.005     0.657      0.7        0.699
STEM.Libraries & Information                 0.001     0.737      0.5        0.634
STEM.Mathematics                             0.0       0.955      0.5        0.562
STEM.Medicine & Health                       0.006     0.561      0.7        0.7
STEM.Physics                                 0.001     0.74       0.3        0.662
STEM.STEM*                                   0.069     0.413      0.7        0.905
STEM.Space                                   0.006     0.051      0.7        0.962
STEM.Technology                              0.005     0.588      0.3        0.704
-------------------------------------------  --------  ---------  ---------  ------

## viwiki.txt
$ python get_thresholds.py viwiki
-------------------------------------------  --------  ---------  ---------  ------
label                                        pop rate  threshold  precision  recall
Culture.Biography.Biography*                 0.123     0.193      0.7        0.954
Culture.Biography.Women                      0.015     0.545      0.5        0.788
Culture.Food and drink                       0.002     0.73       0.7        0.679
Culture.Internet culture                     0.004     0.788      0.7        0.757
Culture.Linguistics                          0.007     0.25       0.7        0.81
Culture.Literature                           0.016     0.583      0.7        0.752
Culture.Media.Books                          0.004     0.522      0.7        0.763
Culture.Media.Entertainment                  0.004     0.557      0.5        0.641
Culture.Media.Films                          0.011     0.334      0.7        0.881
Culture.Media.Media*                         0.059     0.528      0.7        0.873
Culture.Media.Music                          0.024     0.11       0.7        0.95
Culture.Media.Radio                          0.002     0.202      0.7        0.747
Culture.Media.Software                       0.001     0.76       0.3        0.726
Culture.Media.Television                     0.009     0.382      0.7        0.821
Culture.Media.Video games                    0.003     0.27       0.7        0.957
Culture.Performing arts                      0.003     0.767      0.7        0.597
Culture.Philosophy and religion              0.011     0.394      0.5        0.65
Culture.Sports                               0.071     0.028      0.7        0.963
Culture.Visual arts.Architecture             0.011     0.359      0.7        0.804
Culture.Visual arts.Comics and Anime         0.002     0.694      0.7        0.898
Culture.Visual arts.Fashion                  0.001     0.713      0.5        0.807
Culture.Visual arts.Visual arts*             0.018     0.485      0.7        0.807
Geography.Geographical                       0.024     0.661      0.7        0.519
Geography.Regions.Africa.Africa*             0.008     0.859      0.7        0.614
Geography.Regions.Africa.Central Africa      0.0       0.9        < 0.15
Geography.Regions.Africa.Eastern Africa      0.0       0.346      0.3        0.73
Geography.Regions.Africa.Northern Africa     0.001     0.917      0.5        0.502
Geography.Regions.Africa.Southern Africa     0.001     0.67       0.5        0.662
Geography.Regions.Africa.Western Africa      0.001     0.154      0.3        0.905
Geography.Regions.Americas.Central America   0.003     0.833      0.7        0.52
Geography.Regions.Americas.North America     0.064     0.29       0.7        0.812
Geography.Regions.Americas.South America     0.006     0.746      0.7        0.689
Geography.Regions.Asia.Asia*                 0.046     0.658      0.7        0.794
Geography.Regions.Asia.Central Asia          0.001     0.659      0.5        0.785
Geography.Regions.Asia.East Asia             0.011     0.875      0.7        0.614
Geography.Regions.Asia.North Asia            0.001     0.838      0.3        0.584
Geography.Regions.Asia.South Asia            0.015     0.103      0.7        0.911
Geography.Regions.Asia.Southeast Asia        0.006     0.89       0.7        0.534
Geography.Regions.Asia.West Asia             0.011     0.194      0.7        0.894
Geography.Regions.Europe.Eastern Europe      0.013     0.52       0.7        0.834
Geography.Regions.Europe.Europe*             0.076     0.434      0.7        0.846
Geography.Regions.Europe.Northern Europe     0.031     0.238      0.7        0.809
Geography.Regions.Europe.Southern Europe     0.013     0.319      0.7        0.826
Geography.Regions.Europe.Western Europe      0.019     0.304      0.7        0.846
Geography.Regions.Oceania                    0.015     0.253      0.7        0.812
History and Society.Business and economics   0.01      0.719      0.7        0.507
History and Society.Education                0.007     0.497      0.7        0.617
History and Society.History                  0.011     0.376      0.3        0.662
History and Society.Military and warfare     0.014     0.7        0.7        0.719
History and Society.Politics and government  0.028     0.591      0.7        0.517
History and Society.Society                  0.013     0.368      0.3        0.625
History and Society.Transportation           0.015     0.09       0.7        0.964
STEM.Biology                                 0.034     0.094      0.7        0.967
STEM.Chemistry                               0.002     0.829      0.5        0.622
STEM.Computing                               0.003     0.752      0.5        0.74
STEM.Earth and environment                   0.005     0.647      0.7        0.652
STEM.Engineering                             0.005     0.582      0.7        0.828
STEM.Libraries & Information                 0.001     0.612      0.5        0.761
STEM.Mathematics                             0.0       0.779      0.5        0.82
STEM.Medicine & Health                       0.006     0.635      0.7        0.733
STEM.Physics                                 0.001     0.72       0.3        0.7
STEM.STEM*                                   0.069     0.371      0.7        0.937
STEM.Space                                   0.006     0.055      0.7        0.965
STEM.Technology                              0.005     0.842      0.5        0.53
-------------------------------------------  --------  ---------  ---------  ------
	$ python get_thresholds.py arwiki
	------------------------------------------- -------- --------- --------- ------
	label pop rate threshold precision recall
	Culture.Biography.Biography* 0.123 0.338 0.7 0.975
	Culture.Biography.Women 0.015 0.617 0.5 0.661
	Culture.Food and drink 0.002 0.792 0.7 0.61
	Culture.Internet culture 0.004 0.818 0.7 0.702
	Culture.Linguistics 0.007 0.251 0.7 0.739
	Culture.Literature 0.016 0.707 0.7 0.636
	Culture.Media.Books 0.004 0.583 0.7 0.727
	Culture.Media.Entertainment 0.004 0.218 0.15 0.675
	Culture.Media.Films 0.011 0.207 0.7 0.847
	Culture.Media.Media* 0.059 0.635 0.7 0.768
	Culture.Media.Music 0.024 0.268 0.7 0.825
	Culture.Media.Radio 0.002 0.311 0.3 0.618
	Culture.Media.Software 0.001 0.797 0.3 0.598
	Culture.Media.Television 0.009 0.621 0.7 0.539
	Culture.Media.Video games 0.003 0.361 0.7 0.893
	Culture.Performing arts 0.003 0.375 0.3 0.577
	Culture.Philosophy and religion 0.011 0.453 0.5 0.52
	Culture.Sports 0.071 0.064 0.7 0.96
	Culture.Visual arts.Architecture 0.011 0.454 0.7 0.688
	Culture.Visual arts.Comics and Anime 0.002 0.839 0.7 0.692
	Culture.Visual arts.Fashion 0.001 0.489 0.3 0.736
	Culture.Visual arts.Visual arts* 0.018 0.6 0.7 0.67
	Geography.Geographical 0.024 0.409 0.7 0.753
	Geography.Regions.Africa.Africa* 0.008 0.905 0.7 0.616
	Geography.Regions.Africa.Central Africa 0.0 0.9 < 0.15
	Geography.Regions.Africa.Eastern Africa 0.0 0.305 0.3 0.814
	Geography.Regions.Africa.Northern Africa 0.001 0.834 0.3 0.592
	Geography.Regions.Africa.Southern Africa 0.001 0.603 0.5 0.786
	Geography.Regions.Africa.Western Africa 0.001 0.519 0.5 0.782
	Geography.Regions.Americas.Central America 0.003 0.707 0.7 0.553
	Geography.Regions.Americas.North America 0.064 0.381 0.5 0.765
	Geography.Regions.Americas.South America 0.006 0.468 0.7 0.764
	Geography.Regions.Asia.Asia* 0.046 0.532 0.7 0.821
	Geography.Regions.Asia.Central Asia 0.001 0.924 0.5 0.569
	Geography.Regions.Asia.East Asia 0.011 0.507 0.7 0.732
	Geography.Regions.Asia.North Asia 0.001 0.457 0.15 0.72
	Geography.Regions.Asia.South Asia 0.015 0.092 0.7 0.886
	Geography.Regions.Asia.Southeast Asia 0.006 0.257 0.7 0.798
	Geography.Regions.Asia.West Asia 0.011 0.651 0.7 0.765
	Geography.Regions.Europe.Eastern Europe 0.013 0.542 0.7 0.72
	Geography.Regions.Europe.Europe* 0.076 0.639 0.7 0.708
	Geography.Regions.Europe.Northern Europe 0.031 0.594 0.7 0.599
	Geography.Regions.Europe.Southern Europe 0.013 0.653 0.7 0.669
	Geography.Regions.Europe.Western Europe 0.019 0.666 0.7 0.652
	Geography.Regions.Oceania 0.015 0.072 0.7 0.918
	History and Society.Business and economics 0.01 0.286 0.3 0.674
	History and Society.Education 0.007 0.246 0.3 0.575
	History and Society.History 0.011 0.462 0.3 0.504
	History and Society.Military and warfare 0.014 0.748 0.7 0.541
	History and Society.Politics and government 0.028 0.647 0.7 0.519
	History and Society.Society 0.013 0.207 0.15 0.669
	History and Society.Transportation 0.015 0.198 0.7 0.915
	STEM.Biology 0.034 0.109 0.7 0.879
	STEM.Chemistry 0.002 0.82 0.5 0.626
	STEM.Computing 0.003 0.587 0.3 0.742
	STEM.Earth and environment 0.005 0.788 0.7 0.508
	STEM.Engineering 0.005 0.799 0.7 0.601
	STEM.Libraries & Information 0.001 0.512 0.3 0.658
	STEM.Mathematics 0.0 0.94 0.5 0.528
	STEM.Medicine & Health 0.006 0.797 0.7 0.596
	STEM.Physics 0.001 0.418 0.15 0.742
	STEM.STEM* 0.069 0.429 0.7 0.89
	STEM.Space 0.006 0.089 0.7 0.948
	STEM.Technology 0.005 0.578 0.3 0.677
	------------------------------------------- -------- --------- --------- ------
	$ python get_thresholds.py cswiki
	------------------------------------------- -------- --------- --------- ------
	label pop rate threshold precision recall
	Culture.Biography.Biography* 0.123 0.191 0.7 0.964
	Culture.Biography.Women 0.015 0.498 0.7 0.864
	Culture.Food and drink 0.002 0.77 0.7 0.742
	Culture.Internet culture 0.004 0.791 0.7 0.76
	Culture.Linguistics 0.007 0.25 0.7 0.846
	Culture.Literature 0.016 0.645 0.7 0.707
	Culture.Media.Books 0.004 0.541 0.7 0.84
	Culture.Media.Entertainment 0.004 0.626 0.5 0.506
	Culture.Media.Films 0.011 0.258 0.7 0.899
	Culture.Media.Media* 0.059 0.55 0.7 0.882
	Culture.Media.Music 0.024 0.208 0.7 0.925
	Culture.Media.Radio 0.002 0.723 0.7 0.568
	Culture.Media.Software 0.001 0.833 0.3 0.589
	Culture.Media.Television 0.009 0.291 0.7 0.905
	Culture.Media.Video games 0.003 0.238 0.7 0.957
	Culture.Performing arts 0.003 0.739 0.7 0.616
	Culture.Philosophy and religion 0.011 0.588 0.5 0.566
	Culture.Sports 0.071 0.04 0.7 0.965
	Culture.Visual arts.Architecture 0.011 0.535 0.7 0.756
	Culture.Visual arts.Comics and Anime 0.002 0.338 0.7 0.914
	Culture.Visual arts.Fashion 0.001 0.635 0.5 0.76
	Culture.Visual arts.Visual arts* 0.018 0.579 0.7 0.757
	Geography.Geographical 0.024 0.731 0.7 0.56
	Geography.Regions.Africa.Africa* 0.008 0.638 0.7 0.652
	Geography.Regions.Africa.Central Africa 0.0 0.9 < 0.15
	Geography.Regions.Africa.Eastern Africa 0.0 0.415 0.3 0.728
	Geography.Regions.Africa.Northern Africa 0.001 0.416 0.3 0.734
	Geography.Regions.Africa.Southern Africa 0.001 0.701 0.7 0.521
	Geography.Regions.Africa.Western Africa 0.001 0.116 0.3 0.603
	Geography.Regions.Americas.Central America 0.003 0.482 0.7 0.695
	Geography.Regions.Americas.North America 0.064 0.451 0.7 0.672
	Geography.Regions.Americas.South America 0.006 0.348 0.7 0.76
	Geography.Regions.Asia.Asia* 0.046 0.505 0.7 0.812
	Geography.Regions.Asia.Central Asia 0.001 0.84 0.5 0.509
	Geography.Regions.Asia.East Asia 0.011 0.414 0.7 0.8
	Geography.Regions.Asia.North Asia 0.001 0.613 0.15 0.639
	Geography.Regions.Asia.South Asia 0.015 0.134 0.7 0.869
	Geography.Regions.Asia.Southeast Asia 0.006 0.325 0.7 0.791
	Geography.Regions.Asia.West Asia 0.011 0.434 0.7 0.813
	Geography.Regions.Europe.Eastern Europe 0.013 0.585 0.5 0.694
	Geography.Regions.Europe.Europe* 0.076 0.75 0.7 0.615
	Geography.Regions.Europe.Northern Europe 0.031 0.416 0.7 0.706
	Geography.Regions.Europe.Southern Europe 0.013 0.66 0.7 0.609
	Geography.Regions.Europe.Western Europe 0.019 0.755 0.7 0.579
	Geography.Regions.Oceania 0.015 0.187 0.7 0.813
	History and Society.Business and economics 0.01 0.465 0.5 0.655
	History and Society.Education 0.007 0.568 0.7 0.553
	History and Society.History 0.011 0.382 0.3 0.724
	History and Society.Military and warfare 0.014 0.79 0.7 0.553
	History and Society.Politics and government 0.028 0.61 0.7 0.508
	History and Society.Society 0.013 0.428 0.3 0.577
	History and Society.Transportation 0.015 0.201 0.7 0.952
	STEM.Biology 0.034 0.114 0.7 0.915
	STEM.Chemistry 0.002 0.806 0.5 0.75
	STEM.Computing 0.003 0.866 0.5 0.627
	STEM.Earth and environment 0.005 0.767 0.7 0.653
	STEM.Engineering 0.005 0.737 0.7 0.714
	STEM.Libraries & Information 0.001 0.765 0.5 0.629
	STEM.Mathematics 0.0 0.862 0.5 0.789
	STEM.Medicine & Health 0.006 0.641 0.7 0.703
	STEM.Physics 0.001 0.676 0.3 0.724
	STEM.STEM* 0.069 0.41 0.7 0.916
	STEM.Space 0.006 0.096 0.7 0.973
	STEM.Technology 0.005 0.829 0.5 0.547
	------------------------------------------- -------- --------- --------- ------
	$ python get_thresholds.py enwiki
	------------------------------------------- -------- --------- --------- ------
	label pop rate threshold precision recall
	Culture.Biography.Biography* 0.123 0.247 0.7 0.946
	Culture.Biography.Women 0.015 0.667 0.5 0.668
	Culture.Food and drink 0.002 0.782 0.7 0.661
	Culture.Internet culture 0.004 0.797 0.7 0.722
	Culture.Linguistics 0.007 0.201 0.7 0.814
	Culture.Literature 0.016 0.763 0.7 0.618
	Culture.Media.Books 0.004 0.858 0.7 0.516
	Culture.Media.Entertainment 0.004 0.387 0.3 0.593
	Culture.Media.Films 0.011 0.318 0.7 0.864
	Culture.Media.Media* 0.059 0.637 0.7 0.814
	Culture.Media.Music 0.024 0.146 0.7 0.908
	Culture.Media.Radio 0.002 0.365 0.7 0.824
	Culture.Media.Software 0.001 0.639 0.15 0.543
	Culture.Media.Television 0.009 0.573 0.7 0.722
	Culture.Media.Video games 0.003 0.335 0.7 0.921
	Culture.Performing arts 0.003 0.816 0.7 0.547
	Culture.Philosophy and religion 0.011 0.499 0.5 0.551
	Culture.Sports 0.071 0.03 0.7 0.97
	Culture.Visual arts.Architecture 0.011 0.641 0.7 0.682
	Culture.Visual arts.Comics and Anime 0.002 0.932 0.7 0.614
	Culture.Visual arts.Fashion 0.001 0.778 0.5 0.645
	Culture.Visual arts.Visual arts* 0.018 0.728 0.7 0.66
	Geography.Geographical 0.024 0.416 0.7 0.712
	Geography.Regions.Africa.Africa* 0.008 0.74 0.7 0.785
	Geography.Regions.Africa.Central Africa 0.0 0.9 < 0.15
	Geography.Regions.Africa.Eastern Africa 0.0 0.99 0.7 0.506
	Geography.Regions.Africa.Northern Africa 0.001 0.818 0.5 0.627
	Geography.Regions.Africa.Southern Africa 0.001 0.884 0.7 0.628
	Geography.Regions.Africa.Western Africa 0.001 0.381 0.3 0.838
	Geography.Regions.Americas.Central America 0.003 0.755 0.7 0.59
	Geography.Regions.Americas.North America 0.064 0.53 0.7 0.678
	Geography.Regions.Americas.South America 0.006 0.652 0.7 0.684
	Geography.Regions.Asia.Asia* 0.046 0.473 0.7 0.867
	Geography.Regions.Asia.Central Asia 0.001 0.944 0.7 0.61
	Geography.Regions.Asia.East Asia 0.011 0.542 0.7 0.762
	Geography.Regions.Asia.North Asia 0.001 0.448 0.15 0.689
	Geography.Regions.Asia.South Asia 0.015 0.065 0.7 0.94
	Geography.Regions.Asia.Southeast Asia 0.006 0.243 0.7 0.853
	Geography.Regions.Asia.West Asia 0.011 0.3 0.7 0.84
	Geography.Regions.Europe.Eastern Europe 0.013 0.534 0.7 0.746
	Geography.Regions.Europe.Europe* 0.076 0.648 0.7 0.678
	Geography.Regions.Europe.Northern Europe 0.031 0.607 0.7 0.622
	Geography.Regions.Europe.Southern Europe 0.013 0.619 0.7 0.642
	Geography.Regions.Europe.Western Europe 0.019 0.71 0.7 0.537
	Geography.Regions.Oceania 0.015 0.117 0.7 0.904
	History and Society.Business and economics 0.01 0.395 0.3 0.565
	History and Society.Education 0.007 0.211 0.3 0.673
	History and Society.History 0.011 0.364 0.3 0.559
	History and Society.Military and warfare 0.014 0.673 0.7 0.647
	History and Society.Politics and government 0.028 0.514 0.7 0.628
	History and Society.Society 0.013 0.31 0.3 0.532
	History and Society.Transportation 0.015 0.301 0.7 0.898
	STEM.Biology 0.034 0.067 0.7 0.914
	STEM.Chemistry 0.002 0.588 0.3 0.668
	STEM.Computing 0.003 0.765 0.3 0.511
	STEM.Earth and environment 0.005 0.645 0.7 0.67
	STEM.Engineering 0.005 0.77 0.7 0.645
	STEM.Libraries & Information 0.001 0.702 0.3 0.529
	STEM.Mathematics 0.0 0.903 0.3 0.571
	STEM.Medicine & Health 0.006 0.735 0.7 0.613
	STEM.Physics 0.001 0.83 0.3 0.51
	STEM.STEM* 0.069 0.389 0.7 0.895
	STEM.Space 0.006 0.069 0.7 0.937
	STEM.Technology 0.005 0.63 0.3 0.588
	"""
	Queries for optimal thresholds from ORES.


	Usage:
	get_thresholds (-h\|--help)
	get_thresholds <wiki>

	Options:
	-h --help Prints this documentation
	<wiki> The DBname of the wiki to query thresholds for.
	"""
	import docopt
	import requests
	from tabulate import tabulate

	ORES_HOST = "https://ores.wikimedia.org"
	PATH = "/v3/scores"
	MODEL = "articletopic"
	PRECISION_TARGETS = [0.7, 0.5, 0.3, 0.15]


	def main(argv=None):
	args = docopt.docopt(__doc__, argv=argv)

	wiki = args['<wiki>']

	headers = [['label', 'pop rate', 'threshold', 'precision', 'recall']]

	table_data = headers
	for label, pop_rate in get_labels(wiki, MODEL):
	threshold, precision, recall = get_best_threshold(wiki, label)
	row = [label, pop_rate, threshold, precision, recall]
	table_data.append(row)

	print(tabulate(table_data))


	def get_labels(wiki, model):
	doc = requests.get(
	ORES_HOST + PATH + "/" + wiki + "/",
	params={
	'models': MODEL,
	'model_info': "params\|statistics.rates"
	}
	).json()
	labels = doc[wiki]['models'][MODEL]['params']['labels']
	pop_rates = doc[wiki]['models'][MODEL]['statistics']['rates']['population']
	return [(l, pop_rates[l]) for l in labels]


	def get_threshold(wiki, label, target):
	doc = requests.get(
	ORES_HOST + PATH + "/" + wiki + "/",
	params={
	'models': MODEL,
	'model_info': "statistics.thresholds.{0}.'maximum recall @ precision >= {1}'".format(repr(label), target)
	}
	).json()

	thresholds = doc[wiki]['models'][MODEL]['statistics']['thresholds'][label]
	if len(thresholds) == 1 and thresholds[0] is not None:
	return thresholds[0]['threshold'], thresholds[0]['recall']
	else:
	return None, None


	def get_best_threshold(wiki, label):
	for target in PRECISION_TARGETS:
	threshold, recall = get_threshold(wiki, label, target)
	if recall is not None and recall >= 0.5:
	return threshold, target, recall

	return 0.9, "< 0.15", None


	if __name__ == '__main__':
	main()
	$ python get_thresholds.py kowiki
	------------------------------------------- -------- --------- --------- ------
	label pop rate threshold precision recall
	Culture.Biography.Biography* 0.123 0.236 0.7 0.954
	Culture.Biography.Women 0.015 0.739 0.7 0.608
	Culture.Food and drink 0.002 0.688 0.7 0.76
	Culture.Internet culture 0.004 0.851 0.7 0.661
	Culture.Linguistics 0.007 0.276 0.7 0.797
	Culture.Literature 0.016 0.657 0.7 0.705
	Culture.Media.Books 0.004 0.552 0.7 0.759
	Culture.Media.Entertainment 0.004 0.414 0.3 0.627
	Culture.Media.Films 0.011 0.301 0.7 0.876
	Culture.Media.Media* 0.059 0.62 0.7 0.826
	Culture.Media.Music 0.024 0.255 0.7 0.883
	Culture.Media.Radio 0.002 0.264 0.5 0.717
	Culture.Media.Software 0.001 0.827 0.3 0.585
	Culture.Media.Television 0.009 0.484 0.7 0.725
	Culture.Media.Video games 0.003 0.419 0.7 0.91
	Culture.Performing arts 0.003 0.541 0.5 0.623
	Culture.Philosophy and religion 0.011 0.458 0.5 0.594
	Culture.Sports 0.071 0.043 0.7 0.948
	Culture.Visual arts.Architecture 0.011 0.508 0.7 0.724
	Culture.Visual arts.Comics and Anime 0.002 0.876 0.7 0.743
	Culture.Visual arts.Fashion 0.001 0.442 0.3 0.772
	Culture.Visual arts.Visual arts* 0.018 0.597 0.7 0.715
	Geography.Geographical 0.024 0.647 0.7 0.574
	Geography.Regions.Africa.Africa* 0.008 0.676 0.7 0.677
	Geography.Regions.Africa.Central Africa 0.0 0.9 < 0.15
	Geography.Regions.Africa.Eastern Africa 0.0 0.083 0.15 0.844
	Geography.Regions.Africa.Northern Africa 0.001 0.785 0.5 0.617
	Geography.Regions.Africa.Southern Africa 0.001 0.329 0.5 0.804
	Geography.Regions.Africa.Western Africa 0.001 0.044 0.3 0.794
	Geography.Regions.Americas.Central America 0.003 0.495 0.7 0.715
	Geography.Regions.Americas.North America 0.064 0.452 0.7 0.68
	Geography.Regions.Americas.South America 0.006 0.432 0.7 0.765
	Geography.Regions.Asia.Asia* 0.046 0.724 0.7 0.725
	Geography.Regions.Asia.Central Asia 0.001 0.79 0.5 0.659
	Geography.Regions.Asia.East Asia 0.011 0.668 0.5 0.739
	Geography.Regions.Asia.North Asia 0.001 0.792 0.3 0.599
	Geography.Regions.Asia.South Asia 0.015 0.111 0.7 0.889
	Geography.Regions.Asia.Southeast Asia 0.006 0.422 0.7 0.764
	Geography.Regions.Asia.West Asia 0.011 0.373 0.7 0.804
	Geography.Regions.Europe.Eastern Europe 0.013 0.543 0.7 0.761
	Geography.Regions.Europe.Europe* 0.076 0.575 0.7 0.772
	Geography.Regions.Europe.Northern Europe 0.031 0.28 0.7 0.825
	Geography.Regions.Europe.Southern Europe 0.013 0.543 0.7 0.722
	Geography.Regions.Europe.Western Europe 0.019 0.558 0.7 0.712
	Geography.Regions.Oceania 0.015 0.149 0.7 0.85
	History and Society.Business and economics 0.01 0.512 0.5 0.614
	History and Society.Education 0.007 0.451 0.7 0.683
	History and Society.History 0.011 0.604 0.5 0.543
	History and Society.Military and warfare 0.014 0.654 0.7 0.623
	History and Society.Politics and government 0.028 0.563 0.7 0.549
	History and Society.Society 0.013 0.387 0.3 0.577
	History and Society.Transportation 0.015 0.191 0.7 0.939
	STEM.Biology 0.034 0.106 0.7 0.909
	STEM.Chemistry 0.002 0.701 0.5 0.762
	STEM.Computing 0.003 0.865 0.5 0.554
	STEM.Earth and environment 0.005 0.663 0.7 0.623
	STEM.Engineering 0.005 0.657 0.7 0.699
	STEM.Libraries & Information 0.001 0.737 0.5 0.634
	STEM.Mathematics 0.0 0.955 0.5 0.562
	STEM.Medicine & Health 0.006 0.561 0.7 0.7
	STEM.Physics 0.001 0.74 0.3 0.662
	STEM.STEM* 0.069 0.413 0.7 0.905
	STEM.Space 0.006 0.051 0.7 0.962
	STEM.Technology 0.005 0.588 0.3 0.704
	------------------------------------------- -------- --------- --------- ------
	$ python get_thresholds.py viwiki
	------------------------------------------- -------- --------- --------- ------
	label pop rate threshold precision recall
	Culture.Biography.Biography* 0.123 0.193 0.7 0.954
	Culture.Biography.Women 0.015 0.545 0.5 0.788
	Culture.Food and drink 0.002 0.73 0.7 0.679
	Culture.Internet culture 0.004 0.788 0.7 0.757
	Culture.Linguistics 0.007 0.25 0.7 0.81
	Culture.Literature 0.016 0.583 0.7 0.752
	Culture.Media.Books 0.004 0.522 0.7 0.763
	Culture.Media.Entertainment 0.004 0.557 0.5 0.641
	Culture.Media.Films 0.011 0.334 0.7 0.881
	Culture.Media.Media* 0.059 0.528 0.7 0.873
	Culture.Media.Music 0.024 0.11 0.7 0.95
	Culture.Media.Radio 0.002 0.202 0.7 0.747
	Culture.Media.Software 0.001 0.76 0.3 0.726
	Culture.Media.Television 0.009 0.382 0.7 0.821
	Culture.Media.Video games 0.003 0.27 0.7 0.957
	Culture.Performing arts 0.003 0.767 0.7 0.597
	Culture.Philosophy and religion 0.011 0.394 0.5 0.65
	Culture.Sports 0.071 0.028 0.7 0.963
	Culture.Visual arts.Architecture 0.011 0.359 0.7 0.804
	Culture.Visual arts.Comics and Anime 0.002 0.694 0.7 0.898
	Culture.Visual arts.Fashion 0.001 0.713 0.5 0.807
	Culture.Visual arts.Visual arts* 0.018 0.485 0.7 0.807
	Geography.Geographical 0.024 0.661 0.7 0.519
	Geography.Regions.Africa.Africa* 0.008 0.859 0.7 0.614
	Geography.Regions.Africa.Central Africa 0.0 0.9 < 0.15
	Geography.Regions.Africa.Eastern Africa 0.0 0.346 0.3 0.73
	Geography.Regions.Africa.Northern Africa 0.001 0.917 0.5 0.502
	Geography.Regions.Africa.Southern Africa 0.001 0.67 0.5 0.662
	Geography.Regions.Africa.Western Africa 0.001 0.154 0.3 0.905
	Geography.Regions.Americas.Central America 0.003 0.833 0.7 0.52
	Geography.Regions.Americas.North America 0.064 0.29 0.7 0.812
	Geography.Regions.Americas.South America 0.006 0.746 0.7 0.689
	Geography.Regions.Asia.Asia* 0.046 0.658 0.7 0.794
	Geography.Regions.Asia.Central Asia 0.001 0.659 0.5 0.785
	Geography.Regions.Asia.East Asia 0.011 0.875 0.7 0.614
	Geography.Regions.Asia.North Asia 0.001 0.838 0.3 0.584
	Geography.Regions.Asia.South Asia 0.015 0.103 0.7 0.911
	Geography.Regions.Asia.Southeast Asia 0.006 0.89 0.7 0.534
	Geography.Regions.Asia.West Asia 0.011 0.194 0.7 0.894
	Geography.Regions.Europe.Eastern Europe 0.013 0.52 0.7 0.834
	Geography.Regions.Europe.Europe* 0.076 0.434 0.7 0.846
	Geography.Regions.Europe.Northern Europe 0.031 0.238 0.7 0.809
	Geography.Regions.Europe.Southern Europe 0.013 0.319 0.7 0.826
	Geography.Regions.Europe.Western Europe 0.019 0.304 0.7 0.846
	Geography.Regions.Oceania 0.015 0.253 0.7 0.812
	History and Society.Business and economics 0.01 0.719 0.7 0.507
	History and Society.Education 0.007 0.497 0.7 0.617
	History and Society.History 0.011 0.376 0.3 0.662
	History and Society.Military and warfare 0.014 0.7 0.7 0.719
	History and Society.Politics and government 0.028 0.591 0.7 0.517
	History and Society.Society 0.013 0.368 0.3 0.625
	History and Society.Transportation 0.015 0.09 0.7 0.964
	STEM.Biology 0.034 0.094 0.7 0.967
	STEM.Chemistry 0.002 0.829 0.5 0.622
	STEM.Computing 0.003 0.752 0.5 0.74
	STEM.Earth and environment 0.005 0.647 0.7 0.652
	STEM.Engineering 0.005 0.582 0.7 0.828
	STEM.Libraries & Information 0.001 0.612 0.5 0.761
	STEM.Mathematics 0.0 0.779 0.5 0.82
	STEM.Medicine & Health 0.006 0.635 0.7 0.733
	STEM.Physics 0.001 0.72 0.3 0.7
	STEM.STEM* 0.069 0.371 0.7 0.937
	STEM.Space 0.006 0.055 0.7 0.965
	STEM.Technology 0.005 0.842 0.5 0.53
	------------------------------------------- -------- --------- --------- ------