Skip to content

Instantly share code, notes, and snippets.

@kanekomasahiro
Created August 23, 2022 20:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kanekomasahiro/39f0e2f6d703922e4df6aced50a2d3d5 to your computer and use it in GitHub Desktop.
Save kanekomasahiro/39f0e2f6d703922e4df6aced50a2d3d5 to your computer and use it in GitHub Desktop.
SNLIやMNLIデータのjsonl形式のラベルを取り出す
import json
import argparse
from collections import defaultdict, Counter
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("--input", type=str, required=True,
help="jsonl形式のデータ")
args = parser.parse_args()
return args
def main(args):
type_num_distributions = defaultdict(int)
with open(args.input) as f:
for l in f:
d = json.loads(l)
type_num = max(Counter(d["annotator_labels"]).values())
type_num_distributions[type_num] += 1
print(type_num_distributions)
if __name__ == "__main__":
args = parse_args()
main(args)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment