Skip to content

Instantly share code, notes, and snippets.

@theo-m
Last active March 24, 2021 08:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save theo-m/61b3c0c47fc6121d08d3174bd4c2a26b to your computer and use it in GitHub Desktop.
Save theo-m/61b3c0c47fc6121d08d3174bd4c2a26b to your computer and use it in GitHub Desktop.
huggingface/datasets validation errors
$ ./scripts/datasets_metadata_validator.py --check_all
WARNING:root:❌ Failed to validate 'datasets/acronym_identification/README.md':
1 validation error for DatasetMetadata
task_ids
'structure-prediction-other-acronym-identification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/ade_corpus_v2/README.md':
3 validation errors for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/adversarial_qa/README.md'
WARNING:root:❌ Failed to validate 'datasets/aeslc/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/afrikaans_ner_corpus/README.md':
1 validation error for DatasetMetadata
licenses
'other-Creative Commons Attribution 2.5 South Africa License' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/ag_news/README.md'
WARNING:root:❌ Failed to validate 'datasets/ai2_arc/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/air_dialogue/README.md':
2 validation errors for DatasetMetadata
annotations_creators
'human-annotated' is not a registered tag for 'annotations', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/creators.json (type=value_error)
task_ids
'conditional-text-generation-other-dialogue-generation' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/ajgt_twitter_ar/README.md':
1 validation error for DatasetMetadata
size_categories
'1k<n<10k' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
DEBUG:root:✅️ Validated 'datasets/allegro_reviews/README.md'
DEBUG:root:✅️ Validated 'datasets/allocine/README.md'
DEBUG:root:✅️ Validated 'datasets/alt/README.md'
DEBUG:root:✅️ Validated 'datasets/amazon_polarity/README.md'
WARNING:root:❌ Failed to validate 'datasets/amazon_reviews_multi/README.md':
4 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
licenses
'other-amazon-license' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
multilinguality
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/amazon_us_reviews/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/ambig_qa/README.md'
DEBUG:root:✅️ Validated 'datasets/amttl/README.md'
WARNING:root:❌ Failed to validate 'datasets/anli/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/app_reviews/README.md'
DEBUG:root:✅️ Validated 'datasets/aqua_rat/README.md'
DEBUG:root:✅️ Validated 'datasets/aquamuse/README.md'
WARNING:root:❌ Failed to validate 'datasets/ar_cov19/README.md':
1 validation error for DatasetMetadata
licenses
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/ar_res_reviews/README.md':
1 validation error for DatasetMetadata
size_categories
'1k<n<10k' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/ar_sarcasm/README.md':
1 validation error for DatasetMetadata
task_ids
'text-classification-other-sarcasm-detection' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/arabic_billion_words/README.md':
1 validation error for DatasetMetadata
licenses
'unkown' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/arabic_pos_dialect/README.md'
DEBUG:root:✅️ Validated 'datasets/arabic_speech_corpus/README.md'
WARNING:root:❌ Failed to validate 'datasets/arcd/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/arsentd_lev/README.md':
2 validation errors for DatasetMetadata
licenses
'other-Copyright-2018-by-[American-University-of-Beirut]' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
size_categories
'1K<n<10K"' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/art/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/arxiv_dataset/README.md'
WARNING:root:❌ Failed to validate 'datasets/aslg_pc12/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/asnq/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/asset/README.md':
2 validation errors for DatasetMetadata
task_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/assin/README.md'
DEBUG:root:✅️ Validated 'datasets/assin2/README.md'
DEBUG:root:✅️ Validated 'datasets/atomic/README.md'
WARNING:root:❌ Failed to validate 'datasets/autshumato/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/bbc_hindi_nli/README.md'
DEBUG:root:✅️ Validated 'datasets/bc2gm_corpus/README.md'
WARNING:root:❌ Failed to validate 'datasets/best2009/README.md':
2 validation errors for DatasetMetadata
size_categories
'100k<n<1M' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'structure-prediction-other-word-tokenization' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:⁉️ Something unexpected happened on 'datasets/bianet/README.md':
while scanning a simple key
in "<unicode string>", line 6, column 1:
en-to-ku
^
could not find expected ':'
in "<unicode string>", line 7, column 1:
- en
^
WARNING:root:❌ Failed to validate 'datasets/bible_para/README.md':
1 validation error for DatasetMetadata
languages
'False' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
DEBUG:root:✅️ Validated 'datasets/big_patent/README.md'
DEBUG:root:✅️ Validated 'datasets/billsum/README.md'
DEBUG:root:✅️ Validated 'datasets/bing_coronavirus_query_set/README.md'
WARNING:root:❌ Failed to validate 'datasets/biomrc/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/blended_skill_talk/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/blimp/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/blog_authorship_corpus/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/bn_hate_speech/README.md':
1 validation error for DatasetMetadata
task_ids
'text-classification-other-hate-speech-topic-classification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/bookcorpus/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/bookcorpusopen/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/boolq/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/bprec/README.md'
WARNING:root:❌ Failed to validate 'datasets/break_data/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/brwac/README.md'
DEBUG:root:✅️ Validated 'datasets/bsd_ja_en/README.md'
DEBUG:root:✅️ Validated 'datasets/bswac/README.md'
WARNING:root:❌ Failed to validate 'datasets/c3/README.md':
1 validation error for DatasetMetadata
licenses
'other-non-commercial-research' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:⁉️ Something unexpected happened on 'datasets/c4/README.md':
did not find a yaml block in '/home/theo/sync/datasets/datasets/c4/README.md'
DEBUG:root:✅️ Validated 'datasets/cail2018/README.md'
WARNING:root:❌ Failed to validate 'datasets/caner/README.md':
1 validation error for DatasetMetadata
licenses
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
DEBUG:root:✅️ Validated 'datasets/capes/README.md'
WARNING:root:❌ Failed to validate 'datasets/catalonia_independence/README.md':
2 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
'text-classification-other-stance-detection' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/cawac/README.md'
WARNING:root:❌ Failed to validate 'datasets/cbt/README.md':
3 validation errors for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/cc100/README.md':
1 validation error for DatasetMetadata
languages
'False' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
DEBUG:root:✅️ Validated 'datasets/cc_news/README.md'
WARNING:root:❌ Failed to validate 'datasets/ccaligned_multilingual/README.md':
1 validation error for DatasetMetadata
languages
'False' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
DEBUG:root:✅️ Validated 'datasets/cdsc/README.md'
DEBUG:root:✅️ Validated 'datasets/cdt/README.md'
WARNING:root:❌ Failed to validate 'datasets/cfq/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/chr_en/README.md':
7 validation errors for DatasetMetadata
annotations_creators
none is not an allowed value (type=type_error.none.not_allowed)
languages
none is not an allowed value (type=type_error.none.not_allowed)
licenses
'other-different-license-per-source' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
multilinguality
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/cifar10/README.md':
2 validation errors for DatasetMetadata
languages
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
multilinguality
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
WARNING:root:❌ Failed to validate 'datasets/cifar100/README.md':
2 validation errors for DatasetMetadata
languages
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
multilinguality
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
WARNING:root:❌ Failed to validate 'datasets/circa/README.md':
1 validation error for DatasetMetadata
task_ids
'text-classification-other-question-answer-pair-classification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/civil_comments/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/clickbait_news_bg/README.md'
DEBUG:root:✅️ Validated 'datasets/climate_fever/README.md'
DEBUG:root:✅️ Validated 'datasets/clinc_oos/README.md'
WARNING:root:❌ Failed to validate 'datasets/clue/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/cmrc2018/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/cnn_dailymail/README.md'
DEBUG:root:✅️ Validated 'datasets/coached_conv_pref/README.md'
WARNING:root:❌ Failed to validate 'datasets/coarse_discourse/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/codah/README.md'
WARNING:root:❌ Failed to validate 'datasets/code_search_net/README.md':
1 validation error for DatasetMetadata
licenses
'other-several-licenses' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/com_qa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/common_gen/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/common_voice/README.md'
WARNING:root:❌ Failed to validate 'datasets/commonsense_qa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/compguesswhat/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/conceptnet5/README.md'
WARNING:root:❌ Failed to validate 'datasets/conll2000/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/conll2002/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/conll2003/README.md'
DEBUG:root:✅️ Validated 'datasets/conllpp/README.md'
WARNING:root:❌ Failed to validate 'datasets/conv_ai/README.md':
1 validation error for DatasetMetadata
task_ids
'text-scoring-other-evaluating-dialogue-systems' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/conv_ai_2/README.md':
1 validation error for DatasetMetadata
task_ids
'text-scoring-other-evaluating-dialogue-systems' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/conv_ai_3/README.md':
1 validation error for DatasetMetadata
task_ids
'text-scoring-other-evaluating-dialogue-systems' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/coqa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/cord19/README.md':
1 validation error for DatasetMetadata
licenses
'other-cc0' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/cornell_movie_dialog/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/cos_e/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/cosmos_qa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/counter/README.md'
DEBUG:root:✅️ Validated 'datasets/covid_qa_castorini/README.md'
DEBUG:root:✅️ Validated 'datasets/covid_qa_deepset/README.md'
WARNING:root:❌ Failed to validate 'datasets/covid_qa_ucsd/README.md':
2 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/covid_tweets_japanese/README.md'
DEBUG:root:✅️ Validated 'datasets/covost2/README.md'
DEBUG:root:✅️ Validated 'datasets/craigslist_bargains/README.md'
DEBUG:root:✅️ Validated 'datasets/crawl_domain/README.md'
WARNING:root:❌ Failed to validate 'datasets/crd3/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/crime_and_punish/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/crows_pairs/README.md':
1 validation error for DatasetMetadata
task_ids
'text-scoring-other-bias-evaluation' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/cryptonite/README.md'
DEBUG:root:✅️ Validated 'datasets/cs_restaurants/README.md'
WARNING:root:⁉️ Something unexpected happened on 'datasets/csv/README.md':
did not find a yaml block in '/home/theo/sync/datasets/datasets/csv/README.md'
WARNING:root:❌ Failed to validate 'datasets/curiosity_dialogs/README.md':
1 validation error for DatasetMetadata
task_ids
'sequence-modeling-other-conversational-curiosity' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/daily_dialog/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/dane/README.md':
1 validation error for DatasetMetadata
source_datasets
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
DEBUG:root:✅️ Validated 'datasets/danish_political_comments/README.md'
WARNING:root:❌ Failed to validate 'datasets/dart/README.md':
1 validation error for DatasetMetadata
task_ids
'conditional-text-generation-other-rdf-to-text' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/datacommons_factcheck/README.md':
1 validation error for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/dbpedia_14/README.md'
DEBUG:root:✅️ Validated 'datasets/dbrd/README.md'
WARNING:root:❌ Failed to validate 'datasets/deal_or_no_dialog/README.md':
1 validation error for DatasetMetadata
task_ids
'conditional-text-generation-other-dialogue-generation' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/definite_pronoun_resolution/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/dengue_filipino/README.md'
WARNING:root:❌ Failed to validate 'datasets/dialog_re/README.md':
1 validation error for DatasetMetadata
licenses
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
DEBUG:root:✅️ Validated 'datasets/diplomacy_detection/README.md'
DEBUG:root:✅️ Validated 'datasets/disaster_response_messages/README.md'
WARNING:root:❌ Failed to validate 'datasets/discofuse/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/discovery/README.md':
2 validation errors for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
'text-classification-other-discourse-marker-prediction' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/doc2dial/README.md'
WARNING:root:❌ Failed to validate 'datasets/docred/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/doqa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/dream/README.md'
WARNING:root:❌ Failed to validate 'datasets/drop/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/duorc/README.md':
1 validation error for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/dutch_social/README.md':
1 validation error for DatasetMetadata
size_categories
'100K< n<1M' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
DEBUG:root:✅️ Validated 'datasets/dyk/README.md'
WARNING:root:❌ Failed to validate 'datasets/e2e_nlg/README.md':
1 validation error for DatasetMetadata
task_ids
'conditional-text-generation-other-meaning-representtion-to-text' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/e2e_nlg_cleaned/README.md':
1 validation error for DatasetMetadata
task_ids
'conditional-text-generation-other-meaning-representtion-to-text' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/ecb/README.md'
WARNING:root:❌ Failed to validate 'datasets/ehealth_kd/README.md':
1 validation error for DatasetMetadata
task_ids
'structure-prediction-other-relation-prediction' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/eitb_parcc/README.md'
DEBUG:root:✅️ Validated 'datasets/eli5/README.md'
DEBUG:root:✅️ Validated 'datasets/emea/README.md'
WARNING:root:❌ Failed to validate 'datasets/emo/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/emotion/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/emotone_ar/README.md':
2 validation errors for DatasetMetadata
size_categories
'1k<n<10k' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'emotion-classification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/empathetic_dialogues/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/enriched_web_nlg/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/eraser_multi_rc/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/esnli/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/eth_py150_open/README.md':
1 validation error for DatasetMetadata
annotations_creators
'no-annotations' is not a registered tag for 'annotations', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/creators.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/ethos/README.md':
2 validation errors for DatasetMetadata
language_creators
'found, other' is not a registered tag for 'annotations', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/creators.json (type=value_error)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/euronews/README.md'
DEBUG:root:✅️ Validated 'datasets/europa_eac_tm/README.md'
DEBUG:root:✅️ Validated 'datasets/europa_ecdc_tm/README.md'
DEBUG:root:✅️ Validated 'datasets/europarl_bilingual/README.md'
WARNING:root:❌ Failed to validate 'datasets/event2Mind/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/evidence_infer_treatment/README.md'
WARNING:root:❌ Failed to validate 'datasets/exams/README.md':
3 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
multilinguality
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/factckbr/README.md'
DEBUG:root:✅️ Validated 'datasets/fake_news_english/README.md'
DEBUG:root:✅️ Validated 'datasets/fake_news_filipino/README.md'
DEBUG:root:✅️ Validated 'datasets/farsi_news/README.md'
WARNING:root:❌ Failed to validate 'datasets/fashion_mnist/README.md':
3 validation errors for DatasetMetadata
language_creators
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
languages
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
multilinguality
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
WARNING:root:❌ Failed to validate 'datasets/fever/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/few_rel/README.md':
1 validation error for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/financial_phrasebank/README.md'
DEBUG:root:✅️ Validated 'datasets/finer/README.md'
WARNING:root:❌ Failed to validate 'datasets/flores/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/flue/README.md':
1 validation error for DatasetMetadata
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/fquad/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/freebase_qa/README.md'
WARNING:root:❌ Failed to validate 'datasets/gap/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/gem/README.md':
9 validation errors for DatasetMetadata
annotations_creators
none is not an allowed value (type=type_error.none.not_allowed)
language_creators
none is not an allowed value (type=type_error.none.not_allowed)
languages
none is not an allowed value (type=type_error.none.not_allowed)
licenses
'other-research-only' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
multilinguality
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
source_datasets
none is not an allowed value (type=type_error.none.not_allowed)
task_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/generated_reviews_enth/README.md'
DEBUG:root:✅️ Validated 'datasets/generics_kb/README.md'
DEBUG:root:✅️ Validated 'datasets/german_legal_entity_recognition/README.md'
WARNING:root:❌ Failed to validate 'datasets/germaner/README.md':
3 validation errors for DatasetMetadata
annotations_creators
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
language_creators
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
licenses
'other-ASL 2.0' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/germeval_14/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/giga_fren/README.md'
WARNING:root:❌ Failed to validate 'datasets/gigaword/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/glucose/README.md':
1 validation error for DatasetMetadata
task_ids
'sequence-modeling-other-common-sense-inference' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/glue/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/gnad10/README.md'
WARNING:root:❌ Failed to validate 'datasets/go_emotions/README.md':
2 validation errors for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
'text-classification-other-emotion' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/google_wellformed_query/README.md':
2 validation errors for DatasetMetadata
language_creators
field required (type=value_error.missing)
licenses
'CC-BY-SA-4.0' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/grail_qa/README.md':
1 validation error for DatasetMetadata
task_ids
'question-answering-other-knowledge-base-qa' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/great_code/README.md':
1 validation error for DatasetMetadata
size_categories
'1M<n<5M' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/guardian_authorship/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/gutenberg_time/README.md'
WARNING:root:❌ Failed to validate 'datasets/hans/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/hansards/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/hard/README.md':
1 validation error for DatasetMetadata
size_categories
'10k<n<100k' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
DEBUG:root:✅️ Validated 'datasets/harem/README.md'
WARNING:root:❌ Failed to validate 'datasets/has_part/README.md':
1 validation error for DatasetMetadata
task_ids
'text-scoring-other-Meronym-Prediction' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/hate_offensive/README.md':
2 validation errors for DatasetMetadata
size_categories
'10k<n<100k' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'text-classification-other-hate-speech-detection' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/hate_speech18/README.md'
WARNING:root:❌ Failed to validate 'datasets/hate_speech_filipino/README.md':
1 validation error for DatasetMetadata
task_ids
'sentiment-analysis' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/hate_speech_offensive/README.md':
2 validation errors for DatasetMetadata
size_categories
'10k<n<100K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'text-classification-other-hate-speech-detection' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/hate_speech_pl/README.md'
WARNING:root:❌ Failed to validate 'datasets/hate_speech_portuguese/README.md':
2 validation errors for DatasetMetadata
size_categories
'1k<n<10K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'text-classification-other-hate-speech-detection' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/hatexplain/README.md':
1 validation error for DatasetMetadata
task_ids
'text-classification-other-hate-speech-detection' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/hausa_voa_ner/README.md'
DEBUG:root:✅️ Validated 'datasets/hausa_voa_topics/README.md'
DEBUG:root:✅️ Validated 'datasets/hda_nli_hindi/README.md'
WARNING:root:❌ Failed to validate 'datasets/head_qa/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/health_fact/README.md'
DEBUG:root:✅️ Validated 'datasets/hebrew_projectbenyehuda/README.md'
DEBUG:root:✅️ Validated 'datasets/hebrew_sentiment/README.md'
WARNING:root:❌ Failed to validate 'datasets/hebrew_this_world/README.md':
1 validation error for DatasetMetadata
licenses
'gpl' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/hellaswag/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/hind_encorp/README.md'
WARNING:root:❌ Failed to validate 'datasets/hindi_discourse/README.md':
2 validation errors for DatasetMetadata
licenses
'other-MIDAS-LAB-IIITD-Delhi' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
task_ids
'sequence-modeling-other-discourse-analysis' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/hippocorpus/README.md':
2 validation errors for DatasetMetadata
licenses
'other-my-license' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
task_ids
'text-scoring-other-narrative-flow' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/hkcancor/README.md'
WARNING:root:❌ Failed to validate 'datasets/hope_edi/README.md':
4 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
multilinguality
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
'text-classification-other-hope-speech-classification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/hotpot_qa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/hover/README.md'
DEBUG:root:✅️ Validated 'datasets/hrenwac_para/README.md'
DEBUG:root:✅️ Validated 'datasets/hrwac/README.md'
WARNING:root:❌ Failed to validate 'datasets/humicroedit/README.md':
2 validation errors for DatasetMetadata
task_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/hybrid_qa/README.md':
1 validation error for DatasetMetadata
task_ids
'question-answering-other-multihop-tabular-text-qa' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/hyperpartisan_news_detection/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/iapp_wiki_qa_squad/README.md'
WARNING:root:❌ Failed to validate 'datasets/id_clickbait/README.md':
1 validation error for DatasetMetadata
size_categories
'10k>n>100k' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
DEBUG:root:✅️ Validated 'datasets/id_liputan6/README.md'
WARNING:root:❌ Failed to validate 'datasets/id_nergrit_corpus/README.md':
1 validation error for DatasetMetadata
licenses
'other-nergrit-license' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/id_newspapers_2018/README.md'
DEBUG:root:✅️ Validated 'datasets/id_panl_bppt/README.md'
DEBUG:root:✅️ Validated 'datasets/id_puisi/README.md'
DEBUG:root:✅️ Validated 'datasets/igbo_english_machine_translation/README.md'
WARNING:root:❌ Failed to validate 'datasets/igbo_monolingual/README.md':
1 validation error for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/igbo_ner/README.md'
WARNING:root:❌ Failed to validate 'datasets/ilist/README.md':
3 validation errors for DatasetMetadata
annotations_creators
'unknown' is not a registered tag for 'annotations', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/creators.json (type=value_error)
language_creators
field required (type=value_error.missing)
task_ids
'text-classification-other-language-identification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/imdb/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/imdb_urdu_reviews/README.md'
DEBUG:root:✅️ Validated 'datasets/imppres/README.md'
WARNING:root:❌ Failed to validate 'datasets/indic_glue/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/indonlu/README.md':
3 validation errors for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/inquisitive_qg/README.md':
1 validation error for DatasetMetadata
task_ids
'conditional-text-generation-other-question-generation' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/interpress_news_category_tr/README.md':
2 validation errors for DatasetMetadata
size_categories
'100k<n<1M' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'text-classification-other-news-category-classification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/interpress_news_category_tr_lite/README.md':
2 validation errors for DatasetMetadata
size_categories
'100k<n<1M' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'text-classification-other-news-category-classification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/irc_disentangle/README.md'
WARNING:root:❌ Failed to validate 'datasets/isixhosa_ner_corpus/README.md':
1 validation error for DatasetMetadata
licenses
'other-Creative Commons Attribution 2.5 South Africa License' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/isizulu_ner_corpus/README.md':
1 validation error for DatasetMetadata
licenses
'other-Creative Commons Attribution 2.5 South Africa' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/iwslt2017/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/jeopardy/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/jfleg/README.md':
2 validation errors for DatasetMetadata
multilinguality
'other-language-learner' is not a registered tag for 'multilinguality', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'conditional-text-generation-other-grammatical-error-correction' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/jigsaw_toxicity_pred/README.md'
DEBUG:root:✅️ Validated 'datasets/jnlpba/README.md'
WARNING:root:❌ Failed to validate 'datasets/journalists_questions/README.md':
2 validation errors for DatasetMetadata
size_categories
'1k<n<10K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'text-classification-other-question-identification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:⁉️ Something unexpected happened on 'datasets/json/README.md':
did not find a yaml block in '/home/theo/sync/datasets/datasets/json/README.md'
DEBUG:root:✅️ Validated 'datasets/kannada_news/README.md'
DEBUG:root:✅️ Validated 'datasets/kd_conv/README.md'
DEBUG:root:✅️ Validated 'datasets/kde4/README.md'
DEBUG:root:✅️ Validated 'datasets/kelm/README.md'
WARNING:root:❌ Failed to validate 'datasets/kilt_tasks/README.md':
6 validation errors for DatasetMetadata
annotations_creators
none is not an allowed value (type=type_error.none.not_allowed)
language_creators
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
source_datasets
none is not an allowed value (type=type_error.none.not_allowed)
task_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/kilt_wikipedia/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/kinnews_kirnews/README.md':
1 validation error for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/kor_3i4k/README.md'
DEBUG:root:✅️ Validated 'datasets/kor_hate/README.md'
DEBUG:root:✅️ Validated 'datasets/kor_ner/README.md'
WARNING:root:❌ Failed to validate 'datasets/kor_nli/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/kor_nlu/README.md'
DEBUG:root:✅️ Validated 'datasets/kor_qpair/README.md'
DEBUG:root:✅️ Validated 'datasets/kor_sae/README.md'
WARNING:root:❌ Failed to validate 'datasets/kor_sarcasm/README.md':
1 validation error for DatasetMetadata
task_ids
'text-classification-other-sarcasm-detection' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/labr/README.md':
1 validation error for DatasetMetadata
size_categories
'10k<n<100k' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/lama/README.md':
1 validation error for DatasetMetadata
task_ids
'text-scoring-other-probing' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/lambada/README.md':
1 validation error for DatasetMetadata
task_ids
'conditional-text-generation-other-long-range-dependency' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/large_spanish_corpus/README.md':
2 validation errors for DatasetMetadata
licenses
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/laroseda/README.md':
1 validation error for DatasetMetadata
size_categories
'n=15K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/lc_quad/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/lener_br/README.md'
WARNING:root:❌ Failed to validate 'datasets/liar/README.md':
1 validation error for DatasetMetadata
task_ids
'text-classification-other-fake-news-detection' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/librispeech_asr/README.md'
WARNING:root:❌ Failed to validate 'datasets/librispeech_lm/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/limit/README.md'
WARNING:root:❌ Failed to validate 'datasets/lince/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/linnaeus/README.md'
DEBUG:root:✅️ Validated 'datasets/liveqa/README.md'
WARNING:root:❌ Failed to validate 'datasets/lj_speech/README.md':
1 validation error for DatasetMetadata
licenses
'other-public-domain' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:⁉️ Something unexpected happened on 'datasets/lm1b/README.md':
did not find a yaml block in '/home/theo/sync/datasets/datasets/lm1b/README.md'
WARNING:root:❌ Failed to validate 'datasets/lst20/README.md':
3 validation errors for DatasetMetadata
licenses
'other-aiforthai' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
size_categories
'100k<n<1M' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'structure-prediction-other-clause-segmentation' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/m_lama/README.md':
2 validation errors for DatasetMetadata
size_categories
'1M>n>100K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'text-scoring-other-probing' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/mac_morpho/README.md'
WARNING:root:❌ Failed to validate 'datasets/makhzan/README.md':
1 validation error for DatasetMetadata
licenses
'other-my-license' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/math_dataset/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/math_qa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/matinf/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/mc_taco/README.md'
WARNING:root:❌ Failed to validate 'datasets/md_gender_bias/README.md':
5 validation errors for DatasetMetadata
annotations_creators
none is not an allowed value (type=type_error.none.not_allowed)
language_creators
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
source_datasets
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
'text-classification-other-gender-bias' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/mdd/README.md':
1 validation error for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/med_hop/README.md':
1 validation error for DatasetMetadata
task_ids
'question-answering-other-multi-hop' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/medal/README.md'
WARNING:root:❌ Failed to validate 'datasets/medical_dialog/README.md':
2 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
'n>1K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
DEBUG:root:✅️ Validated 'datasets/medical_questions_pairs/README.md'
DEBUG:root:✅️ Validated 'datasets/menyo20k_mt/README.md'
WARNING:root:❌ Failed to validate 'datasets/meta_woz/README.md':
1 validation error for DatasetMetadata
licenses
'Microsoft Research Data License Agreement' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/metooma/README.md':
1 validation error for DatasetMetadata
licenses
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/metrec/README.md':
2 validation errors for DatasetMetadata
size_categories
'10k<n<100k' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'text-classification-other-poetry-classification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/miam/README.md':
2 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/mkb/README.md':
1 validation error for DatasetMetadata
language_creators
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/mkqa/README.md':
1 validation error for DatasetMetadata
languages
'False' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/mlqa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/mlsum/README.md':
2 validation errors for DatasetMetadata
licenses
'other-research-only' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
size_categories
'1M<n<5M' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/mnist/README.md':
4 validation errors for DatasetMetadata
annotations_creators
'experts' is not a registered tag for 'annotations', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/creators.json (type=value_error)
languages
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
licenses
'MIT' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
multilinguality
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
WARNING:root:❌ Failed to validate 'datasets/mocha/README.md':
1 validation error for DatasetMetadata
task_ids
'question-answering-other-generative-reading-comprehension-metric' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/moroco/README.md'
WARNING:root:❌ Failed to validate 'datasets/movie_rationales/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/mrqa/README.md'
WARNING:root:❌ Failed to validate 'datasets/ms_marco/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/ms_terms/README.md':
1 validation error for DatasetMetadata
languages
'other-bn-india' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/msr_genomics_kbcomp/README.md':
1 validation error for DatasetMetadata
licenses
'other-my-license' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/msr_sqa/README.md'
WARNING:root:❌ Failed to validate 'datasets/msr_text_compression/README.md':
1 validation error for DatasetMetadata
licenses
'other-Microsoft Research Data License' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/msr_zhen_translation_parity/README.md'
DEBUG:root:✅️ Validated 'datasets/msra_ner/README.md'
DEBUG:root:✅️ Validated 'datasets/mt_eng_vietnamese/README.md'
WARNING:root:❌ Failed to validate 'datasets/muchocine/README.md':
1 validation error for DatasetMetadata
licenses
'cc-by-2.1' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/multi_booked/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/multi_news/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/multi_nli/README.md':
1 validation error for DatasetMetadata
licenses
'other-Open Portion of the American National Corpus' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/multi_nli_mismatch/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/multi_para_crawl/README.md'
WARNING:root:❌ Failed to validate 'datasets/multi_re_qa/README.md':
2 validation errors for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
source_datasets
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/multi_woz_v22/README.md'
DEBUG:root:✅️ Validated 'datasets/multi_x_science_sum/README.md'
DEBUG:root:✅️ Validated 'datasets/mutual_friends/README.md'
WARNING:root:❌ Failed to validate 'datasets/mwsc/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/myanmar_news/README.md':
2 validation errors for DatasetMetadata
annotations_creators
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
language_creators
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
DEBUG:root:✅️ Validated 'datasets/narrativeqa/README.md'
DEBUG:root:✅️ Validated 'datasets/narrativeqa_manual/README.md'
WARNING:root:❌ Failed to validate 'datasets/natural_questions/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/ncbi_disease/README.md'
DEBUG:root:✅️ Validated 'datasets/nchlt/README.md'
DEBUG:root:✅️ Validated 'datasets/ncslgr/README.md'
DEBUG:root:✅️ Validated 'datasets/nell/README.md'
WARNING:root:❌ Failed to validate 'datasets/neural_code_search/README.md':
2 validation errors for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/news_commentary/README.md'
WARNING:root:❌ Failed to validate 'datasets/newsgroup/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/newsph/README.md'
DEBUG:root:✅️ Validated 'datasets/newsph_nli/README.md'
WARNING:root:❌ Failed to validate 'datasets/newspop/README.md':
1 validation error for DatasetMetadata
licenses
'cc-by-4' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/newsqa/README.md':
1 validation error for DatasetMetadata
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/newsroom/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/nkjp-ner/README.md'
WARNING:root:❌ Failed to validate 'datasets/nli_tr/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/norec/README.md'
WARNING:root:❌ Failed to validate 'datasets/norwegian_ner/README.md':
2 validation errors for DatasetMetadata
languages
'False' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
licenses
'unknown-' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/nq_open/README.md'
DEBUG:root:✅️ Validated 'datasets/nsmc/README.md'
DEBUG:root:✅️ Validated 'datasets/numer_sense/README.md'
WARNING:root:❌ Failed to validate 'datasets/numeric_fused_head/README.md':
3 validation errors for DatasetMetadata
annotations_creators
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
'structure-prediction-other-fused-head-identification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/oclar/README.md'
WARNING:root:❌ Failed to validate 'datasets/offcombr/README.md':
1 validation error for DatasetMetadata
task_ids
'text-classification-other-hate-speech-detection' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/offenseval2020_tr/README.md':
3 validation errors for DatasetMetadata
licenses
'found' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
size_categories
'10k<n<100k' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'text-classification-other-offensive-language-classification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/offenseval_dravidian/README.md':
3 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
'text-classification-other-offensive-language' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/ofis_publik/README.md'
WARNING:root:❌ Failed to validate 'datasets/ohsumed/README.md':
2 validation errors for DatasetMetadata
annotations_creators
'human-annotated' is not a registered tag for 'annotations', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/creators.json (type=value_error)
size_categories
'100k< n<500K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/ollie/README.md':
1 validation error for DatasetMetadata
licenses
'other-university-of-washington-academic' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/omp/README.md'
DEBUG:root:✅️ Validated 'datasets/onestop_english/README.md'
WARNING:root:❌ Failed to validate 'datasets/open_subtitles/README.md':
1 validation error for DatasetMetadata
languages
'False' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/openbookqa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/openwebtext/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/opinosis/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/opus100/README.md':
3 validation errors for DatasetMetadata
language_creators
field required (type=value_error.missing)
languages
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
'10K<n<1M' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/opus_books/README.md':
1 validation error for DatasetMetadata
languages
'False' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
DEBUG:root:✅️ Validated 'datasets/opus_dgt/README.md'
DEBUG:root:✅️ Validated 'datasets/opus_dogc/README.md'
WARNING:root:❌ Failed to validate 'datasets/opus_elhuyar/README.md':
1 validation error for DatasetMetadata
size_categories
'1k<10K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
DEBUG:root:✅️ Validated 'datasets/opus_euconst/README.md'
WARNING:root:❌ Failed to validate 'datasets/opus_finlex/README.md':
1 validation error for DatasetMetadata
size_categories
'1k<10K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/opus_fiskmo/README.md':
1 validation error for DatasetMetadata
size_categories
'1k<100K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
DEBUG:root:✅️ Validated 'datasets/opus_gnome/README.md'
WARNING:root:❌ Failed to validate 'datasets/opus_infopankki/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/opus_memat/README.md':
1 validation error for DatasetMetadata
size_categories
'1k<10K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/opus_montenegrinsubs/README.md':
1 validation error for DatasetMetadata
size_categories
'1k<10K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/opus_openoffice/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/opus_paracrawl/README.md'
WARNING:root:❌ Failed to validate 'datasets/opus_rf/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/opus_tedtalks/README.md'
DEBUG:root:✅️ Validated 'datasets/opus_ubuntu/README.md'
DEBUG:root:✅️ Validated 'datasets/opus_wikipedia/README.md'
WARNING:root:❌ Failed to validate 'datasets/opus_xhosanavy/README.md':
1 validation error for DatasetMetadata
size_categories
'1k<10K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
DEBUG:root:✅️ Validated 'datasets/orange_sum/README.md'
WARNING:root:❌ Failed to validate 'datasets/oscar/README.md':
1 validation error for DatasetMetadata
languages
'False' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
WARNING:root:⁉️ Something unexpected happened on 'datasets/pandas/README.md':
did not find a yaml block in '/home/theo/sync/datasets/datasets/pandas/README.md'
WARNING:root:❌ Failed to validate 'datasets/para_crawl/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/para_pat/README.md'
DEBUG:root:✅️ Validated 'datasets/parsinlu_reading_comprehension/README.md'
WARNING:root:❌ Failed to validate 'datasets/paws/README.md':
3 validation errors for DatasetMetadata
annotations_creators
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
'text-scoring-other-paraphrase-identification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/paws-x/README.md':
1 validation error for DatasetMetadata
task_ids
'text-scoring-other-paraphrase-identification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/pec/README.md':
1 validation error for DatasetMetadata
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/peer_read/README.md':
1 validation error for DatasetMetadata
task_ids
'text-classification-other-acceptability-classification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/peoples_daily_ner/README.md'
DEBUG:root:✅️ Validated 'datasets/per_sent/README.md'
DEBUG:root:✅️ Validated 'datasets/persian_ner/README.md'
WARNING:root:❌ Failed to validate 'datasets/pg19/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/php/README.md'
WARNING:root:❌ Failed to validate 'datasets/piaf/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/pib/README.md'
DEBUG:root:✅️ Validated 'datasets/piqa/README.md'
DEBUG:root:✅️ Validated 'datasets/pn_summary/README.md'
DEBUG:root:✅️ Validated 'datasets/poem_sentiment/README.md'
DEBUG:root:✅️ Validated 'datasets/polemo2/README.md'
DEBUG:root:✅️ Validated 'datasets/poleval2019_cyberbullying/README.md'
DEBUG:root:✅️ Validated 'datasets/poleval2019_mt/README.md'
DEBUG:root:✅️ Validated 'datasets/polsum/README.md'
WARNING:root:❌ Failed to validate 'datasets/polyglot_ner/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/prachathai67k/README.md'
WARNING:root:❌ Failed to validate 'datasets/pragmeval/README.md':
5 validation errors for DatasetMetadata
annotations_creators
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
language_creators
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
licenses
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
source_datasets
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
DEBUG:root:✅️ Validated 'datasets/proto_qa/README.md'
DEBUG:root:✅️ Validated 'datasets/psc/README.md'
WARNING:root:❌ Failed to validate 'datasets/ptb_text_only/README.md':
1 validation error for DatasetMetadata
licenses
'other-LDC User Agreement for Non-Members' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/pubmed/README.md':
2 validation errors for DatasetMetadata
licenses
'other-nlm-license' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
task_ids
'text-scoring-other-citation-estimation' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/pubmed_qa/README.md':
1 validation error for DatasetMetadata
size_categories
'1K<n<1M' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/py_ast/README.md':
1 validation error for DatasetMetadata
task_ids
'sequence-modeling-code-modeling' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/qa4mre/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/qa_srl/README.md'
WARNING:root:❌ Failed to validate 'datasets/qa_zre/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/qangaroo/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/qanta/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/qasc/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/qed/README.md':
1 validation error for DatasetMetadata
task_ids
'question-answering-other-explanations-in-question-answering' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/qed_amara/README.md'
DEBUG:root:✅️ Validated 'datasets/quac/README.md'
WARNING:root:❌ Failed to validate 'datasets/quail/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/quarel/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/quartz/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/quora/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/quoref/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/race/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/re_dial/README.md':
1 validation error for DatasetMetadata
task_ids
'text-classification-other-dialogue-sentiment-classification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/reasoning_bg/README.md'
DEBUG:root:✅️ Validated 'datasets/recipe_nlg/README.md'
WARNING:root:⁉️ Something unexpected happened on 'datasets/reclor/README.md':
did not find a yaml block in '/home/theo/sync/datasets/datasets/reclor/README.md'
WARNING:root:❌ Failed to validate 'datasets/reddit/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/reddit_tifu/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/refresd/README.md':
1 validation error for DatasetMetadata
source_datasets
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
WARNING:root:❌ Failed to validate 'datasets/reuters21578/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/ro_sent/README.md'
WARNING:root:❌ Failed to validate 'datasets/ro_sts/README.md':
1 validation error for DatasetMetadata
annotations_creators
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
WARNING:root:❌ Failed to validate 'datasets/ro_sts_parallel/README.md':
1 validation error for DatasetMetadata
annotations_creators
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
WARNING:root:⁉️ Something unexpected happened on 'datasets/roman_urdu/README.md':
did not find a yaml block in '/home/theo/sync/datasets/datasets/roman_urdu/README.md'
DEBUG:root:✅️ Validated 'datasets/ronec/README.md'
DEBUG:root:✅️ Validated 'datasets/ropes/README.md'
WARNING:root:❌ Failed to validate 'datasets/rotten_tomatoes/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/s2orc/README.md'
DEBUG:root:✅️ Validated 'datasets/samsum/README.md'
WARNING:root:❌ Failed to validate 'datasets/sanskrit_classic/README.md':
1 validation error for DatasetMetadata
licenses
'other-Public Domain Mark 1.0' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/saudinewsnet/README.md'
WARNING:root:❌ Failed to validate 'datasets/scan/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/scb_mt_enth_2020/README.md'
DEBUG:root:✅️ Validated 'datasets/schema_guided_dstc8/README.md'
WARNING:root:❌ Failed to validate 'datasets/scicite/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/scielo/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/scientific_papers/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/scifact/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/sciq/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/scitail/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/scitldr/README.md'
WARNING:root:❌ Failed to validate 'datasets/search_qa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/selqa/README.md'
WARNING:root:❌ Failed to validate 'datasets/sem_eval_2010_task_8/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/sem_eval_2014_task_1/README.md'
WARNING:root:❌ Failed to validate 'datasets/sem_eval_2020_task_11/README.md':
2 validation errors for DatasetMetadata
language_creators
'original' is not a registered tag for 'annotations', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/creators.json (type=value_error)
task_ids
'text-classification-other-propaganda-technique-classification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/sent_comp/README.md'
WARNING:root:❌ Failed to validate 'datasets/senti_lex/README.md':
1 validation error for DatasetMetadata
languages
'False' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/senti_ws/README.md':
1 validation error for DatasetMetadata
task_ids
'structure-prediction-other-pos-tagging' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/sentiment140/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/sepedi_ner/README.md':
1 validation error for DatasetMetadata
licenses
'other-Creative Commons Attribution 2.5 South Africa License' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/sesotho_ner_corpus/README.md':
1 validation error for DatasetMetadata
licenses
'other-Creative Commons Attribution 2.5 South Africa License' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/setimes/README.md'
WARNING:root:❌ Failed to validate 'datasets/setswana_ner_corpus/README.md':
1 validation error for DatasetMetadata
licenses
'other-Creative Commons Attribution 2.5 South Africa License' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/sharc/README.md':
1 validation error for DatasetMetadata
task_ids
'question-answering-other-conversational-qa' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/sharc_modified/README.md':
1 validation error for DatasetMetadata
task_ids
'question-answering-other-conversational-qa' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/sick/README.md':
2 validation errors for DatasetMetadata
licenses
'CC-BY-NC-SA-3.0' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
size_categories
'1k<10K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/silicone/README.md':
1 validation error for DatasetMetadata
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/simple_questions_v2/README.md':
1 validation error for DatasetMetadata
source_datasets
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
WARNING:root:❌ Failed to validate 'datasets/siswati_ner_corpus/README.md':
1 validation error for DatasetMetadata
licenses
'other-Creative Commons Attribution 2.5 South Africa License' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/smartdata/README.md'
DEBUG:root:✅️ Validated 'datasets/sms_spam/README.md'
DEBUG:root:✅️ Validated 'datasets/snips_built_in_intents/README.md'
DEBUG:root:✅️ Validated 'datasets/snli/README.md'
DEBUG:root:✅️ Validated 'datasets/snow_simplified_japanese_corpus/README.md'
DEBUG:root:✅️ Validated 'datasets/so_stacksample/README.md'
WARNING:root:❌ Failed to validate 'datasets/social_bias_frames/README.md':
1 validation error for DatasetMetadata
task_ids
'hate-speech-detection' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/social_i_qa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/sofc_materials_articles/README.md'
WARNING:root:❌ Failed to validate 'datasets/sogou_news/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/spanish_billion_words/README.md':
1 validation error for DatasetMetadata
languages
'esXXX_CI_SHOULD_FAIL_HERE' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/spc/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/species_800/README.md'
WARNING:root:❌ Failed to validate 'datasets/spider/README.md':
1 validation error for DatasetMetadata
task_ids
'conditional-text-generation-other-stuctured-to-text' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/squad/README.md'
DEBUG:root:✅️ Validated 'datasets/squad_adversarial/README.md'
WARNING:root:❌ Failed to validate 'datasets/squad_es/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/squad_it/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/squad_kor_v1/README.md'
DEBUG:root:✅️ Validated 'datasets/squad_kor_v2/README.md'
WARNING:root:❌ Failed to validate 'datasets/squad_v1_pt/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/squad_v2/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/squadshifts/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/srwac/README.md'
WARNING:root:❌ Failed to validate 'datasets/sst/README.md':
3 validation errors for DatasetMetadata
licenses
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
source_datasets
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
WARNING:root:❌ Failed to validate 'datasets/stereoset/README.md':
1 validation error for DatasetMetadata
task_ids
'text-classification-other-stereotype-detection' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/stsb_mt_sv/README.md'
WARNING:root:❌ Failed to validate 'datasets/style_change_detection/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/super_glue/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/swag/README.md'
DEBUG:root:✅️ Validated 'datasets/swahili/README.md'
DEBUG:root:✅️ Validated 'datasets/swahili_news/README.md'
DEBUG:root:✅️ Validated 'datasets/swda/README.md'
DEBUG:root:✅️ Validated 'datasets/swedish_ner_corpus/README.md'
DEBUG:root:✅️ Validated 'datasets/swedish_reviews/README.md'
DEBUG:root:✅️ Validated 'datasets/tab_fact/README.md'
DEBUG:root:✅️ Validated 'datasets/tamilmixsentiment/README.md'
WARNING:root:❌ Failed to validate 'datasets/tanzil/README.md':
1 validation error for DatasetMetadata
languages
'False' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/tapaco/README.md':
2 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
'conditional-text-generation-other-given-a-sentence-generate-a-paraphrase-either-in-same-language-or-another-language' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/tashkeela/README.md'
DEBUG:root:✅️ Validated 'datasets/taskmaster1/README.md'
DEBUG:root:✅️ Validated 'datasets/taskmaster2/README.md'
DEBUG:root:✅️ Validated 'datasets/taskmaster3/README.md'
DEBUG:root:✅️ Validated 'datasets/tatoeba/README.md'
WARNING:root:❌ Failed to validate 'datasets/ted_hrlr/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/ted_iwlst2013/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/ted_multi/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/ted_talks_iwslt/README.md':
1 validation error for DatasetMetadata
size_categories
'100K< n<500k' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
DEBUG:root:✅️ Validated 'datasets/telugu_books/README.md'
DEBUG:root:✅️ Validated 'datasets/telugu_news/README.md'
WARNING:root:❌ Failed to validate 'datasets/tep_en_fa_para/README.md':
1 validation error for DatasetMetadata
size_categories
'1k<10K' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:⁉️ Something unexpected happened on 'datasets/text/README.md':
did not find a yaml block in '/home/theo/sync/datasets/datasets/text/README.md'
DEBUG:root:✅️ Validated 'datasets/thai_toxicity_tweet/README.md'
WARNING:root:❌ Failed to validate 'datasets/thainer/README.md':
1 validation error for DatasetMetadata
size_categories
'1k<n<10k' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
DEBUG:root:✅️ Validated 'datasets/thaiqa_squad/README.md'
DEBUG:root:✅️ Validated 'datasets/thaisum/README.md'
WARNING:root:❌ Failed to validate 'datasets/tilde_model/README.md':
1 validation error for DatasetMetadata
languages
'False' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
DEBUG:root:✅️ Validated 'datasets/times_of_india_news_headlines/README.md'
WARNING:root:❌ Failed to validate 'datasets/timit_asr/README.md':
1 validation error for DatasetMetadata
licenses
'other-LDC-User-Agreement-for-Non-Members' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/tiny_shakespeare/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/tlc/README.md'
WARNING:root:❌ Failed to validate 'datasets/tmu_gfm_dataset/README.md':
1 validation error for DatasetMetadata
task_ids
'conditional-text-generation-other-grammatical-error-correction' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/totto/README.md'
WARNING:root:❌ Failed to validate 'datasets/trec/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/trivia_qa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/tsac/README.md'
WARNING:root:❌ Failed to validate 'datasets/ttc4900/README.md':
2 validation errors for DatasetMetadata
size_categories
'100k<n<1M' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'text-classification-other-news-category-classification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/tunizi/README.md'
DEBUG:root:✅️ Validated 'datasets/tuple_ie/README.md'
WARNING:root:❌ Failed to validate 'datasets/turk/README.md':
1 validation error for DatasetMetadata
licenses
'gnu-gpl-v3.0' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/turkish_movie_sentiment/README.md'
DEBUG:root:✅️ Validated 'datasets/turkish_ner/README.md'
DEBUG:root:✅️ Validated 'datasets/turkish_product_reviews/README.md'
DEBUG:root:✅️ Validated 'datasets/turkish_shrinked_ner/README.md'
DEBUG:root:✅️ Validated 'datasets/turku_ner_corpus/README.md'
WARNING:root:❌ Failed to validate 'datasets/tweet_eval/README.md':
4 validation errors for DatasetMetadata
annotations_creators
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
source_datasets
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/tweet_qa/README.md'
DEBUG:root:✅️ Validated 'datasets/tweets_ar_en_parallel/README.md'
DEBUG:root:✅️ Validated 'datasets/tweets_hate_speech_detection/README.md'
DEBUG:root:✅️ Validated 'datasets/twi_text_c3/README.md'
WARNING:root:❌ Failed to validate 'datasets/twi_wordsim353/README.md':
1 validation error for DatasetMetadata
source_datasets
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
WARNING:root:❌ Failed to validate 'datasets/tydiqa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/ubuntu_dialogs_corpus/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/udhr/README.md'
DEBUG:root:✅️ Validated 'datasets/um005/README.md'
WARNING:root:❌ Failed to validate 'datasets/un_ga/README.md':
2 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
'n>10M' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/un_multi/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/un_pc/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/universal_dependencies/README.md':
2 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
'constituency-parsing' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/universal_morphologies/README.md':
3 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
'structure-prediction-other-morphology' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/urdu_fake_news/README.md'
DEBUG:root:✅️ Validated 'datasets/urdu_sentiment_corpus/README.md'
WARNING:root:❌ Failed to validate 'datasets/web_nlg/README.md':
3 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
task_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/web_of_science/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/web_questions/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/weibo_ner/README.md'
WARNING:root:❌ Failed to validate 'datasets/wi_locness/README.md':
2 validation errors for DatasetMetadata
multilinguality
'other-language-learner' is not a registered tag for 'multilinguality', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
task_ids
'conditional-text-generation-other-grammatical-error-correction' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/wiki40b/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/wiki_asp/README.md'
WARNING:root:❌ Failed to validate 'datasets/wiki_atomic_edits/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/wiki_auto/README.md':
1 validation error for DatasetMetadata
annotations_creators
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/wiki_bio/README.md'
WARNING:root:❌ Failed to validate 'datasets/wiki_dpr/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/wiki_hop/README.md':
1 validation error for DatasetMetadata
task_ids
'question-answering-other-multi-hop' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
WARNING:root:❌ Failed to validate 'datasets/wiki_lingua/README.md':
2 validation errors for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/wiki_movies/README.md'
WARNING:root:❌ Failed to validate 'datasets/wiki_qa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/wiki_qa_ar/README.md'
WARNING:root:❌ Failed to validate 'datasets/wiki_snippets/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/wiki_source/README.md'
WARNING:root:❌ Failed to validate 'datasets/wiki_split/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/wiki_summary/README.md'
WARNING:root:❌ Failed to validate 'datasets/wikiann/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/wikicorpus/README.md':
5 validation errors for DatasetMetadata
annotations_creators
none is not an allowed value (type=type_error.none.not_allowed)
languages
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:⁉️ Something unexpected happened on 'datasets/wikihow/README.md':
did not find a yaml block in '/home/theo/sync/datasets/datasets/wikihow/README.md'
WARNING:root:❌ Failed to validate 'datasets/wikipedia/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/wikisql/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/wikitext/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/wikitext_tl39/README.md'
WARNING:root:❌ Failed to validate 'datasets/wili_2018/README.md':
2 validation errors for DatasetMetadata
languages
'other-roa-tara' is not recognised as a valid language code (BCP47 norm), you can refer to https://github.com/LuminosoInsight/langcodes (type=value_error)
task_ids
'text-classification-other-language-identification' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/wino_bias/README.md'
DEBUG:root:✅️ Validated 'datasets/winograd_wsc/README.md'
WARNING:root:❌ Failed to validate 'datasets/winogrande/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/wiqa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/wisesight1000/README.md':
1 validation error for DatasetMetadata
task_ids
'structure-prediction-other-word-tokenization' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/wisesight_sentiment/README.md'
WARNING:root:❌ Failed to validate 'datasets/wmt14/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/wmt15/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/wmt16/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/wmt17/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/wmt18/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/wmt19/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/wmt20_mlqe_task1/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/wmt20_mlqe_task2/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/wmt20_mlqe_task3/README.md'
WARNING:root:❌ Failed to validate 'datasets/wmt_t2t/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/wnut_17/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/wongnai_reviews/README.md':
2 validation errors for DatasetMetadata
annotations_creators
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
language_creators
ensure this value has at least 1 items (type=value_error.list.min_items; limit_value=1)
WARNING:root:❌ Failed to validate 'datasets/woz_dialogue/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
DEBUG:root:✅️ Validated 'datasets/wrbsc/README.md'
WARNING:root:❌ Failed to validate 'datasets/x_stance/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/xcopa/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/xed_en_fi/README.md'
WARNING:root:❌ Failed to validate 'datasets/xglue/README.md':
9 validation errors for DatasetMetadata
annotations_creators
none is not an allowed value (type=type_error.none.not_allowed)
language_creators
none is not an allowed value (type=type_error.none.not_allowed)
languages
none is not an allowed value (type=type_error.none.not_allowed)
licenses
none is not an allowed value (type=type_error.none.not_allowed)
multilinguality
none is not an allowed value (type=type_error.none.not_allowed)
size_categories
none is not an allowed value (type=type_error.none.not_allowed)
source_datasets
none is not an allowed value (type=type_error.none.not_allowed)
task_categories
none is not an allowed value (type=type_error.none.not_allowed)
task_ids
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/xnli/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/xor_tydi_qa/README.md'
WARNING:root:❌ Failed to validate 'datasets/xquad/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/xquad_r/README.md':
1 validation error for DatasetMetadata
languages
none is not an allowed value (type=type_error.none.not_allowed)
WARNING:root:❌ Failed to validate 'datasets/xsum/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/xsum_factuality/README.md'
WARNING:root:❌ Failed to validate 'datasets/xtreme/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
DEBUG:root:✅️ Validated 'datasets/yahoo_answers_qa/README.md'
DEBUG:root:✅️ Validated 'datasets/yahoo_answers_topics/README.md'
WARNING:root:❌ Failed to validate 'datasets/yelp_polarity/README.md':
9 validation errors for DatasetMetadata
annotations_creators
field required (type=value_error.missing)
language_creators
field required (type=value_error.missing)
languages
field required (type=value_error.missing)
licenses
field required (type=value_error.missing)
multilinguality
field required (type=value_error.missing)
size_categories
field required (type=value_error.missing)
source_datasets
field required (type=value_error.missing)
task_categories
field required (type=value_error.missing)
task_ids
field required (type=value_error.missing)
WARNING:root:❌ Failed to validate 'datasets/yelp_review_full/README.md':
1 validation error for DatasetMetadata
licenses
'other-yelp-licence' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
DEBUG:root:✅️ Validated 'datasets/yoruba_bbc_topics/README.md'
WARNING:root:❌ Failed to validate 'datasets/yoruba_gv_ner/README.md':
2 validation errors for DatasetMetadata
licenses
'Creative Commons 3.0' is not a registered tag for 'licenses', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/licenses.json (type=value_error)
size_categories
'200<n<1k' is not a registered tag for 'size_categories', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils//home/theo/sync/datasets/src/datasets/utils/metadata.py (type=value_error)
DEBUG:root:✅️ Validated 'datasets/yoruba_text_c3/README.md'
DEBUG:root:✅️ Validated 'datasets/yoruba_wordsim353/README.md'
DEBUG:root:✅️ Validated 'datasets/youtube_caption_corrections/README.md'
WARNING:root:❌ Failed to validate 'datasets/zest/README.md':
1 validation error for DatasetMetadata
task_ids
'question-answering-other-yes-no-qa' is not a registered tag for 'tasks_ids', reference at https://github.com/huggingface/datasets/tree/master/src/datasets/utils/resources/tasks.json (type=value_error)
INFO:root:❌ Failed on 378 files.
Process finished with exit code 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment