-
-
Save thomwolf/ecc52ea728d29c9724320b38619bd6a6 to your computer and use it in GitHub Desktop.
import json | |
from pytorch_pretrained_bert import cached_path | |
url = "https://s3.amazonaws.com/datasets.huggingface.co/personachat/personachat_self_original.json" | |
# Download and load JSON dataset | |
personachat_file = cached_path(url) | |
with open(personachat_file, "r", encoding="utf-8") as f: | |
dataset = json.loads(f.read()) | |
# Tokenize and encode the dataset using our loaded GPT tokenizer | |
def tokenize(obj): | |
if isinstance(obj, str): | |
return tokenizer.convert_tokens_to_ids(tokenizer.tokenize(obj)) | |
if isinstance(obj, dict): | |
return dict((n, tokenize(o)) for n, o in obj.items()) | |
return list(tokenize(o) for o in obj) | |
dataset = tokenize(dataset) |
getting the same error
same error here too
Should be fixed now
@thomwolf the error still persists. Unable to download the json dataset due to that issue.
@thomwolf the error still persists. Unable to download the json dataset due to that issue.
I fixed the error. It was an error on my end. I had to reconfigure the AWS credentials.
Should be fixed now
@thomwolf the error still persists. Unable to download the json dataset due to that issue.
I fixed the error. It was an error on my end. I had to reconfigure the AWS credentials.
I am still getting the same error. Please help.
@thomwolf the error still persists. Unable to download the json dataset due to that issue.
I fixed the error. It was an error on my end. I had to reconfigure the AWS credentials.
@sashank06 I am still getting the error, can you please share how you rectified the error.
this URL has worked for me
"https://s3.amazonaws.com/datasets.huggingface.co/personachat/personachat_self_original.json"
Thanks Khaled, this "https://s3.amazonaws.com/datasets.huggingface.co/personachat/personachat_self_original.json" worked for me too.
It worked for me with that url = "https://s3.amazonaws.com/datasets.huggingface.co/personachat/personachat_self_original.json"
Thanks Khaled
Hi, I am trying to download the file form the s3 bucket you have indicated in the link, but it raises an error:
NoCredentialsError: Unable to locate credentials
This happens at the function
s3_etag(url)
At seems as any kind of credentials is needed. Any help would be welcomed.