Skip to content

Instantly share code, notes, and snippets.

@mrm8488
Created April 17, 2020 03:17
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mrm8488/2b2b64f5d006bb8421942f47dee9556b to your computer and use it in GitHub Desktop.
Save mrm8488/2b2b64f5d006bb8421942f47dee9556b to your computer and use it in GitHub Desktop.
Create an efficient text dataset
class LazyTextDataset(Dataset):
def __init__(self, filename):
self._filename = filename
self._total_data = 0
self._total_data = int(subprocess.check_output("wc -l " + filename, shell=True).split()[0])
def __getitem__(self, idx):
line = linecache.getline(self._filename, idx + 1)
csv_line = csv.reader([line])
return next(csv_line)
def __len__(self):
return self._total_data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment