Skip to content

Instantly share code, notes, and snippets.

@gorborukov
Created April 13, 2023 05:59
Show Gist options
  • Save gorborukov/4f70d85022e329e299204261035adb89 to your computer and use it in GitHub Desktop.
Save gorborukov/4f70d85022e329e299204261035adb89 to your computer and use it in GitHub Desktop.
Removing fields from JSONL dataset
require 'json'
# Read the JSON data from the original file line by line
modified_data = ''
File.foreach('original_file.jsonl') do |line|
# Parse the line as a JSON object
json_object = JSON.parse(line)
# Remove the 'category' and 'context' fields from the JSON object
json_object.delete('category')
json_object.delete('context')
# Convert the modified JSON object back into a JSON string and append it to the modified data variable
modified_data << JSON.generate(json_object) << "\n"
end
# Write the modified JSON data to a new file
File.write('modified_file.jsonl', modified_data)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment