Skip to content

Instantly share code, notes, and snippets.

@kevinmelodi
Created September 13, 2023 02:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kevinmelodi/dc82499ca03b6a4f15f2085e4732fef7 to your computer and use it in GitHub Desktop.
Save kevinmelodi/dc82499ca03b6a4f15f2085e4732fef7 to your computer and use it in GitHub Desktop.
#!/bin/bash
# Make sure prodigy.json has appropriate environment variables
python mkconfig.py
echo 'Start labeling: python scripts/3_start_prodigy_UI.py --dataset your_dataset_name --input_file your_input_file_path.jsonl'
echo 'Prep new PDFs for labeling, run: python scripts/1_format_PDFs_to_label_format-DocAI.py --input_file customer-PDFs'
echo 'Pull most recently formatted labeling data into prodigy, run: python scripts/2_load_data_to_prodigy.py'
echo 'Export labels to GCS with python scripts/4_export_prodigy_to_gcs.py dataset_name'
echo 'Create contexts with python scripts/5_create_contexts_from_label-OpenAI.py dataset_name.jsonl'
# Show the stats, just for the logs
python -m prodigy stats -l
python scripts/2_load_data_to_prodigy.py --input_file dummy-data
python scripts/3_start_prodigy_UI.py --dataset dummy-data_ds --input_file recent.jsonl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment