Skip to content

Instantly share code, notes, and snippets.

@roycoding
Last active June 27, 2017 18:38
Show Gist options
  • Save roycoding/66a4e0110bb9092fd833 to your computer and use it in GitHub Desktop.
Save roycoding/66a4e0110bb9092fd833 to your computer and use it in GitHub Desktop.
All zeros benchmark for Random Acts of Pizza

All zeros benchmark for Random Acts of Pizza competition

In the Random Acts of Pizza competition on Kaggle, the goal is to predict whether people posting on Reddit's Random Acts of Pizza sub-Reddit will actually receive a free pizza based on their post. For this classification problem, the evaluation metric is AUC.

I recreated the all-zeros benchmark using a couple of unix commandline tools.

  1. Create the CSV header:
echo "request_id,requester_received_pizza" > zero-benchmark.csv
  1. Extract the request_id from each JSON entry in the test set and set the prediction to 0, i.e. no pizza. For this I used the powerful jq tool to process the JSON and extract the request_id's.
cat test.json | jq -r '.[].request_id+",0"' >> zero-benchmark.csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment