Created
May 18, 2020 16:02
-
-
Save zredlined/2858b1023fc807b263ce2e5af7bb377c to your computer and use it in GitHub Desktop.
Custom record validator function for training a model on UC Irvine's Heart Disease Dataset
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Validate each generated record | |
# Note: This custom validator verifies the record structure matches | |
# the expected format for UCI healthcare data, and also that | |
# generated records are Female (e.g. column 1 is 0) | |
def validate_record(line): | |
rec = line.strip().split(",") | |
if not int(rec[1]) == 0: | |
raise Exception("record generated must be female") | |
if len(rec) == 14: | |
int(rec[0]) | |
int(rec[2]) | |
int(rec[3]) | |
int(rec[4]) | |
int(rec[5]) | |
int(rec[6]) | |
int(rec[7]) | |
int(rec[8]) | |
float(rec[9]) | |
int(rec[10]) | |
int(rec[11]) | |
int(rec[12]) | |
int(rec[13]) | |
else: | |
raise Exception('record not 14 parts') |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment