Created
November 9, 2017 15:43
-
-
Save Deepayan137/44b243a0452fc3dcd5768d95b9944a17 to your computer and use it in GitHub Desktop.
Pseudo code for cost estimation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def cost_model(**kwargs): | |
tc, sc = 15, 5 | |
#method = kwargs['method'] | |
in_dictionary = kwargs['included'] | |
not_in_dictionary = kwargs['excluded'] | |
if in_dictionary: | |
return tc*not_in_dictionary + sc*in_dictionary | |
else: | |
return tc*not_in_dictionary | |
def naive(errors): | |
#pdb.set_trace() | |
cost = cost_model(excluded= errors, included = None) | |
return cost | |
def spell_check(correctable, uncorrectable): | |
cost = cost_model(excluded = uncorrectable, included = correctable) | |
return cost | |
def cluster_cost(components, **kwargs): | |
mode = kwargs['mode'] | |
cost = 0 | |
tc, ti = 20, 2 # tc = time to correct one cluster, ti = time to correct each word in a cluster | |
if mode == 'Naive': | |
return components*tc | |
if mode == 'Advanced': | |
for c in components: | |
cost += len(c)*tc | |
return cost |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey, Could you please provide some more insight into cost estimation using clustering. The function cluster_cost is a rough implementation of what I understood. What more to add?
Secondly, what is the mode of correction of for the annotator, what I mean is whether the annotator is only going to select and deselect the word prediction or we also provide an option either to type the correct word or select a suggestion from a drop down menu.
if we provide the above two options then the cost computation will change.