Skip to content

Instantly share code, notes, and snippets.

@Deepayan137
Created November 9, 2017 15:43
Show Gist options
  • Save Deepayan137/44b243a0452fc3dcd5768d95b9944a17 to your computer and use it in GitHub Desktop.
Save Deepayan137/44b243a0452fc3dcd5768d95b9944a17 to your computer and use it in GitHub Desktop.
Pseudo code for cost estimation
def cost_model(**kwargs):
tc, sc = 15, 5
#method = kwargs['method']
in_dictionary = kwargs['included']
not_in_dictionary = kwargs['excluded']
if in_dictionary:
return tc*not_in_dictionary + sc*in_dictionary
else:
return tc*not_in_dictionary
def naive(errors):
#pdb.set_trace()
cost = cost_model(excluded= errors, included = None)
return cost
def spell_check(correctable, uncorrectable):
cost = cost_model(excluded = uncorrectable, included = correctable)
return cost
def cluster_cost(components, **kwargs):
mode = kwargs['mode']
cost = 0
tc, ti = 20, 2 # tc = time to correct one cluster, ti = time to correct each word in a cluster
if mode == 'Naive':
return components*tc
if mode == 'Advanced':
for c in components:
cost += len(c)*tc
return cost
@Deepayan137
Copy link
Author

Deepayan137 commented Nov 9, 2017

Hey, Could you please provide some more insight into cost estimation using clustering. The function cluster_cost is a rough implementation of what I understood. What more to add?

Secondly, what is the mode of correction of for the annotator, what I mean is whether the annotator is only going to select and deselect the word prediction or we also provide an option either to type the correct word or select a suggestion from a drop down menu.

if we provide the above two options then the cost computation will change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment