Skip to content

Instantly share code, notes, and snippets.

@rjurney
Last active August 29, 2015 14:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rjurney/ef7d851f50e424c2131a to your computer and use it in GitHub Desktop.
Save rjurney/ef7d851f50e424c2131a to your computer and use it in GitHub Desktop.
How can I optimize this Python code?
from pig_util import outputSchema
import sys, os, re
@outputSchema('matches:bag{t:tuple(name:chararray)}') # I am a pig schema
def match_names(one_name, all_names): #all_names is an array with 150,000 string elements
match_pairs = []
for name_tuple in all_names:
name = name_tuple[0]
match = one_name.find(name)
if match >= 0:
# other operations tha can't be optimized
# ...
match_pairs.append(name)
else:
pass # no match
return match_pairs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment