Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@jgomo3
Last active February 21, 2020 15:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jgomo3/70df1704a5c242be8b17ab4198bbdf70 to your computer and use it in GitHub Desktop.
Save jgomo3/70df1704a5c242be8b17ab4198bbdf70 to your computer and use it in GitHub Desktop.
Return the first n elements of a collection by natural order of by a key function
import heapq
def top(n, col, key=None):
return heapq.nsmallest(n, col, key)
# Alternative, descent only for small n.
from itertools import count
def top_small_n(n, col, key=None):
podium = []
for i, e in enumerate(col):
if i < n:
podium.append(e)
else:
podium[-1] = e
if i >= n:
podium.sort(key=key)
if i < n:
podium.sort(key=key)
return podium
@jackboot7
Copy link

def top(n: int, col:list, key=None) -> list: 
     return sorted(col, key=key)[:n] 

@jgomo3
Copy link
Author

jgomo3 commented Feb 21, 2020

@jackboot7: Good! Really straightforward.

As the docs of heapq.nsmallest says:

The latter two (nsmallest and nlargest) functions perform best for smaller values of n (the size of the collection). For larger values, it is more efficient to use the sorted() function. Also, when n==1, it is more efficient to use the built-in min() and max() functions. If repeated usage of these functions is required, consider turning the iterable into an actual heap.

And I was thinking, that top_small function could be the best option for really big size of collections and small n. Think about it: the top 10 of many millions of something. The function would traverse the collection only once, and sort the podium of 10 elements, many millions of times. That is still linear, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment