Sort python dictionaries by values

Perhaps you already encountered a problem like the following one yourself:
You have a large list of items (let’s say URIs for this example) and want to sum up how often they were viewed (or edited or… whatever). A small one-shot solution in python looks like the following and uses the often unknown operator.itemgetter:

import sys
import operator
uriViews = {}
for line in sys.stdin:
    uri, views = line.strip().split()
    views = int(views)
    uriViews[uri] = uriViews.get(uri, 0) - views
    # why minus? could do + and use reverse=True below?
    # right, but then items also get sorted in reverse, if views are
    # the same (so Z would come first, a last)
for uri, views in sorted(uriViews.items(),
                         key=operator.itemgetter(1,0)):
    print -views, uri

This approach can be a lot faster than self written lambda functions called for every comparison or a list comprehension which turns around all tuples and then sorts it. Also in contrast to many other solutions you can find online this one uses operator.itemgetter(1,0), which means that if two items have the same amount of views, their URIs are sorted alphabetically.

Remember that this approach sorts the whole list in RAM, so you might want to chose other approaches in case your lists are getting huge.
For further information you might want to read PEP-0265, which also includes a hint what to do when you’re only interested in the Top 1000 for example (will only sort these top1000):

import heapq
top1000 = heapq.nlargest(1000, uriViews.iteritems(), itemgetter(1,0))
for uri,views in top1000:
   print -views, uri

2 thoughts on “Sort python dictionaries by values

  1. Gecko

    Hi Jörn.

    itemgetter ist auch super, um ein mehrdimensionales array nach einem beliebigen key zu sortieren:

    from operator import itemgetter
    someArray = [['o',3],['n',1],['e',2]]
    
    someArray.sort()
    print "Standard Sortierung:n", someArray
    # [['e', 2], ['n', 1], ['o', 3]]
    
    someArray.sort(key=itemgetter(1))
    print "sortiert nach den Ziffern:n", someArray
    # [['n', 1], ['e', 2], ['o', 3]]
    

    Gruß,
    Gecko

    Reply
  2. joern Post author

    Hi Gecko,
    ja, aber erneut: itemgetter(1) holt nur jeweils das 2te item, ich würde itemgetter(1,0) benutzen, damit klar ist, wie’s bei Gleichheit vom 2ten item weitergeht.
    Viele Grüße,
    Jörn

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

 

This site uses Akismet to reduce spam. Learn how your comment data is processed.