Python Practice #2

I orginally shared this story on medium.com

Use a hash map/dictionary!, e-mail #2 example pattern #3

Notes: spends space to save time

https://www.interviewcake.com/question/python/top-scores

I’m still really new to Big O notation, so when I saw the example mentioned that I needed to write an algorithm that sorts faster than O(n log(n)), I had to do some research to figure out what that meant! Thank Turing for Google and StackOverflow.

First I found this helpful post on StackOverflow:

https://stackoverflow.com/questions/42238323/how-is-on-log-n-different-then-olog-n

which lead me to this very helpful Big O cheat sheet:

http://bigocheatsheet.com/.

Looking at the cheatsheet, it looks like O(n) would be faster, and I’m assuming it’s the fastest we can get considering we have to sort through all the scores. Parker also hints that these problems usually make use of a “counts” hash, and looking at the cheatsheet again we can see that the counting sort array sorting algorithm is pretty fast, using O(n + k) time which is faster than O(n log(n)), but has a slight trade off in space using O(k) space, which is not too bad.

We have to write a function that takes in the highest possible score, and an unsorted list of scores, and returns a sorted list of scores. So how can we make use of the counting sort algorithm to handle this problem? Let’s find out!

This is the clearest implementation and explanation of a counting sort in Python that I found:

http://www.geekviewpoint.com/python/sorting/countingsort

The only problem is, it’s backwards, it returns the list in order of lowest to highest, so we gotta figure out how to turn that around.

#define the function
def sort_scores(us, hps):
    #Initialize the buckets array, a 0 for each entry from 0-100 (the highest possible score)
    counter = [0] * (hps + 1)
    #loop through the unsorted scores and add one to the counter array where
    #the index is equal to the value of the score.
    for i in us:
        counter[i] += 1
    
    #Initialize sorted list with the same length as the unsorted list
    sl = [0] * (len(us))
    #create an index object to insert values in the scores array
    ndx = len(us) - 1
    #loop through counter array
    for i in range(len(counter)):
        #create an entry in sorted scores array for each count of each bucket
        while 0 < counter[i]:
            sl[ndx] = i
            ndx -= 1
            counter[i] -= 1
    return sl

sorted_scores = sort_scores([37, 89, 41, 65, 91, 53], 100)
print(sorted_scores)

Link to gist

The main differences between my code here and the code I borrowed from the link above are that I created a new list called sl (sorted list) and that I reversed the order that the scores are displayed in. I wasn’t sure when I started that it would work the way I did it, and for a little while I was confounded by a syntax error where I accidentally nested the counter array for loop inside of the unsorted scores for loop, giving me a whacky result that had me second guessing my logic for too long. Finally I figured out what I had done and was vindicated!

I wanted to figure out how to do the reversal myself since I had already borrowed so much via Google and StackOverflow, but after I had come to this solution I did a quick search and found a reversed() function for lists that apparently does not use extra space, so I could have probably left the borrowed code as is and just reversed the list after the fact, or I could return it as sl[::-1]. I’m glad that I did it my way in the end, I gained more from it by working the ole noggin a bit than I would have by just putting a borrowed code collage together.

Parker’s solution is elegant as always:


def sort_scores(unsorted_scores, highest_possible_score):
    # List of 0s at indices 0..highest_possible_score
    score_counts = [0] * (highest_possible_score+1)

    # Populate score_counts
    for score in unsorted_scores:
        score_counts[score] += 1

    # Populate the final sorted list
    sorted_scores = []

    # For each item in score_counts
    for score in xrange(len(score_counts) - 1, -1, -1):
        count = score_counts[score]

        # For the number of times the item occurs
        for time in xrange(count):
            # Add it to the sorted list
            sorted_scores.append(score)

    return sorted_scores

Link to gist

It is essentially the same as my solution, it’s a counting sort algorithm but it is simpler than my solution because he uses the range/xrange functions built in ability to step backwards over the list.

Besides the straightforward things I learned about Big O notation and using it to think about the time and space efficiency of algorithms I write and read, my main take away from this exercise is to remember to look into the functions I’m using and look at their built in functionality to see if a solution for my problem has already been encoded in the function itself. Since this problem I was trying to solve was a fairly basic one, iterating backwards over a list, it has obviously been encountered and solved by many people over time, and of course it makes sense that the range function could step forward as well as backward. The good news is, the next time I encounter a problem like this I’ll be ready!