I've observed that there is a little bit of a disconnect in understanding what it is that needs to be done to properly implement the lock_pairs function for cs50 Tideman. The goal of this little write-up is simply an attempt at explaining what the problem actually is, and why a cycle imposes a problem.
First:
If you are unfamiliar with the actual problem, or have not read through the entire cs50 Tideman problem description. Then I think you should start there.
cs50 Tideman
Second:
This little write-up is only narrowing in on the idea of cycles, and a way to think about what a cycle is and determine if locking a pair in the pairs
array would create that cycle. This does not talk about any other part of the Tideman problem.
lock_pairs Purpose:
The function lock_pairs does one thing, and one thing only. That is update the locked
array for all the pairs that should get locked. The lock_pairs function does not know anything about the Tideman problem itself. The function does not know what a source is, it does not care who the eventual winner the Tideman algorithm would pick. It does one thing. It decides whether to lock a pair, or not lock a pair.
I'm saying this explicitly because I see a lot of people make the mistake of inferring the logic the Tideman algorithm would use to take shortcuts in the lock_pairs function. So don't do that. I've also see people conflate the extra ideas that the Tideman algorithm uses, that creates confusion in what it is the lock_pairs function is actually doing.
Once again. lock_pairs doesn't know what a source is, and doesn't care who the eventual winner would be. It is the function that creates the guide (locked
) that you can later determine who the winner is.
So what is a cycle?
As explained in the cs50 Tideman problem description. A cycle is is when the winner of a pair can be traced through the graph
of what is already locked and get back to that same winner.
Wait... Graph
. Maybe that term means something in the realm of computer science.
Wikipedia Graph(abstract data type)
Ok. Don't let that overwhelm you. Just realize that a graph
is an idea of how to conceptualize and order data for a problem that can provide a solution in a new way.
So what does a graph
have to do with this problem? Lets dig into that with a little bit of a real example.
Consider this:
Candidates = [a, b, c, d]
Sorted Pairs = [(d, a), (a, b), (b, c), (c, a), (d, b), (d, c)]
Note for the Sorted Pairs. If you test and your sorted order is slightly different than this. That is ok based on differences in the sorting algorithm.
We take each pair one pair at a time and determine whether to lock it or not.
When we take a pair, we are in a sense creating a graph, using the locked
array as our guide. As more pairs get locked, we have more pairs we may have to check. Our goal is to create a chain, from winner through loser and see if we can get back to our original winner. If we can that is a cycle, and if not then no cycle is created.
So what would that look like?
Taking the first pair (d, a) we take the winner d and see if a wins in any pair that is locked. Since no pairs are locked (this is our first pair) a never wins. So we can lock (d, a). The chain formed would look as follows.
d -> a
Taking the second pair (a, b) we perform the same algorithm. b is never found to win in any locked pair though. So we can safely lock.
a -> b
Taking the third pair (b, c) we do the same once again. c is never found to win in any locked pair though. So we can safely lock.
b -> c
Taking the fourth pair (c, a). a is found to win against another candidate and that pair is locked. So we build a chain starting with c, though a, and see how far it takes us.
c -> a -> b -> c
In this case we find that using the previously locked pairs, we can create a chain from c and get back to c. So we shouldn't lock this pair. It is a cycle.
Taking the fifth pair (d, b) we once again do the same.
d -> b -> c
Because the pair (c, a) was not locked prior to checking our pair (d, b) we can safely lock this pair.
Taking the final pair (d, c) we do the same one more time.
d -> c
And we can lock the final pair.
So considering the above we locked all but one pair. Pair (c, a) was found to create a cycle.
Our final graph would look something like this
a | ||
---|---|---|
➚ | ↓ | |
d | ➙ | b |
➘ | ↓ | |
c |
Takeaway
The primary thing I hope you observe in the above is the simple logic we can apply to determine what forms a cycle, and what does not. Every time we find a new link in the chain, we have to re-examine all the pairs once again to see if the newly found loser would win against any other candidate in any locked pair. If we find one, we create a new link in the chain and repeat the process.
For every pair we consider. We are creating a new graph and attempting to traverse from the original winner, back to that same original winner. Even though the ultimate result is one big graph.
Please note. This is a very simple example, and it is possible that you have to check more than one chain when looking for a cycle. For example if a candidate wins agains multiple other candidates, but also loses.
Why recursion is helpful
The reason recursion can help in solving this type of problem is based on how each recursive function remembers
where it left off if you continue to call it over an over. I do believe this cs50 short video explains how that memory
works pretty well. So every time you find you should create a new link in the chain, you can recursively call your function to look for a cycle and all the frames remember
where they left off.
I hope my couple thoughts are helpful in aiding you in solving the Tideman problem. Feel free to let me know if you see any error in my above explanation of things, or if you feel I could/should add anything I didn't touch on.
Feel free to use the text4.txt file to test your own solution to the Tideman problem.
To use the file just download to your folder where you tideman program exists and run as
./tideman a b c d < test4.txt
For everyone still having the ":( lock_pairs skips final pair if it creates cycle" error, this might be due to the following error in your code / logic: If the other lock_pairs tests are working, then you are probably on the right path but not checking for a special edge case.
Consider the previously mentioned graph:
If you write it out on paper, you will see that from the number 4 the graph splits and you can either go to 2 or to 3.
Now, with the incomplete logic your code is probably looking for "leafs", i.e. numbers that are only "losers" and not "winners", only (to be precise: only the first leaf).
However, this might result in the code to end prematurely. In the above case, because 2 < 3, the code would go to number 2 before 3. 2 is a leaf at the moment of the check (1). Then, having found this leaf that causes no loop, the code terminates.
This leads to the code not checking the next branch, i.e. 3, which would lead to the proper conclusion, that at (1) a loop would be created.
Long story short: You need to include logic into your code to check every available branch, and if one causes circularity, not lock the pair.
Wrong:
Correct:
Note: These are excerpts of a recursive helper function to check for circularity and not the complete code.