A lot of people are familiar with the problem of detecting a loop in a linked list. The problem goes as follows: “Given a linked list, what is the algorithm to determine if it has any cycles (loops)?”
The algorithm is pretty straightforward:
- We start at the beginning of the linked list with two pointers.
- The first pointer is incremented through each node of the list. The second pointer moves twice as fast, and skips every other node.
- If the linked list contains a loop, these two pointers will eventually meet at the same node, thus indicating that the linked list contains a loop.
The algorithm is straightforward and it is relatively easy to create a mental model and get an intuitive sense of why it works.
Now, a slight twist to the same question asks: “Given a circular linked list, what is the algorithm to find the first node of the loop.”
For instance, in the circular list
A->B->C->D->E->C, the first node of the loop is node C. The first part of the algorithm is identical to the algorithm for finding if there is a loop (above). Once a loop has been found, the following additional steps will give us the starting node of the loop:
- Once a loop as been detected (step-3 above), move one of the pointers to the beginning (head) of the linked list. The second pointer remains where it was at the end of step-3.
- Increment both pointers one node at a time. The node at which the two pointers meet will be the starting node of the loop!
This algorithm isn’t too difficult compared to the algorithm for detecting a loop. However, the mental model seems a bit trickier. Why and how does it always find the start of the loop?
How does the Algorithm work? An intuitive explanation:
Here’s some explanation which would hopefully help you intuitively understand why the algorithm works, without going into a lot of mathematical detail.
First, meeting point of two pointers in a loop
Consider two pointers: a slow pointer
S that increments by one node at each step, and a fast pointer
F that increments by two nodes at each step (i.e. it is twice as fast as
S). Both pointers start at the same time from the beginning of an n-node loop. In the time
S covers n nodes.
F will have covered
2n nodes and they will both meet at the start of the loop.
Now, let us say that the slow pointer
S starts at the beginning of the loop, and the fast pointer
F starts at node k (where k < n) of the loop. As these two pointers move along the loop, they will meet at node (n-x).
What we really need to do is figure out x, as it will give us the node at which the two pointers meet inside the loop.
n/2steps, it will be at node
n/2. During the same time,
Fwill have taken
2(n/2) = nsteps, and it will be at node
(k+n). Since the we are inside a loop,
Fwill be effectively back at node
- In order for the two pointers to meet at node
Sneeds to take a further
(n-x-n/2)=(n/2-x)steps and it will end up at node
n-x. During the same time,
Fwill have taken 2*(n/2-x)=(n-2x) steps and will be at node
k+(n-2x). Given our assumption that both S and
Fmeet at the same node:
This means that if S starts from the start of the loop, and F starts k nodes into the loop, both of them will meet at node (n-k), i.e k nodes from the end of the loop. This is a key insight.
Circular Linked List
Now, coming back to the linked list that contains a loop. Suppose the start of the loop is
m=3) nodes from the start of the linked list. Both
F start at the beginning of the linked list [Figure-1].
Figure-1: Circular linked list with S and F pointers at the start
By the time
S gets to node
m (i.e. start of loop),
F will be at node
2m [Figure-2]. This means that
S will be at the start of the loop and
F will be
m nodes into the loop.
Figure-2: Circular linked list, with S at the start of loop and F m nodes into the loop
Based on the discussion above, we already know that if
S begins from the start of the loop and
F starts from node
m, they will meet
m nodes from the end of the loop (i.e. the orange-node in [Figure-3]).
Figure-3: Both F and S meet m nodes from the end of the loop
At this point, keep the pointer
F at the orange-node where the two pointers met (i.e.
m-nodes from the start of the loop), and move the pointer
S to the beginning of the linked list [Figure-4]. Now, if we increment both
F one node at a time, it is obvious that they will meet at ‘Node-m’ (red-node) of the list, which is the start of the loop.
Figure-4: S at the start of linked list, F at the point they met. Both increment one at a time from here-on
For the curious, here’s the Java code snippets for detecting a loop in a linked list and finding the starting node. The complete source code for my linked list project is at my Github page (https://github.com/umairsd/LinkedList-Java):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54