A lot of people are familiar with the problem of detecting a loop in a linked list. The problem goes as follows: “Given a linked list, what is the algorithm to determine if it has any cycles (loops)?”

The algorithm is pretty straightforward:

- We start at the beginning of the linked list with two pointers.
- The first pointer is incremented through each node of the list. The second pointer moves twice as fast, and skips every other node.
- If the linked list contains a loop, these two pointers will eventually meet at the same node, thus indicating that the linked list contains a loop.

The algorithm is straightforward and it is relatively easy to create a mental model and get an intuitive sense of why it works.

Now, a slight twist to the same question asks: “Given a circular linked list, what is the algorithm to find the first node of the loop.” For instance, in the circular list A->B->C->D->E->C, the first node of the loop is node C. The first part of the algorithm is identical to the algorithm for finding if there is a loop (above). Once a loop has been found, the following additional steps will give us the starting node of the loop:

- Once a loop as been detected (step-3 above), move one of the pointers to the beginning (head) of the linked list. The second pointer remains where it was at the end of step-3.
- Increment both pointers one node at a time. The node at which the two pointers meet will be the starting node of the loop!

This algorithm isn’t too difficult compared to the algorithm for detecting a loop. However, the mental model seems a bit trickier. Why and how does it always find the start of the loop?

### How does the Algorithm work? An intuitive explanation:

Here’s some explanation which would hopefully help you intuitively understand why the algorithm works, without going into a lot of mathematical detail.

**First, meeting point of two pointers in a loop**

Consider two pointers: a slow pointer S that increments by one node at each step, and a fast pointer F that increments by two nodes at each step (i.e. it is twice as fast as S). Both pointers start at the same time from the beginning of an n-node loop. In the time S covers n nodes. F will have covered 2n nodes and they will both meet at the start of the loop.

Now, let us say that the slow pointer S starts at the beginning of the loop, and the fast pointer F starts at node k (where k < n) of the loop. As these two pointers move along the loop, they will meet at node (n-x).

What we really need to do is figure out x, as it will give us the node at which the two pointers meet inside the loop.

- When S takes n/2 steps, it will be at node n/2. During the same time, F will have taken 2(n/2) = n steps, and it will be at node (k+n). Since the we are inside a loop, F will be effectively back at node k.
- In order for the two pointers to meet at node (n-x), S needs to take a further (n-x-n/2)=(n/2-x) steps and it will end up at node n-x. During the same time, F will have taken 2*(n/2-x)=(n-2x) steps and will be at node k+(n-2x). Given our assumption that both S and F meet at the same node:

n-x = k+n-2x => x = k

This means that if S starts from the start of the loop, and F starts k nodes into the loop, both of them will meet at node (n-k), i.e k nodes from the end of the loop. This is a key insight.

**Circular Linked List**

Now, coming back to the linked list that contains a loop. Suppose the start of the loop is m (e.g. m=3) nodes from the start of the linked list. Both S and F start at the beginning of the linked list [Figure-1].

By the time S gets to node m (i.e. start of loop), F will be at node 2m [Figure-2]. This means that S will be at the start of the loop and F will be m nodes ***into the loop***.

Based on the discussion above, we already know that if S begins from the start of the loop and F starts from node m, they will meet m nodes from the end of the loop (i.e. the orange-node in [Figure-3]).

At this point, keep the pointer F at the orange-node where the two pointers met (i.e. m-nodes from the start of the loop), and move the pointer S to the beginning of the linked list [Figure-4]. Now, if we increment both S and F *one node at a time*, it is obvious that they will meet at ‘Node-m’ (red-node) of the list, which is the start of the loop.

For the curious, here’s the Java code snippets for detecting a loop in a linked list and finding the starting node. The complete source code for my linked list project is at my Github page (https://github.com/umairsd/LinkedList-Java):

/** * Checks if the given linked list is a circular linked list (i.e. it * contains a loop). This means a list in which a node's next pointer points * to an earlier node, so as to make a loop in the linked list. For * instance: * A -> B -> C -> D -> E -> C * * @param linkedList * the linked list to be tested * @return true if there is a loop, false if there isn't */ public static boolean hasLoop(LinkedList linkedList) { if (linkedList == null || linkedList.getHead() == null) { return false; } IntegerNode slow = linkedList.getHead(); IntegerNode fast = linkedList.getHead(); while (true) { slow = slow.getNext(); if (fast.getNext() != null) { fast = fast.getNext().getNext(); } else { return false; } if (slow == null || fast == null) { return false; } if (slow == fast) { return true; } } }

/** * Returns the node at the start of a loop in the given circular linked * list. A circular list is one in which a node's next pointer points * to an earlier node, so as to make a loop in the linked list. For * instance: * * input: A -> B -> C -> D -> E -> C [the same C as earlier] * output: C * * (CCI_0205) * * @param linkedList * list to be tested * @return the node at the start of the loop if there is a loop, null * otherwise */ public static IntegerNode findLoopStart(LinkedList linkedList) { if (linkedList == null || linkedList.getHead() == null) { return null; } IntegerNode loopStartNode = null; IntegerNode slow = linkedList.getHead(); IntegerNode fast = linkedList.getHead(); while (slow != null && fast != null) { slow = slow.getNext(); if (fast.getNext() == null) { loopStartNode = null; break; } fast = fast.getNext().getNext(); // If slow and fast point to the same node, it means that the // linkedList contains a loop. if (slow == fast) { slow = linkedList.getHead(); while (slow != fast) { // Keep incrementing the two pointers until they both // meet again. When this happens, both the pointers will // point to the beginning of the loop slow = slow.getNext(); // Can't be null, as we have a loop fast = fast.getNext(); // Can't be null, as we have a loop } loopStartNode = slow; break; } } return loopStartNode; }

thank you..

Thanx man…!!

nicely explained..!!

Keep up the good work..!!

Awesome job man..

Clearly explained..

Thank you buddy..!

Are you sure this works for all cases?

Assume the case that you started out with namely:

A -> B -> C -> D -> E -> C

When you increment the slow pointer by one and the fast pointer with two, they will eventually meet at E (n and 2n as you later describe).

Now if you set the slow pointer to A and let the fast pointer be and start incrementing them one node at a time they will follow the pattern as below:

Slow: A -> B -> C -> D -> E

Fast: E -> C -> D -> E -> C

Please correct me if I got anything wrong.

Actually, for your example: For the first pass, both the fast and the slow pointers will meet at D to detect the presence of the loop. Here’s the path that they’ll follow:

Slow: A -> B -> C -> D

Fast: A -> C -> E -> D

Now, when you reset the slow pointer to the beginning, and move them one node at a time, they’ll follow the patterns as below:

Slow: A -> B -> C

Fast: D -> E -> C

Coolest explanation ever!!!!! Highly useful.

It’s nice job.i m very clear abt slow and fast ptr.thz u sir.

Cool n Awesome .. keep going man ..

nice job! can i give the following explanation (my idea is at the end):

1, Consider two pointers: a slow pointer S that increments by one node at each step, and a fast pointer F that increments by two nodes at each step (i.e. it is twice as fast as S). Both pointers start at the same time from the beginning of an n-node loop. In the time S covers n nodes. F will have covered 2n nodes and they will both meet at the start of the loop.

2, let us say that the slow pointer S starts at the beginning of the loop, and the fast pointer F starts at node k (where k < n) of the loop. As these two pointers move along the loop, they will meet at node (n-x),

What we really need to do is figure out x, as it will give us the node at which the two pointers meet inside the loop. If we move S backwards k step inside the loop, and F will move backwards 2k step. Then both pointers at same node, so we reduced second case to the first case, so we know the next meet node will be (n-k).

Based on the example “A->B->C->D->E->C” ,

If A,B,C,D and E are nodes of the Linked list. then head and Tail node should be linked to form the circular linked list.

Eg:

“C->D->E->C” , “A->B->C->D->E->A” are circular linked lists.

But this “A->B->C->D->E->C” how could be a circular linked list? and

how this solution work for “A->B->C->D->E->A”?.. Please correct me if i’m wrong..

Neat algorithm. I’d like to suggest a slight modification of the proof that the nodes overlap n – k past the start of the loop; I find the current proof slightly harder to follow and it also has the problem of using n/2 (which doesn’t generalize, strictly speaking, to odd n).

Suppose that S starts at the beginning of the loop and F starts k nodes past the start of the loop and we start walking the loop. At time step x, S will be x nodes past the start of the loop and F will be 2x + k nodes past the start of the loop. Of course, F will not overtake S until F has crossed the start of the loop, at which point we can equivalently describe it as being (2x + k) – n = 2x – (n – k) nodes past the start.

We now ask, “at what step x will S and F overlap?” That is simply when the position of S is equal to the position of F, so x = 2x – (n – k), or (with some simple algebra), x = n – k. Thus, both nodes will be n – k nodes past the start of the loop.

Nice algorithm. I’d like to suggest an alternative proof that the pointers overlap n – k nodes from the start of the loop; I find the current one a little hard to follow and since it uses the quantity n/2, it doesn’t generalize (strictly speaking) to odd n.

Suppose that S starts at the beginning of the loop and F starts k nodes past the start of the loop and we start walking the loop. At time step x, S will be x nodes past the start of the loop and F will be 2x + k nodes past the start of the loop. Of course, F will not overtake S until F has crossed the start of the loop, at which point we can equivalently describe it as being (2x + k) – n = 2x – (n – k) nodes past the start.

We now ask, “at what step x will S and F overlap?” That is simply when the position of S is equal to the position of F, so x = 2x – (n – k), or (with some simple algebra), x = n – k. Thus, both nodes will be n – k nodes past the start of the loop.

Take following example ” A->B->C->D->E->F->G->H->I->J->K->H

now apply your algorithm, and tells us the ans……..!!!!

it will works only if the length of cycle is more then length of non cyclic part.

No. I think you did not quite get it, and your assertion that the algorithm works “only if the length of the cycle is more than the length of non-cyclic part” is incorrect.

In your example, the fast and the slow pointers will meet at node I, on the 8th increment for the slow pointer. The fact that there is a cycle in the list means that the fast pointer will have gone through the cycle twice before the fast & slow pointers meet. But they will meet. Step through and try it :-)

Take a linked list A->B->C->D->E->F->C

Thus one can say C is start

1st step

S A – B – C – D – E – F

F A – C – E – C – E – C

so now they meet at E

2nd step

S A – B – C – D – E –

F E – C – E- C – E -

So here they meet again at E and not C. How you can find start of loop in this case? Correct me if I’m wrong

During step-2, both pointers increment by *one*. So the path will be:

S: A-B-C

F: E-F-C

Thus, both of them meet at C.

Umair, this solution is efficient only for situation where length of loop is larger than distance from from the start to the loop point. Think of situation where loop is thousands of nodes away from start, but loop length is very small, say five nodes. Now you can see how inefficient this solution can be.

Can you care to elaborate how this will be inefficient? Let’s start with the complexity. What do you think is the order of growth for this algorithm?

Hi Umair,

very nice explanation and helped me understand the solution clearly with the help of the images. However, you do not seem to have outlined the case when the loop length is smaller than the nodes traversed from the start to the loop head. I have tried to do it here: http://irrays.blogspot.in/2013/03/finding-loop-starter-of-linked-list.html

but have not been able to do it as clearly as you have.

Thanks for the post.

You explained it excellently, a comprehensive solution.

Simple and easy solution. Thanks

The explanation initialy seems congfusing but the figure remove that too. Brilliant work.

The explanation is so clear. Thanks