Finding the Start of a Loop in a Circular Linked List

A lot of people are familiar with the problem of detecting a loop in a linked list. The problem goes as follows: “Given a linked list, what is the algorithm to determine if it has any cycles (loops)?”

The algorithm is pretty straightforward:

  1. We start at the beginning of the linked list with two pointers.
  2. The first pointer is incremented through each node of the list. The second pointer moves twice as fast, and skips every other node.
  3. If the linked list contains a loop, these two pointers will eventually meet at the same node, thus indicating that the linked list contains a loop.

The algorithm is straightforward and it is relatively easy to create a mental model and get an intuitive sense of why it works.

Now, a slight twist to the same question asks: “Given a circular linked list, what is the algorithm to find the first node of the loop.” For instance, in the circular list A->B->C->D->E->C, the first node of the loop is node C. The first part of the algorithm is identical to the algorithm for finding if there is a loop (above). Once a loop has been found, the following additional steps will give us the starting node of the loop:

  1. Once a loop as been detected (step-3 above), move one of the pointers to the beginning (head) of the linked list. The second pointer remains where it was at the end of step-3.
  2. Increment both pointers one node at a time. The node at which the two pointers meet will be the starting node of the loop!

This algorithm isn’t too difficult compared to the algorithm for detecting a loop. However, the mental model seems a bit trickier. Why and how does it always find the start of the loop?


How does the Algorithm work? An intuitive explanation:

Here’s some explanation which would hopefully help you intuitively understand why the algorithm works, without going into a lot of mathematical detail.

First, meeting point of two pointers in a loop

Consider two pointers: a slow pointer S that increments by one node at each step, and a fast pointer F that increments by two nodes at each step (i.e. it is twice as fast as S). Both pointers start at the same time from the beginning of an n-node loop. In the time S covers n nodes. F will have covered 2n nodes and they will both meet at the start of the loop.

Now, let us say that the slow pointer S starts at the beginning of the loop, and the fast pointer F starts at node k (where k < n) of the loop. As these two pointers move along the loop, they will meet at node (n-x).

What we really need to do is figure out x, as it will give us the node at which the two pointers meet inside the loop.

  1. When S takes n/2 steps, it will be at node n/2. During the same time, F will have taken 2(n/2) = n steps, and it will be at node (k+n). Since the we are inside a loop, F will be effectively back at node k.
  2. In order for the two pointers to meet at node (n-x), S needs to take a further (n-x-n/2)=(n/2-x) steps and it will end up at node n-x. During the same time, F will have taken 2*(n/2-x)=(n-2x) steps and will be at node k+(n-2x). Given our assumption that both S and F meet at the same node:
		n-x = k+n-2x
	=>	  x = k

This means that if S starts from the start of the loop, and F starts k nodes into the loop, both of them will meet at node (n-k), i.e k nodes from the end of the loop. This is a key insight.

Circular Linked List

Now, coming back to the linked list that contains a loop. Suppose the start of the loop is m (e.g. m=3) nodes from the start of the linked list. Both S and F start at the beginning of the linked list [Figure-1].

Figure-1: Circular linked list with S and F pointers at the start

Figure-1: Circular linked list with S and F pointers at the start

 

By the time S gets to node m (i.e. start of loop), F will be at node 2m [Figure-2]. This means that S will be at the start of the loop and F will be m nodes *into the loop*.

Figure-2: Circular linked list, with S at the start of loop and F m nodes into the loop

Figure-2: Circular linked list, with S at the start of loop and F m nodes into the loop

 

Based on the discussion above, we already know that if S begins from the start of the loop and F starts from node m, they will meet m nodes from the end of the loop (i.e. the orange-node in [Figure-3]).

Figure-3: Both F and S meet m nodes from the end of the loop

Figure-3: Both F and S meet m nodes from the end of the loop

 

At this point, keep the pointer F at the orange-node where the two pointers met (i.e. m-nodes from the start of the loop), and move the pointer S to the beginning of the linked list [Figure-4]. Now, if we increment both S and F *one node at a time*, it is obvious that they will meet at ‘Node-m’ (red-node) of the list, which is the start of the loop.

Figure-4: S at the start of linked list, F at the point they met. Both increment one at a time from here-on

Figure-4: S at the start of linked list, F at the point they met. Both increment one at a time from here-on

 

For the curious, here’s the Java code snippets for detecting a loop in a linked list and finding the starting node. The complete source code for my linked list project is at my Github page (https://github.com/umairsd/LinkedList-Java):

	/**
	 * Checks if the given linked list is a circular linked list (i.e. it
	 * contains a loop). This means a list in which a node's next pointer points
	 * to an earlier node, so as to make a loop in the linked list. For
	 * instance:
	 * 			A -> B -> C -> D -> E -> C
	 *
	 * @param linkedList
	 *            the linked list to be tested
	 * @return true if there is a loop, false if there isn't
	 */
	public static boolean hasLoop(LinkedList linkedList) {
		if (linkedList == null || linkedList.getHead() == null) {
			return false;
		}

		IntegerNode slow = linkedList.getHead();
		IntegerNode fast = linkedList.getHead();

		while (true) {
			slow = slow.getNext();

			if (fast.getNext() != null) {
				fast = fast.getNext().getNext();
			} else {
				return false;
			}

			if (slow == null || fast == null) {
				return false;
			}

			if (slow == fast) {
				return true;
			}
		}
	}

 


	/**
	 * Returns the node at the start of a loop in the given circular linked
	 * list. A circular list is one in which a node's next pointer points
	 * to an earlier node, so as to make a loop in the linked list. For
	 * instance:
	 *
	 * input: A -> B -> C -> D -> E -> C [the same C as earlier]
	 * output: C
	 *
	 * (CCI_0205)
	 *
	 * @param linkedList
	 *            list to be tested
	 * @return the node at the start of the loop if there is a loop, null
	 * otherwise
	 */
	public static IntegerNode findLoopStart(LinkedList linkedList) {
		if (linkedList == null || linkedList.getHead() == null) {
			return null;
		}

		IntegerNode loopStartNode = null;
		IntegerNode slow = linkedList.getHead();
		IntegerNode fast = linkedList.getHead();

		while (slow != null && fast != null) {
			slow = slow.getNext();
			if (fast.getNext() == null) {
				loopStartNode = null;
				break;
			}
			fast = fast.getNext().getNext();

			// If slow and fast point to the same node, it means that the
			// linkedList contains a loop.
			if (slow == fast) {

				slow = linkedList.getHead();

				while (slow != fast) {
					// Keep incrementing the two pointers until they both
					// meet again. When this happens, both the pointers will
					// point to the beginning of the loop
					slow = slow.getNext(); // Can't be null, as we have a loop
					fast = fast.getNext(); // Can't be null, as we have a loop
				}

				loopStartNode = slow;
				break;
			}
		}

		return loopStartNode;
	}

About these ads

31 Comments

Filed under Algorithms, Java

31 responses to “Finding the Start of a Loop in a Circular Linked List

  1. Pracheer

    Thanx man…!!
    nicely explained..!!
    Keep up the good work..!!

  2. Sreeram

    Awesome job man..
    Clearly explained..
    Thank you buddy..!

  3. anshul

    Are you sure this works for all cases?
    Assume the case that you started out with namely:
    A -> B -> C -> D -> E -> C
    When you increment the slow pointer by one and the fast pointer with two, they will eventually meet at E (n and 2n as you later describe).
    Now if you set the slow pointer to A and let the fast pointer be and start incrementing them one node at a time they will follow the pattern as below:
    Slow: A -> B -> C -> D -> E
    Fast: E -> C -> D -> E -> C

    Please correct me if I got anything wrong.

    • Umair

      Actually, for your example: For the first pass, both the fast and the slow pointers will meet at D to detect the presence of the loop. Here’s the path that they’ll follow:

      Slow: A -> B -> C -> D
      Fast: A -> C -> E -> D

      Now, when you reset the slow pointer to the beginning, and move them one node at a time, they’ll follow the patterns as below:

      Slow: A -> B -> C
      Fast: D -> E -> C

  4. Amit

    Coolest explanation ever!!!!! Highly useful.

  5. Malarvizhi

    It’s nice job.i m very clear abt slow and fast ptr.thz u sir.

  6. Ahmad

    Cool n Awesome .. keep going man ..

  7. nice job! can i give the following explanation (my idea is at the end):
    1, Consider two pointers: a slow pointer S that increments by one node at each step, and a fast pointer F that increments by two nodes at each step (i.e. it is twice as fast as S). Both pointers start at the same time from the beginning of an n-node loop. In the time S covers n nodes. F will have covered 2n nodes and they will both meet at the start of the loop.
    2, let us say that the slow pointer S starts at the beginning of the loop, and the fast pointer F starts at node k (where k < n) of the loop. As these two pointers move along the loop, they will meet at node (n-x),

    What we really need to do is figure out x, as it will give us the node at which the two pointers meet inside the loop. If we move S backwards k step inside the loop, and F will move backwards 2k step. Then both pointers at same node, so we reduced second case to the first case, so we know the next meet node will be (n-k).

  8. Vijay

    Based on the example “A->B->C->D->E->C” ,
    If A,B,C,D and E are nodes of the Linked list. then head and Tail node should be linked to form the circular linked list.
    Eg:
    “C->D->E->C” , “A->B->C->D->E->A” are circular linked lists.
    But this “A->B->C->D->E->C” how could be a circular linked list? and
    how this solution work for “A->B->C->D->E->A”?.. Please correct me if i’m wrong..

  9. Neat algorithm. I’d like to suggest a slight modification of the proof that the nodes overlap n – k past the start of the loop; I find the current proof slightly harder to follow and it also has the problem of using n/2 (which doesn’t generalize, strictly speaking, to odd n).

    Suppose that S starts at the beginning of the loop and F starts k nodes past the start of the loop and we start walking the loop. At time step x, S will be x nodes past the start of the loop and F will be 2x + k nodes past the start of the loop. Of course, F will not overtake S until F has crossed the start of the loop, at which point we can equivalently describe it as being (2x + k) – n = 2x – (n – k) nodes past the start.

    We now ask, “at what step x will S and F overlap?” That is simply when the position of S is equal to the position of F, so x = 2x – (n – k), or (with some simple algebra), x = n – k. Thus, both nodes will be n – k nodes past the start of the loop.

  10. nevsan

    Nice algorithm. I’d like to suggest an alternative proof that the pointers overlap n – k nodes from the start of the loop; I find the current one a little hard to follow and since it uses the quantity n/2, it doesn’t generalize (strictly speaking) to odd n.

    Suppose that S starts at the beginning of the loop and F starts k nodes past the start of the loop and we start walking the loop. At time step x, S will be x nodes past the start of the loop and F will be 2x + k nodes past the start of the loop. Of course, F will not overtake S until F has crossed the start of the loop, at which point we can equivalently describe it as being (2x + k) – n = 2x – (n – k) nodes past the start.

    We now ask, “at what step x will S and F overlap?” That is simply when the position of S is equal to the position of F, so x = 2x – (n – k), or (with some simple algebra), x = n – k. Thus, both nodes will be n – k nodes past the start of the loop.

  11. sonesh

    Take following example ” A->B->C->D->E->F->G->H->I->J->K->H
    now apply your algorithm, and tells us the ans……..!!!!
    it will works only if the length of cycle is more then length of non cyclic part.

    • Umair

      No. I think you did not quite get it, and your assertion that the algorithm works “only if the length of the cycle is more than the length of non-cyclic part” is incorrect.

      In your example, the fast and the slow pointers will meet at node I, on the 8th increment for the slow pointer. The fact that there is a cycle in the list means that the fast pointer will have gone through the cycle twice before the fast & slow pointers meet. But they will meet. Step through and try it :-)

  12. here ‘a another solution

    lets say nodes
    1 -> 2 -> 3* -> 4 -> 5 -> 6->3

    #1 somewhere fast/slow met,, keep counter of N nods for slower
    fast/slow met somewhere betwen 3..6 (including)

    #3 from place they met , run pointer 1step node to get lenght of loop, got K nodes

    #4 so we have N nod position for slow, and k nods of loop
    , so run pointer from START OF LIST to n-k node..
    in worse case scenario it directly points to 3 (if slow/fast meet at 6)
    , or before 3,,

    #5 last and simple,,, from n-k point

    run slow pointer 1 node per step
    and run k nodes per step pointer, aka run whole loop in 1 step
    as soon as pointers met, its start,,

    seems thats it..

  13. ro

    Take a linked list A->B->C->D->E->F->C
    Thus one can say C is start
    1st step
    S A – B – C – D – E – F
    F A – C – E – C – E – C

    so now they meet at E

    2nd step
    S A – B – C – D – E –
    F E – C – E- C – E -

    So here they meet again at E and not C. How you can find start of loop in this case? Correct me if I’m wrong

    • Umair

      During step-2, both pointers increment by *one*. So the path will be:

      S: A-B-C
      F: E-F-C

      Thus, both of them meet at C.

  14. Code Guru

    Umair, this solution is efficient only for situation where length of loop is larger than distance from from the start to the loop point. Think of situation where loop is thousands of nodes away from start, but loop length is very small, say five nodes. Now you can see how inefficient this solution can be.

    • Umair

      Can you care to elaborate how this will be inefficient? Let’s start with the complexity. What do you think is the order of growth for this algorithm?

  15. Hi Umair,
    very nice explanation and helped me understand the solution clearly with the help of the images. However, you do not seem to have outlined the case when the loop length is smaller than the nodes traversed from the start to the loop head. I have tried to do it here: http://irrays.blogspot.in/2013/03/finding-loop-starter-of-linked-list.html
    but have not been able to do it as clearly as you have.

    Thanks for the post.

  16. Faisal Raza

    You explained it excellently, a comprehensive solution.

  17. Naveen T

    Simple and easy solution. Thanks

  18. lakshmi101192@gmail.com

    thank you very much :):):)

  19. Kashim

    The explanation initialy seems congfusing but the figure remove that too. Brilliant work.

  20. Weilan

    Nice and neat algorithm – now a follow-up — after we find out where the loop starts, can we find out how many nodes in the loop?

    Thanks,
    Weilan

  21. Sangar Jitendra

    Once we get the node in loop,we can find out number of nodes in loop, let’s say K nodes are in loop. With that information at hand we only need to find Kth node from the end of the list. :)
    Full explanation and code can be referred here

    http://algorithmsandme.blogspot.in/2013/10/linked-list-detecting-loop-in-singly.html

  22. sknair

    Wonderful explanation!!

  23. The explanation is so clear. Thanks

  24. M.Zito

    Having trouble seeing the part: This means that if S starts from the start of the loop, and F starts k nodes into the loop, both of them will meet at node (n-k), i.e k nodes from the end of the loop. This is a key insight.

  25. Very nice post man. Borrowed the diagram from your posts. Here is my post : http://k2code.blogspot.in/2011/12/finding-start-of-loop-in-circular.html

  26. Sidharth

    Excellent explaination!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s