A Quine in Objective-C

A quine is a program that takes no input and outputs its own source code. It has been a while since I last wrote a quine, so I figured I’ll write one in Objective-C. In general, quines follow a fairly simple formula. The program contains a string that includes all the code before the string and all the code after the string. Depending on the programming language, the string might also contain format string (for languages that use format-strings to print to stdout)

Here is the code for the quine. I’ve broken the code below into segments to make it simpler to understand:

#import <Foundation/Foundation.h>  
int main (int argc, const char * argv[]) { 
    @autoreleasepool 
    { 
        NSString *str=@" \
            #import <Foundation/Foundation.h> %c \
            int main (int argc, const char * argv[]) { @autoreleasepool { NSString *str= \
            %c%c%@%c; \
            NSLog(str, 10, 64, 34, str, 34);} return 0;}";
        NSLog(str, 10, 64, 34, str, 34); 
    } 
    return 0;
}

Before I go on, 10, 64 and 34 are the ASCII character codes for newline, @ and " respectively. To make it clearer, I've broken the string into four lines. The first two lines contain the source code before the string. The 3rd line is the format string and the last line is the source code after the string. Here's the code with all lines collapsed:

#import <Foundation/Foundation.h>  
int main (int argc, const char * argv[]) { 
    @autoreleasepool 
    { 
        NSString *str=@"#import <Foundation/Foundation.h> %c int main (int argc, const char * argv[]) { @autoreleasepool { NSString *str=%c%c%@%c; NSLog(str, 10, 64, 34, str, 34);} return 0;}";
        NSLog(str, 10, 64, 34, str, 34); 
    } 
    return 0;
}

Finally, here's the output when you execute this program. As you can tell, this is the same as the original program (minus the whitespace)

#import <Foundation/Foundation.h> 
 int main (int argc, const char * argv[]) { @autoreleasepool { NSString *str=@"#import <Foundation/Foundation.h> %c int main (int argc, const char * argv[]) { @autoreleasepool { NSString *str=%c%c%@%c; NSLog(str, 10, 64, 34, str, 34);} return 0;}"; NSLog(str, 10, 64, 34, str, 34);} return 0;}

Leave a comment

Filed under Cocoa

Compiling Objective-C via Command Line

If you want to do any kind of programming for iOS or Mac OS X, you really have to use Xcode and the sooner you become comfortable with it, the better off you will be. However, there are times when I want to write a very simple (and mostly throwaway) command-line program. For such single-file programs, occasionally I find it preferable to simply use the LLVM Clang Objective-C compiler via the command line, as it saves me from creating an entire Xcode project.

Here is the general format of the clang command that I use most commonly:

clang -fobjc-arc –framework Foundation files -o outputName

The -framework option tells the compiler about the frameworks to use. If, like me, you use Cocoa, you will want to include the Foundation framework.

The -fobjc-arc option tells the compiler to use ARC (Automatic Reference Counting). Skipping this option means that I will need to take care of memory management.

Leave a comment

Filed under Cocoa

Finding the Start of a Loop in a Circular Linked List

A lot of people are familiar with the problem of detecting a loop in a linked list. The problem goes as follows: “Given a linked list, what is the algorithm to determine if it has any cycles (loops)?”

The algorithm is pretty straightforward:

  1. We start at the beginning of the linked list with two pointers.
  2. The first pointer is incremented through each node of the list. The second pointer moves twice as fast, and skips every other node.
  3. If the linked list contains a loop, these two pointers will eventually meet at the same node, thus indicating that the linked list contains a loop.

The algorithm is straightforward and it is relatively easy to create a mental model and get an intuitive sense of why it works.

Now, a slight twist to the same question asks: “Given a circular linked list, what is the algorithm to find the first node of the loop.” For instance, in the circular list A->B->C->D->E->C, the first node of the loop is node C. The first part of the algorithm is identical to the algorithm for finding if there is a loop (above). Once a loop has been found, the following additional steps will give us the starting node of the loop:

  1. Once a loop as been detected (step-3 above), move one of the pointers to the beginning (head) of the linked list. The second pointer remains where it was at the end of step-3.
  2. Increment both pointers one node at a time. The node at which the two pointers meet will be the starting node of the loop!

This algorithm isn’t too difficult compared to the algorithm for detecting a loop. However, the mental model seems a bit trickier. Why and how does it always find the start of the loop?


How does the Algorithm work? An intuitive explanation:

Here’s some explanation which would hopefully help you intuitively understand why the algorithm works, without going into a lot of mathematical detail.

First, meeting point of two pointers in a loop

Consider two pointers: a slow pointer S that increments by one node at each step, and a fast pointer F that increments by two nodes at each step (i.e. it is twice as fast as S). Both pointers start at the same time from the beginning of an n-node loop. In the time S covers n nodes. F will have covered 2n nodes and they will both meet at the start of the loop.

Now, let us say that the slow pointer S starts at the beginning of the loop, and the fast pointer F starts at node k (where k < n) of the loop. As these two pointers move along the loop, they will meet at node (n-x).

What we really need to do is figure out x, as it will give us the node at which the two pointers meet inside the loop.

  1. When S takes n/2 steps, it will be at node n/2. During the same time, F will have taken 2(n/2) = n steps, and it will be at node (k+n). Since the we are inside a loop, F will be effectively back at node k.
  2. In order for the two pointers to meet at node (n-x), S needs to take a further (n-x-n/2)=(n/2-x) steps and it will end up at node n-x. During the same time, F will have taken 2*(n/2-x)=(n-2x) steps and will be at node k+(n-2x). Given our assumption that both S and F meet at the same node:
		n-x = k+n-2x
	=>	  x = k

This means that if S starts from the start of the loop, and F starts k nodes into the loop, both of them will meet at node (n-k), i.e k nodes from the end of the loop. This is a key insight.

Circular Linked List

Now, coming back to the linked list that contains a loop. Suppose the start of the loop is m (e.g. m=3) nodes from the start of the linked list. Both S and F start at the beginning of the linked list [Figure-1].

Figure-1: Circular linked list with S and F pointers at the start

Figure-1: Circular linked list with S and F pointers at the start

 

By the time S gets to node m (i.e. start of loop), F will be at node 2m [Figure-2]. This means that S will be at the start of the loop and F will be m nodes *into the loop*.

Figure-2: Circular linked list, with S at the start of loop and F m nodes into the loop

Figure-2: Circular linked list, with S at the start of loop and F m nodes into the loop

 

Based on the discussion above, we already know that if S begins from the start of the loop and F starts from node m, they will meet m nodes from the end of the loop (i.e. the orange-node in [Figure-3]).

Figure-3: Both F and S meet m nodes from the end of the loop

Figure-3: Both F and S meet m nodes from the end of the loop

 

At this point, keep the pointer F at the orange-node where the two pointers met (i.e. m-nodes from the start of the loop), and move the pointer S to the beginning of the linked list [Figure-4]. Now, if we increment both S and F *one node at a time*, it is obvious that they will meet at ‘Node-m’ (red-node) of the list, which is the start of the loop.

Figure-4: S at the start of linked list, F at the point they met. Both increment one at a time from here-on

Figure-4: S at the start of linked list, F at the point they met. Both increment one at a time from here-on

 

For the curious, here’s the Java code snippets for detecting a loop in a linked list and finding the starting node. The complete source code for my linked list project is at my Github page (https://github.com/umairsd/LinkedList-Java):

	/**
	 * Checks if the given linked list is a circular linked list (i.e. it
	 * contains a loop). This means a list in which a node's next pointer points
	 * to an earlier node, so as to make a loop in the linked list. For
	 * instance:
	 * 			A -> B -> C -> D -> E -> C
	 *
	 * @param linkedList
	 *            the linked list to be tested
	 * @return true if there is a loop, false if there isn't
	 */
	public static boolean hasLoop(LinkedList linkedList) {
		if (linkedList == null || linkedList.getHead() == null) {
			return false;
		}

		IntegerNode slow = linkedList.getHead();
		IntegerNode fast = linkedList.getHead();

		while (true) {
			slow = slow.getNext();

			if (fast.getNext() != null) {
				fast = fast.getNext().getNext();
			} else {
				return false;
			}

			if (slow == null || fast == null) {
				return false;
			}

			if (slow == fast) {
				return true;
			}
		}
	}

 


	/**
	 * Returns the node at the start of a loop in the given circular linked
	 * list. A circular list is one in which a node's next pointer points
	 * to an earlier node, so as to make a loop in the linked list. For
	 * instance:
	 *
	 * input: A -> B -> C -> D -> E -> C [the same C as earlier]
	 * output: C
	 *
	 * (CCI_0205)
	 *
	 * @param linkedList
	 *            list to be tested
	 * @return the node at the start of the loop if there is a loop, null
	 * otherwise
	 */
	public static IntegerNode findLoopStart(LinkedList linkedList) {
		if (linkedList == null || linkedList.getHead() == null) {
			return null;
		}

		IntegerNode loopStartNode = null;
		IntegerNode slow = linkedList.getHead();
		IntegerNode fast = linkedList.getHead();

		while (slow != null && fast != null) {
			slow = slow.getNext();
			if (fast.getNext() == null) {
				loopStartNode = null;
				break;
			}
			fast = fast.getNext().getNext();

			// If slow and fast point to the same node, it means that the
			// linkedList contains a loop.
			if (slow == fast) {

				slow = linkedList.getHead();

				while (slow != fast) {
					// Keep incrementing the two pointers until they both
					// meet again. When this happens, both the pointers will
					// point to the beginning of the loop
					slow = slow.getNext(); // Can't be null, as we have a loop
					fast = fast.getNext(); // Can't be null, as we have a loop
				}

				loopStartNode = slow;
				break;
			}
		}

		return loopStartNode;
	}

32 Comments

Filed under Algorithms, Java

Incorrect C++ compiler behavior in Visual Studio

[Note: Special credit for this post goes to my colleague Yuliy Gerchikov, who first discovered this issue]

As most C++ programmers know, an object that has been constructed via the new expression must be destroyed by calling delete. The delete-expression invokes the object’s destructor which in turn is responsible for clean up tasks. For instance, here’s some sample code that shows object creation, and then freeing it up via delete:

	Foo *pF = new Foo();
	// Do some work
	delete pF;

The C++ Specification for delete has this to say:

The delete-expression operator destroys a most derived object (1.8) or array created by a new-expression.

delete-expression:
::opt delete cast-expression
::opt delete [ ] cast-expression

The first alternative is for non-array objects, and the second is for arrays. The operand shall have a pointer type, or a class type having a single conversion function (12.3.2) to a pointer type. The result has type void. (emphasis mine)

So, given that the delete expression has the return type void, what do you think the C++ compiler should do when it sees the following code:

	Foo *pF = new Foo();
	// Do some work
	delete delete pF;

But delete returns void, so it is reasonable to assume that the C++ compilers would throw a fit on the above code, right? Right? Unfortunately, that’s not always the case.

Let’s look at some C++ code. I have two user defined types Foo and Bar. The difference between the two types is that I declare and define an explicit destructor for Foo.

#include <iostream>
class Foo {
	public: ~Foo() { std::cout<<"In dtor"<<std::endl; }
};

class Bar {};

int main(){
	Foo *pf = new Foo();
    delete delete pf;

    Bar *pb = new Bar();
    delete delete pb;

    return 0;
}
Listing-1

When I compile this using Microsoft’s cl compiler as it ships with Visual Studio 2010 (Microsoft (R) C/C++ Optimizing Compiler Version 16.00.30319.01 for x64) here’s what I get:

test.cpp(13) : error C2541: 'delete' : cannot delete
 objects that are not pointers

And here’s the screen shot:

Screen shot-1

What is going on? Why doesn’t the compiler complain at both lines 10 and 13? It turns out, if a user-defined type has an explicit destructor, then the Visual Studio compiler happily compiles the double delete expression. It appears that in such a case the compiler violates the C++ specifications as the “delete ptr-to-type” returns a non-void pointer instead of void.

Suppose I get my code to compile by getting rid of the offending Bar object (code shown below). When I run my program, it is guaranteed to crash as we are calling delete on a ptr to unallocated space!

#include <iostream>
class Foo {
	public: ~Foo() { std::cout<<"In dtor"<<std::endl; }
};

int main(){
	Foo *pf = new Foo();
    delete delete pf;		//Uh Oh!

    std::cout<<"Hello world"<<std::endl;
    return 0;
}
Listing-2

And here’s the screen shot of the crash. Notice that as expected, the “Hello world” line never gets executed.

Screen shot-2

I tried out the same code in gcc (i686-apple-darwin10-gcc-4.2.1), and gcc behaves correctly, i.e. it errors out on both lines 10 and 13 in the listing-1 above.

I am not sure if this behavior of the Visual Studio compiler is intentional. What do you think? Have you encountered any weird C++ compiler behaviors? Please share your thoughts in the comments below.

1 Comment

Filed under Programming, Tools

Stern-Brocot Tree

Stern-Brocot tree is a tree data structure whose vertices correspond to the set of non-negative rational numbers. Thus, this tree provides a very elegant way for constructing the set of fractions m/n, where m and n are relatively prime. To construct the tree, the basic idea is to start with two fractions (0/1, 1/0) and then repeat the following operation:

Insert (m+m’)/(n+n’) between two adjacent fractions m/n and m’/n’

The first step gives us the entry 1/1 between 0/1 and 1/0. Similarly, the 2nd step gives us two more: 0/1,1/2,1/1,2/1,1/0

Continuing on like this results in an infinite binary search tree which preserves the usual ordering of rational numbers.

The figure below shows the 1st 4 levels of the Stern-Brocot tree.

Stern-Brocot.png

Finding the Path to k in Stern-Brocot Tree

The path from the root of the tree to a number k in the Stern-Brocot tree can be found using binary search. At each node, k will either be in the left half of the tree, or the right half. We continue down the left or right subtree until we finally find k.

  • Initialize the left fraction L to 0/1 and right fraction R to 1/0
  • Repeat the following until k is found:
    • Compute the mediant M (which is (m+m’)/(n+n’) )
    • If (k>M), then k is in the right half of the tree. L:=M and continue.
    • Else If (M>k), then k is in the left half of the tree. R:=M and continue.
    • Else k=M, terminate search.

Implementation

There’s a couple of things to tackle in our implementation. First, I need an easy way to represent fractions, so I create my own SternBrocotFraction class. I deliberately chose to make it very specific to this algorithm because I needed a special way to handle the fraction 1/0 (which by definition is greater than all other rationals).

Secondly, I needed a good way to represent the path from the root of the tree to k. I do this by using a StringBuilder, and at each step I append either the letter ‘L’ or ‘R’ depending on which sub-tree we take. When the search is finished, this gives us a String representation of the path from the root of the tree to the number k. This approach is similar to the approach advocated by ACM Programming Competitions for the “Stern-Brocot Number System” problem.

Here’s the code to find path to a number k:

package com.umairsaeed.algorithm;

public class SternBrocotPath {
	private static final char LEFT_SUB = 'L';
	private static final char RIGHT_SUB = 'R';

	public String findPathTo(SternBrocotFraction f) {
		SternBrocotFraction L = new SternBrocotFraction(0, 1);
		SternBrocotFraction R = new SternBrocotFraction(1, 0);

		StringBuilder results = new StringBuilder();
		SternBrocotPath.find(f, L, R, results);
		return results.toString();
	}

	public static void find(SternBrocotFraction f,
			SternBrocotFraction L,
			SternBrocotFraction R,
			StringBuilder results)
	{
		SternBrocotFraction M = L.add(R);

		if (M.compareTo(f) < 0) {
			L = M;
			results.append(RIGHT_SUB);
			SternBrocotPath.find(f, L, R, results);
		} else if (M.compareTo(f) > 0) {
			R = M;
			results.append(LEFT_SUB);
			SternBrocotPath.find(f, L, R, results);
		}
		return;
	}
}

The special SternBrocotFraction class is:

package com.umairsaeed.algorithm;

public class SternBrocotFraction implements
				Comparable<SternBrocotFraction> {
	private int numerator;
	private int denominator;

	public SternBrocotFraction(int numerator, int denominator) {
		if (denominator < 0) {
			numerator *= -1;
			denominator *= -1;
		}

		this.numerator = numerator;
		this.denominator = denominator;
	}

	public double doubleValue() {
		if (this.denominator == 0) {
			return Double.MAX_VALUE;
		} else {
			return (double) this.numerator /
						(double) this.denominator;
		}
	}

	public SternBrocotFraction add(SternBrocotFraction other) {
		return new SternBrocotFraction(
				this.numerator + other.numerator,
				this.denominator + other.denominator);
	}

	public int compareTo(SternBrocotFraction other) {
		if (this.doubleValue() < other.doubleValue()) {
			return -1;
		} else if (this.doubleValue() > other.doubleValue()) {
			return 1;
		}
		return 0;
	}
}

Finally, some test code to exercise my class:

package com.umairsaeed.algorithm;

public class SternBrocotTester {
	public static void main(String[] args) {
		testSternBrocotPath();
	}

	public static void testSternBrocotPath() {
		SternBrocotPath t = new SternBrocotPath();

		SternBrocotFraction f = new SternBrocotFraction(5, 7);
		System.out.println(t.findPathTo(f));

		f = new SternBrocotFraction(19, 101);
		System.out.println(t.findPathTo(f));

		f = new SternBrocotFraction(977, 331);
		System.out.println(t.findPathTo(f));

		f = new SternBrocotFraction(1049, 7901);
		System.out.println(t.findPathTo(f));
	}
}

2 Comments

Filed under Algorithms

Manual Memory Management in Objective-C

Objective-C on iOS has no garbage collector, so it is up to the programmer to make sure that memory is properly freed once an object is no longer needed. On the other hand, Objective-C on the Mac does have a garbage collector (in Objective C 2.0). This blog post focuses on how to manage memory in the absence of a garbage collector.

When managing memory manually, two major issues to watch out for are premature deallocation and memory leaks.

Cocoa Touch framework uses manual reference counting to manage memory on the iOS devices. Reference counting works on the principle that once created, every object has an owner. During its existence, its owner may change and it may even have more than one owners. When the number of owners for an object drops to zero, it deallocates itself to free up the memory being used.

retain & release:

Owners are tracked via retain counts. When an object is created it always has a retain count of 1. To own an object its retain count is incremented via the retain message. On the other hand, when the object is no longer needed its ownership is relinquished by decrementing the retain count via the release message. When the count reaches zero, the object sends itself the dealloc message and returns all the memory back to the heap.

autorelease:

autorelease marks an object for future release (delayed release). When an object is sent the autorelease message, its retain count is incremented and it is added to an instance of NSAutoreleasePool. The Autorelease pool keeps track of all the objects that have been sent the autorelease message. This pool is drained periodically, at which time all the objects within it are sent the release message.

autorelease is really handy when the creator of an object (e.g. a factory) simply creates the object and returns it to the caller. At this point, the creator has nothing to do with the object anymore, so it is up to the caller to retain the returned object in order to continue using it.

An Example – A Ticket class

Let us work through an example to see manual memory management in action. Suppose I am writing a ticketing framework, and I have a Ticket entity. The header file for Ticket looks as:

@interface Ticket : NSObject {
	NSString *ticketId;
}

- (NSString *) ticketId;
- (void) setTicketId:(NSString *) tid;

- (id) initWithId:(NSString *) tid;
+ (id) ticketWithId:(NSString *) tid;

@end

And the implementation file for Ticket looks as follows:

#import "Ticket.h"
@implementation Ticket

- (void) dealloc
{
	[ticketId release];
	[super dealloc];
}

- (NSString *) ticketId
{
	return ticketId;
}

- (void) setTicketId:(NSString *)tid
{
	[tid retain];
	[ticketId release];
	ticketId = tid;
}

- (id) initWithId:(NSString *) tid
{
	if (self = [super init])
	{
		[self setTicketId:tid];
	}

	return self;
}

+ (id) ticketWithId:(NSString *)tid
{
	Ticket *newTkt = [[Ticket alloc] initWithId:tid];
	return [newTkt autorelease];
}

@end

There are three memory management points to note here:

1. Inside dealloc, an object must release all its instance variables first. Then it should go up its class hierarchy and release any instance variables of its superclass. We should never directly send the dealloc message to instance variables, as some other objects might still have references to those variables.

2. A setter must retain the value passed in before it releases the old value, as tid and ticketId could be pointers to the same object.

3. The ticketWithId: method creates a ticket and simply returns it to the caller. It has no use for the newTkt, but it owns newTkt by virtue of creating it. At this point, if newTkt were released before method-exit, then the caller would get a pointer to unallocated heap. To avoid this, we put the newTkt on the autorelease pool, thus incrementing the retain count to 1. Periodically, the autorelease pool is drained and all the objects it it are sent the release message thus decrementing the retain count.

Essentially, the ticketWithId: method is saying that it does not want to be the owner for newTkt and puts that responsibility on the caller. If the caller wants to hold on to newTkt once it is returned, it must send it the retain message.

Using the Ticket class:

// EXAMPLE-1
- (void) processTicketWithId:(NSString *)ticketId
{
	Ticket* tkt = [[Ticket alloc] initWithId:ticketId];
	// Do something with tkt
	// ...
	[tkt release];
}

At this point, the retain count for tkt is 1. Moreover, since the processTicketWithId: method created the tkt object, it is now the owner and thus is responsible for cleaning it up before this method exits. Clean up is done by sending it the release message

Let’s see another example:

// EXAMPLE-2
- (void) processTicketWithId:(NSString *)ticketId
{
	Ticket* tkt = [Ticket ticketWithId:ticketId];
	[tkt retain];
	// Do something with tkt
	// ...
	[tkt release];
}

In this example, the memory for tkt wasn’t allocated by processTicketWithId: method, so it doesn’t own the tkt object. However, as we’ve seen in the implementation of the Ticket class, the ticketWithId: method created the Ticket object and added it to the autorelease pool. In order to continue using tkt, we must retain it so that even if it is drained from the autorelease pool, we can still continue to use the tkt object. Once done, we need to clean up and send the release message.

Summary of Memory Management Rules for Objective-C:

Rule-1: If you get/create the object from new, alloc or copy, you must release it when done.

Rule-2: If you get the object any other way, assume that it has been autoreleased. If you want to hold on to this object, send it the retain message.

Rule-3: If you retain an object, you must balance every retain with a release.

Rule-4: Never send the dealloc message to an object directly. Others might be holding references to this object, and if deallocated, they’ll be left with pointers to unallocated memory.

6 Comments

Filed under Cocoa

Snapshots in Xcode

When writing code, I usually write code in an incremental fashion. I try to get one set of functionality to work before I start on the next set. Due to this, I frequently create incremental backups of my files/project before I begin the next set of major changes or mass edits. This allows me to come back to a clean and consistent state if I ever need to.

I typically don’t work on a private branch, so I cannot check in my code until I am reasonably done. To create temporary incremental backups, I usually just end up making copies of my project folder.

Xcode has a handy feature called Snapshot that let’s you create a safety net before you make any major or risky changes to your files. Go to File->Make Snapshot (shortcut: command-control-s), and Xcode remembers the state of the project.

Screen shot-1.png

Now we are free to make changes to the source files all we want. If we need to return to the previous state we use the snapshot window. This window can be invoked from File->Snapshots

Screen shot-2.png

Simply select the snapshot that you want, and press the Restore button. Xcode automatically creates another snapshot when you restore to an older snapshot, which allows you to jump back and forth between two different versions of the project. Pretty useful!

Snapshots are stored in a disk image that lives in ~/Library/Application Support/Developer/Shared/SnapshotRepository.sparseimage

5 Comments

Filed under Tools, Xcode