I've been wanting to write about this for a long time and googling didn't return popular hits back then but just had time to write this but sharing nevertheless.
In a number of programming interview guides (most notably that of Joel Spolsky) it has been mentioned that one of the ways (note: only one and not the be all and end of it) to find out whether someone is "smart and gets the job done" is to ask them about recursion. And I agree in the sense that YES someone who has a good idea on recursion probably has good logic and flows and thus could be a good programmer but someone who might not know how to approach one might not be that brilliant but could still do the job (this one is of course my opinion).
Then it came to me that one doesn't necessarily have to know recursions or recursive functions if they do know how to approach the problem using stacks. Now why would I do that? The answer would be performance (very significant especially if the operation involves millions of level in the hierarchy and heavy operations for each node). The drawback is it is a little more code and possibly more complicated. But for most applications, the performance gain is very much worth it. [more]
According to wikipedia
"Recursion, in mathematics and computer science, is a method of defining functions in which the function being defined is applied within its own definition. The term is also used more generally to describe a process of repeating objects in a self-similar way."
while stack is an abstract data type and data structure based on the principle of Last In First Out (LIFO).
To illustrate better. Let's say we have a simple class named Node with properties Name and ChildNodes (a node can contain nodes in itself) and a method to generate a sample nodes with descendant nodes.
1 class Node
2 {
3 public string Name;
4 public List<Node> ChildNodes = new List<Node>();
5
6 public static Node GenerateTestData()
7 {
8 Node retVal = new Node();
9 retVal.Name = "Categories";
10
11 Node a = new Node();
12 a.Name = "Food";
13
14 Node b = new Node();
15 b.Name = "Electronics";
16
17 Node c = new Node();
18 c.Name = "Places";
19
20 retVal.ChildNodes.Add(a);
21 retVal.ChildNodes.Add(b);
22 retVal.ChildNodes.Add(c);
23
24 // Food level 1
25 Node a1 = new Node();
26 a1.Name = "Fruits";
27
28 Node a2 = new Node();
29 a2.Name = "Vegetables";
30
31 Node a3 = new Node();
32 a3.Name = "Meat";
33
34 a.ChildNodes.Add(a1);
35 a.ChildNodes.Add(a2);
36 a.ChildNodes.Add(a3);
37
38 // Fruits level 2
39 Node a1a = new Node();
40 a1a.Name = "Apple";
41
42 Node a1b = new Node();
43 a1b.Name = "Orange";
44
45 a1.ChildNodes.Add(a1a);
46 a1.ChildNodes.Add(a1b);
47
48 // Meat level 2
49 Node a3a = new Node();
50 a3a.Name = "Beef";
51
52 Node a3b = new Node();
53 a3b.Name = "Pork";
54
55 Node a3c = new Node();
56 a3c.Name = "Chicken Meat";
57
58 a3.ChildNodes.Add(a3a);
59 a3.ChildNodes.Add(a3b);
60 a3.ChildNodes.Add(a3c);
61
62 // Electronics level 1
63 Node b1 = new Node();
64 b1.Name = "TV";
65
66 Node b2 = new Node();
67 b2.Name = "Phone";
68
69 Node b3 = new Node();
70 b3.Name = "Radio";
71
72 b.ChildNodes.Add(b1);
73 b.ChildNodes.Add(b2);
74 b.ChildNodes.Add(b3);
75
76 // Places level 1
77 Node c1 = new Node();
78 c1.Name = "Parks";
79
80 Node c2 = new Node();
81 c2.Name = "Hospitals";
82
83 Node c3 = new Node();
84 c3.Name = "Malls";
85
86 c.ChildNodes.Add(c1);
87 c.ChildNodes.Add(c2);
88 c.ChildNodes.Add(c3);
89
90 return retVal;
91 }
92 }
The node hierarchy is something like:
Categories
Food
Fruits
Apple
Orange
Vegetables
Meat
Beef
Pork
Chicken Meat
Electronics
TV
Phone
Radio
Places
Parks
Hospitals
Malls
Now, let's say the requirement now is to display all the nodes inside the generated test node including itself (in a Console Application for example)
The common approach is to have a recursive function similar to this
1 private static void PrintNodeNamesRecursive(Node n)
2 {
3 Console.WriteLine(n.Name);
4 foreach (Node child in n.ChildNodes)
5 {
6 PrintNodeNamesRecursive(child);
7 }
8 }
What it does is when you pass a node, it displays the node's name, then loops over the child nodes and calling the same function (itself) for each of the child node.
Running the function from the main method (like below)
1 static void Main(string[] args)
2 {
3 Node c = Node.GenerateTestData();
4 PrintNodeNamesRecursive(c);
5 Console.ReadKey();
6 }
should result to something like
Now on the main topic. The same result can be produced with the following method making use of stack instead of calling itself.
1 private static void PrintNodeNamesWithStack(Node p)
2 {
3 Stack<Node> s = new Stack<Node>();
4 s.Push(p);
5 while (s.Count > 0)
6 {
7 Node current = s.Pop();
8 Console.WriteLine(current.Name);
9
10 // we can use foreach loop here and still go through every node but
11 // to come up with exactly the same output as the recursive function
12 // we push the last child node first so it is popped last and thus printed
13 // last, the same order as the recursive function
14 for (int a = current.ChildNodes.Count-1; a >= 0; a–)
15 {
16 s.Push(current.ChildNodes[a]);
17 }
18 }
19 }
Now see for yourself and you'd get the same results.
So what makes the performance difference?
If you will notice, when you debug the recursive approach, the function is called over and over again and the call stack is used to internally track which call to return to after one is finished. This is where using the explicit stack approach makes the difference. In the latter we loose the overhead of having the program remember which function call/line to return to; rather it just needs to loop until all the items in the stack is popped out and thus involves less work.
Again the alternative might/likely involve a little more code and honestly could be more complicated (depending on how you think) but if you understand the concept and you're particular about performance (which I would recommend you are) then have a look at it.
Before posting this post, I googled for "replacing recursion with stack" and I came across this post from Phil Haack on the same topic with a sample on finding controls inside an ASP.NET page which is one of the practical applications of this alternative so please go ahead and read it too : Replacing Recursion with a Stack (yeah same title I know but that's what it is really all about)
I'm not quite sure where I knew about this, probably some posts or code I've read some time ago but I do remember getting reminded and getting help on this while working with a former colleague. So special thanks to Joseph.
Hope you'd find the approach helpful too. 🙂