# Complexity of Algorithm to calculate number of nodes in a binary tree

#### Chinta

##### New member
I guess this is the first question on partly CS topic in this forum. But I think you guys will be able to help me.

I have an algorithm which goes as follows:

int CN(struct node *node)
{
if(node==null)
return 0;
return 1 + CN(node->left) + CN(node->right);
}

My question is that how to calculate the complexity of the above code and what is the complexity in terms of number of nodes n.
The answer that I'm guessing is O(nlogn). But the answer is given as O(n); I'm clueless how to approach to get O(n)?

#### CaptainBlack

##### Well-known member
I guess this is the first question on partly CS topic in this forum. But I think you guys will be able to help me.

I have an algorithm which goes as follows:

int CN(struct node *node)
{
if(node==null)
return 0;
return 1 + CN(node->left) + CN(node->right);
}

My question is that how to calculate the complexity of the above code and what is the complexity in terms of number of nodes n.
The answer that I'm guessing is O(nlogn). But the answer is given as O(n); I'm clueless how to approach to get O(n)?
Lets assume this is a binary tree, and we have decided on worst case. Suppose that $$t(n)$$ is the worst case time for a tree of $$n$$-nodes, then strong induction can be used to show that $$t(n) \in O(n)$$ using the fact that $$t(n)=k+t(a)+t(n-a-1)$$ for some constant $$k$$ and $$a$$ nodes in the r-tree and $$n-a$$ in the l-tree.

That is we are not going to calculate the complexity we are going to prove it is $$O(n)$$.

CB

#### Chinta

##### New member
As you said that let's assume this as a binary tree, then we have a=(n/2) and (n-a-1)=(n/2)-1.
Then T(n) can be written as follows:

T(n)=T(n/2) + T{(n/2) -1} + k
=2*T(n/2) + k
as a result of which
T(n)
∈ O(n) .

But in the worst case is "k" not a linear function of n? Assuming each addition takes constant time c, in the worst case it can become "c*(n-1)" [because #additions=(n-1) in worst case,where n = #nodes] whence-forth this recurrence becomes T(n)=
2*T(n/2)+k=2*T(n/2) + c(n-1) => T(n) O(nlogn).

Last edited:

#### Evgeny.Makarov

##### Well-known member
MHB Math Scholar
Suppose that $$t(n)$$ is the worst case time for a tree of $$n$$-nodes, then strong induction can be used to show that $$t(n) \in O(n)$$ using the fact that $$t(n)=k+t(a)+t(n-a-1)$$ for some constant $$k$$ and $$a$$ nodes in the r-tree and $$n-a$$ in the l-tree.
In order to prove that t(n) is O(n) by induction on n we need to consider some different property P(n); then we prove ∀n P(n) by induction, and "t(n) is O(n)" is going to be a simple corollary of ∀n P(n). This is because we cannot say, e.g., t(0) is O(n): for such statement we have to consider a complete function, not just its individual value. We can, for example, say that there exists a constant C such that t(n) <= C * n.

As you said that let's assume this as a binary tree, then we have a=(n/2) and (n-a-1)=(n/2)-1.
The fact that the tree is binary does not mean that both subtrees have an (almost) equal number of leaves.

But in the worst case is "k" not a linear function of n? Assuming each addition takes constant time c, in the worst case it can become "c*(n-1)" [because #additions=(n-1) in worst case,where n = #nodes]
What do you mean by an addition?

I would first prove that the body of the function CN is executed exactly n times where n is the number of nodes. This can be proved by strong induction n as CB said. Each execution of CN, given the results of the recursive calls, takes a constant time, from where it follows that there exists a constant C such that t(n) <= C * n.

#### CaptainBlack

##### Well-known member
In order to prove that t(n) is O(n) by induction on n we need to consider some different property P(n); then we prove ∀n P(n) by induction, and "t(n) is O(n)" is going to be a simple corollary of ∀n P(n). This is because we cannot say, e.g., t(0) is O(n): for such statement we have to consider a complete function, not just its individual value. We can, for example, say that there exists a constant C such that t(n) <= C * n.
I did not want to be too specific, but what I envisaged for the induction step is the assumption that there is some constant $$C$$ such that for all $$n\le N$$ for some $$N \in \mathbb{N}$$ we had $$t(n)<Cn$$. Then when we choose any $$C\ge k$$ induction on $$N$$ should do the trick, since then we would have proven that there exists a $$C$$ such that $$t(r)<C\times r$$ for all $$r \in \mathbb{N}$$.

CB

Last edited:

#### Chinta

##### New member
I am not concerned about the inductive proof. All I want to know is that whether the constant "k" will be there or not? Because there are (n-1) additions being performed for the statement 1 + CN(node->left) + CN(node->right) and assuming each addition takes "c" unit time then we are getting c(n-1) instead of the constant k in the recurrence t(n)= k+t(n-a)+t(n-a-1). That is what leads to O(nlogn) time,otherwise the time is linear i.e. O(n).

Another thing is that in the worst case the binary tree would have been a full binary tree,in which the number of nodes in the left tree and right tree would almost be the same resulting to "a=n/2" almost.

Help me if I'm wrong somewhere!!

Last edited:

#### CaptainBlack

##### Well-known member
I am not concerned about the inductive proof. All I want to know is that whether the constant "k" will be there or not? Because there are (n-1) additions being performed for the statement 1 + CN(node->left) + CN(node->right) and assuming each addition takes "c" unit time then we are getting c(n-1) instead of the constant k in the recurrence t(n)= k+t(n-a)+t(n-a-1). That is what leads to O(nlogn) time,otherwise the time is linear i.e. O(n).

Another thing is that in the worst case the binary tree would have been a full binary tree,in which the number of nodes in the left tree and right tree would almost be the same resulting to "a=n/2" almost.

Help me if I'm wrong somewhere!!
The constant represents the overheads and common operations of counting a root node and calling the counting routine with the left and right subtrees.

Untill proven otherwise tiy do not know that the worst case timing of an n-node tree is for a full tree.

CB

#### Chinta

##### New member
So is k a constant at all or a linear function of n?

#### CaptainBlack

##### Well-known member
So is k a constant at all or a linear function of n?
It is a constant (it does not depend on n since all you are doing is some pointer manipulation and calling functions etc).

CB

#### Chinta

##### New member 