Binary Trees
| We extend the concept of linked data structures to structure containing nodes with more than one self-referenced field. A binary tree is made of nodes, where each node contains a "left" reference, a "right" reference, and a data element. The topmost node in the tree is called the root. Every node (excluding a root) in a tree is connected by a directed edge from exactly one other node. This node is called a parent. On the other hand, each node can be connected to arbitrary number of nodes, called children. Nodes with no children are called leaves, or external nodes. Nodes which are not leaves are called internal nodes. Nodes with the same parent are called siblings. | |
- The depth of a node is the number of edges from the root to the node.
- The height of a node is the number of edges from the node to the deepest leaf.
- The height of a tree is a height of the root.
- A full binary tree.is a binary tree in which each node has exactly zero or two children.
- A complete binary tree is a binary tree, which is completely filled, with the possible exception of the bottom level, which is filled from left to right.

n = 1 + 2 + 4 + ... + 2h-1 + 2h = 2h+1 - 1
h = O(log n)
Advantages of trees
Trees are so useful and frequently used, because they have some very serious advantages:- Trees reflect structural relationships in the data
- Trees are used to represent hierarchies
- Trees provide an efficient insertion and searching
- Trees are very flexible data, allowing to move subtrees around with minumum effort
Traversals
A traversal is a process that visits all the nodes in the tree. Since a tree is a nonlinear data structure, there is no unique traversal. We will consider several traversal algorithms with we group in the following two kinds- depth-first traversal
- breadth-first traversal
- PreOrder traversal - visit the parent first and then left and right children;
- InOrder traversal - visit the left child, then the parent and the right child;
- PostOrder traversal - visit left child, then the right child and then the parent;
| As an example consider the following tree and its four traversals:
PreOrder - 8, 5, 9, 7, 1, 12, 2, 4, 11, 3 InOrder - 9, 5, 1, 7, 2, 12, 8, 4, 3, 11 PostOrder - 9, 1, 2, 12, 7, 5, 3, 11, 4, 8 LevelOrder - 8, 5, 4, 9, 7, 11, 1, 12, 3, 2 |
|


Binary Search Trees
We consider a particular kind of a binary tree called a Binary Search Tree (BST). The basic idea behind this data structure is to have such a storing repository that provides the efficient way of data sorting, searching and retriving. |
A BST is a binary tree where nodes are ordered in the following way:
|
Implementation
We implement a binary search tree using a private inner class BSTNode. In order to support the binary search tree property, we require that data stored in each node is Comparable:public class BST <AnyType extends Comparable<AnyType>>
{
private Node<AnyType> root;
private class Node<AnyType>
{
private AnyType data;
private Node<AnyType> left, right;
public Node(AnyType data)
{
left = right = null;
this.data = data;
}
}
...
}
Insertion
The insertion procedure is quite similar to searching. We start at the root and recursively go down the tree searching for a location in a BST to insert a new node. If the element to be inserted is already in the tree, we are done (we do not insert duplicates). The new node will always replace a NULL reference.
11, 6, 8, 19, 4, 10, 5, 17, 43, 49, 31
Draw a binary search tree by inserting the above numbers from left to right.

Searching
Searching in a BST always starts at the root. We compare a data stored at the root with the key we are searching for (let us call it astoSearch). If the node
does not contain the key we proceed either to the left or right child depending upon
comparison. If the result of comparison is negative we go to the left child,
otherwise - to the right child. The recursive structure of a BST yields a recursive
algorithm.
Searching in a BST has O(h) worst-case runtime complexity, where h is
the height of the tree. Since s binary search tree with n nodes has a
minimum of O(log n) levels, it takes at least O(log n) comparisons to
find a particular node.
Unfortunately, a binary serch tree can degenerate to a linked list,
reducing the search time to O(n).
Deletion
Deletion is somewhat more tricky than insertion. There are several cases to consider. A node to be deleted (let us call it astoDelete)
- is not in a tree;
- is a leaf;
- has only one child;
- has two children.
toDelete is not in the tree, there is nothing to delete. If toDelete
node has only one child the procedure of deletion is identical to
deleting a node from a linked list - we just bypass that node being
deleted


See BST.java for a complete implementation.
Exercise. Given a sequence of numbers:
11, 6, 8, 19, 4, 10, 5, 17, 43, 49, 31
Draw a binary search tree by inserting the above numbers from left to right and
then show the two trees that can be the result after the removal of 11.

Non-Recursive Traversals
Depth-first traversals can be easily implemented recursively.A non-recursive implementation is a bit more difficult. In this section we implement a pre-order traversal as a tree iteratorpublic Iterator<AnyType> iterator()
{
return new PreOrderIterator();
}
private class PreOrderIterator implements Iterator<AnyType>
{
...
}
next() method, we check if the top element has
a left child. If it has a left child, we push that child on a stack and return a parent
node. If there is no a left child, we check for a right child. If it has a right child, we
push that child on a stack and return a parent node. If there is no right child, we
move back up the tree (by popping up elements from a stack) until we find a
node with a right child. Here is the next() implementation
public AnyType next()
{
Node cur = stk.peek();
if(cur.left != null)
{
stk.push(cur.left);
}
else
{
Node tmp = stk.pop();
while(tmp.right == null)
{
if (stk.isEmpty()) return cur.data;
tmp = stk.pop();
}
stk.push(tmp.right);
}
return cur.data;
}
Hashing and Hash
Tables
Hashing
Hashing is a
technique used for storing and retrieving keys in a rapid manner.
In hashing, a string
of characters are transformed into a usually shorter length value or key that
represents the
original
string.
Hashing is used to
index and retrieve items in a database because it is faster to find item using
shorter hashed
key than
to find it using the original value.
Hashing can also be
defined as a concept of distributing keys in an array called a hash table using
a
predetermined function called hash function.
Hash table
Hash table is a table
(array) where we store the original string. Index of the table is the hashed
key while the
value is
the original string.
The size of hash
table is usually several orders of magnitude lower than the total number of
possible string, so
several
string might have a same hashed-key.
For example, there
are 267 (8,031,810,176) string of length 7 character consists
of lowercase only.
Hash Function
There are many ways
to hash a string into a key. The following are some of the important methods
for
constructing hash functions.
Mid-square
Division
(most common)
Folding
Digit
Extraction
Rotating Hash
Mid-square
Square the
string/identifier and then using an appropriate number of bits from the middle
of the square to obtain
the
hash-key.
If the key is a
string, it is converted to a number.
Steps:
1. Square the value of the key. (k2)
2. Extract the middle r bits of the
result obtained in Step 1 Division
Divide the
string/identifier by using the modulus operator.
It’s the most simple
method of hashing an integer x.
Function: h(z) = z mod M
z
= key
M = the
value using to divide the key, usually using a prime number, the table size or
the size of memory used. Folding
Partition the
string/identifier into several parts, then add the parts together to obtain the
hash key
Steps:
1.Divide the key value into a number of
parts
2.Add the individual part (usually the
last carry is ignored) Digit Extraction
A predefined digit of
the given number is considered as the hash address.
Example:
–Consider x = 14,568
–If we extract the 1st,
3rd, and 5th digit, we will get a hash code of : 158.
Rotating Hash
Use any hash method
(such as division or mid-square method)
After getting the
hash code/address from the hash method, do rotation
Rotation is performed
by shifting the digits to get a new hash address.
Example:
–Given hash address =
20021 Rotation result: 12002 (fold the
digits)
Collision
What happened if we want to store these
strings using the previous hash function (use the first character of each
string)
define, float, exp, char, atan, ceil,
acos, floor.
There are several strings which have
the same hash-key, it’s float and floor (hash-key: 5), char and ceil (hash-key:
2).
It’s called a collision. How can
we handle this?
There are two general ways to handle
collisions:
Linear Probing
Search the next empty slot and
put the string there.
This is the hash table of these string:
define, float, exp, char, atan,
ceil, floor, acos.
Note that ceil is stored in h[6], acos
is
stored in h[1] and floor is stored in
h[7].
When we want to store “ceil”, there is
already “char” stored in h[2], so we
search the next empty slot which is
h[6].
Linear probing has a bad
search complexity if there are
many collisions.
The table “step” on the right describe
how many loop/step needed to find the string. Supposed we want to find ceil, we
compute the hash key and found
2. But ceil is not there so we should
iterate until we found ceil.
Chaining
Put the string in a slot as a
chained list (linked list).
In chaining, we store each string in a
chain (linked list). So if there is collision, we only need to iterate on that
chain.
No comments:
Post a Comment