1.2 - Ise Notes

1.2 A Short History of File Structure Design

Earlier, the file access was sequential, and the cost of access grew in direct proportion to the size of the file. So, Indexes were added to files.
Indexes made it possible to keep a list of keys and pointers in a smaller file that could be searched more quickly.
Simple indexes became difficult to manage for dynamic files in which the set of keys changes. Hence tree structures were introduced.
Trees grew unevenly as records were added and deleted, resulting in long searches requiring multiple disk accesses to find a record. Hence an elegant, self-adjusting binary tree structure called an AVL tree was developed for data in memory.
Even with a balanced binary tree, dozens of accesses were required to find a record in moderate-sized files.
A method was needed to keep a tree balanced when each node of the tree was not a single record, as in a binary tree, but a file block containing hundreds of records. Hence, B-Trees were introduced...
AVL trees grow from top down as records are added, B-Trees grow from the bottom up.
B-Trees provided excellent access performance but, a file could not be accessed sequentially with efficiency.
The above problem was solved using B+ tree which is a combination of a B-Tree and a sequential linked list added at the bottom level of the B-Tree.
To further reduce the number of disk accesses, hashing was introduced for files that do not change size greatly over time.
Extendible, dynamic hashing was introduced for volatile, dynamic files which change.