Directory Hierarchy Traversal
Tree traversal is an important topic in computer science. Other synonyms used in the literature for traversing such structures are walking, navigating, and visiting a tree.
Note that in order to traverse a directory hierarchy and access its entries, the directory must have execute permission.
The Files class provides methods to traverse a directory hierarchy which has a tree structure. We will use the directory hierarchy shown below to illustrate these traversal methods. The directory hierarchy below is rooted at directory a—that is, this directory is the start or the root of the directory hierarchy. Traversal of the directory hierarchy starts by visiting directory a and is considered to be at depth 0. Dropping down each level of directories in the hierarchy increases the depth. The directory a/b and the file a/x.txt are at depth 1 in relation to the directory a, and are said to be siblings at depth 1. The directory hierarchy below has a maximum depth of 3, as there are no directories below this depth. Note that the file a/b/c/dir_link is a symbolic link to the directory a/b, and as we shall see, it can influence the traversal if symbolic links are to be followed.
Traversal Strategies
There are primarily two main strategies for traversing a tree structure, or as in our case, a directory hierarchy. Both strategies start at the root of the directory hierarchy.
- Depth-first traversal
Any subdirectory encountered at a particular depth level is traversed to its maximum depth before continuing with any sibling of this subdirectory at this depth level. The depth-first traversal of the hierarchy rooted at directory a is shown below. Note how at depth level 2, the directory a/b/c is traversed completely before traversing its sibling directory a/b/d. The traversal methods of the Files class use the depth-first traversal strategy.
Visited entry Depth
./a 0
./a/x.txt 1
./a/b 1
./a/b/c 2
./a/b/c/z.txt 3
./a/b/c/dir_link 3
./a/b/d 2
./a/b/d/y.txt 3
- Breadth-first traversal
The siblings at each depth level are traversed before moving on to the next depth level—in other words, the traversal is by depth level. The breadth-first traversal of the hierarchy rooted at directory a is shown below. Note how entries at each depth are traversed before traversing the entries at the next depth level.
Visited entry Depth
./a 0
./a/x.txt 1
./a/b 1
./a/b/d 2
./a/b/c 2
./a/b/c/z.txt 3
./a/b/c/dir_link 3
./a/b/d/y.txt 3
Both strategies have their pros and cons, and each excels at solving particular traversal problems. For example, finding a search goal that is closest to the root is best found by the breadth-first strategy in the shortest amount of time, whereas a goal that is farthest from the root is best found by the depth-first strategy in the shortest amount of time. Depth-first search may need to backtrack when trying to find a solution, whereas breadth-first search is more predictive. Breath-first search requires more memory as entries at previous levels must be maintained. The interested reader should consult the ample body of literature readily available on this important subject of trees and graph algorithms.