Check out the new USENIX Web site. next up previous
Next: Publius content type Up: Implementation issues Previous: Publishing mutually hyperlinked documents

Publishing a directory

Publius contains a directory publishing tool that automatically publishes all files in a directory. In addition, if some file, f, contains a hyperlink to another file, g, in that same directory, then f's hyperlink to g is rewritten to reflect g's Publius URL. Mutually hyperlinked HTML documents are also dealt with, as described in the previous section. The first step in publishing a directory, D, is to publish all of D's non-HTML files and record, for later use, each file's corresponding Publius URL. All HTML files in D are then scanned for hyperlinks to other files within D. If a hyperlink, h, to a previously published non-HTML file, f, is found then hyperlink h is changed to the Publius URL of f. Information concerning hyperlinks between HTML files in directory D is recorded in a data structure called a dependency graph. Dependency graph, G, is a directed graph containing one node for each HTML file in D. A directed edge (x,y) is added to G if the HTML file x must be published before file y. In other words, the edge (x,y) is added if file y contains a hyperlink to file x. If, in addition, file x contains a hyperlink to file y the edge (y,x) would be added to the graph causing the creation of a cycle. Cycles in the graph indicate that we need to utilize the Publius Update trick that Alice uses when publishing her mutually hyperlinked files C and D (Section 4.4). Once all the HTML files have been scanned the dependency graph G is checked for cycles. All HTML files involved in a cycle are published and their Publius URLs recorded for later use. Any hyperlink, h, referring to a file, f, involved in a cycle, is replaced with f's Publius URL. All nodes in the cycle are removed from G leaving G cycle-free. A topological sort is then performed on G yielding R, the publishing order of the remaining HTML files. The result of a topological sort of a directed acyclic graph (DAG) is a linear ordering of the nodes of the DAG such that if there is a directed edge from vertex i to vertex j then i appears before j in the linear ordering [1]. The HTML files are published according to order R. After each file, f, is published, all hyperlinks pointing to f are modified to reflect f's Publius URL. Finally a Publius Update operation is performed on all files that were part of a cycle in G.
next up previous
Next: Publius content type Up: Implementation issues Previous: Publishing mutually hyperlinked documents
Avi Rubin
2000-06-13