Next: Publius content type
Up: Implementation issues
Previous: Publishing mutually hyperlinked documents
Publius contains a directory publishing tool that
automatically publishes all files in a directory.
In addition, if some file, f,
contains a hyperlink to another file, g, in that
same directory, then f's hyperlink to g is rewritten
to reflect g's Publius URL. Mutually hyperlinked
HTML documents are also dealt with, as described in the previous
section.
The first step in publishing a directory, D,
is to publish all of D's non-HTML files and record,
for later use, each file's corresponding Publius URL.
All HTML files in D are then scanned for hyperlinks
to other files within D. If a hyperlink, h, to a previously
published non-HTML file, f, is found then hyperlink h
is changed to the Publius URL of f.
Information concerning hyperlinks between HTML files
in directory D is recorded in a data structure called
a dependency graph. Dependency graph, G, is a directed
graph containing one node for each HTML file in D.
A directed edge (x,y) is added to G if the HTML file
x must be published before file y. In other words,
the edge (x,y) is added if file y contains a
hyperlink to file x. If, in addition, file x contains
a hyperlink to file y the edge (y,x) would be added
to the graph causing the creation of a cycle. Cycles in the
graph indicate that we need to utilize the Publius
Update trick that Alice uses when publishing her
mutually hyperlinked files
C and D (Section 4.4).
Once all the HTML files have been scanned the dependency graph G
is checked for cycles. All HTML files
involved in a cycle are published and their
Publius URLs recorded for later use. Any
hyperlink, h, referring to a file, f, involved
in a cycle, is replaced with f's Publius URL.
All nodes in the cycle are removed from G
leaving G cycle-free. A topological sort
is then performed on G yielding R, the publishing
order of the remaining HTML files.
The result of a topological sort of a directed acyclic graph (DAG) is
a linear ordering of the nodes of the DAG
such that if there is a directed edge
from vertex i to vertex j then i appears
before j in the linear ordering [1].
The HTML files are published according to order
R. After each file, f, is published, all hyperlinks
pointing to f are modified to reflect f's Publius URL.
Finally a Publius Update operation is
performed on all files that were part
of a cycle in G.
Next: Publius content type
Up: Implementation issues
Previous: Publishing mutually hyperlinked documents
Avi Rubin
2000-06-13