Next: Benefits of the approach
Up: DSL Implementation Using Staging
Previous: Intensional analysis of code
The important issues of efficient language implementation by refinement from high-level
specifications are: the efficient use of the underlying target environment, and removing the
layer of interpretative computation introduced by such specifications. We have shown that
monads and staging are the right abstraction mechanisms to accomplish the task. To
effectively use these tools we propose that DSL implementers follow a well defined
method. We reiterate our method here:
- Domain analysis. The problem domain is analyzed to find the common
abstractions around which the language is designed. This step is perhaps the most
important step in a good language design. It has been studied extensively by
others [32, 2, 3]. Our research group has been
investigating the integration of DSL design and domain analysis for several years.
Recently Widen and Hook have summarized a ``top level" view of this integration,
which is called the Software Design Automation (SDA) method [33]. This
method provides a design process and many synthesis techniques to facilitate the
integration of traditional domain analysis activities with language design and
implementation. The method we propose can be used in the context of SDA. It
specifically addresses the language implementation phase of the process.
- Definitional interpreter. Once the language has been identified, the next
step is to provide it with a semantics given as a pure functional interpreter. This program
can be thought of as its high-level definition [14, 25]. high-level
interpreters are usually easy to construct and provide a reference which can be
consulted to resolve any ambiguity in the language specification discovered in further
steps. By building it in an executable framework (a functional language, such as
Haskell or ML) it also provides a rapid prototype against which expectations can be
measured.
- Binding time improvements. The next step requires a binding
separation [8]. By identifying compile-time versus run-time data
structures in the definitional interpreter, we can separate those with both components
into separate data-structures. Examples of binding time improvements include the separation of
environments, which map names to values, into a compile-time index and a run-time stack, and
the introduction of a local recursive function to separate the recursion which drives the
analysis of the syntax of the program being interpreted from the recursion that encodes the
looping of the while command.
- Target domain analysis. The next step is to analyze the target language to
identify the primitive implementation features that will support the translation. This
step is usually straight-forward as the target language is often fixed, and well
understood.
- Design a monad. The next step is to design a monad to capture the effects and
actions implicit in the target language. This is a hard step in the process since it requires
both abstract knowledge about the structure and properties of monads, and detailed concrete
knowledge about the target domain. The choices made in this step influence the structure of the
monad, the structure of the monadic interpreter, and the run-time system which interacts with
the low-level effects of the target language.
Once the monad is designed, an implementation for the monad as a pure functional
emulation must be produced. The implementation must emulate the actions in a purely
functional setting by explicitly threading abstract representations of the actions such
as ``stores'', ``I/O streams'', or ``exception continuations'' in and out of all
computations.
- Monadic Interpreter. The next step is to refine the purely functional
definitional interpreter into one written in a monadic
style [28, 24, 13]. This implementation is still purely functional
because the actions of the monad are emulated in a functional style. But because the actions
are now explicit, we have moved the form of definition closer to the target language. This step
often requires a big change to the structure of the source code, because the monad makes
implicit much of the ``plumbing" explicit in the interpreter. The cost of
this restructuring is not without benefit. The removal of the explicit
plumbing results in programs which are simpler, and more immune to future changes.
- Staging. The next step completes the binding-time separation begun in the binding
time improvement step. That step separated the compile-time data from the run-time data.
Staging separates the compile-time computations from the run-time computations. This is
done by placing explicit staging annotations in the program written in METAML. Staging is the
crucial step that differentiates an (inefficient) interpreter from an (efficient) compiler.
- Transformation of residual code.
The residual object-program produced by a staged interpreter is both a data structure that can
be manipulated, and a program that can be run. Control over the form of the residual program is
crucial here. The residual program is always an ML program (ML is the object language). But the
user can control the form of this ML program. A goal of the translation is to make the object
program use only those ML features directly supported by the target language. The restricted
form of the residual object program make it possible to use the intensional analysis of
object-code tools provided by MetaML to easily build the final translation step to the target
language.
Next: Benefits of the approach
Up: DSL Implementation Using Staging
Previous: Intensional analysis of code
Zine-El-abidine Benaissa
Wed Jul 21 11:46:59 PDT 1999