USENIX ;login: - Summaries

2nd Conference on Domain-Specific Languages (DSL '99)
October 3—6, 1999
AUSTIN, TEXAS

These reports were originally published in the February 2000 issue of ;login:.

Refereed Papers Track

Session: Testing and Experience Reports

Hot Research Review

Invited Talks

Our thanks to the summarizer:
Kimberly A. Knowles

REFEREED PAPERS

Session: Testing and Experience Reports

Using Production Grammars in Software Testing
Emin Gün Sirer and Brian N. Bershad, University of Washington

Sirer's work is motivated by the desire to test and debug a Java virtual machine. Known testing techniques include formal methods; manual code analysis; manual test-case generation; and a technique favored perhaps too often, "release the system and wait." The goal for this project was to provide automated test generation via some sort of specialized scripting language. Sirer's system, a language called lava, enables users to use a production grammar to produce test cases. The syntax of lava is very similar to regular YACC-style parsing grammars, except that the grammar is used in reverse, to generate test inputs instead of parsing them. Lava has provisions for specifying limits and weights on productions, as well as a means of generating context-sensitive output. The output of the tool is a set of test cases that may vary widely. These test cases may be used to detect "gross violations of type safety," compute time complexity, and verify the correctness of code transformations.

One problem Sirer had to address was the "oracle problem": because the test cases may be complex, it is unclear what the correct output should be, compared to the output the system computes. While comparative methods are sometimes feasible, as with testing a Java virtual machine, the oracle problem remains for systems that have no other implementation available for comparison. To address this, Sirer has developed an auxiliary tool that generates lambda expressions that reflect the computation desired; as rules in lava are traversed, the lambda expressions are composed. The result is that when a test case is completed, the production rules have also generated a lambda expression in Scheme that reflects the analogous computation. Thus, the output of a JVM (for example) can be compared to the output of a Scheme interpreter when the corresponding lambda expression is evaluated.

This project has two major contributions: on one hand, test generation is made simple with lava, with very high test coverage and control over the types of tests; the other contribution is the integration of the test generation with a description of the expected behavior.

Jargons for Domain Engineering
Lloyd H. Nakatani, Mark A. Ardis, Robert G. Olsen, and Paul M. Pontrelli, Bell Laboratories, Lucent Technologies

This project is a study of the use of jargons, a family of DSLs that share a common syntax and customizable interpreter, in real-world domain-engineering problems. A model expressed in a jargon can be easily transformed into a multiplicity of related products -- for instance, C code and its documentation. Because making jargons does not entail designing their syntax or writing their interpreter from scratch, they are very easy to make. In fact, they are so easy to make that domain experts with no language-design experience can easily make jargons for their own use (therefore called "Do It Yourself" jargons). To test this claim, Nakatani et al. had teams of domain experts working within the FAST domain-engineering process make their own jargons. Each team had access to a consultant who was a jargon expert. Each team successfully made the jargons they needed, confirming that jargons are easily made by domain experts who are not experts in language design and implementation.

Jargons share all the benefits of conventional DSLs — including domain-specific expressiveness, high-level abstraction, and evolution — while avoiding the pitfalls of DSLs such as high language-development and maintenance costs. In addition, because jargons share a common syntax and interpreter, they are easily composed as long as care is taken to avoid keyword conflicts. Composability means that a complex problem can be broken up into simpler subproblems, each subproblem modeled in its own jargon, and the models merged to express a model for the entire problem. Jargons manage thereby to avoid the Tower of Babel syndrome that would otherwise be a consequence of the proliferation of DSLs.

Slicing Spreadsheets: An Integrated Methodology for Spreadsheet Testing and Debugging
James Reichwein, Gregg Rothermel, and Margaret Burnett, Oregon State University

The goal of this project was to provide an easy way for spreadsheet users to debug spreadsheets. In doing so, the authors had to take into account that not only are spreadsheets modelless and offer immediate feedback, but the spreadsheet "developers" are unlikely to understand testing and debugging theory and methodologies. Therefore, they wanted a system to lead the user through debugging and fault localization that would be easy to use. A previous system allowed users to identify correct output of cells when inputs were entered, and it marked untested cells in red. This paper adds the idea of fault localization, which uses a backward dynamic slice to trace user-determined errors back to cells in the spreadsheet that could be causing the error. Because all of the feedback to the user is visual, the interface is easy to use. Likelihoods for each cell are computed on the basis of heuristics and shaded accordingly to tell the user which cells have the highest likelihood of error. This work seems very valuable; testing and debugging support for spreadsheets seems very sparse, yet studies indicate that a high percentage of spreadsheets used for business decision-making contain faults.

HOT RESEARCH REVIEW

DSLs for Programming and Security in Active Networks
Carl A. Gunter, University of Pennsylvania

Carl Gunter presented his talk in two parts: programming active networks, and security.

He noted first that whenever he talks about domain-specific languages, the first question everyone asks him is, "Why not just implement another library?" Library extensions of a general-purpose language (GPL) are useful because they give a lot of information about and control over resources. However, the downside is that using GPLs incurs a significant cost in terms of complexity, such as having to explicitly handle memory allocation, lack of mechanisms for resource restriction, and difficulty of detecting termination.

Active networks were proposed in 1997 by Tennenhouse et al. The idea is that in a network, packets could be considered as programs requesting some way to be treated. For example, a packet could arrive at a router with an instruction for multicast; thus, instead of the typical action of passing the packet on, the router could interpret the instruction for multicast and replicate the packet on outgoing lines. This could reduce network traffic, since a Web page that was very popular could be served in one packet requesting multicast, instead of one packet per request. There are many different levels of network programming: a program could be installed at the routers; the packet could carry the program; or the program could be a "switchlet" — a combination whereby the packet provides input instructions and the switchlet, resident on the router, interprets those instructions. One application of switchlets is the Queue Management Switchlet (Hicks et al., 1999), which implements Flow-Based Adaptive Routing (FBAR).

This research is being carried out in the SwitchWare project at the University of Pennsylvania. SwitchWare uses a three-layer architecture: the top active packet layer, a service layer, and an OS layer. While the service layer is written in a GPL, the active packet layer uses a DSL called PLAN. PLAN is a scripting language for composing active-network services. PLAN is declarative and functional but has no looping mechanism. The routers are protected from malicious code by typechecking, and the network is protected by resource limits on the code. The typechecking is Anytime Typechecking!™: types can be checked either statically or dynamically.

Considerable security problems need to addressed in implementing active networks. Already there exist security problems in just researching active-network technology. SRI runs an active network testbed with machines hosted at different sites across the country. Each site wants to determine its own security policy, but they must all agree on a security protocol. For authentication, a public-key system is preferable but requires a public-key infrastructure (PKI) to establish trust, handle certificates, and express and check authorizations. There are several DSLs for policies; one, the Query Certificate Manager (QCM), achieves general policies, transparent policy distribution, diverse distribution strategies, local autonomy, and a formal model. Thus, in this system, policy-directed certificate retrieval is possible. However, sometimes it becomes necessary to revoke a certificate. For this, Certificate Revocation Lists (CRLs) must also be maintained and distributed. Fox and Lamaccia (1998) have studied CRLs.

In this talk, we saw how DSLs can be used in multiple ways in emerging technologies. Further questions are:

When is a DSL architecture advantageous?

Is there a high overhead in introducing a DSL?

Is there general support for DSL design, development, testing, and deployment?

INVITED TALK

Language Technology for Performance and Security, or, Making Life Better, Not Just Easier
Peter Lee, Carnegie Mellon University and Cedilla Systems Incorporated

This research is motivated by a focus on safety-critical systems, which are those systems in which the cost of failure is "unacceptably high" — for example, those used by the space program or for airplane-guidance systems. These systems are everywhere, used all the time. Such systems typically are required never to crash; always meet deadlines; be reconfigurable without the need to shut down; and be secure, trustworthy, lightweight, extensible, and adaptable. With time, safety-critical systems won't disappear; instead, they will be more integrated with daily activities. Programming-language technology will provide the technology for safety, because the same characteristics that make languages easy to program in can also make them easy to reason about.

The idea of proof-carrying code is to provide easy access to remote resources while maintaining invariants, protecting the key, and matching the allowed behavior (according to the resource protector) with what could happen while running a piece of client code. The idea is that the host publishes rules about what is allowed and a set of verification conditions. The verification conditions can be used to produce a proof checker and a proof generator. Anyone who wants to use the resource uses the verification conditions to produce a proof generator, then feeds the program in to get a proof. The program then carries the proof (say, as part of its header) to the resource. If the untrusted client has used anything but the correct proof generator or verification conditions, the proof checker at the host will detect it and reject the program.

DSLs can provide safety policies specific to the domain. Once we have and can manipulate safety policies, we can use them to generate certificates. Consider a certifying compiler, in which the compiler "explains" why the target code it produces preserves the safety properties of the source. Then, by certifying the compiler and the source, we can conclude that the target is safe. However, instead of certifying the compiler, which would be the equivalent of doing a formal proof of correctness, the strategy instead is to have the compiler provide annotations in the output to allow the theorem prover to reconstruct the translation process. This is what Lee's implementation of a certifying compiler does. The compiler generates optimized code from Java source code. The compiler outputs both code with annotations and a proof, which can be verified by a theorem-prover on the host side, given the hints in the annotations. The compiler is mostly off-the-shelf; the theorem-prover is hidden, and the binaries are in standard format.

Thus, the use of DSLs allows for reasoning to be done about code that is optimized and in standard format, thereby making it easier and more efficient to verify and run untrusted code.

2nd Conference on Domain-Specific Languages (DSL '99) October 3—6, 1999 AUSTIN, TEXAS

REFEREED PAPERS

HOT RESEARCH REVIEW

INVITED TALK

2nd Conference on Domain-Specific Languages (DSL '99)
October 3—6, 1999
AUSTIN, TEXAS