Next: Conclusion
Up: Deployment Experience
Previous: Use of OSPFScan for
9.2 Lessons Learned
In this section, we point out some of the lessons learned during
and after the deployment of the system.
The points may help in design, development and deployment
of other route monitoring systems.
- New tools reveal new failure modes.
The LSAG has allowed us to find and fix several
problems in a pro-active fashion.
Some of these problems would have been impossible to
find with other network management tools
(e.g., the refresh LSA bug).
- Real-time alerting and off-line analysis
are complementary.
Some problems such as router-bug were caught by real-time messages,
whereas some other problems such as excessive duplicate LSA traffic
were caught because of off-line analysis of LSAs stored
over long time intervals.
Finally, problems such as refresh LSA bug were identified using
LSAG messages in real-time, but they also required
a more detailed off-line analysis.
- OSPF exhibits significant amount of activity.
Based on our experience, we have noticed that both the networks
monitored exhibit significant amount of OSPF activity.
This activity is due to maintenance tasks as well as network problems.
Efficient and scalable design of the system has
helped us tackle this high level of activity with relative
ease.
- Add functionality incrementally.
We have added new functionality and improved the system
by close interaction with network operators.
At one level, this pertained to the user interfaces.
For example, it took several iterations
until the operators were satisfied with LSAG message formats
and could make sense of associated logs at a glance.
At another level, it was important to customize and enhance
value by building custom reports that reflected operational practices.
- Archive all the LSAs.
The analysis of excessive duplicate LSAs and refresh LSA bug
required archiving all the LSAs captured from the network,
not just those that indicated topology changes.
The volume of all OSPF LSAs is not onerous.
As seen in Table I,
the volume of raw LSAs collected from each
of the two networks is in the order of 10 MB per day.
This makes it fairly easy to collect all the LSAs from the
network, store these for a long period, as well as
transfer and replicate the archives as needed.
Next: Conclusion
Up: Deployment Experience
Previous: Use of OSPFScan for
aman shaikh
2004-02-07