The IETF, Reliable Multicast, and Distributed Simulation

J. Mark Pullen

Department of Computer Science and C3I Center

George Mason University

mpullen@gmu.edu

 

Keywords:

network standards, IETF, large scale multicast, RTI, reliable multicast

 

ABSTRACT: This paper continues a series dealing with use of standard, open protocols from the Internet Protocol Suite (IPS) as a basis for distributed simulation networking. It sets forth the reasons why this approach is important for distributed simulation, and describes the Internet Engineering Task Force (IETF) which is the premier forum in this area. Progress of the IETF’s Large Scale Multicast working group is reported, showing that the group was successful in focusing parts of the IETF on distributed simulation. A related effort to create Reliable Multicast (RM) technology is reported, and its importance is explained. The paper concludes with a report on the approach to RM being developed by the author’s group, called the Selectively Reliable Transmission Protocol.

1. Introduction

This paper continues a series begun in the Distributed Interactive Simulation (DIS) workshops and continued under SIW, dealing with use of standard, open protocols from the Internet Protocol Suite (IPS) as a basis for distributed simulation networking [PMB97]. DoD has found it effective to develop custom solutions such as the Defense Simulation Internet [PuWo95] and the Synthetic Theater of War runtime infrastructure (RTI) and network [WBVM97] that extend beyond the capabilities of commercial networks. However, we believe there is a long-term benefit in pursuing open, commercial standards, because ultimately systems built around these technologies will prove more cost-effective, stable and supportable because they share the growing commercial networking technical base.

At present there are some significant shortfalls in the IPS that were reported on in the SIW last year [PMB97]. The continuing interest in DoD to use shared networks for distributed simulation can only be met in very constrained ways by currently available technology [BRO97]. It is therefore to the advantage of the Defense distributed simulation community to track progress in these technologies, and influence their development to meet distributed simulation needs to the largest extent possible. This paper will provide an update on last year’s report concerning the Internet Engineering Task Force (IETF) Large Scale Multicast Applications working group, showing some success in influencing commercial technologies. It will also describe our role in a new working group in Reliable Multicast that could produce technology with significant benefit to future High Level Architecture (HLA) Runtime Environments (RTIs).

2. The IETF

The IETF was formed by the managers of the Federal Government Internet in the late 1980’s to coordinate engineering concerns among the networks that constituted the Internet. Early in its existence the IETF became the focal point for coordination and agreement of the Internet Protocol Suite (IPS). The IPS consists of all networking and related protocols proposed or agreed to for experimental, draft, or standardized use in the Internet. To facilitate this coordination, commercial providers of hardware and software for the Internet became part of the IETF. With commercialization of the Internet, the IETF has grown into a large, mostly informal organization of mostly commercial technical personnel concerned with interoperability of Internet technologies. It has a reputation for maintaining fairness and due process while sustaining momentum in moving new technologies forward as rapidly as possible.

The Communications Architecture and Security subgroup (CAS) of the DIS Working Group participated in developing the IEEE standard for DIS and developed the IEEE standard for DIS communications [IEEE95]. The latter calls for use of Internet Protocol multicast (IPmc) as the standard protocol for support of large-scale distributed simulations. During its last year of existence, CAS started forming a relationship with the IETF, with the intention of influencing the IETF technology development process to achieve better networking support for DIS applications.

The IETF is organized as a collection of Working Groups under the coordination of volunteer Area Directors who collectively form the Internet Engineering Steering Group (IESG). For a topic to be taken up there are two firm requirements: (1) multiple organizations are prepared to invest time developing the technology through a sequence of status steps (Proposed, Draft, and Internet Standard) and (2) a commitment that the standards become part of the open IPS. Documents known as "Internet Drafts" are the input to this process; a published standard or informational document is known for historical reasons as a "Request for Comments" (RFC).

The IETF meets three times yearly but a great deal of work is accomplished on email between meetings. Working groups come and go as technology changes, and their charters are always finite with "sunset" dates. Procedures tend to the informal, and the credo "we believe in rough consensus and running code," underscores the need to have interoperable implementations before standardization. For details see http://www.ietf.org.

3. The IETF LSMA Working Group

As a result of CAS involvement, the Large Scale Multicast Applications (LSMA) Working Group was formed within the IETF. The purpose of LSMA is to create a consensus-based requirement for Internet Protocols that support Distributed Interactive Simulations (DIS), whether they are based on the previous DIS protocol, new applications developed by the HLA, or related applications. LSMA applications are characterized by the need to distribute real-time application data over a shared wide-area network in a scalable manner. The number of hosts in this environment will number from a few to tens of thousands. They require the ability to interchange state data with sufficient reliability and timeliness in order to sustain a three-dimensional virtual, visual environment containing large numbers of moving objects. The network supporting such a system will require an appropriate multicast transport implementation. Clearly, LSMA has an aggressive vision of the distributed simulation environment of the future.

LSMA set out to create two Internet Drafts, to be published as Informational RFCs. One would define the limitations of today’s IPS for the LSMA, the other would provide representative scenarios to the networking community could understand the intended application of LSMA. The process of creating these drafts has been hampered somewhat by minimal support for the authors’ participation in this process. However both documents have been through multiple drafts; as of the December, 1997 IETF meeting the limitations draft [PMB97a] was described as ready for consideration as an RFC, while the scenarios draft [SSM97] requires more work. In addition, due to the strong desire of the IESG for quantified requirements, a group from British Telecom was enlisted to create a "Requirements" draft, but found they could really only create a framework for specification of requirements [BBP97]. The latest versions of the drafts are available from ftp://ds.internic.net/ internet-drafts.

SIW participants interested in networking should aware of two facts about the LSMA working group:

4. Reliable Multicast

The IETF has a less-active twin called the Internet Research Task Force (IRTF). Considerable interest has been raised in "reliable multicast" (RM) by LSMA and several other groups, particularly from the area of Integrated Services which combines voice, video and other real-time applications with traditional data transfer. As a result the IRTF has formed an RM working group which has met three time in the last year. The focus for RM is the ability to transfer some or all components of a message (e.g. a data file or a segment of a data flow) among the members of a multicast group, with a guarantee that they are delivered. Not all RM services handle real-time data, however those that do so inevitably transfer only a small amount of data reliably, or otherwise take advantage of some special characteristic of the data. A specific requirement on all IRTF RM is to provide mechanisms to control congestion arising from storms of ACKs or NACKs that could occur when data segments are lost (known as "ACK implosion").

Reliable multicast is of considerable importance to limiting network capacity requirements for distributed simulation. Some attributes in the simulation data (such as the object position) behave as a flow, where each new packet provides new state values; these can be transmitted on a best-effort basis. However, other attributes are rarely updated; Cohen [Cohe94] showed that it is more effective to transmit these reliably (that is, to have the protocol guarantee their delivery by retransmitting as needed). Under the HLA, provision is made for these attributes to be transmitted reliably. In a federation with a small number of geographic locations, this can be achieved by multiple reliable unicast (point-to-point) connections as in [BRO97]. When the number of locations expands as envisioned in LSMA, this will become impractical because it involves wide-area transmissions numbering roughly N2 for N locations. Advances in RM will be needed to overcome the fact that at present RM for large groups is impractical. The goal for distributed simulation should be standardized RM protocols that will allow standard software components to take over a role that now requires custom-generated software in any RTI. This will happen when appropriate IRTF results move to the IETF to become standards for implementation.

At George Mason University (GMU) we have been developing a RM protocol for distributed simulation, called the Selectively Reliable Transmission Protocol (SRTP) [PuLa95]. SRTP formalizes the capability to mix best-effort and reliable transmission within a single multicast group, taking advantage of the larger amount of best-effort traffic to send a sequence number that shows when a new reliable transmission has been made. A particular advantage under distributed simulation is that only the last value of any attribute need be saved for repairs when a NACK is received. Combined with several other techniques described in [PuLa97], this can reduce the ACK/NACK problem to manageable scale. We have a working implementation of SRTP and have used it to create and experimental "Light Weight RTI" under the HLA [PML97].

A new idea we put forward in [PuLa97] is Hop-Hierarchical Multicast Logging (HHML), inspired by [HSC95]. The basic idea is that the simulation host computers on each LAN will dynamically select a "logger" to provide repairs when other hosts detect data loss and send a NACK. The LAN logger then dynamically interacts with other wide area loggers to form a hierarchy of loggers. Using the "time to live" (actually, hops-to-live) feature of IPmc, the loggers localize repairs and minimize NACKs traffic. We believe our approach is unique in that it provides for dynamic selection and replacement of loggers and can be scaled to the LSMA distributed simulation environment.

When we originally began SRTP, we used a simulation system called the Networking Workbench, also developed at GMU, to validate the basic design [PuLa96]. We have now completed a Networking Workbench implementation of HHML. Initial simulations have validated the ability to dynamically form logger hierarchies and localize repairs for RM, even in the presence of high packet loss. We are continuing to develop SRTP within the context of the IRTF RM Working Group. We expect to produce a working implementation of SRTP with HHML in the next few months.

5. Summary

The technical and economic leverage available by using commercial networking technology dictates that the distributed simulation community track progress in this area, and wherever possible influence it to meet our needs. The Internet Engineering Task Force (IETF) and Internet Research Task Force (IRTF) represent key forums for this purpose. They are developing the next generations of today’s dominant commercial networking technology. The IETF Large Scale Multicast Applications (LSMA) group is a case in point. It has focused technology developers on the future needs of distributed simulation, with good results.

A particularly important technology to watch is Reliable Multicast (RM), which can dramatically reduce network capacity requirements in large-scale shared networks by reducing the amount of data that must be either transmitted best effort, or transmitted in many copies via reliable unicast. GMU has developed the Selectively Reliable Transport Protocol (SRTP) and is now adding a capability to improve its scalability. Called Hop-Hierarchical Multicast Logging (HHML), it is being implemented in the context of the IRTF RM working group’s efforts to achieve congestion control in reliable multicast.

6. References

[BBP97] Bagnall, P., R Briscoe, and A. Poppitt, "Taxonomy of Communication Requirements for Large-scale Multicast Applications", Internet Engineering Task Force Large Scale Multicast Applications Working Group December 1997, draft-ietf-lsma-requirements-01.txt, work in progress

[BRO97] Boyle, J., D. Roland and C. O’Donnell, "Planned DSI Support of RSVP and Multicast IP", Simulation Interoperability Workshop, Orlando Florida, March 1997

[Cohe94] Cohen, D., "Back to Basics", 11th Workshop on Standards for Distributed Interactive Simulation, Orlando, Florida, September 1994

[HCS95] Holbrook, H. W., S. K. Singhal and D. R. Cheriton, "Log-Based Receiver-Reliable Multicast for Distributed Interactive Simulation", Proceedings of ACM SIGCOMM '95, August 1995

[IEEE95] Institute of Electrical and Electronics Engineers, Inc, IEEE 1278.2-1995, Standard for Distributed Interactive Simulation - Communication services and Profiles

[PMB97] Pullen, J., M. Myjak and C. Bouwens, "Limitations of The Internet Protocol Suite for Distributed Simulation in the Large Multicast Environment", Simulation Interoperability Workshop, Orlando Florida, March 1997

[PMB97a] Pullen, J., M. Myjak and C. Bouwens, "Limitations of The Internet Protocol Suite for Distributed Simulation in the Large Multicast Environment", Internet Engineering Task Force Large Scale Multicast Applications Working Group December 1997, draft-ietf-lsma-limitations-02.txt, work in progress

[PuLa95] Pullen, J. and V. Laviano, "A Selectively Reliable Transport Protocol for Distributed Interactive Simulation", 13th Workshop on Standards for the Interoperability of Distributed Simulations, Orlando, September 1995

[PuLa96] Pullen, J. and V. Laviano, "Prototyping the Selectively Reliable Transport Protocol", 14th Workshop on Standards for the Interoperability of Distributed Simulations, Orlando Florida, March 1996.

[PuLa97] Pullen, J. and V. Laviano, "Adding Congestion Control To The Selectively Reliable Transmission Protocol For Large-Scale Distributed Simulation", Simulation Interoperability Workshop, Orlando Florida, September 1997

[PML97] Pullen, J., M. Moreau, and V. Laviano, "Creating A Light-Weight RTI As An Evolution Of Dual Mode Multicast Using Selectively Reliable Transmission", Simulation Interoperability Workshop, Orlando Florida, September 1997

[PuWo95] Pullen, J. M. and D. Wood, "Networking Technology and DIS", Proceedings of the IEEE,

August 1995

[SSM97] Seidensticker, S., W. Smith, and M. Myjak, "Scenarios and Appropriate Protocols for Distributed Interactive Simulation", Internet Engineering Task Force Large Scale Multicast Applications Working Group March 1997, draft-ietf-lsma-scenarios-01.txt, work in progress

[WBVM97] Wolfson, H., S. Boswell, D. Van Hook, and S. McGarry, "Reliable Multicast in the STOW RTI Prototype", Simulation Interoperability Workshop, Orlando Florida, March 1997