INTERNET DRAFT J.M.Pullen Expiration: 7 May 1998 George Mason U. Lava K. Lavu George Mason U. Hai Nguyen ESystems Falls Church Eric Crawley Baynetworks 7 December 1997 A Simulation of QOSFP Multicasting for a Large Area Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as ``work in progress'' To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet- Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Abstract This document describes a detailed simulation model of a sizable IPmc/RSVP network using the Quality of Service Extensions to OSPF and the performance predictions produced by the model. The model was developed using the OPNET simulation package with procedures defined in the C language. The model was developed to allow investigation of scaling characteristics of QoS routing by the Internet multicast/resource reservation community. We are making our model publicly available for this purpose. This Internet-Draft is intended to form the basis for an Informational RFC. 1. Background The purpose of this document is to describe a simulation model designed to help in understanding the scaling characteristics of the QOSPF protocol in networks of significant size. The successful deployment of IP multicasting [1] and its availability in the Mbone has led to continuing increase in real- time multimedia Internet applications. Because the Internet has traditionally supported only a best-effort quality of service, there is considerable interest to create mechanisms that will allow adequate resources to be reserve in networks using the Internet protocol suite, such that the quality of real-time traffic such as video and voice can be sustained at specified levels. The RSVP protocol [2] has been developed for this purpose and is the subject of considerable ongoing implementation efforts. RSVP does not provide routing, but relies on routing protocols to be available in its working environment. One school of thought argues that, to be effective, this routing must be aware of quality of service (QOS) capabilities of network components through which the RSVP paths and reservations are to be routed. A proposal has been put forward for QOS-sensitive routing (QOSPF) based on the well-known OSPF routing protocol [4] and its multicast derivative, MOSPF [5]. However, serious questions have been raised about the scalability of such a protocol. The simulation described in this document is intended to provide a tool to examine the behavior of a sizable QOSPF-routed network with IPmc and RSVP, handling large numbers of resource-reserved, real-time multicast applications. A companion paper describes the simulation models for IPmc and RSVP that support the QOSPF simulation. Each of these models is available for use of the IETF comunity. 2. The OPNET Simulation Environment The Optimized Network Engineering Tools (OPNET) is a commercial simulation product of the MIL3 Company, Arlington, VA. It employs a Discrete Event Simulation approach that allows large numbers of closely-spaced events in a sizable network to be represented accurately and efficiently. OPNET uses a modeling approach where networks are built of components interconnected by perfect links (which can be degraded at will). Each component's behavior is modeled as a state-transition diagram. The process that takes place in each state is described by a program in the C language. We believe this makes the OPNET-based models relatively easy to port to other modeling environments. Perhaps more importantly, given the widespread availability of OPNET, it makes them sufficiently efficient that an extended period of network behavior can be simulated in considerable detail, even for large networks. The following sections describe the state-transition models and process code for the QOSPF model we have created using OPNET. 3. QOSPF Model The state-transition diagrams for the QOSPF model can be found at http://bacon.gmu.edu/qosip/qospf. The following processing takes place in the indicated modules. 3.1 init This state initializes all the router variables. Default transition to idle state. 3.2 idle This state has several transitions. If a packet arrives it transits to arr state. Depending on interrupts received it will transit to BCOspfLsa, BCQospfLsa, BCMospfLsa, hello_pks state. In future versions, links coming up or down will also cause a transition. 3.3 BCOspfLsa Transition to this state from idle state is executed whenever the condition send_ospf_lsa is true, which happens when the network is being initialized, and when ospf_lsa_refresh_timout occurs. This state will create Router, Network, Summary Link State Advertisements and pack all of them into an Link State Update packet. The Link State Update Packet is sent to the IP layer with a destination address of AllSPFRouters. 3.4 BCQospfLsa Transition to this state from the idle state is executed whenever the condition send_qospf_lsa is true. This state will create Link Resource Advertisement and Resource Reservation Advertisement and pack them into a Qospf Link State Update Packet. This Qospf Link State Update Packet is sent to IP layer with a destination address of AllSPFRouters. 3.5 BCMospfLsa Transition to this state from idle state is executed whenever the condition send_mospf_lsa is true. This state will create Group Membership Link State Advertisement and pack them into Mospf Link State Update Packet. This Mospf Link State Update Packet is sent to IP layer with a destination address of AllSPFRouters. 3.6 arr The arr state checks the type of packet that is received upon a packet arrival. It calls the following functions depending on the protocol Id of the packet received. a. QospfPkPro: Depending on the type of QOSPF/OSPF/MOSPF packet received the function calls the following functions. 1. HelloPk_pro: This function is called whenever a hello packet is received. This function updates the router's neighbor information, which is later used while sending the different LSAs. 2. OspfLsUpdatePk_pro: This function is called when an OSPF LSA update packet is received (router LSA, network LSA, or summary LSA). If the Router is an Area Border Router or if the LSA belongs to the Area whose Area Id is the Routers Area Id, then it is searched to determine whether this LSA already exists in the Link State database. If it exists and if the existing LSA's LS Sequence Number is less than the received LSA's LS Sequence Number the existing LSA was replaced with the received one. The function processes the Network LSA only if it is a designated router or Area Border Router. It processes the Summary LSA only if the router is a Area Border Router. The function also turns on the trigger ospfspfcalc which is the condition for the transition from arr state to ospfspfcalc. 3. MospfLsUpdatePk_pro: This function is called when a MOSPF LSA update packet is received. It updates the group membership link state database of the router. 4. QospfLsUpdatePk_pro: This function is called when a QOSPF LSA update packet is received. It updates the resource link state database of the router. b. RsvpPkPro: This function is invoked whenever a packet is received by the arr state from the RSVP daemon. RSVP will send a packet to QOSPF daemon whenever the RSVP daemon receives an initial path message, or a reservation for a source is successful. This function calls one of the following two functions depending on the type of packet received by the QOSPF arr state. 1. Path_Msg_pro: This function gets the source and destination information, sender transmission specs andturns the send_qospf_lsa trigger on. send_qospf_lsa is used to send the Qospf LSA update packets, thereby sending the Resource Reservation Advertisement with the new information. 2. Resv_Msg_Pro: This function gets the resources reserved information from the TCSB of the RSVP daemon and turns the send_qospf_lsa trigger on. 3.7 qospfspfcalc This function is used to calculate the QOSPF routing table. Resource LSAs are used to discover the neighbors and RRAs are used to check for the available resources on the link. This state transit to upstr_node on detupstrnode condition. Only topology changes (indicated by router LSA, network LSA, resource LSA) will trigger recalculation of all flows, other changes (summary LSA, group change, and RRA/DABRA) only cause recalculation of affected entries. a. QospfCandidateAddPro: Each vertex's neighbors are checked for inclusion into the candidate list by examining the Resource-LSA. If the existing reservation for the flow (for this source destination pair) or the available bandwidth on this link meets the QOS requirements of the flow then the other end of the link is considered for inclusion in the candidate list. The delay from the source to this vertex(the other end of the link) is calculated and if this vertex is not on the candidate list it is added to the candidate list. Route pinning is used. When adding the vertex if the parent of vertex has a reservation for the flow it is marked reserved. b. QospfSPFTreeCalc: While the candidate list is not empty the candidate that is closest to the root is deleted and added to the shortest path tree, and the Resource-LSA of this candidate is used to check for possible inclusion of the other end of the links into the candidate list. A vertex marked reserved is chosen first in building the Shortest Path Tree. c. QospfRouteTableCalc: Using the shortest path tree information obtained from the shortest path tree database route table is calculated. The IP layer uses this information to route the QOS flows. 3.8 hello_pks Hello packets are created and sent with destination address of AllSPFRouters. Default transition to idle state. 3.9 mospfspfcalc The following functions are used to calculate the shortest path tree and routing table. This state transit to upstr_node upon detupstrnode condition. a. CandListInit: Depending upon the SourceNet of the datagram, the candidate lists are initialized. b. MospfCandAddPro: The vertex link is examined and if the other end of the link is not a stub networks and is not already in the candidate list it is added to the candidate list after calculating the cost to that vertex. If this other end of the link is already on the shortest path tree and the calculated cost is less than the one that shows in the shortest path tree entry update the shortest path tree to show the calculated cost. c. MospfSPFTreeCalc: The vertex that is closest to the root that is in the candidate list is added to the shortest path tree and its link is considered for possible inclusions in the candidate list. d. MCRoutetableCalc: Multicast routing table is calculated using the information of the MOSPF shortest Path tree. 3.10 ospfspfcalc The following functions are used in this state to calculate the shortest path tree and using this information the routing table. Transition to qospfspfcalc state on qospfcalc condition. This is set to one after processing all functions in the state. a. OspfCandidateAddPro: This function initializes the candidate list by examining the link state advertisement of the the Router. For each link in this advertisement, if the other end of the link is a router or transit network and if it is not already in the shortest-path tree then calculate the distance between these vertices. If the other end of this link is not already on the candidate list or if the distance calculated is less than the value that appears for this other end add the other end of the link to candidate list. b. OspfSPTreeBuild: This function pulls each vertex from the candidate list that is closest to the root and adds it to the shortest path tree. In doing so it deletes the vertex from the candidate list. This function continues to do this till the candidate list is empty. c. OspfStubLinkPro: In this procedure the stub networks are added to shortest path tree. d. OspfSummaryLinkPro: If the router is an Area Border Router the summary links that it has received is examined. The route to the Area border router advertising this summary LSA is examined in the routing table. If one is found a routing table update is done by adding the route to the network specified in the summary LSA and the cost to this route is sum of the cost to area border router advertising this and the cost to reach this network from that area border router. e. RoutingTableCalc: This function updates the routing table by examining the shortest path tree data structure. 3.11 upstr_node This state does not do anything in the present model. It transitions to DABRA state. 3.12 DABRA If the router is an Area Border Router and the area is the source area then a DABRA message is constructed and send to all the downstream areas. Default transition to idle state. 4. Performance Predictions The purpose for generating the model was to use it to predict performance of a large IPmc-RSVP systems using QOSPF routing. We describe results of the simulation below. 4.1 small-scale test network and model calibration A 5-router test Network has been created in a lab environment to study the behavior of the Quality-of-Service Open Shortest Path First (QOSPF) routing protocol. The test network is made up of Bay Networks Backbone Link Node-2 (BLN-2) routers, Silicon Graphics Inc. (SGI) workstations, and Audio Host Processors. Routers are interconnected via Crossover cables that perform as T1 links. Workstations and Hosts are nodes on Fiber Distributed Data Interface (FDDI) Local Area Networks (LANs). A diagram of the test network is shown in: http://www.nac.gmu.edu/qosip/test_networks/SmallNet/index.html 4.1.1 Hardware and Software Configuration a. Backbone Link Node-2 Routers: The routers are running BayNetworks Image 11.0 Release with the addition of QOSPF, IGMP v2.0, and a subset of RSVP based on Internet Draft (ID) 8.0. For RSVP, fixed filter reservation style using raw RSVP I/O is supported. The ADspec object is not supported. Router alert IP option is used. Integrated services Controlled-Load is supported (as specified in the draft- ietf-intserv-ctrl-load-svc-03.txt). In addition, a protocol prioritization mechanism is used to control the queuing delay and dropping of QOSPF and RSVP messages. b. Silicon Graphics Inc. Workstations: These are Indy Workstations with Sysconnect FDDI card (12501) running IRIX v5.1 operating system. Custom UDP Test traffic generator (both Unicast and Multicast) is included. c. Audio Host Processors: These are E-Systems proprietary 6U VMEbus boxes consisting of Motorola 68040 processor (MVME167-033B), Cyclone i960 processors (CVM964), and Rockwell FDDI interface card (125010). Each Host has an internal Ethernet network for processor to processor communication. The units are running ISI pSOS+ v1.3 real time operating system. Audio applications use Xpress Transfer Protocol (XTP) v4.0 as its Transport layer. IGMP v2.0, RIP v1.0 and a subset of RSVP based on ID 8.0 are also supported. 4.1.2 Test Network Description The BLN-2 Routers are connected via serial synchronous links acting as T1 links. Each link has a line bandwidth of 1.25 Mbits/sec. This is the clock speed closest to the 1.544 Mbits/sec T1 rate that the routers can source. For each link, the Reservable bandwidth is set at 1.075 Mbits/sec, and the Best Effort (BE) bandwidth is set at 175 Kbits/sec. a. Reserved Flows Characteristics: Audio flows are provided with a Controlled-Load service. The RSVP Tspec has a burst of 5,120 bytes (10 datagrams) and a rate of 77 Kbytes/sec. Audio data packet size is 512 bytes. IP datagram packet size is 584 bytes (includes data, XTP header, IP header and a second encapsulating IP header). Based on the configured reservable bandwidth of 1.075 Mbits/sec, a maximum of 13 audio flows can be allocated per link. This does include a 7% inflation factor. b. BE Traffic Characteristics: Data packet size is 500 bytes. UDP IP datagram size is 528 bytes. 4.1.3 Test Scenarios 4.1.3.1 Test Case 1 - Reserved Flows with BE Traffic a. Test Scenario: Host1 is the source for Audio multicast group #1-13. Host2 is the source for Audio multicast group #14-26. Host3 joins and starts receiving Audio multicast group #1-26. There are 90 Kbits/sec of BE UDP traffic from Workstation2 to Workstation3. b. Observation: All Reserved flows and BE traffic from R2-R5-R3 are contained within a single link. Similarly, only one link is utilized from R1-R4-R3. Host3 receives all datagrams from Audio multicast group #1-26. Workstation3 receives 90 Kbit/sec of UDP traffic from Workstation2 via R2-R5-R3. No packets are clipped, since the link bandwidth is greater than that consumed by both the Reserved flows and BE traffic. 4.1.3.2 Test Case 2 - Reserved Flows with BE Traffic and Links break a. Test Scenario: Same as Test Case 1. Both Reserved flows and BE traffic are flowing from R1-R4-R3 and R2-R5-R3. The Links connecting R2 and R5 are disconnected. b. Observation: Host3 is receiving all Reserved flows from Host1 via R1-R4-R3, and Host2 via R2-R5-R3. Workstation3 is receiving all BE UDP traffic from Workstation 2 via R2-R5-R3. After the links between R2 and R5 are removed, the Reserved flows from Host2 and BE traffic from Workstation2 are re-routed to OSPF area 3 via R2-R1-R4-R3. The rerouted traffic, originating from OSPF area2, now utilizes the second link between R1&R4 and R4&R3. 4.1.3.3 Test Case 3 - Reserved Flows with Heavy BE Traffic a. Test Scenario: Same as Test Case 1, except that the BE UDP traffic from Workstation2 to Workstation3 is increased from 90 Kbits/sec to 902.4 Kbits/sec, causing link overload. Note that on this scenario the BE traffic are exceeding the BE available bandwidth. b. Observation: Host3 is receiving all of the Reserved flows from Host1 via R1-R4-R3, and Host2 via R2-R5-R3. Workstation3 is receiving in the average 257.15 Kbits/sec of BE UDP traffic from Workstation 2 via R2-R5-R3. In the average 71.6 packets/sec of BE UDP packets are clipped at R2 since the available BE bandwidth is much less than the actual BE traffic. The Reserved flows are not effected by the heavy BE traffic on the same links. 4.1.4 Calibration Calibration of the small-scale network simulation against the behavior or the physical model is ongoing. 4.2 Behavior of the scaled-up QOSPF network The major purpose for this simulation effort was to examine in detail the projected performance of a large QOSPF-based autonomous system. Accordingly we have generated a system of 84 routers that we believe is representative of the sort of environment where QOSPF would be used. 4.2.1 Nature of the scaled-up network The scaled-up network can be seen at: http://bacon.gmu.edu/qosip/test_networks/BigNet/index.html The network consists of four areas, each constructed from four copies of the small-scale network. The areas are cross-linked such that each small-scale network is connected to two area routers. Further, the area routers are connected to backbone routers and cross-linked for redundancy. The intention is to represent a corporate network with high performance and reliability requirements, where each of the small-scale networks might represent a department and each area router a geographic area. An intermediate network is used as a building block (figure 3). 4.2.2 Simulation results from the scaled-up network. We are still early in the process of understanding just what is happening inside our complex target environment. In particular we have been grappling with a series of problems associated with scaling up the protocols, which appear to be working in the small-scale network. We are running session generators at moderate levels (at most one new session per router per second) and working to verify the validity of the simulation results. Thus far we have found the target environment is fully able to break any naive simulation we try, either by demanding amounts of memory beyond the virtual memory space of Unix, or by running so slowly that hundreds of hours of wall-clock time would be required to represent one hour of simulated time. We are making good progress and expect to have validated simulation statistics within a month. Meanwhile we are releasing this interim report to the Internet community with the expectation that this will result in a thorough review and theoretical validation, or indication where further work is needed, for the model elements described here. A set of outputs from early simulations of an intermediate size network (42 routers and 48 hosts) will be found at http://bacon.gmu.edu/qosip/Results/IntrNet/index.html. In this simulation initial exchange of LSA among routers required 15 seconds. In order to examine LSA updates we set the LSA update cycle at 20 seconds (rather than the 30 minutes recommended in the QOSPF draft), and set resource-reserved sessions length in the range 5 to 10 seconds, so many multicast groups would form. Routing appeared stable in these short intervals, and network performance was generally as we would expect. A set of outputs from early simulations of the scaled-up network will be found at http://bacon.gmu.edu/qosip/Results/BigNet/index.html. This simulation only covered 40 seconds of simulated time, however system performance was generally as we would expect and routing overhead appears to be under 10% of total traffic, without explicit routing. Measured time for the Sparc20 hosting the simulation to perform the Dijkstra calculation was always under .05 seconds. 5. Future Work We expect to perform further simulations of the QOSPF protocol to allow it to be defined in such as way as to be most effective. We welcome participation in this process by the Internet community. Some of our concerns are: validating the overall model, testing the explicit routing function, making the resource consumption of QOSPF more readily distinguishable from the requirements of user traffic, investigating effects of anomalies such as link failures, and running the model for a long period of simulated time (perhaps one hour) which will require about a week of sustained simulation as the current ratio of simulation time to wall clock time is about 250:1. 6. References [1] Deering, "Host Requirements for IP Multicasting", RFC 1112, August 1989 [2] Braden et. al., "Resource Reservation Protocol Version 1 Functional Specification", work in progress (draft-ietf-rsvp- spec-14), November 1996 [3] Zhang et. al., "Quality of Service Extensions to OSPF or Quality of Service First Path Routing (QOSPF)", work in progress (draft-zhang-qos-qospf-00.txt), June 1996 [4] Moy, "OSPF Specification", RFC 1131, October 1989 [5] Moy, "MOSPF: Analysis and Experience", RFC 1585, March 1994 [6] MIL3 Inc., OPNET: Optimized Network Engineering Tools, Simulation Kernel Manual, November 1991 Authors' Addresses J. Mark Pullen C3I Center and Computer Science Mail Stop 4A5 George Mason University Fairfax, VA 22032 mpullen@gmu.edu Lava K. Lavu C3I Center Mail Stop 4B5 George Mason University Fairfax, VA 22030 llavu@bacon.gmu.edu Hai H. Nguyen Raytheon E-Systems, Falls Church Division 7700 Arlington Blvd, N201 Falls Church, VA 22046 hai_nguyen@fallschurch.esys.com Eric Crawley BayNetworks Mailstop 3FS-1302 3 Federal St. Billerica, MA 01821 esc@baynetworks.com Expiration: 7 May 1998