MANET WG Minutes 14 Dec 98 Taken and Prepared by Erik Guttman, Sun Microsystems WG Chairs: Joe Macker and Scott Corson Agenda Bashing State of the Working Group Guidance Drafts Performance Issues Applicability Statement Terminology Draft Addressing / Architectural Framework UCLA Multicast Protocol CEDAR AMRIS CMU NS2 results CMU NS2 results (conclusion) MITRE Sim results Possible network scenarios AODV DSR ZRP CBRP Open Discussion Implementation issues/ status/ plans Lessons learned --------------------------------------------------------------------------- Joe Macker gave presentation on WG status We have many proposals on the table. This is a good thing. We have documents to play with and simulate. There are proactive and reactive proposals. On the other side we need to consider the engineering job of deciding which is better. We need to begin comparative performance analysis at the point where we can do detailed simulation. There are many approaches and trade offs. We need to proceed using 'good science' discipline: Where do the protocols break? How do they scale? We need to proceed using realistic scenarios. There may be multiple protocols advanced as the result of WG efforts, if they can be shown to not be redundant and have different situations where they perform better. Common Simulation toolkit is coming - We need this for cross comparison, analysis and simulation. Others (third parties) should get involved. The next steps: * Collect base line scenarios of interest We don't want to limit these scenarios to too few, nor do we want 1000s of scenarios. * We should have more discussion on the mailing list. Come out! * Lessons learned: Performance enhancements are possible on different link layers. When should we use them? We want to be able to use them, but not to limit the applicability of the base protocols to those which do support them, probably. It would be great if there were available source code from the protocol designers for others to play with. --------------------------------------------------------------------------- Open Discussion of the Performance Issues Draft The 'router ID' section should be removed. It was pointed out that simply because there has not been criticism of the document does not mean it is in great shape. Sometimes working group chairs must actively solicit area expert reviews in order to diligently attain document quality. --------------------------------------------------------------------------- Open Discussion of the Applicability Statement Applicability statement is a required section for future and revised MANET protocol drafts. Discussion of suggested additions contributed via email: Q: Isn't it better not to try to distinguish between voice and data, or video? A: Voice data doesn't mean anything. Some protocols are capable of performing well despite out-of-order deliver and jittery delays. The decision is to not distinguish between the two. Q: What is the estimated computation, storage and communication costs under normal conditions for networks with 10, 100 and 1000 nodes? A: We don't know how scalable the protocols are yet. Group suggests quantification of normal and worst-case costs using standard complexity notation. A: Specifically, we should discuss memory requirements for routing tables as the number of communicating nodes increases. Overhead in delivered data messages used by routing protocols should also be analyzed and included. We also need to discuss TCP performance. This is CRITICAL. We need to emphasize it. Q: Why put in the question "Does the protocol support Multicast? If so, how?" Protocol independent multicast can be built on top of unicast, right? A: If you have multicast, specify what optimizations you have that makes multicast depend on your unicast solution. Otherwise the question is broken. --------------------------------------------------------------------------- Open Discussion of the Terminology draft Two changes are present in -01 in its revision from -00: (1) Comments from the previous draft have been incorporated. (2) Definitions used in various other drafts have been included. The Terminology draft is meant to harmonize the terminology used in different drafts. Comments: . General terms from performance and communications have been defined, such as 'link' and 'bandwidth.' These are not appropriate to the document, it was asserted. . Some definitions are misleading, such as that for 'neighborhood.' Others may be wrong. Many are contentious. But why argue about it, the commenter asked. Other examples: 'throughput', 'jitter'... . It is unclear (to many at the WG session) whether this document is useful at all. It was contended that vocabulary used in drafts has to be defined there. It was asserted that it is entirely appropriate to hold authors to task if they omit definitions for any term they use which is not obvious to the reader. . Although the draft is useful to reduce confusion, it was asserted by several WG participants that a terminology document would be inappropriate as an 'archival document' (meaning an RFC). The only terminology draft so far is RFC 2119 which is more a terminological reference for the IETF standards process than a protocol document. . It was argued in rebuttal that it would be helpful to have a common terminology document to cite and only include differences in the protocol documents. . It was requested that we worry about terminology but use a 'lazy evaluation appoach'. We should take this discussion to the list. --------------------------------------------------------------------------- Open Discussion on Architecture / Addressing Framework There are two options here: (A) Use a common format for all addresses used internally to MANET protocols. (B) Allow any MANET domain protocol to use an internal address scheme provided that it doesn't preclude interoperability with the Internet addressing architecture. . Option A would artificially constrain routing protocols. What really matters is B. . We need to be IP transparent 'below' the protocol, ie. at the periphery of MANET nets where gateways exist. . We also need to be able to accept and use IP addresses 'above' the protocol, at the application / API level. WG Consensus: We will go with option (B). --------------------------------------------------------------------------- On-Demand Multicast Routing Protocl (ODMRP) S.J. Lee, UCLA The following is an overview of several points brought up in the presentation. This approach features a mesh topology instead of a tree. They use a 'forwarding group' concept. This finds the shortest path between nodes. The protocol achieves route maintenance using on-demand mechanisms. All routing information is soft state, so you don't need to explicitely leave groups. A 'bubble' is a set of members forming a 'forwarding group.' This provides redundency to overcome problems arising from channel failure. These groups must be established and maintained. * A sender floods join requests. * All intermediate forwarders remember the previous hop allowing backwards learning. * Neighbors exchange 'join tables' but only within bubbles. Loops: A DSR-like caching strategy is used. Sequence numbers are used to remember the previous traffic. That way duplicates can be suppressed. This is only used in control and discovery. Note that this will result in large caches which will have to be searched on every message. A large sequence number space will likely necessary as well. Properties and Status: * Periodic messages are used, but no control traffic is required when there is no traffic. * ODMRP with only one reciever could be used to support unicast. * Parsec has been used to simulate the protocol. They used synthetic traces. Q: Which movement and communication patterns simulated? A: There's lots of work to be done here. They can't say yet what exactly they've done, for instance, their performance evaluation. The protocol has not yet been compared to others. Q: Is there an assumption that there may always be multiple receivers? There are radio technologies which have single receivers. A: Radio assumptions were made such that broadcasts will always be obtained by multiple receivers provided they are able to receive the transmission (if they are in range, etc.) It was pointed out that this assumption needs to be added to the specification. See draft-gerla-manet-odmrp-00.txt --------------------------------------------------------------------------- Core Extraction Distribution Ad Hoc Routing Protocol (CEDAR) V. Bharghavan, UIUC The basic goal is to provide QoS routing, which they define as 'minimum bandwidth requirements'. This is not a statistical nor control guarantee. Rather, it models the dynamics of the network and figures out the bandwidth over time. Minimum routes are not obtained, rather a balance is arrived at over time, a best effort using heuristics. In a highly mobile network, you allow routing information to be obselete or stale. Once the network has stabilized, routing information is propogated. A self-configuration infrastructure aggregates local state. These constitutes 'core' elements in the routing calculation. There are four elements to CEDAR: (1) Core computation. This maintains the dynamic infrastructure. (2) Core state maintenance. (3) Core broadcast. Broadcast occurs only within the core (4) QoS routing. This is done in an efficient manner, propogating information on how to compute routes. Core: . Have as few nodes as possible in the core. Calculate the minimal dominating set. This is an NP-complete problem which is also hard to approximate. Instead, use a heuristic. . Tunnels are established and optimized to share common paths. . The core nodes don't know the full core. State maintenance: . 'Increase' and 'Decrease' waves are used to adjust the bandwidth availability model. Simulation: . A custom simulator was written. NS2 is being considered for future simulation efforts. . Simulations have been written for 10s to 100s of nodes. They have a test bed with 6-8 nodes (with a real implementation). Q: How stable are the tunnels established by CEDAR? A: There's no set up and they are not arbitrarily long (as they only connect intra-core nodes via 2 or 3 hops). Arguably these are pretty stable. --------------------------------------------------------------------------- Ad Hoc Multicast Routing utilizing increasing Id Sequence Numbers (AMRIS) Chun Weh, University of Singapore Design philosophy: * Top down - independent of underlying unicast. * Distributed operation - survivability of delivery paths. * Observation - Discovery vs. Recovery. Assumption: Applications are long lived. Very fast route discovery was sacrificed for speedy recovery. Q: Is there an assumption about the kinds of applications which will use multicast in an AMRIS network? A: Just that they be long-lived. SID = Smallest ID. Every participating node has increasing ids. These IDs are used for tree maintenance and allowing localized repair. Beacons are used. There is an initialization phase in which a new session message is emitted. As the ring expands, it forms a DAG route based at the SID. To join, send a join request if neighbors can be found with a lower id. Do this after a move to graft in to an existing tree. Future work: They are working on simulation. They aren't sure if they'll use NS2 or PARSEC. They want to incorporate routing metrics and utilization of unidirectional links. They are looking at existing routing protocols to build on top of, for instance, DSR. However, as DSR doesn't do beacons so this may be a bad fit. They are interested in QoS issues. Q: Are SIDs unique? A: No. And they leave space as they grow for other nodes to slip in. Q: Does the current protocol require bidirectional links? A: Yes. Q: Does the protocol allow non-member senders? A: No. It was pointed out that this is a different model than classic IP multicast, which allows non-member sending. See: http://cram.comp.nus.edu.sg:8080/cram --------------------------------------------------------------------------- Simulation work with NS2 Dave Johnson, CMU This work is presented in "A Performance Comparison of Mobile Ad Hoc Routing Protocols" a paper which will appeared at MOBICOM '98. They used a model for radios which specified receive strength, propogation delay, duration of transmission, etc. Such factors as carrier sense, capture and attenuation were also figured in. Thus, more than 'range' was used to determine reception. A 'wavelan' like model for the radio was used. The experiment did not study congestion (i.e. light load). For this reason the simulation did not use TCP for traffic, as it backs off and tends to mask the scaling characteristics (more on this later.) The simulation code and protocol implementations are available at http://www.monarch.cs.cmu.edu A "scenario file" includes all start and stop positions. It includes what was sent and when. The same scenario can then be played for the different protocol implementations underneath. This scenario was "semi-realistic" in that mobility agents moved after a random wait period to a random position. The pause was tunable. Mobility agents communicated with random targets each transmission. The simulator ran for approximately one to two times the simulated time. A rectangular space was defined large enough to require multiple hops in many cases. The topology was rectangular, with a 300m-to-1500m ratio of width-to-length for the rectangular box. Each run contained 50 nodes with maximum transmission radii of 250m, creating a densely-connected topology which never partitioned during any of the simulations. # of hops: average 3, maximum 8. 4 protocols were run against exactly the same work load. DSDV-SQ, TORA, DSR and AODV-LL. DSDV was enhanced with a sequence number operation: It performed better in this sense. DSDV uses periodic messages. If the mobility factor is too great it fails to converge, as expected. AODV was enhanced to use link level sensing rather than keep alives, which would have incurred a constant cost. Note that DSR uses link level sensing also. TORA requires broadcast reliability from IMEP which proved problematic. TORA had longer paths than others. Why? Traces indicated that the link reversal algorithm has problems in congestion. There is a temporary loop at the moment of reversal, until a TTL expires and then the data goes away. TORA uses a reliable delivery service provided by IMEP. This protocol suffers congestion collapse. If a single ACK is lost, IMEP erroneously decides the target is unreachable and causes more updates. This leads to further congestion. AODV-LL requires more overhead than DSR as the number of sources increases. Why? DSR has more optimization. DSR has more aggressive caching, frequency of request limitations use of other's caches, etc. Conclusions: DSDV-SEQ fails to converge when mobility factor increases past a certain point. TORA link reversal leads to loss of data. IMEP's overhead is quite high, especially in congestion. DSR's aggressive caching and optimizations are critical. Byte overheads for source routes may be important and were not modeled here. AODV could benefit from optimizations as its protocol overhead currently grows faster than DSR. Q: What is the amount of data which is lost in communications due to the routing protocol? A: It is small in source routing protocols compared to the channel because the path lengths are short. (Average length 3, max 8). Q: What node densities were tried? One can imagine using larger space or more nodes. A: This wasn't done. Many tests were run, but there is a lot more possible! Q: The traffic pattern (4 packets per second to verify route/reachability) with tiny packets doesn't stress the network and create congestion. Why was this chosen? A: Congestion effects tended to mask the routing protocol behaviors. Q: Specifically, why was TCP not tested? A: Packet loss was big enough to cause back off with all protocols. The first hop was good, then behavior got worse the more hops away. TCP favored nodes where were close over those which were far. Q: What was the delay on the link (which was a simulated 802.11)? A: No explicit delay setting for the link was used during for the simulation. Q: Why was a rectangle space used? Doesn't that tend to concentrate the distribution and increase the performance of caches? Wouldn't a square or circle be a better geometry? A: Different geometries will be experimented with in the future. Q: What about a lower density test? A: This wasn't done. DSR will perform worst in partition. Note that AODV has no exponential backoff for retries, so it is also in bad shape. Q: Was IMEP's setting disadvantageous? (1 second was used for determining unreachability). A: This number was taken from an official source. All protocols have timing numbers. Some value has to be chosen unless there is a way to adjust them automatically. This is not currently possible with IMEP. Q: Physical link effects system in NS2 - which were used? A: The team did not try different values. Broadcasts were as likely to be delivered as unicasts (not true). Hiddent terminal effects are from 4-6% which is statistically significant. Q: No delay curves were present in the data, apparently. How did DSR compare with AODV for data delivery delay? A: This wasn't explicitely measured. There were variable queuing and acquisition delays and path lengths... Q: Should you count packets or count bytes? A: It is more important to count packets. While payload vs. overhead is important there's also the IP header, routing header, RTSC, etc. It is hard to count bytes. What are the relevent ones to count? Two things that would be useful to learn, in further studies, would be: 1. More discussion and analysis on saturation. Note that this will be the normal mode of operation. 2. Byte overhead. If a more uniformly distributed traffic pattern was used, a hop-by-hop routing protocol may perform better. Simulation models of the protocols need to be updated. Things have already changed since the paper's comparison was made. --------------------------------------------------------------------------- Possible network simulation scenarios Erik Guttman, Sun There are collections of mobility data out there. Some examples include those used in modeling traffic flow on roads, military maneuvers, crowds shopping or demonstrating. No one disagreed that it would be useful to try to get some of this sort of data to share. We'll have to be creative coming up with realistic applications in these settings. At least we can try to fix some degrees of freedom with realism. --------------------------------------------------------------------------- Simulations of MANET Protocols (Routing Evaluation of MANET Protocols) J. Strater, MITRE Several protocols were evaluated: WIRP TORA Link State Distance Vector version of WIRP ZRP (no results yet) They concentrated on evaluating proactive protocols. They looked at selective acknowledgments, considered header overhead and used multipriority queues. Q: What did you do to model lower layers? A: Our Link Layer simulation captures congestion but avoids collisions. Sources are restricted from transmitting till neighbors have processed previous transmissions. This captures congestion easily, without collisions. They tried lossy statistics, with 10^-5 error rates. Packet sizes were around 500 bytes. Physical layer: topology and link fluctuations were modeled. Q: Does reachability from the network protocol determine the availability of a host? A: No. The network layer has to discover this from lower layer events. MITRE used the MOSDAF simulator, a government force-on-force engagement simulation environment. They looked at three kinds of conditions: Up, Down and Fluctuation. Q: How did you choose whether a link was up or down? What was the spatial coupling model? A: Some randomness was used in modelling motion, highly correlated with terrain (as the simulation used infantry and vehicle models). Movement towards and away from objects correlated to taking certain links up and others down. Terrain effects dominated the movement scenario. MITRE used several traffic models (voice, routing control, Engagement Ops, etc.) In dense networks WIRP and TORA did best. In sparse networks WIRP and link state did better. In that case the delay for TORA was high. Saturation caused serious impact. Statistics emphasize data loss and suffering performance as network load increases. Contact: jstrater@mitre.org --------------------------------------------------------------------------- AODV (Ad Hoc Distance Vector Routing Protocol) Charles Perkins, Sun and Elizabeth Royer, UCSB The newest version of the AODV protocol specification includes features which were not included in the routing protocol comparison from CMU. In particular, AODV stops retrying transmissions now. There is a new extension for Route Request (RREQ) for QoS: The message includes the allowable delay. This value decrements each hop between source and destination and is stored in the routing entries. A 'subnet leader' includes a prefix length. This allows a tag to designate a route table entry with a sequence number. The nominated subnet leader increments the subnet sequence number for the entire group. The current draft reduces reliance on HELLO messages. Simulations (in PARSEC) of up to 1000 nodes using unicast and 64 byte packets have shown 90% goodput. The simulation will include a voice benchmark in the future. For link layers which do not require hello packets for neighbor discovery, hello packets will not be used. Parameters values have changed. 2 retries turned out to work better than 3, for example. Local repairs in routing are possible by sending a small TTL route request. Multicast in AODV (Elizabeth Royer presented this material) Multicast routing uses the base AODV routing messages RREQ and RRPLY with the addition of a 'J', 'P' and 'G' flags. Nodes that receive a route request with the 'J' (Join) flag set will react depending on their membership in the specified group. If a member, the host responds (depending further upon sequence numbers). The root of the tree may obtain more than one path to individual nodes. After discovery completes, the root prunes routes with lesser sequence numbers and eliminates potential loops and all redundencies. Nodes combine to form trees. A group leader periodically broadcasts a group hello. This allows the tree to reconnect after a partition. Partitions result in new leaders elected in the leaderless portions. After partitions rejoin, elections result in only a single leader again. This has been tested with multiple partitions. Recent changes: MINV (invalidate) and MACT (activation) flags, have new names P (prune) and G (group leader) flags. The tree pruning uses 'next hop' not 'shortest paths', as a star topology increases the total traffic in the network. Only a node which is a tree member can attempt repair. Experiments allowing non-tree members to attempt repair exhibited excessive repair costs. The 'U' flag in the 'hello' message indicates a group hello. This resets all group members from the previous leader before the partition. Q: What is the rationale of the removal of the hop count? A: Activation is only required on the next hop. Q: What is the state of simulation, especially of the multicast extensions? A: Simulation results for 50-1000 nodes have been obtained. Simulation for multicast is not done yet. Currently PARSEC has been used. NS2 will be used in the future. A comment for the WG: Visualation tools really help to understand what is going on in Ad Hoc network simulations. --------------------------------------------------------------------------- DSR (Dynamic Source Routing) Dave Johnson, CMU DSR supports route discovery and route maintenance. DSR's address model uses static IP addresses as in Mobile IP. It uses an 'interface index' if there is more than one interface. IMEP has a similar concept and uses 32-bit IP addresses. DSR employs a 7-bit value. This value gets assigned locally and remains with source route information. A new optimization has been added: Packet Salvaging. A -> B -> C - X D -> E (the C to D link fails!) In the past this would result in loss of the data sent from A to E. Now C will look for a route it has cached to E and possibly forward directly to the destination. A -> B -> C -> E Note that this can only be done once per trip (to avoid loops!) IF_Identifier: If a node receives a packet from somewhere, it is possible to identify the id of the interface it was received on and the id of the interface it is to be forwarded onto. Later packets can reverse this route. A bit indicates that packets may be forwarded on different paths than the received path (ie. on different interfaces.) This turns off certain optimizations. Route discovery of a router: The route reply includes a magic number indicating a gateway. There are two types of magic number defined so far: MA "Mobile Agent" (home or foreign agent) ROUTER (a gateway) Multicast data uses local area hop count flooding. This is still in development. A revised DSR specification exists, but Dave did not submit it in time to get it published as an internet draft. Experiments with real scenarios: . FreeBSD 2.2.7, wavelan PCMCIA cards, driving around with real radios . The protocol stands up but performs worse than in simulations. Much of this is due to observed radio effects. . They are experimenting with different types of traffic (UDP and TCP) --------------------------------------------------------------------------- ZRP (Zone Routing Protocol) Marc Perlmann, Cornell ZRP aims to derive proactive routing protocol benefits locally. Between zones reactive routing protocols are used. ZRP use 'bordercasting' to transmit data from any entity in a zone to the border routers of the zone. Route discovery, principally, relies upon this feature. ZRP has another feature, 'extended route lifetime.' ZRP includes a new feature 'Reduced source route specification.' This designates one node per zone instead of all nodes in the zone in routes between zones. ZRP's proactive routing protocol has switched to split horizon from Distance Vector in the reference design. The reactive protocol is currently a source routing protocol with optimizations. The presenter noted that AODV and TORA would also satisfy the reactive routing protocol requirements for ZRP. ZRP with radius one looks like DSR. With an infinite radius, it looks like the internet. The hybrid nature of the protocol makes it versatile. Q: Is ZRP suitable for large zones (ie. larger than radius 2)? A: Load population and node density determine the best zone radius. You want smaller radii for highly mobile zones. Many in the working group argued that the ZRP authors have to prove that ZRP overhead and complexity is worthwhile. Will ZRP scale up? What about millions or billions of nodes? Current simulation experience is limited. Data presented included 'U' curves, where the saddle indicated the benefit of the hybrid approach. Did the data include the cost of bordercasting? The presenter said it did. But it remained unclear whether the traffic data for overlapping zones bordercast had been adequately accounted for. Overlapping zones may also effect control traffic issues. --------------------------------------------------------------------------- Cluster Based Routing Protocol (CBRP) Mingliang, University of Singapore Simulations have been developed using NS2 from Berkeley and CMU. A brief presentation of the CBRP protocol followed. Q: Is scalability to large networks a stated goal of the WG? A: (Macker:) Not a WG goal. CBRP aims at good service for large, dense clouds. Other protocols (AODV and DSR, for instance) involve all nodes in routing. CBRP build routing trees to avoid this requirement. Local recovery allows route reestablishment without complete rediscovery. As DSR doesn't cache state, DSR could not be used as the core-to-core routing protocol very naturally by CBRP. CBRP builds tables which indicate the next hop to a given destination. Simulation results show that the hello message contributes to overhead. DSR doesn't perform well on unidirectional links. Q: How does CBRP obtain and maintain 'second hop' information? A: Hello messages and an IMEP-like mechanism support state maintenance inside the core. --------------------------------------------------------------------------- Implementation experience: The biggest insight they had at CMU was that it is difficult to do physical simulations. They are using Wavelan 1.1 as the 802.11-based stuff from Lucent is not yet available. CMU has successfully run gateways at the periphery of the ad hoc network for internet connectivity. Device drivers are lacking for the new stuff coming out. Scott Corson's experience is similar - they are using wavelan-equipped laptops running Linux.