CURRENT_MEETING_REPORT_


Reported by Claudio Topolcic/CNRI

CIP Minutes

Agenda


   o Status reports

      -  COIP-K
      -  ST-II
      -  VT and PVP
      -  SRI activities

   o Discussion

      -  Analysis of COIP approach vs other CL approaches


Meeting Report

Guru, Claudio and Steve gave overviews of the status of the
implementations that they are responsible for.

Barbara gave an overview of the activities at SRI.


   o Benchmarks on DARTnet.

   o SFQ (based on source & destination IP addresses only) - implemented
     but not debugged.

   o SFQ + resource reservation - to work with ST, for example.

   o Writing an annotated bibliography on congestion control.

   o tg currently uses tcp or udp sockets; we need to add ST sockets and
     test.  Benchmark results:  BW, loss, delay; fairness, path
     utilization.


Discussion of CO vs.  CL Approaches

                                   1


The purpose of this discussion was to understand the real differences
between the approach taken by this group versus other, ostensibly
connectionless, approaches that have been proposed, and where there are
differences, to identify analysis, measurements, or experiments that
would give us a better understanding of which approach is superior in
which situation.

Steve led a discussion of our understanding of an alternate CL approach.
The following is a diagram of the modules that would have to be
implemented in a router in order to support such an approach.


    ----------------------------------------------------------------
    |      ________________         _____________       ________   |
    |     |     Packet     |       |  Resource   |              |  |
  ------> | Identification |-----> | Enforcement |---->  Queue  |----->
    |     |________________|       |_____________|      ________|  |
    |             |          _____        ^                       |
    |             |         |     |       |                       |
    |             |         |_____|       |                       |
    |             --------> |_____|--------                       |
    |                      |     | Forwarding                     |
    |                      |_____| Table                         |
    |                         ^                                  |
    -------------------------- | -----------------------------------
                              |
                       _______________
                      |               |
                      |    Resource   |
                      |    Manager    |
                      |               |
                      |_______________|


We discussed what were believed to be differences in the approaches.


  1. Classes vs.  individual flows.

     A proposed CL approach may have ``classes'' that can carry traffic
     belonging to different flows.  However, Guru's MCHIP protocol has
     PICons and Lixia's Flow Protocol (FP) has Flow 0, either of which
     can carry packets from any flow so are equivalent concepts.  When
     you use a PICon, you have to include more addressing info than just
     the logical channel number, perhaps the full addresses.  This
     raised the question of whether the short headers that ST and MCHIP
     use are worthwhile, and how often they would be used?

                                   2


   We may have a different view of the future.  Will individual flows
   be small or large with respect to available bandwidth.  If they are
   large, then identifying individual flows will be more important.
   If they are small, then perhaps it is better to aggregate a number
   of flows together.  The answer may be different if we look at the
   short term or the long term.

2. Reservation request and the start of the data flow.

   There may be a difference as to the chronological binding of
   reservation to the time flow begins.  We make the reservation at
   the time the flow begins.  An alterate approach might allow a
   reservation ahead of time.  There are some further issues,
   specifically, if the intent is to not do any work at the time the
   flow begins, then the system must be prepared to redo work as the
   topology changes.

3. Failure recovery.

   When a link goes down, connectionless protocols can reroute more
   easily if multiple paths exist.  But in the CO scheme, we could use
   Flow 0 or PICon (or encapsulate ST in IP) along the alternate path
   without guarantees during the recovery.  How fast will IP rerouting
   be compared to CO connection repair?  One RTT?

4. Location of resource manager.

   The alternate approach allows the resource manager to be in a
   separate box from the router.  A resource manager separate from the
   router allows a hot standby for redundancy, possibly fewer resource
   managers than forwarders, allowing the use of dumb, and therefore
   cheap, forwarders, and may simplify the transition from the current
   IP to an ``integrated services'' IP since the changes to the
   routers might be less so it would be easier to get the vendor to
   accept the change.

   However, it needs a reliable protocol between the resource manager
   and the forwarder, which must be standardized to allow mixing
   vendors and introduces a number tradeoffs, e.g., problems because
   the manager doesn't directly see connectivity changes.  Further, we
   don't expect any difference in setup time required with separate
   resource manager vs.  one combined in the router.

5. Transition path to the new system.

   A CL approach is presumed to allow an easier transition.  However,
   how significant is it whether the first 20 bytes look the same as
   an IP header?  In either case, new software must be installed in
   all routers that need to implement resource management.  Host
   software may not need to change if resource management used only IP
   options since the existing BSD software allows IP options to be
   specified by the application.

                                 3


  6. Resource management.

     This is an issue regardless of the approach taken.  Furthermore, in
     general, the same mechanisms can be used in both approaches.

  7. Flywheel resource allocation.

     This is a scheme by which a router predicts the resource
     requirements of flows within a implicitly by monitoring past usage
     and assuming that the requirements will change slowly, that is, it
     has ``momentum''.  If a new flow is detected which would overuse a
     class's resources, that new flow could be blocked.  This approach
     requires keep-alives, may require further feedback to the
     applications, and does not interact well with pre-scheduling of
     resources.

  8. Routing.

     A CO oriented approach doesn't need smart routing because the
     routes are verified anyway, allows for alternate path routing based
     on load whereas a datagram approach does not, because it is
     unstable.  Further, we couldn't see how IP multicast would support
     dynamic flows efficiently.

  9. Explicit vs implicit setup.

     A CO scheme, which naturally incorporates explicit setup, allows
     coordinated call blocking, which would allow for some set of
     related flows to succeed, rather than a random set.  However, in an
     implicit setup scheme, the cost (delay) is the same if the setup
     fails, but much lower if it succeeds, which is presumed to be most
     of the time.  On the other hand, doesn't just push the buck up a
     level (making the application decide if connection didn't work, vs.
     having explicit setup at a lower layer)?


Experiments

We identified a number of tests and experiments that could be conducted
to try to tell which approach may be better under what circumstances.


   o Questions

      -  Does blocking work?
      -  How much interference comes from outages?
      -  Do you honor scheduled calls?
      -  Utilization?

   o Types of experiments:

                                   4


      -  Measure lost bandwidth due to flywheel approach as utilization
         approaches saturation.

      -  If CO implies enforcement per flow, and CL allows enforcement
         per class, which works better.

      -  Failure recovery.

          * What is the impact of an outage on flows over paths that
            haven't failed (as failed flows are rerouted)?

          * How long does it take to reconstruct and what mechanisms are
            required in each case?

          * Measure time required to detect failure with various
            schemes.


   o What is the setup time?

   o How well are pre-scheduled flows honored?

   o Flip-side of (1):  How much loss due to momentum of the flywheel
     (time the allocation is held after the flow stops) and what is the
     impact of reducing the timeout?

   o Which approach is better for correlated flows?


Attendees

Joe Blackmon             blackmon@ncsa.uiuc.edu
Andreas Bovopoulos       andreas@patti.wustl.edu
Helen Bowns              hbowns@bbn.com
Stephen Casner           casner@isi.edu
Barbara Denny            denny@sri.com
Zubin Dittia             zubin@dworkin.wustl.edu
Allison Mankin           mankin@gateway.mitre.org
Jay Melvin               infopath@well.sf.ca.us
Gary Mussar              mussar@bnr.ca
Andy Nicholson           droid@cray.com
Philippe Park            ppark@bbn.com
Gurudatta Parulkar       guru@flora.wustl.edu
Rehmi Post               rehmi@ftp.com
K.K. Ramakrishnan        rama@kalvi.enet.dec.com
Mark Schaefer            schaefer@davidsys.com
Brad Solomon             bsolomon@hobbes.msfc.nasa.gov
Martha Steenstrup        msteenst@bbn.com
Claudio Topolcic         topolcic@NRI.reston.va.us
Wing Fai Wong            wfwong@malta.sbi.com

                                   5


6