IP Performance Metrics WG (IPPM) Tuesday, August 25 at 0900-1000 Chairs: Guy Almes Will Leland AGENDA: 1. Status report on I-Ds and RFC progress (G. Almes, W. Leland) (5 minutes) 2. Presentation on error bars (M. Zekauskas) (15 minutes) This short presentation summarizes the new material added to I-Ds and 3. Discussion of error bars and confidence intervals (30 minutes) 4. Future directions for IPPM (G. Almes, W. Leland) (10 minutes) 5. A brief overview and discussion, if time permits. Will Leland opened the WG meeting with a review of the agenda followed by a brief report on the WG's document status: - RFC 2330: Framework for Metrics - Connectivity: WG final call - One-way Packet Loss, One-way Packet Delay Discussed at April IETF, revised Calibration error material added - Delay Variation: revised - Bulk Transfer: stalled - Loss Patterns: no consensus Matt Zekauskas then took the floor. He reviewed the changes that had been added to the Drafts: added "Calibration" to the "Errors" section, removed the path parameter, and created a new "Reporting the Metric" section. The goal of this effort is "... to be able to compare metrics from different implementations." To that end, you need to be able to find and remove systematic errors and then be able to classify/characterize the random error. We want to be able to report for a singleton value +/- "calibration error bar", with 95% confidence. The idea is to ensure the metrics aren't dominated by the error. And, as defined with One-way Delay, the error is hard to analyze. We need to look for a simpler way. We can characterize errors in measurement as measured = true + systematic + random. With suitable experience, we can remove the systematic component and characterize the random error in a given measurement; that is reported = measured - systematic yielding true = reported - random. As to the question of why 95% versus some other interval: we must chose some specific value to allow comparison between measurements. And, as experience suggests, "...[f]or user-level implementation, 95% [is] tight enough to exclude outliers." This value is therefore specific proposal for IPPM confidence bounds. It is important to note that the calibration error is a property of the measurement instrument - in effect, "How accurate is the yardstick?" - and is not a property of a series of measurements. The problem of how to treat losses resolves into three possible causes, two of which are false losses and which are looked at: - False loss due to threshold value - False loss due to resource limits on measurement instrument (e.g., buffers) - Packets truly lost, but reported as finite (not considered). The first is addressed by trying to have a large enough threshold value so that false loss is not a problem. We report probability of false loss, which depends on local network and measurement instrument loads. Some potentially open issues; really some things to think about: - How does an error bar relate to percentile statistics on the metrics? - False loss could make large (e.g., 90th) percentile infinite - Due to dispersion of values, low-to-middle percentiles don't change much. Following Matt's presentation, a Q&A session further discussed the issues raised. Each topic is summarized below: - How are calibration error bars assigned? By calibrating in a lab under similar conditions - that is under simulated load. The error bars are a property of the singleton, not for the stream. - The chairs welcome more comment on this - it seems to be the only open issue and, while all would like to get the draft out, no important contributions should be missed. - The question of clipping/discarding data was raised. This is not the case - no data is discarded rather what is being attempted is to clip the amount of uncertainty - how much error we are introducing into the measurement. Again, what this is is a bunch of singletons, not a stream. - Concern was expressed about multiple standards efforts in the area of metrics: for example, the IPPM work overlaps with the new Draft ITU-T Recommendation I.35IP: "Internet Protocol Data Communication Service - IP Packet Transfer and Availability Performance Parameters". Manufacturers value having a single standard; they are concerned that multiple standards may be inconsistent. Vern Paxson, the IPPM Liaison with the T1A1 and ITU efforts in this area, explained that presentations had been made by the IPPM to them prior to this Draft Recommendation, but that there are differences in scope and intention between the ITU and IPPM documents; the Chairs agreed that the critical issue is for the standards to be consistent where they overlap but that a wider scope in the ITU work is not a problem per se. Vern will look into the current situation and report to the IPPM mailing list. - The idea was raised of a document to be prepared by the WG which quantified the metrics being developed - establishing what are good or bad values. Guy Almes replyed that the WG's Area Directors had cautioned against just this - against any sort of interpretation of the metrics. However, a customer might very well use these metrics as tools in dealing with their providers for performance & reliability. The metrics will permit the customer and provider to have a clearly understood set of metrics. The customer and provider "only" have to agree on the values to be associated. - Will Leland pointed out that the WG should document experience as it is valuable to capture and pass on, perhaps as a "Best Practices" document. Following the Q&A, Will Leland led a brief discussion of Future Directions for the WG. He pointed out that the Charter is much out of date and needs to be revised. He also raised the question of what is the best process for stringent metric scrutiny. Finally, several metrics have been proposed either in a session or on the mailing list, but have elicited little response. He said "Orlando or Else" - that is, if the WG is to consider it, it must have gotten some debate behind it before the next IETF, 7-11 December in Orlando. Will also revisited the question of how to coordinate our work with that of other standards bodies. He will try to get more resources applied before Orlando, and invited volunteers. A speaker pointed out how the publication by various labs of router "benchmark" figures had caused the manufactures to take notice and pay attention. The speaker hoped the WG's work would come to the same end.