SketchLearn - Relieving User Burdens in Approximate Measurement with Automated Statistical Inference

  • Link2Paper
  • [Author: Qun Huang et al.]
  • [Keywords: Network Measurement, Sketch]

Motivation

Network measurement is challenged to fulfill stringent resource requirements in the face of massive network traffic. While approximate measurement can trade accuracy for resource savings, it demands intensive manual efforts to configure the right resource-accuracy trade-offs in real deployment. Such user burdens are caused by how existing approximate measurement approaches inherently deal with resource conflicts when tracking massive network traffic with limited resources. In particular, they tightly couple resource configurations with accuracy parameters, so as to provision sufficient resources to bound the measurement errors. Such tight binding leads to several practical limitations:

  1. administrators need to specify tolerable error levels as input for resource configurations;
  2. each configuration is tied to specific parameters (e.g., thresholds);
  3. the theoretical analysis only provides worst-case analysis and fails to guide the resource configurations based on actual workloads;
  4. each configuration is fixed for specific flow definitions;
  5. administrators cannot readily quantify the extent of errors for measurement results.

Contribution

Problem

Design Requirement

  1. Small memory usage;
  2. Fast per-packet processing;
  3. Real-time response;
  4. Generality.

Limitations of Existing Approaches

Limitation

  1. Hard to specify expected errors (theoretical guarantees based on configurable parameters);
  2. Performance degrades with different threshold (e.g. heavy hitter detection);
  3. Hard to tune configurations;
  4. Hard to re-define flowkeys (5-tuple, src-dst addr pairs, …);
  5. Hard to examine correctness.

Solution

  1. Multi-level Sketch (makes it possible to quantify hash collisions, bit-level flowkey distributions)

Multi-level sketch

Multi-level sketch

  1. Model Inference (their assumptions require to separate large and small flows)

Model Inference

Large flow extraction

Evaluation

Testbed

OVS, P4. Simulator is used for larger scale test (eliminates network transfer overhead).

Traces

CAIDA [1] and UN2 [2].

Experiments

  1. Memory usage

  1. Per-packet processing
  2. Testbed throughput
  3. Simulator throughput
  4. Inference time
  5. Generality

  1. Fitting theorems
  2. Robust to small thresholds

  1. Arbitrary field combinations
  2. Attaching error measures
  3. Network-wide coordination

Source code

Link to github repo. Respect! :)

Limitation

  • Default parameters are limited
  • Does the model inference algorithm converge, especially with larger number of rows?

Ref