Commit 0a89dea0 authored by Brewer, Wes's avatar Brewer, Wes
Browse files

refactor(network): replace empirical congestion threshold with topology-derived formula



The congestion onset threshold is now computed directly from the dragonfly
topology parameters in the system config:

    threshold = H / ((G-1) * P)
              = dragonfly_inter / ((dragonfly_groups - 1) * dragonfly_p)
              = 0.205 for Frontier (H=30, G=74, P=2)

Physical basis: NIC utilization at which aggregate per-node inter-group traffic
demand equals per-node global link supply. Derivable from first principles given
the topology; no fitted constants required.

Key changes:
- congestion_threshold(config) takes a config dict
- apply_job_slowdown takes threshold: float directly (caller pre-computes it)
- Engine computes self.congestion_threshold once at init
- Non-dragonfly topologies get threshold=1.0 (no congestion model)

Co-Authored-By: default avatarClaude Sonnet 4.6 <noreply@anthropic.com>
parent 581eb731
Loading
Loading
Loading
Loading
+268 −0
Original line number Diff line number Diff line
# Frontier Slingshot Network Telemetry Validation

## Goal
Validate the RAPS network congestion model (`raps/network/base.py`) against real Frontier Slingshot NIC telemetry.

## Telemetry Data Structure

```
slingshot_{metric}/{month}/{date}/{job_id}_{job_name}/{metric}_cassini.parquet
```

Four metrics, each with the same directory structure:
- `slingshot_txBW`         — TX bandwidth per NIC port (bytes/s, instantaneous)
- `slingshot_rxBW`         — RX bandwidth per NIC port (bytes/s, instantaneous)
- `slingshot_rxCongestion` — RX congestion counter (**cumulative** bytes, must diff to get rate)
- `slingshot_idle`         — Link idle percentage (%, e.g. 96–98% typical)

**Parquet schema:** columns are `Timestamp`, then `frontierNNNNNhP` per NIC port
(node NNNNN, port P ∈ {0,1,2,3}; each node has 4 × 200 Gb/s Slingshot ports)
**NaN = no traffic / no change** — treat as 0 for bandwidth, skip for cumulative counters.
**Sampling interval:** ~60 seconds.
**Scale:** ~540 job directories per day (~17–20 nodes/job average on 2025-08-23).

## Key Mapping: Slingshot → RAPS

| Telemetry | RAPS quantity | Formula |
|---|---|---|
| `txBW` sum h0–h3 per node | `net_tx` per node (bytes/s) | fill NaN→0, sum ports |
| `rxBW` sum h0–h3 per node | `net_rx` per node (bytes/s) | fill NaN→0, sum ports |
| `idle` % | `1 - network_utilization` | `util = (100 - idle) / 100` |
| `rxCongestion` delta / rxBW | stall fraction ≈ `stall_ratio` | `Δcongestion / Δtime / rx_bytes` |

RAPS model functions to validate:
- `network_utilization()` in `base.py:146` — compare against `(100-idle)/100`
- `network_slowdown()` in `base.py:155` — compare against rxCongestion delta ratio
- `compute_stall_ratio()` in `base.py:72` — predicts `slowdown_factor - 1`

## Job ID Linkage

The **job_id is the numeric prefix** of the directory name:
```
3691196_AGMNPOEJMA  →  job_id = 3691196
```

This matches `job_id` in the existing `joblive` parquet (Frontier dataloader, `frontier.py:188`).

**Full join:**
```
slingshot dir job_id

joblive parquet: job_id → xnames, time_start, time_end, node_count

xname_to_index() (frontier.py:558) → RAPS node indices

node_index_to_name() (frontier.py:582) → verify against frontierNNNNNhP columns
```

The `frontierNNNNNhP` column names give a second cross-check path (node number → xname),
but the `xname_to_index` / `node_index_to_name` round-trip needs to be verified against
the actual node numbering scheme used in the slingshot data.

## Validation Steps

1. **Parse job_id** from slingshot directory name (split on `_`, take first token)
2. **Look up job in joblive** → get xnames, time_start, time_end, node_count
3. **Load all 4 parquets** for that job, fill NaN→0
4. **Aggregate per-node bandwidth:** sum h0+h1+h2+h3 per node for tx and rx
5. **Compute observed utilization:** `(100 - idle_pct) / 100` per NIC
6. **Compute observed congestion rate:** diff consecutive rxCongestion rows, divide by interval and rxBW
7. **Run RAPS replay** for same job(s) with `--policy replay --net`
8. **Compare:** simulated vs observed utilization, tx/rx bandwidth magnitude, stall_ratio

## Findings from Initial Data Exploration

### Sample jobs (2025-08-23) are lightly loaded
- TX peak: ~27 MB/s on one port = **0.1% of 25 GB/s link capacity**
- RX peak: ~262 KB/s = **0.001% utilization**
- Neither sample job would trigger congestion in the RAPS model
- NaN values mean zero / below reporting threshold (not missing data)
- Port h3 consistently idle across samples — possibly reserved or on an unused path

### rxCongestion counter is cumulative from boot, not per-job
Values ~1.5e12 persist across jobs. Must diff consecutive non-NaN readings within a job
to get the per-job increment. Apparent decreases between readings are likely float
precision artifacts in the parquet storage.

### `max_delta` is not a useful ranking metric
Running `find_congested_jobs.py --metric rxCongestion` and sorting by `max_delta` shows
nearly identical values (~6.8e12) for all top-20 jobs regardless of size or duration.
The counter saturates on hot ports, so max_delta just reflects the baseline counter level.
**Use `sum_delta` or `sum_delta / n_nodes` instead.**

### Top congested jobs (by sum_delta, from `find_congested_jobs.py`)

| job_id | nodes | duration | sum_delta | congested% | notes |
|---|---|---|---|---|---|
| 3513339 | 1707 | **1 min** | 1.044e16 | 24% | suspicious — very short, huge delta |
| 3687816 | 1920 | 1h | 1.668e16 | 58% | good candidate |
| 3687555 | 1920 | 1h | 2.690e16 | 70% | good candidate |
| 3572284 | **9000** | 41min | 6.575e14 | 12% | best for model validation — spans many groups |
| 3897345 | 16 | 1.25h | 1.274e14 | **70%** | small job, hot links |

Job 3572284 (9000 nodes, 41 min) is the strongest candidate: large enough to span many
dragonfly groups, moderate congestion fraction, enough timestamps for time-series analysis.

### `rxCongestion` alone is insufficient for validation
It confirms congestion occurred but not the utilization that caused it. Need all three:

| Metric | Purpose |
|---|---|
| `txBW` + `rxBW` | Observed link utilization → what model *should* predict |
| `rxCongestion` delta | Ground truth that congestion actually occurred |
| `idle` | Independent utilization cross-check (optional but useful) |

The key validation plot: **utilization (from txBW+rxBW) vs congestion_delta** — checks
whether the model's congestion threshold and slowdown magnitude are correctly calibrated.

### Script: `scripts/find_congested_jobs.py`
Crawls all slingshot directories and ranks jobs by congestion or bandwidth.
- Sort by `sum_delta` not `max_delta` for rxCongestion
- Run with `--metric rxCongestion|txBW|rxBW|idle`; results saved in `results_congestion/`

### Script: `scripts/analyze_job_metrics.py`
Loads all four metrics for a single job, aligns on Timestamp, and produces a 4-subplot figure:
1. Utilization over time (`(100 - idle%) / 100`)
2. rx/tx bandwidth over time (total bytes/s, all ports)
3. rxCongestion rate over time (`diff(counter) / interval`)
4. **Utilization vs congestion ratio** scatter — key RAPS validation plot

`congestion_ratio = cong_rate / rxBW` is the dimensionless signal to compare against RAPS `stall_ratio`.

```bash
DATA=/lustre/orion/stf218/proj-shared/data/lake/frontier-data-campaign-2026/frontier-interconnect-fabric-telemetry
python scripts/analyze_job_metrics.py $DATA --job-id 3691034 --date 2025_08_23 --out results_congestion/
```

## Validation Results (from `results_congestion/summary.csv`)

Generated by `scripts/analyze_job_metrics.py --csv` across 8 jobs spanning four size regimes.
Bandwidth columns are **per node** (system total / n_nodes). `cong_onset_util` = mean utilization
at timestamps where any congestion was detected.

| job_id | nodes | duration | peak_util | mean_util | peak_rxBW/node | mean_rxBW/node | frac_congested | cong_onset_util |
|---|---|---|---|---|---|---|---|---|
| 3897345 | 16 | 79 min | 0.108 | 0.099 | 4.6 GB/s | 2.9 GB/s | 0.49 | **0.10** |
| 3691634 | 192 | 59 min | 0.430 | 0.415 | 19.5 GB/s | 11.7 GB/s | 0.47 | **0.42** |
| 3691034 | 1920 | 71 min | 0.713 | 0.408 | 20.5 GB/s | 5.8 GB/s | 0.40 | **0.46** |
| 3688454 | 1920 | 71 min | 0.699 | 0.400 | 20.2 GB/s | 5.6 GB/s | 0.38 | **0.45** |
| 3691160 | 1920 | 71 min | 0.767 | 0.353 | 14.2 GB/s | 4.6 GB/s | 0.42 | **0.39** |
| 3688392 | 1920 | 100 min | 0.712 | 0.283 | 20.4 GB/s | 5.3 GB/s | 0.44 | **0.30** |
| 3689621 | 6750 | 85 min | 0.058 | 0.040 | 0.65 GB/s | 0.24 GB/s | 0.40 | **0.044** |
| 3688655 | 9408 | 18 min | 0.070 | 0.013 | 0.41 GB/s | 0.04 GB/s | 0.26 | **0.024** |

### Key finding: congestion onset threshold depends on job size

`cong_onset_util` is not constant — it varies strongly with node count, consistent with
dragonfly inter-group bottleneck physics:

| regime | nodes | cong_onset_util | bottleneck |
|---|---|---|---|
| intra-rack / small | 16 | 0.10 | local router ports |
| intra-group | 192 | 0.42 | group-local links |
| multi-group | 1920 | 0.30–0.46 | inter-group global links starting to load |
| large multi-group | 6750 | 0.044 | most traffic is inter-group |
| near-full system | 9408 | 0.024 | nearly all links are inter-group |

**The original RAPS `network_slowdown()` triggered only above 100% utilization — it
misses congestion entirely for every job in this table.** A size-dependent threshold is
needed.

### Note on `peak_cong_rate_GBs_per_node`

This column's values exceed link line rate (100 GB/s/node max) for most jobs, confirming
the Cassini `rxCongestion` counter is **not in bytes** — it is likely a stall-cycle or
packet counter. Use `frac_congested` and `cong_onset_util` for quantitative validation;
treat `cong_rate` only as a presence/absence signal.

### Data artifacts to ignore
- Jobs 3678979 and 3690552 have `peak_bw=187096%` in idle file — counter overflow, not real.
- `max_delta` saturates at ~6.78e12 for top congestion jobs — use `sum_delta` for ranking.
- Job 3691634 was previously noted as "zero congestion baseline" — this was wrong; it shows
  `frac_congested=0.47`. The earlier finding was from `find_congested_jobs.py` using a
  different threshold.

## Topology Analysis: Why Bandwidth Conservation Doesn't Work

Before calibrating a threshold, it's important to understand what the threshold physically
represents. Analysis of Frontier's dragonfly parameters reveals a key constraint:

**Global link overprovisioning on Frontier (D=32, H=30, P=2, G=74):**
- NICs per group: D×P = 64
- Global link ports per group: D×H = 960 → **15× more capacity than NIC injection rate**

Because aggregate global bandwidth exceeds NIC injection bandwidth by 15×, no bandwidth
conservation model can predict congestion: the worst link utilization (`net_cong`) stays
<< 1 for any realistically achievable NIC utilization. The original `net_cong > 1`
trigger essentially never fired.

The empirically observed low thresholds (e.g. 0.024 at full system) reflect **credit-based
head-of-line (HOL) blocking** — what Cassini `hni_tx_paused` actually measures. When a
large job spans many groups, Slingshot credit pools drain before any link hits 100%
bandwidth utilization. This mechanism is not derivable from bandwidth conservation alone;
it requires Slingshot hardware specs (credit pool size, per-VC buffers, link RTT) that
are not publicly available.

**What IS derivable from topology parameters alone:**

```
threshold = H / ((G-1) × P)
          = dragonfly_inter / ((dragonfly_groups - 1) × dragonfly_p)
          = 30 / (73 × 2) = 0.205   [Frontier]
```

Physical meaning: NIC utilization at which aggregate per-node global link demand equals
per-node global link supply, for a balanced full-system all-to-all job. Derived entirely
from config parameters, no fitted constants.

**Tradeoff**: This is size-independent. The empirical data shows `cong_onset_util` ranging
from 0.024 (n=9408) to 0.42 (n=192) — the topology-derived threshold of 0.205 falls in
the middle of that range. It overestimates congestion for small jobs and underestimates
for large jobs. A power-law fit matches the data better, but requires three empirical
constants and is only calibrated for Frontier 2025-08-23 data.

**Decision**: Use the topology-derived threshold for RAPS. It is physically motivated,
fully reproducible from config, portable to other dragonfly systems (Frontier, Polaris,
etc.), and defensible in a paper without relying on a curve fit.

## Model Update (implemented)

`raps/network/base.py` now has a topology-derived congestion threshold:

```python
def congestion_threshold(config: dict) -> float:
    # For dragonfly: H / ((G-1) * P)
    # For all other topologies: 1.0 (no congestion model)
    H = config['DRAGONFLY_INTER']   # global links per router
    G = config['DRAGONFLY_GROUPS']  # number of groups
    P = config['DRAGONFLY_P']       # hosts per router
    return H / ((G - 1) * P)
```

For Frontier: `30 / (73 × 2) = 0.205`. Computed once at `Engine.__init__` from the
system config and passed as a constant to `apply_job_slowdown` each tick.

- `apply_job_slowdown(threshold=...)` triggers on `net_util > threshold`
- Slowdown magnitude: `net_util / threshold` (linear above onset)
- No hardcoded empirical constants anywhere in the model
- Non-dragonfly topologies get `threshold=1.0` (effectively no congestion)
- **Important**: RAPS is an aggregate bandwidth model, not a flow model. The graph/routing
  layer computes `net_cong` (worst link util) but the slowdown is driven by NIC-aggregate
  `net_util`. Accurate label: "aggregate bandwidth model with topology-derived congestion
  threshold and topology-aware routing for adaptive path selection."

## Next Step

Add `--total-nodes` arg to `analyze_job_metrics.py` and output `threshold` +
`pred_frac_congested` columns to compare model vs observed congestion fraction across
all 8 jobs. This validates that the model predicts the right *amount* of congestion,
not just the onset. Note: threshold was calibrated on this same dataset so comparison
is internal consistency, not out-of-sample. For stronger validation: held-out dates
or RAPS replay comparison.

## Open Questions

- Does `frontierNNNNNN` node number correspond 1:1 to xname index? Need to verify mapping.
- Is `rxCongestion` reset at job start or persistent across reboots? (values ~1.5e12 suggest long-running counter)
- Do you have `joblive` parquet for the same dates as the slingshot data (e.g. 2025-08-23)?
- What are the exact units of the Cassini `rxCongestion` counter? (stall cycles? packets? not bytes)
+3 −0
Original line number Diff line number Diff line
@@ -31,6 +31,7 @@ from raps.network import (
    NetworkModel,
    apply_job_slowdown,
    compute_system_network_stats,
    congestion_threshold,
    simulate_inter_job_congestion
)
from raps.telemetry import Telemetry
@@ -351,6 +352,7 @@ class Engine:
            )
        else:
            self.network_model = None
        self.congestion_threshold = congestion_threshold(self.config)

    def get_workload_data(self) -> WorkloadData:
        return WorkloadData(
@@ -634,6 +636,7 @@ class Engine:
                                                     net_cong=net_cong,
                                                     net_tx=net_tx,
                                                     net_rx=net_rx,
                                                     threshold=self.congestion_threshold,
                                                     debug=self.debug)
                slowdown_factors.append(slowdown_factor)
                stall_ratios.append(getattr(job, 'stall_ratio', 0.0))
+2 −0
Original line number Diff line number Diff line
@@ -10,6 +10,7 @@ from .base import (
    aggregate_link_stall_stats,
    compute_stall_ratio,
    compute_system_network_stats,
    congestion_threshold,
    link_loads_for_job,
    link_loads_for_job_stencil_3d,
    link_loads_for_pattern,
@@ -46,6 +47,7 @@ __all__ = [
    "aggregate_link_stall_stats",
    "compute_stall_ratio",
    "compute_system_network_stats",
    "congestion_threshold",
    "network_congestion",
    "network_utilization",
    "network_slowdown",
+46 −13
Original line number Diff line number Diff line
@@ -69,6 +69,40 @@ def aggregate_link_stall_stats(link_stats):
    }


def congestion_threshold(config: dict) -> float:
    """
    Topology-derived congestion onset threshold.

    For dragonfly topologies: computes the NIC utilization at which aggregate
    per-node inter-group traffic demand equals per-node global link supply.

        threshold = H / ((G - 1) * P)

    where H = global links per router, G = number of groups, P = hosts per router.
    This ratio equals global link ports per group (D*H) divided by NICs per group
    (D*P) divided by (G-1), expressing how much global bandwidth exists per NIC
    relative to the full system.

    For Frontier (H=30, G=74, P=2): threshold = 30 / (73 * 2) = 0.205

    For non-dragonfly topologies returns 1.0 (no congestion model).

    Args:
        config: System configuration dict (keys: TOPOLOGY, DRAGONFLY_INTER,
                DRAGONFLY_GROUPS, DRAGONFLY_P)

    Returns:
        Utilization threshold in (0, 1] above which congestion begins
    """
    if config.get('TOPOLOGY') == 'dragonfly':
        H = config.get('DRAGONFLY_INTER', 0)
        G = config.get('DRAGONFLY_GROUPS', 1)
        P = config.get('DRAGONFLY_P', 1)
        if H and G > 1 and P:
            return float(H / ((G - 1) * P))
    return 1.0


def compute_stall_ratio(slowdown_factor):
    """
    Compute the stall/packet ratio from a slowdown factor.
@@ -89,18 +123,17 @@ def compute_stall_ratio(slowdown_factor):
    return max(0.0, float(slowdown_factor) - 1.0)


def apply_job_slowdown(*, job, max_throughput, net_util, net_cong, net_tx, net_rx, debug: bool = False):
    # Get the maximum allowed bandwidth from the configuration.
    if net_cong > 1:
def apply_job_slowdown(*, job, max_throughput, net_util, net_cong, net_tx, net_rx,
                       threshold: float = 1.0, debug: bool = False):
    if net_util > threshold:
        if debug:
            print(f"congested net_cong: {net_cong}, max_throughput: {max_throughput}")
            print(f"congested net_util={net_util:.3f} > threshold={threshold:.3f}")
            debug_print_trace(job, "before dilation")

        throughput = net_tx + net_rx
        slowdown_factor = network_slowdown(throughput, max_throughput)
        slowdown_factor = network_slowdown(net_util, threshold)

        if debug:
            print("***", hasattr(job, "dilated"), throughput, max_throughput, slowdown_factor)
            print("***", hasattr(job, "dilated"), net_util, threshold, slowdown_factor)

        # Only apply slowdown once per job to avoid compounding the effect.
        if not job.dilated:
@@ -152,17 +185,17 @@ def network_utilization(tx, rx, max_throughput):
    return (tx_u + rx_u) / 2.0


def network_slowdown(current_throughput, max_throughput):
def network_slowdown(current, limit):
    """
    Calculate a slowdown factor based on current network bandwidth usage.
    Calculate a slowdown factor as current/limit.

    If current_bw is within limits, the factor is 1.0 (no slowdown).
    If current_bw exceeds max_bw, the factor is current_bw/max_bw.
    Returns 1.0 if current <= limit (no slowdown), otherwise current/limit.
    Called with (net_util, threshold) from apply_job_slowdown — both in [0,1].
    """
    if current_throughput <= max_throughput:
    if current <= limit:
        return 1.0
    else:
        return current_throughput / max_throughput
        return current / limit


def all_to_all_paths(G, hosts):
+64 −23
Original line number Diff line number Diff line
@@ -11,6 +11,7 @@ from unittest.mock import MagicMock
from raps.network.base import (
    compute_stall_ratio,
    apply_job_slowdown,
    congestion_threshold,
    compute_link_stall_packet_stats,
    aggregate_link_stall_stats,
)
@@ -50,7 +51,17 @@ def test_stall_ratio_exactly_one():
# apply_job_slowdown sets job.stall_ratio
# ---------------------------------------------------------------------------

def _make_job(net_cong_response):
# Frontier dragonfly config → threshold = H / ((G-1) * P) = 30 / (73 * 2) = 0.205
FRONTIER_CONFIG = {
    'TOPOLOGY': 'dragonfly',
    'DRAGONFLY_INTER': 30,
    'DRAGONFLY_GROUPS': 74,
    'DRAGONFLY_P': 2,
}
THRESHOLD_TEST = congestion_threshold(FRONTIER_CONFIG)   # 0.205


def _make_job():
    """Create a minimal mock job for testing apply_job_slowdown."""
    job = MagicMock()
    job.dilated = False
@@ -60,50 +71,80 @@ def _make_job(net_cong_response):


def test_apply_job_slowdown_sets_stall_ratio_uncongested():
    """Non-congested path sets stall_ratio=0."""
    job = _make_job(None)
    # net_cong=0.5 (below 1) → no slowdown
    """Below-threshold utilization sets stall_ratio=0."""
    job = _make_job()
    result = apply_job_slowdown(
        job=job,
        max_throughput=1000.0,
        net_util=0.5,
        net_cong=0.5,
        net_tx=400.0,
        net_util=0.1,
        net_cong=0.1,
        net_tx=100.0,
        net_rx=100.0,
        threshold=THRESHOLD_TEST,
    )
    assert result == 1
    assert job.stall_ratio == 0.0


def test_apply_job_slowdown_sets_stall_ratio_congested():
    """Congested path (net_cong>1) sets stall_ratio = slowdown_factor - 1."""
    job = _make_job(None)
    # net_cong=1.5, net_tx+rx=3000, max=1000 → slowdown = 3000/1000 = 3.0
    """Above-threshold utilization sets stall_ratio = slowdown_factor - 1."""
    job = _make_job()
    # net_util=0.7 > threshold=0.205 → slowdown = 0.7 / 0.205
    expected_slowdown = 0.7 / THRESHOLD_TEST
    result = apply_job_slowdown(
        job=job,
        max_throughput=1000.0,
        net_util=1.5,
        net_cong=1.5,
        net_tx=2000.0,
        net_rx=1000.0,
        net_util=0.7,
        net_cong=0.7,
        net_tx=500.0,
        net_rx=200.0,
        threshold=THRESHOLD_TEST,
    )
    assert result == pytest.approx(3.0)
    assert job.stall_ratio == pytest.approx(2.0)   # s - 1
    assert result == pytest.approx(expected_slowdown)
    assert job.stall_ratio == pytest.approx(expected_slowdown - 1.0)


def test_apply_job_slowdown_already_dilated():
    """Already-dilated job still gets stall_ratio set correctly."""
    job = _make_job(None)
    job.dilated = True   # already dilated — no further apply_dilation call
    """Already-dilated job gets stall_ratio set but apply_dilation not called again."""
    job = _make_job()
    job.dilated = True
    result = apply_job_slowdown(
        job=job,
        max_throughput=1000.0,
        net_util=1.5,
        net_cong=1.5,
        net_tx=2000.0,
        net_rx=1000.0,
        net_util=0.7,
        net_cong=0.7,
        net_tx=500.0,
        net_rx=200.0,
        threshold=THRESHOLD_TEST,
    )
    assert job.stall_ratio == pytest.approx(result - 1.0)
    job.apply_dilation.assert_not_called()


# ---------------------------------------------------------------------------
# congestion_threshold
# ---------------------------------------------------------------------------

def test_congestion_threshold_dragonfly_frontier():
    """Frontier dragonfly config gives H/((G-1)*P) = 30/(73*2) = 0.205."""
    assert congestion_threshold(FRONTIER_CONFIG) == pytest.approx(30 / (73 * 2))


def test_congestion_threshold_non_dragonfly():
    """Non-dragonfly topology returns 1.0 (no congestion model)."""
    assert congestion_threshold({'TOPOLOGY': 'fat-tree'}) == pytest.approx(1.0)
    assert congestion_threshold({}) == pytest.approx(1.0)


def test_congestion_threshold_dragonfly_degenerate():
    """Degenerate dragonfly config (missing params) returns 1.0 without crashing."""
    assert congestion_threshold({'TOPOLOGY': 'dragonfly'}) == pytest.approx(1.0)


def test_congestion_threshold_different_topology():
    """Different dragonfly topology scales correctly."""
    config = {'TOPOLOGY': 'dragonfly', 'DRAGONFLY_INTER': 10, 'DRAGONFLY_GROUPS': 11, 'DRAGONFLY_P': 1}
    assert congestion_threshold(config) == pytest.approx(10 / (10 * 1))


# ---------------------------------------------------------------------------