Orchestrator Pipeline

Each device has a dedicated orchestrator (run_rtx5090.py, run_jetson_agx.py, run_jetson_nano.py) that executes the full benchmark pipeline in sequential phases.

RTX 5090

ORCHESTRATOR (run_rtx5090.py)
│
├── Load experiment matrix from config
│   └── Produces: train_runs, export_runs, infer_runs
│
├── PHASE 1 — Training
│   └── FOR each model (arch × size × task × approach):
│       ├── IF report.txt already exists → SKIP (resume)
│       ├── train_model()
│       │   ├── Train YOLO from scratch / pretrained weights
│       │   ├── Run one validation pass
│       │   └── Write report.txt  (metrics + timing)
│       └── Mark run as done/failed in status JSON
│
├── PHASE 2 — TensorRT Export
│   └── FOR each model × precision (FP16, INT8):
│       ├── IF .engine already exists → SKIP
│       ├── IF .pt weights missing → FAIL
│       └── export_model()  →  writes .engine file
│
├── PHASE 3 — Inference Benchmark
│   └── FOR each model × format × precision × imgsz × batch:
│       ├── IF report_{format}_{prec}_img{sz}_b{bs}.txt exists → SKIP
│       ├── IF weights missing → FAIL
│       └── run_inference()
│           ├── N warm-up passes  (discard)
│           ├── M measurement passes  (timed)
│           └── Write report_{...}.txt  (FPS, latency, mAP)
│
└── PHASE 4 — Aggregation
    ├── Collect all report*.txt across results/rtx5090/
    └── Write benchmark_results.csv  (single summary table)

Jetson AGX Orin

Weights must be copied from the RTX 5090 first (scp results/rtx5090/ ...). No training phase — export and inference only.

ORCHESTRATOR (run_jetson_agx.py)
│
├── Load experiment matrix from config
│   └── Produces: export_runs, infer_runs
│
├── PHASE 1 — TensorRT Export
│   └── FOR each model × precision (FP16, INT8):
│       ├── IF .engine already exists → SKIP
│       ├── IF .pt weights missing → FAIL
│       └── export_model()  →  writes .engine file
│           (built on Jetson hardware — not portable from RTX)
│
├── PHASE 2 — Inference Benchmark
│   └── FOR each model × format × precision × imgsz × batch:
│       ├── IF report_{...}.txt exists → SKIP
│       ├── IF weights missing → FAIL
│       └── run_inference()
│           ├── N warm-up passes  (discard)
│           ├── M measurement passes  (timed, + power via jtop)
│           └── Write report_{...}.txt  (FPS, latency, mAP, FPS/W)
│
└── PHASE 3 — Aggregation
    ├── Collect all report*.txt across results/jetson_agx/
    └── Write benchmark_results.csv

Jetson Orin Nano

Same pipeline as the AGX, with an additional OOM protection layer due to the 8 GB shared memory constraint.

ORCHESTRATOR (run_jetson_nano.py)
│
├── Load experiment matrix from config
│   └── Produces: export_runs, infer_runs
│
├── PHASE 1 — TensorRT Export
│   └── FOR each model × precision (FP16, INT8):
│       ├── IF likely to OOM → SKIP proactively
│       │     (large+b16, large+img1280+b8, medium+img1280+b16)
│       ├── IF .engine already exists → SKIP
│       ├── IF .pt weights missing → FAIL
│       ├── export_model()  →  writes .engine file
│       └── ON OOM error → SKIP (caught, not failed)
│
├── PHASE 2 — Inference Benchmark
│   └── FOR each model × format × precision × imgsz × batch:
│       ├── IF likely to OOM → SKIP proactively
│       ├── IF report_{...}.txt exists → SKIP
│       ├── IF weights missing → FAIL
│       ├── run_inference()
│       │   ├── N warm-up passes  (discard)
│       │   ├── M measurement passes  (timed, + power via jtop)
│       │   └── Write report_{...}.txt  (FPS, latency, mAP, FPS/W)
│       └── ON OOM error → SKIP (caught, not failed)
│
└── PHASE 3 — Aggregation
    ├── Collect all report*.txt across results/jetson_nano/
    └── Write benchmark_results.csv

Resume behaviour

Every phase is fully resumable. Completed runs are detected by the existence of their output file and skipped on re-run — no re-training or re-inference needed after an interruption.

Phase	Skip condition
Training	`report.txt` exists
Export	`.engine` file exists
Inference	`report_{format}_{prec}_img{sz}_b{bs}.txt` exists

Device differences

Feature	RTX 5090	Jetson AGX	Jetson Orin Nano
Training	Yes	No (uses RTX weights)	No (uses RTX weights)
TensorRT export	Yes (FP16/INT8)	Yes (FP16/INT8)	Yes (FP16/INT8)
Inference	Yes	Yes	Yes
OOM protection	No	No	Yes

TensorRT engines are GPU-architecture specific

Engines built on the RTX 5090 cannot be used on Jetson devices and vice versa. Each orchestrator exports its own .engine files on the target hardware.