Experimental Design
Five experiments with a total of 408 inference runs across all devices.
Experiment 1 — Core Comparison
- Fixed: batch=1, imgsz=640, task=segment
- Varies: format (PyTorch FP32, TensorRT FP16, TensorRT INT8), approach (scratch, pretrained), architecture, model size
- RTX 5090: train + infer (PyTorch), export + infer (TensorRT FP16/INT8) | Jetsons: export + infer (TensorRT)
Experiment 2 — Input Size Impact
- Fixed: batch=1, format=PyTorch FP32, task=segment
- Varies: imgsz (320, 1280), approach (scratch, pretrained), architecture, model size
- All devices: inference only (reuses weights from Experiment 1)
Experiment 3 — Batch Throughput
- Fixed: imgsz=640, format=PyTorch FP32, approach=scratch, task=segment
- Varies: batch (4, 8, 16), architecture, model size
- All devices: inference only (reuses weights from Experiment 1)
Experiment 4 — Detection vs Segmentation
- Fixed: batch=1, imgsz=640, format=PyTorch FP32
- Varies: approach (scratch, pretrained), architecture, model size
- RTX 5090: train + infer | Jetsons: inference only
Experiment 5 — Class Imbalance Impact
- Fixed: batch=1, imgsz=640, format=PyTorch FP32, task=segment
- Varies: approach (scratch_balanced, pretrained_balanced), architecture, model size (nano, small, medium, large)
- RTX 5090: train + infer | Jetsons: inference only
- Compares per-class mAP against unbalanced baselines from Experiment 1
Weighted sampling
Images containing rare classes (IV-5, IV-6, IV-3) are sampled more frequently
during training via a WeightedRandomSampler. The validation set is unchanged,
ensuring mAP scores reflect true model performance.
Run Distribution
| Device | Training | Export | Inference | Total |
|---|---|---|---|---|
| RTX 5090 | 48 | 32 | 136 | 216 |
| Jetson Orin AGX | 0 | 32 | 136 | 168 |
| Jetson Orin Nano | 0 | 32 | 136 | 168 |
| Total | 48 | 96 | 408 | 552 |
Weight reuse
Experiments 2 and 3 reuse trained weights from Experiment 1. Inference-only runs do not require retraining.