Validation and Benchmarking
Validating tractography results against ground truth — the ISMRM 2015 challenge, scoring methodology, and coordinate system considerations.
Why Validation Matters
Tractography reconstructs possible pathways — it does not directly observe fibers. Validation asks a critical question: do the reconstructed streamlines correspond to known anatomical structures? Without validation, there is no way to distinguish genuine pathways from algorithmic artifacts.
ISMRM 2015 Tractography Challenge
The ISMRM 2015 Tractography Challenge provides a standardized benchmark for evaluating fiber tracking algorithms. It includes a simulated diffusion dataset with known ground truth bundles, enabling quantitative comparison of tractography methods.
The scoring data (scoring_data_Renauld2023/) includes ground truth bundle files, ROI masks, endpoint definitions, and configuration files for both ROI-based and RecoBundles-based scoring.
Ground Truth Bundles
The challenge includes 22 major white matter bundles organized into four categories:
Commissural
CA, CC, CP, MCP
Cross-hemisphere connections through the corpus callosum and commissures
Association
Cingulum, ILF, OR, SLF, UF (L/R)
Long-range intra-hemispheric connections
Projection
BPS, ICP, SCP (L/R)
Connections between cortex and subcortical structures
Special
Fornix
Limbic system pathway critical for memory
Scoring Methodology
ROI-Based Scoring
Filters user tractography using anatomical ROI constraints: streamlines must pass through required masks, intersect inclusion masks, and have endpoints in specified head/tail regions. Geometric length constraints may also apply.
RecoBundles Scoring
Uses machine learning to identify bundles in user tractography by comparing against ground truth streamline shapes. More forgiving of ROI misalignment but less direct than anatomical scoring.
Evaluation Metrics
| Metric | Abbreviation | Description |
|---|---|---|
| Valid Bundles | VB | Percentage of submitted bundles passing validity checks |
| Valid Connections | VC | Percentage of correct endpoint connections |
| Invalid Bundles | IB | Percentage of bundles with anatomically impossible paths |
| Bundle Coverage | BC | Percentage of ground truth covered by submission |
| Bundle Overreach | BO | Percentage of submission not in ground truth |
| Weighted Dice | WD | Spatial overlap weighted by streamline density |
A combined scoring formula weighting the contributions of valid and invalid results.
Coordinate System Warning
.trk files with world-space coordinates and header-defined transforms. If you save HINEC tracks as TRK without proper header setup, the scoring scripts will misinterpret coordinates — resulting in complete misalignment and a score of zero.HINEC-to-TRK Conversion
To score HINEC results against ISMRM ground truth, tracks must be converted to TRK format with proper spatial headers.
# Convert HINEC tracks to TRK with proper headerimport nibabel as nibimport scipy.iofrom dipy.io.streamline import save_tractogramfrom dipy.io.stateful_tractogram import StatefulTractogram, Space # Load reference NIfTI for spatial informationref_nii = nib.load('reference.nii.gz') # Load HINEC tracks from .mat filemat_data = scipy.io.loadmat('tracks.mat')tracks = mat_data['tracks'] # Create StatefulTractogram with correct spacesft = StatefulTractogram(tracks, ref_nii, space=Space.VOX)save_tractogram(sft, 'output.trk')Ensure the reference NIfTI matches the DWI space used during tracking.
