Home DocumentationValidation and Benchmarking

Validation and Benchmarking

Validating tractography results against ground truth — the ISMRM 2015 challenge, scoring methodology, and coordinate system considerations.

Why Validation Matters

Tractography reconstructs possible pathways — it does not directly observe fibers. Validation asks a critical question: do the reconstructed streamlines correspond to known anatomical structures? Without validation, there is no way to distinguish genuine pathways from algorithmic artifacts.

ISMRM 2015 Tractography Challenge

The ISMRM 2015 Tractography Challenge provides a standardized benchmark for evaluating fiber tracking algorithms. It includes a simulated diffusion dataset with known ground truth bundles, enabling quantitative comparison of tractography methods.

The scoring data (scoring_data_Renauld2023/) includes ground truth bundle files, ROI masks, endpoint definitions, and configuration files for both ROI-based and RecoBundles-based scoring.

Ground Truth Bundles

The challenge includes 22 major white matter bundles organized into four categories:

Commissural

CA, CC, CP, MCP

Cross-hemisphere connections through the corpus callosum and commissures

Association

Cingulum, ILF, OR, SLF, UF (L/R)

Long-range intra-hemispheric connections

Projection

BPS, ICP, SCP (L/R)

Connections between cortex and subcortical structures

Special

Fornix

Limbic system pathway critical for memory

Scoring Methodology

ROI-Based Scoring

Filters user tractography using anatomical ROI constraints: streamlines must pass through required masks, intersect inclusion masks, and have endpoints in specified head/tail regions. Geometric length constraints may also apply.

RecoBundles Scoring

Uses machine learning to identify bundles in user tractography by comparing against ground truth streamline shapes. More forgiving of ROI misalignment but less direct than anatomical scoring.

Evaluation Metrics

Metric	Abbreviation	Description
Valid Bundles	VB	Percentage of submitted bundles passing validity checks
Valid Connections	VC	Percentage of correct endpoint connections
Invalid Bundles	IB	Percentage of bundles with anatomically impossible paths
Bundle Coverage	BC	Percentage of ground truth covered by submission
Bundle Overreach	BO	Percentage of submission not in ground truth
Weighted Dice	WD	Spatial overlap weighted by streamline density

Composite Score

A combined scoring formula weighting the contributions of valid and invalid results.

Coordinate System Warning

Critical: Coordinate Space Mismatch

HINEC stores track coordinates as 1-based MATLAB voxel indices. ISMRM ground truth uses TrackVis .trk files with world-space coordinates and header-defined transforms. If you save HINEC tracks as TRK without proper header setup, the scoring scripts will misinterpret coordinates — resulting in complete misalignment and a score of zero.

Dipy Version Warning

As of August 2022, Dipy's StatefulTractogram no longer applies automatic 0.5-voxel shifts when loading TRK files. Previous Python 2 versions did apply this shift. Ensure you are using the current loading behavior.

HINEC-to-TRK Conversion

To score HINEC results against ISMRM ground truth, tracks must be converted to TRK format with proper spatial headers.

python

# Convert HINEC tracks to TRK with proper header
import nibabel as nib
import scipy.io
from dipy.io.streamline import save_tractogram
from dipy.io.stateful_tractogram import StatefulTractogram, Space
 
# Load reference NIfTI for spatial information
ref_nii = nib.load('reference.nii.gz')
 
# Load HINEC tracks from .mat file
mat_data = scipy.io.loadmat('tracks.mat')
tracks = mat_data['tracks']
 
# Create StatefulTractogram with correct space
sft = StatefulTractogram(tracks, ref_nii, space=Space.VOX)
save_tractogram(sft, 'output.trk')

Ensure the reference NIfTI matches the DWI space used during tracking.

Run Directory System

Troubleshooting