Metric (`pymia.evaluation.metric` package)¶

The metric package provides metrics for evaluation of image segmentation, image reconstruction, and regression.

All metrics implement the pymia.evaluation.metric.base.Metric interface, and can be used with the pymia.evaluation.evaluator package to evaluate results (e.g., with the pymia.evaluation.evaluator.SegmentationEvaluator). To implement your own metric and use it with the pymia.evaluation.evaluator.Evaluator, you need to inherit from pymia.evaluation.metric.base.Metric, pymia.evaluation.metric.base.ConfusionMatrixMetric, pymia.evaluation.metric.base.DistanceMetric, pymia.evaluation.metric.base.NumpyArrayMetric, or pymia.evaluation.metric.base.SpacingMetric and implement pymia.evaluation.metric.base.Metric.calculate().

Note

The segmentation metrics are selected based on the paper by Taha and Hanbury. We recommend to refer to the paper for guidelines on how to select appropriate metrics, descriptions, and the math.

Taha, A. A., & Hanbury, A. (2015). Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool. BMC Medical Imaging, 15. https://doi.org/10.1186/s12880-015-0068-x

Base (`pymia.evaluation.metric.base`) module¶

The base module provides metric base classes.

class pymia.evaluation.metric.base.ConfusionMatrix(prediction: ndarray, reference: ndarray)[source]¶

Bases: object

Represents a confusion matrix (or error matrix).

Parameters:

prediction (np.ndarray) – The prediction binary array.
reference (np.ndarray) – The reference binary array.

class pymia.evaluation.metric.base.ConfusionMatrixMetric(metric: str = 'ConfusionMatrixMetric')[source]¶

Bases: Metric, ABC

Represents a metric based on the confusion matrix.

Parameters:: metric (str) – The identification string of the metric.

class pymia.evaluation.metric.base.DistanceMetric(metric: str = 'DistanceMetric')[source]¶

Bases: Metric, ABC

Represents a metric based on distances.

Parameters:: metric (str) – The identification string of the metric.

class pymia.evaluation.metric.base.Distances(prediction: ndarray, reference: ndarray, spacing: tuple)[source]¶

Bases: object

Represents distances for distance metrics.

Parameters:

prediction (np.ndarray) – The prediction binary array.
reference (np.ndarray) – The reference binary array.
spacing (tuple) – The spacing in mm of each dimension.

See also

Nikolov, S., Blackwell, S., Mendes, R., De Fauw, J., Meyer, C., Hughes, C., … Ronneberger, O. (2018). Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy. http://arxiv.org/abs/1809.04430
Original implementation

class pymia.evaluation.metric.base.Information(column_name: str, value: str)[source]¶

Bases: Metric

Represents an information “metric”.

Can be used to add an additional column of information to an evaluator.

Parameters:

column_name (str) – The identification string of the information.
value (str) – The information.

calculate()[source]¶: Outputs the value of the information.

class pymia.evaluation.metric.base.Metric(metric: str = 'Metric')[source]¶

Bases: ABC

Metric base class.

Parameters:: metric (str) – The identification string of the metric.

abstract calculate()[source]¶: Calculates the metric.

exception pymia.evaluation.metric.base.NotComputableMetricWarning[source]¶

Bases: RuntimeWarning

Warning class to raise if a metric cannot be computed.

class pymia.evaluation.metric.base.NumpyArrayMetric(metric: str = 'NumpyArrayMetric')[source]¶

Bases: Metric, ABC

Represents a metric based on numpy arrays.

Parameters:: metric (str) – The identification string of the metric.

class pymia.evaluation.metric.base.SpacingMetric(metric: str = 'SpacingMetric')[source]¶

Bases: NumpyArrayMetric, ABC

Represents a metric based on images with a physical spacing.

Parameters:: metric (str) – The identification string of the metric.

Metric (`pymia.evaluation.metric.metric`) module¶

The metric module provides a set of metrics.

pymia.evaluation.metric.metric.get_classical_metrics()[source]¶

Gets a list of classical metrics.

Returns:: A list of metrics.
Return type:: list[Metric]

pymia.evaluation.metric.metric.get_distance_metrics()[source]¶

Gets a list of distance-based metrics.

Returns:: A list of metrics.
Return type:: list[Metric]

pymia.evaluation.metric.metric.get_overlap_metrics()[source]¶

Gets a list of overlap-based metrics.

Returns:: A list of metrics.
Return type:: list[Metric]

pymia.evaluation.metric.metric.get_reconstruction_metrics()[source]¶

Gets a list with reconstruction metrics.

Returns:: A list of metrics.
Return type:: list[Metric]

pymia.evaluation.metric.metric.get_regression_metrics()[source]¶

Gets a list with regression metrics.

Returns:: A list of metrics.
Return type:: list[Metric]

pymia.evaluation.metric.metric.get_segmentation_metrics()[source]¶

Gets a list with segmentation metrics.

Returns:: A list of metrics.
Return type:: list[Metric]

Categorical metrics (`pymia.evaluation.metric.categorical`) module¶

The categorical module provides metrics to measure image segmentation performance.

class pymia.evaluation.metric.categorical.Accuracy(metric: str = 'ACURCY')[source]¶

Bases: ConfusionMatrixMetric

Represents an accuracy metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the accuracy.

class pymia.evaluation.metric.categorical.AdjustedRandIndex(metric: str = 'ADJRIND')[source]¶

Bases: ConfusionMatrixMetric

Represents an adjusted rand index metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the adjusted rand index.

class pymia.evaluation.metric.categorical.AreaMetric(metric: str = 'AREA')[source]¶

Bases: SpacingMetric, ABC

Represents an area metric base class.

Parameters:: metric (str) – The identification string of the metric.

class pymia.evaluation.metric.categorical.AreaUnderCurve(metric: str = 'AUC')[source]¶

Bases: ConfusionMatrixMetric

Represents an area under the curve metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the area under the curve.

class pymia.evaluation.metric.categorical.AverageDistance(metric: str = 'AVGDIST')[source]¶

Bases: SpacingMetric

Represents an average (Hausdorff) distance metric.

Calculates the distance between the set of non-zero pixels of two images using the following equation:

$AVD(A,B) = max(d(A,B), d(B,A)),$

where

$d(A,B) = \frac{1}{N} \sum_{a \in A} \min_{b \in B} \lVert a - b \rVert$

is the directed Hausdorff distance and $A$ and $B$ are the set of non-zero pixels in the images.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the average (Hausdorff) distance.

class pymia.evaluation.metric.categorical.CohenKappaCoefficient(metric: str = 'KAPPA')[source]¶

Bases: ConfusionMatrixMetric

Represents a Cohen’s kappa coefficient metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the Cohen’s kappa coefficient.

class pymia.evaluation.metric.categorical.DiceCoefficient(metric: str = 'DICE')[source]¶

Bases: ConfusionMatrixMetric

Represents a Dice coefficient metric with empty target handling, defined as:

$\begin{cases} 1 & \left\vert{y}\right\vert = \left\vert{\hat y}\right\vert = 0 \\ Dice(y,\hat y) & \left\vert{y}\right\vert > 0 \\ \end{cases}$

where $\hat y$ is the prediction and $y$ the target.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the Dice coefficient.

class pymia.evaluation.metric.categorical.FMeasure(beta: float = 1.0, metric: str = 'FMEASR')[source]¶

Bases: ConfusionMatrixMetric

Represents a F-measure metric.

Parameters:

beta (float) – The beta to trade-off precision and recall. Use 0.5 or 2 to calculate the F0.5 and F2 measure, respectively.
metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the F1 measure.

class pymia.evaluation.metric.categorical.Fallout(metric: str = 'FALLOUT')[source]¶

Bases: ConfusionMatrixMetric

Represents a fallout (false positive rate) metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the fallout (false positive rate).

class pymia.evaluation.metric.categorical.FalseNegative(metric: str = 'FN')[source]¶

Bases: ConfusionMatrixMetric

Represents a false negative metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the false negatives.

class pymia.evaluation.metric.categorical.FalseNegativeRate(metric: str = 'FNR')[source]¶

Bases: ConfusionMatrixMetric

Represents a false negative rate metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the false negative rate.

class pymia.evaluation.metric.categorical.FalsePositive(metric: str = 'FP')[source]¶

Bases: ConfusionMatrixMetric

Represents a false positive metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the false positives.

class pymia.evaluation.metric.categorical.GlobalConsistencyError(metric: str = 'GCOERR')[source]¶

Bases: ConfusionMatrixMetric

Represents a global consistency error metric.

Implementation based on Martin 2001. todo(fabianbalsiger): add entire reference

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the global consistency error.

class pymia.evaluation.metric.categorical.HausdorffDistance(percentile: float = 100.0, metric: str = 'HDRFDST')[source]¶

Bases: DistanceMetric

Represents a Hausdorff distance metric.

Calculates the distance between the set of non-zero pixels of two images using the following equation:

$H(A,B) = max(h(A,B), h(B,A)),$

where

$h(A,B) = \max_{a \in A} \min_{b \in B} \lVert a - b \rVert$

is the directed Hausdorff distance and $A$ and $B$ are the set of non-zero pixels in the images.

Parameters:

percentile (float) – The percentile (0, 100] to compute, i.e. 100 computes the Hausdorff distance and 95 computes the 95th Hausdorff distance.
metric (str) – The identification string of the metric.

See also

Nikolov, S., Blackwell, S., Mendes, R., De Fauw, J., Meyer, C., Hughes, C., … Ronneberger, O. (2018). Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy. http://arxiv.org/abs/1809.04430
Original implementation

calculate()[source]¶: Calculates the Hausdorff distance.

class pymia.evaluation.metric.categorical.InterclassCorrelation(metric: str = 'ICCORR')[source]¶

Bases: NumpyArrayMetric

Represents an interclass correlation metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the interclass correlation.

class pymia.evaluation.metric.categorical.JaccardCoefficient(metric: str = 'JACRD')[source]¶

Bases: ConfusionMatrixMetric

Represents a Jaccard coefficient metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the Jaccard coefficient.

class pymia.evaluation.metric.categorical.MahalanobisDistance(metric: str = 'MAHLNBS')[source]¶

Bases: NumpyArrayMetric

Represents a Mahalanobis distance metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the Mahalanobis distance.

class pymia.evaluation.metric.categorical.MutualInformation(metric: str = 'MUTINF')[source]¶

Bases: ConfusionMatrixMetric

Represents a mutual information metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the mutual information.

class pymia.evaluation.metric.categorical.Precision(metric: str = 'PRCISON')[source]¶

Bases: ConfusionMatrixMetric

Represents a precision metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the precision.

class pymia.evaluation.metric.categorical.PredictionArea(slice_number: int = -1, metric: str = 'PREDAREA')[source]¶

Bases: AreaMetric

Represents a prediction area metric.

Parameters:

slice_number (int) – The slice number to calculate the area. Defaults to -1, which will calculate the area on the intermediate slice.
metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the predicted area on a specified slice in mm2.

class pymia.evaluation.metric.categorical.PredictionVolume(metric: str = 'PREDVOL')[source]¶

Bases: VolumeMetric

Represents a prediction volume metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the predicted volume in mm3.

class pymia.evaluation.metric.categorical.ProbabilisticDistance(metric: str = 'PROBDST')[source]¶

Bases: NumpyArrayMetric

Represents a probabilistic distance metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the probabilistic distance.

class pymia.evaluation.metric.categorical.RandIndex(metric: str = 'RNDIND')[source]¶

Bases: ConfusionMatrixMetric

Represents a rand index metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the rand index.

class pymia.evaluation.metric.categorical.ReferenceArea(slice_number: int = -1, metric: str = 'REFAREA')[source]¶

Bases: AreaMetric

Represents a reference area metric.

Parameters:

slice_number (int) – The slice number to calculate the area. Defaults to -1, which will calculate the area on the intermediate slice.
metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the reference area on a specified slice in mm2.

class pymia.evaluation.metric.categorical.ReferenceVolume(metric: str = 'REFVOL')[source]¶

Bases: VolumeMetric

Represents a reference volume metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the reference volume in mm3.

class pymia.evaluation.metric.categorical.Sensitivity(metric: str = 'SNSVTY')[source]¶

Bases: ConfusionMatrixMetric

Represents a sensitivity (true positive rate or recall) metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the sensitivity (true positive rate).

class pymia.evaluation.metric.categorical.Specificity(metric: str = 'SPCFTY')[source]¶

Bases: ConfusionMatrixMetric

Represents a specificity metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the specificity.

class pymia.evaluation.metric.categorical.SurfaceDiceOverlap(tolerance: float = 1, metric: str = 'SURFDICE')[source]¶

Bases: DistanceMetric

Represents a surface Dice coefficient overlap metric.

Parameters:

tolerance (float) – The tolerance of the surface distance in mm.
metric (str) – The identification string of the metric.

See also

Nikolov, S., Blackwell, S., Mendes, R., De Fauw, J., Meyer, C., Hughes, C., … Ronneberger, O. (2018). Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy. http://arxiv.org/abs/1809.04430
Original implementation

calculate()[source]¶: Calculates the surface Dice coefficient overlap.

class pymia.evaluation.metric.categorical.SurfaceOverlap(tolerance: float = 1.0, prediction_to_reference: bool = True, metric: str = 'SURFOVLP')[source]¶

Bases: DistanceMetric

Represents a surface overlap metric.

Computes the overlap of the reference surface with the predicted surface and vice versa allowing a specified tolerance (maximum surface-to-surface distance that is regarded as overlapping). The overlapping fraction is computed by correctly taking the area of each surface element into account.

Parameters:

tolerance (float) – The tolerance of the surface distance in mm.
prediction_to_reference (bool) – Computes the prediction to reference if True, otherwise the reference to prediction.
metric (str) – The identification string of the metric.

See also

Nikolov, S., Blackwell, S., Mendes, R., De Fauw, J., Meyer, C., Hughes, C., … Ronneberger, O. (2018). Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy. http://arxiv.org/abs/1809.04430
Original implementation

calculate()[source]¶: Calculates the surface overlap.

class pymia.evaluation.metric.categorical.TrueNegative(metric: str = 'TN')[source]¶

Bases: ConfusionMatrixMetric

Represents a true negative metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the true negatives.

class pymia.evaluation.metric.categorical.TruePositive(metric: str = 'TP')[source]¶

Bases: ConfusionMatrixMetric

Represents a true positive metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the true positives.

class pymia.evaluation.metric.categorical.VariationOfInformation(metric: str = 'VARINFO')[source]¶

Bases: ConfusionMatrixMetric

Represents a variation of information metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the variation of information.

class pymia.evaluation.metric.categorical.VolumeMetric(metric: str = 'VOL')[source]¶

Bases: SpacingMetric, ABC

Represents a volume metric base class.

Parameters:: metric (str) – The identification string of the metric.

class pymia.evaluation.metric.categorical.VolumeSimilarity(metric: str = 'VOLSMTY')[source]¶

Bases: ConfusionMatrixMetric

Represents a volume similarity metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the volume similarity.

Continuous metrics (`pymia.evaluation.metric.continuous`) module¶

The continuous module provides metrics to measure image reconstruction and regression performance.

class pymia.evaluation.metric.continuous.CoefficientOfDetermination(metric: str = 'R2')[source]¶

Bases: NumpyArrayMetric

Represents a coefficient of determination (R^2) error metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the coefficient of determination (R^2) error.

See also

https://stackoverflow.com/a/45538060

class pymia.evaluation.metric.continuous.MeanAbsoluteError(metric: str = 'MAE')[source]¶

Bases: NumpyArrayMetric

Represents a mean absolute error metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the mean absolute error.

class pymia.evaluation.metric.continuous.MeanSquaredError(metric: str = 'MSE')[source]¶

Bases: NumpyArrayMetric

Represents a mean squared error metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the mean squared error.

class pymia.evaluation.metric.continuous.NormalizedRootMeanSquaredError(metric: str = 'NRMSE')[source]¶

Bases: NumpyArrayMetric

Represents a normalized root mean squared error metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the normalized root mean squared error.

class pymia.evaluation.metric.continuous.PeakSignalToNoiseRatio(metric: str = 'PSNR')[source]¶

Bases: NumpyArrayMetric

Represents a peak signal to noise ratio metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the peak signal to noise ratio.

class pymia.evaluation.metric.continuous.RootMeanSquaredError(metric: str = 'RMSE')[source]¶

Bases: NumpyArrayMetric

Represents a root mean squared error metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the root mean squared error.

class pymia.evaluation.metric.continuous.StructuralSimilarityIndexMeasure(metric: str = 'SSIM')[source]¶

Bases: NumpyArrayMetric

Represents a structural similarity index measure metric.

Parameters:: metric (str) – The identification string of the metric.

calculate()[source]¶: Calculates the structural similarity index measure.

Metric (pymia.evaluation.metric package)¶

Base (pymia.evaluation.metric.base) module¶

Metric (pymia.evaluation.metric.metric) module¶

Categorical metrics (pymia.evaluation.metric.categorical) module¶

Continuous metrics (pymia.evaluation.metric.continuous) module¶

Metric (`pymia.evaluation.metric` package)¶

Base (`pymia.evaluation.metric.base`) module¶

Metric (`pymia.evaluation.metric.metric`) module¶

Categorical metrics (`pymia.evaluation.metric.categorical`) module¶

Continuous metrics (`pymia.evaluation.metric.continuous`) module¶