{ "cells": [ { "cell_type": "raw", "metadata": { "pycharm": { "name": "#%% raw\n" }, "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _example-evaluation1:\n" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Evaluation of results\n", "=====================\n", "\n", "This example shows how to use the `pymia.evaluation` package to evaluate predicted segmentations against reference ground truths.\n", "Common metrics in medical image segmentation are the Dice coefficient, an overlap-based metric, and the Hausdorff distance, a distance-based metric. Further, we also evaluate the volume similarity, a metric that does not consider the spatial overlap.\n", "The evaluation results are logged to the console and saved to a CSV file. Further, statistics (mean and standard deviation) are calculated over all evaluated segmentations, which are again logged to the console and saved to a CSV file.\n", "The CSV files could be loaded into any statistical software for further analysis and visualization.\n", "\n", "
\n", "\n", "Tip\n", "\n", "This example is available as Jupyter notebook at [./examples/evaluation/basic.ipynb](https://github.com/rundherum/pymia/blob/master/examples/evaluation/basic.ipynb) and Python script at [./examples/evaluation/basic.py](https://github.com/rundherum/pymia/blob/master/examples/evaluation/basic.py).\n", "\n", "
\n", "\n", "
\n", "\n", "Note\n", "\n", "To be able to run this example:\n", "\n", "- Get the example data by executing [./examples/example-data/pull_example_data.py](https://github.com/rundherum/pymia/blob/master/examples/example-data/pull_example_data.py).\n", "- Install pandas (`pip install pandas`).\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Import the required modules." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import glob\n", "import os\n", "\n", "import numpy as np\n", "import pymia.evaluation.metric as metric\n", "import pymia.evaluation.evaluator as eval_\n", "import pymia.evaluation.writer as writer\n", "import SimpleITK as sitk" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define the paths to the data and the result CSV files." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "data_dir = '../example-data'\n", "\n", "result_file = '../example-data/results.csv'\n", "result_summary_file = '../example-data/results_summary.csv'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us create a list with the three metrics: the Dice coefficient, the Hausdorff distance, and the volume similarity. Note that we are interested in the outlier-robust 95th Hausdorff distance, and, therefore, pass the percentile as argument and adapt the metric's name." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "metrics = [metric.DiceCoefficient(), metric.HausdorffDistance(percentile=95, metric='HDRFDST95'), metric.VolumeSimilarity()]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we need to define the labels we want to evaluate. In the provided example data, we have five labels for different brain structures. Here, we are only interested in three of them: white matter, grey matter, and the thalamus." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "labels = {1: 'WHITEMATTER',\n", " 2: 'GREYMATTER',\n", " 5: 'THALAMUS'\n", " }" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we can initialize an evaluator with the metrics and labels." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "evaluator = eval_.SegmentationEvaluator(metrics, labels)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now loop over the subjects of the example data. We will load the ground truth image as reference. An artificial segmentation (prediction) is created by eroding the ground truth. Both images, and the subject identifier are passed to the evaluator." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Evaluating Subject_2...\n", "Evaluating Subject_4...\n", "Evaluating Subject_3...\n", "Evaluating Subject_1...\n" ] } ], "source": [ "# get subjects to evaluate\n", "subject_dirs = [subject for subject in glob.glob(os.path.join(data_dir, '*')) if os.path.isdir(subject) and os.path.basename(subject).startswith('Subject')]\n", "\n", "for subject_dir in subject_dirs:\n", " subject_id = os.path.basename(subject_dir)\n", " print(f'Evaluating {subject_id}...')\n", "\n", " # load ground truth image and create artificial prediction by erosion\n", " ground_truth = sitk.ReadImage(os.path.join(subject_dir, f'{subject_id}_GT.mha'))\n", " prediction = ground_truth\n", " for label_val in labels.keys():\n", " # erode each label we are going to evaluate\n", " prediction = sitk.BinaryErode(prediction, [1] * prediction.GetDimension(), sitk.sitkBall, 0, label_val)\n", "\n", " # evaluate the \"prediction\" against the ground truth\n", " evaluator.evaluate(prediction, ground_truth, subject_id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After we evaluated all subjects, we can use a CSV writer to write the evaluation results to a CSV file." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "writer.CSVWriter(result_file).write(evaluator.results)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Further, we can use a console writer to display the results in the console." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Subject-wise results...\n", "SUBJECT LABEL DICE HDRFDST95 VOLSMTY\n", "Subject_1 GREYMATTER 0.313 9.165 0.313 \n", "Subject_1 THALAMUS 0.752 2.000 0.752 \n", "Subject_1 WHITEMATTER 0.642 6.708 0.642 \n", "Subject_2 GREYMATTER 0.298 10.863 0.298 \n", "Subject_2 THALAMUS 0.768 2.000 0.768 \n", "Subject_2 WHITEMATTER 0.654 6.000 0.654 \n", "Subject_3 GREYMATTER 0.287 8.718 0.287 \n", "Subject_3 THALAMUS 0.761 2.000 0.761 \n", "Subject_3 WHITEMATTER 0.641 6.164 0.641 \n", "Subject_4 GREYMATTER 0.259 8.660 0.259 \n", "Subject_4 THALAMUS 0.781 2.000 0.781 \n", "Subject_4 WHITEMATTER 0.649 6.000 0.649 \n" ] } ], "source": [ "print('\\nSubject-wise results...')\n", "writer.ConsoleWriter().write(evaluator.results)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also report statistics such as the mean and standard deviation among all subjects using dedicated statistics writers. Note that you can pass any functions that take a list of floats and return a scalar value to the writers. Again, we will write a CSV file and display the results in the console." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Aggregated statistic results...\n", "LABEL METRIC STATISTIC VALUE\n", "GREYMATTER DICE MEAN 0.289\n", "GREYMATTER DICE STD 0.020\n", "GREYMATTER HDRFDST95 MEAN 9.351\n", "GREYMATTER HDRFDST95 STD 0.894\n", "GREYMATTER VOLSMTY MEAN 0.289\n", "GREYMATTER VOLSMTY STD 0.020\n", "THALAMUS DICE MEAN 0.766\n", "THALAMUS DICE STD 0.010\n", "THALAMUS HDRFDST95 MEAN 2.000\n", "THALAMUS HDRFDST95 STD 0.000\n", "THALAMUS VOLSMTY MEAN 0.766\n", "THALAMUS VOLSMTY STD 0.010\n", "WHITEMATTER DICE MEAN 0.647\n", "WHITEMATTER DICE STD 0.005\n", "WHITEMATTER HDRFDST95 MEAN 6.218\n", "WHITEMATTER HDRFDST95 STD 0.291\n", "WHITEMATTER VOLSMTY MEAN 0.647\n", "WHITEMATTER VOLSMTY STD 0.005\n" ] } ], "source": [ "functions = {'MEAN': np.mean, 'STD': np.std}\n", "writer.CSVStatisticsWriter(result_summary_file, functions=functions).write(evaluator.results)\n", "print('\\nAggregated statistic results...')\n", "writer.ConsoleStatisticsWriter(functions=functions).write(evaluator.results)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we clear the results in the evaluator such that the evaluator is ready for the next evaluation." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "evaluator.clear()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let us have a look at the saved result CSV file." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": " SUBJECT LABEL DICE HDRFDST95 VOLSMTY\n0 Subject_1 GREYMATTER 0.313373 9.165151 0.313373\n1 Subject_1 THALAMUS 0.752252 2.000000 0.752252\n2 Subject_1 WHITEMATTER 0.642021 6.708204 0.642021\n3 Subject_2 GREYMATTER 0.298358 10.862780 0.298358\n4 Subject_2 THALAMUS 0.768488 2.000000 0.768488\n5 Subject_2 WHITEMATTER 0.654239 6.000000 0.654239\n6 Subject_3 GREYMATTER 0.287460 8.717798 0.287460\n7 Subject_3 THALAMUS 0.760978 2.000000 0.760978\n8 Subject_3 WHITEMATTER 0.641251 6.164414 0.641251\n9 Subject_4 GREYMATTER 0.258504 8.660254 0.258504\n10 Subject_4 THALAMUS 0.780754 2.000000 0.780754\n11 Subject_4 WHITEMATTER 0.649203 6.000000 0.649203", "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
SUBJECTLABELDICEHDRFDST95VOLSMTY
0Subject_1GREYMATTER0.3133739.1651510.313373
1Subject_1THALAMUS0.7522522.0000000.752252
2Subject_1WHITEMATTER0.6420216.7082040.642021
3Subject_2GREYMATTER0.29835810.8627800.298358
4Subject_2THALAMUS0.7684882.0000000.768488
5Subject_2WHITEMATTER0.6542396.0000000.654239
6Subject_3GREYMATTER0.2874608.7177980.287460
7Subject_3THALAMUS0.7609782.0000000.760978
8Subject_3WHITEMATTER0.6412516.1644140.641251
9Subject_4GREYMATTER0.2585048.6602540.258504
10Subject_4THALAMUS0.7807542.0000000.780754
11Subject_4WHITEMATTER0.6492036.0000000.649203
\n
" }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "\n", "pd.read_csv(result_file, sep=';')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And also at the saved statistics CSV file." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": " LABEL METRIC STATISTIC VALUE\n0 GREYMATTER DICE MEAN 0.289424\n1 GREYMATTER DICE STD 0.020083\n2 GREYMATTER HDRFDST95 MEAN 9.351496\n3 GREYMATTER HDRFDST95 STD 0.894161\n4 GREYMATTER VOLSMTY MEAN 0.289424\n5 GREYMATTER VOLSMTY STD 0.020083\n6 THALAMUS DICE MEAN 0.765618\n7 THALAMUS DICE STD 0.010458\n8 THALAMUS HDRFDST95 MEAN 2.000000\n9 THALAMUS HDRFDST95 STD 0.000000\n10 THALAMUS VOLSMTY MEAN 0.765618\n11 THALAMUS VOLSMTY STD 0.010458\n12 WHITEMATTER DICE MEAN 0.646678\n13 WHITEMATTER DICE STD 0.005355\n14 WHITEMATTER HDRFDST95 MEAN 6.218154\n15 WHITEMATTER HDRFDST95 STD 0.290783\n16 WHITEMATTER VOLSMTY MEAN 0.646678\n17 WHITEMATTER VOLSMTY STD 0.005355", "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
LABELMETRICSTATISTICVALUE
0GREYMATTERDICEMEAN0.289424
1GREYMATTERDICESTD0.020083
2GREYMATTERHDRFDST95MEAN9.351496
3GREYMATTERHDRFDST95STD0.894161
4GREYMATTERVOLSMTYMEAN0.289424
5GREYMATTERVOLSMTYSTD0.020083
6THALAMUSDICEMEAN0.765618
7THALAMUSDICESTD0.010458
8THALAMUSHDRFDST95MEAN2.000000
9THALAMUSHDRFDST95STD0.000000
10THALAMUSVOLSMTYMEAN0.765618
11THALAMUSVOLSMTYSTD0.010458
12WHITEMATTERDICEMEAN0.646678
13WHITEMATTERDICESTD0.005355
14WHITEMATTERHDRFDST95MEAN6.218154
15WHITEMATTERHDRFDST95STD0.290783
16WHITEMATTERVOLSMTYMEAN0.646678
17WHITEMATTERVOLSMTYSTD0.005355
\n
" }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.read_csv(result_summary_file, sep=';')\n" ] } ], "metadata": { "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 1 }