Original objects

We differentiate between the original and derived DICOM objects in the IDC portal and discussions of the IDC-hosted data. By Original objects we mean DICOM objects that are produced by image acquisition equipment - MR, CT, or PET images fall into this category. By Derived objects we mean those objects that were generated by means of analysis or annotation of the original objects. Those objects can contains, for example, volumetric segmentations of the structures in the original images, or quantitative measurements of the objects in the image.

The search portal of IDC aimed to group attributes into categories that are specific to the original and derived objects. Due to the time constraints, the current listing of attributes as of IDC MVP does not correspond to the final desired organization, and is expected to change.

Original objects

Most of the images stored on IDC are saved as objects that store individual slices of the image in separate instances of a series, with the image stored in the PixelData attribute.

Open source libraries such as DCMTK, GDCM, ITK, and pydicom can be used to parse such files and load pixel data of the individual slices. Recovering geometry of the individual slices (spatial location and resolution) and reconstruction of the individual slices into a volume requires some extra consideration.

It is not safe to sort individual instances using file name, or any attribute other than those that communicate image geometry (ImagePositionPatient, ImageOrientationPatient, and PixelSpacing). This may result in accurate geometric ordering of slices!

We point this out because even some prominent examples use oversimplified approaches to recover image geometry from DICOM files. As an example, this DICOM data preprocessing tutorial from Kaggle uses ImagePositionPatient alone to infer slice spacing. This approach results in incorrect computed slice spacing for oblique acquisitions (i.e., see example below).

ImagePositionPatient[2] for calculating slice spacing for an oblique acquisition leads to incorrect result

Although in most cases, a DICOM image series will correspond to a single traversal of a 3-d volume, in general it may have multiple slices for the same spatial location (e.g., for temporally-resolved acquisitions). Unfortunately, it is also possible that DICOM series, as available to you, will have missing slices and, as a result, inconsistent spacing between slices.

You can use one of the existing tools to reconstruct image volume instead of implementing sorting of the slices on your own:

Edit on GitHub