We differentiate between the original and derived DICOM objects in the IDC portal and discussions of the IDC-hosted data. By Original objects we mean DICOM objects that are produced by image acquisition equipment - MR, CT, or PET images fall into this category. By Derived objects we mean those objects that were generated by means of analysis or annotation of the original objects. Those objects can contains, for example, volumetric segmentations of the structures in the original images, or quantitative measurements of the objects in the image.
Most of the images stored on IDC are saved as objects that store individual slices of the image in separate instances of a series, with the image stored in the
Open source libraries such as DCMTK, GDCM, ITK, and pydicom can be used to parse such files and load pixel data of the individual slices. Recovering geometry of the individual slices (spatial location and resolution) and reconstruction of the individual slices into a volume requires some extra consideration.
We point this out because even some prominent examples use oversimplified approaches to recover image geometry from DICOM files. As an example, this DICOM data preprocessing tutorial from Kaggle uses
ImagePositionPatient alone to infer slice spacing. This approach results in incorrect computed slice spacing for oblique acquisitions (i.e., see example below).
Although in most cases, a DICOM image series will correspond to a single traversal of a 3-d volume, in general it may have multiple slices for the same spatial location (e.g., for temporally-resolved acquisitions). Unfortunately, it is also possible that DICOM series, as available to you, will have missing slices and, as a result, inconsistent spacing between slices.
You can use one of the existing tools to reconstruct image volume instead of implementing sorting of the slices on your own: