Data model
Last updated
Was this helpful?
Last updated
Was this helpful?
IDC relies on DICOM data model for organizing images and image-derived data. At the same time, IDC includes certain attributes and data types that are outside of the DICOM data model. The Entity-Relationship (E-R) diagram and examples below summarize a simplified view of the IDC data model (you will find the explanation of how to interpret the notation used in this E-R diagram in this page from Mermaid documentation).
IDC content is organized in Collections: groups of DICOM files that were collected through certain research activity.
Collections are organized into Programs, which group related collections, or those collections that were contributed under the same funding initiative or a consortium. Example: TCGA program contains TCGA-GBM, TCGA-BRCA and other collections. You will see Collections nested under Programs in the upper left section of the IDC Portal. You will also see the list of collections that meet the filter criteria in the top table on the right-hand side of the portal interface.
Individual DICOM files included in the collection contain attributes that organize content according to the DICOM data model.
Each collection will contain data for one or more case, or patient. Data for the individual patient is organized in DICOM studies, which group images corresponding to a single imaging exam/enconter, and collected in a given session. Studies are composed of DICOM series, which in turn consist of DICOM instances. Each DICOM instance correspond to a single file on disk. As an example, in radiology imaging, individual instances would correspond to image slices in multi-slice acquisitions, and in digital pathology you will see a separate file/instance for each resolution layer of the image pyramid. When using IDC Portal, you will never encounter individual instances - you will only see them if you download data to your computer.
Analysis results collection is a very important concept in IDC. These contain analysis results that were not contributed as part of any specific collection. Such analysis results might be contributed by investigators unrelated to those that submitted the analyzed images, and may span images across multiple collections.