IDC User Guide
  • Welcome!
  • 🚀Getting started
  • Core functions
  • Frequently asked questions
  • Support
  • Key pointers
  • Publications
  • IDC team
  • Acknowledgments
  • Jobs
  • Data
    • Introduction
    • Data model
    • Data versioning
    • Organization of data
      • Files and metadata
      • Resolving CRDC Globally Unique Identifiers (GUIDs)
      • Clinical data
      • Organization of data, v2 through V13 (deprecated)
        • Files and metadata
        • Resolving CRDC Globally Unique Identifiers (GUIDs)
        • Clinical data
      • Organization of data in v1 (deprecated)
    • Downloading data
      • Downloading data with s5cmd
      • Directly loading DICOM objects from Google Cloud or AWS in Python
    • Data release notes
    • Data known issues
  • Tutorials
    • Portal tutorial
    • Python notebook tutorials
    • Slide microscopy
      • Using QuPath for visualization
  • DICOM
    • Introduction to DICOM
    • DICOM data model
    • Original objects
    • Derived objects
      • DICOM Segmentations
      • DICOM Radiotherapy Structure Sets
      • DICOM Structured Reports
    • Coding schemes
    • DICOM-TIFF dual personality files
    • IDC DICOM white papers
  • Portal
    • Getting started
    • Exploring and subsetting data
      • Configuring your search
      • Exploring search results
      • Data selection and download
    • Visualizing images
    • Proxy policy
    • Viewer release notes
    • Portal release notes
  • API
    • Getting Started
    • IDC API Concepts
    • Manifests
    • Accessing the API
    • Endpoint Details
    • V1 API
      • Getting Started
      • IDC Data Model Concepts
      • Accessing the API
      • Endpoint Details
      • Release Notes
  • Cookbook
    • Colab notebooks
    • BigQuery
    • Looker dashboards
      • Dashboard for your cohort
      • More dashboard examples
    • ACCESS allocations
    • Compute engine
      • 3D Slicer desktop VM
      • Using a BQ Manifest to Load DICOM Files onto a VM
      • Using VS Code with GCP VMs
      • Security considerations
    • NCI Cloud Resources
Powered by GitBook
On this page
  • Easy and efficient access to public cancer imaging data
  • Tools to simplify the use of the data
  • Support of continuous enrichment of data
  • Integration of cancer imaging data with other components of CRDC

Was this helpful?

Edit on GitHub
Export as PDF

Core functions

PreviousGetting startedNextFrequently asked questions

Last updated 1 month ago

Was this helpful?

Easy and efficient access to public cancer imaging data

We ingest and distribute datasets from variety of sources and contributors, primarily focusing on large data collection initiatives sponsored by US National Cancer Institute.

At this time, we do not have resources to prioritize receipt of the imaging data from individual PIs (but we are encouraging submissions of annotations/analysis results for existing IDC data!). Nevertheless, if you feel you might have a compelling dataset, please email us at .

On ingestion, we harmonize images and image-derived data into DICOM format for interoperability, whenever data is represented in a non-DICOM format.

Upon conversion, the data undergoes Extract-Transform-Load (ETL), which extracts DICOM metadata to make the data searchable, ingests the DICOM files into public S3 storage buckets and a DICOMweb store. Once the data is released, we provide various interfaces to access data and metadata.

Tools to simplify the use of the data

We are actively developing a variety of capabilities to make it easier for the users to work with the data in IDC. Some of the examples of those tools include

Support of continuous enrichment of data

We welcome you to apply to contribute analysis results and annotations of the images available in IDC! These can be expert manual annotations, analysis results generated using AI tools, segmentations, contours, metadata attributes describing the data (e.g., annotation of the scan type), expert evaluation of the quality of existing AI-generated annotations in IDC.

If your contribution is accepted by the IDC stakeholders:

  • we will work with you to choose the appropriate DICOM object type for your data and convert it into DICOM representation

  • once published in IDC

    • your data will become searchable and viewable in IDC Portal, so it is easier for the users of your data to discover and work with your data

    • files can be downloaded very efficiently using S3 interface and idc-index

Integration of cancer imaging data with other components of CRDC

provides interactive browser-based interface for exploration of IDC data

we are the maintainers of - an open-source viewer of DICOM digital pathology images; Slim is integrated with IDC Portal for visualizing pathology images and image-derived data available in IDC

we are actively contributing to the , and rely on it for visualizing radiology images and image-derived data

is a python package that provides convenience functions for accessing IDC data, including efficient download from IDC public S3 buckets

extensions can be used for interactive download of IDC data

we are contributing to a variety of tools that aim to simplify the use of DICOM in cancer imaging research; these include and library that can be used for conversion between DICOM Whole Slide Imaging (WSI) format and other slide microscopy formats, library for converting image analysis results to and from DICOM representation

If you would like your annotations/analysis results to be considered, you must establish the value of your contribution (e.g., describe the qualifications of the experts performing manual annotations, demonstrate robustness of the AI tool you are applying to images with a peer-reviewed publication or other type of evidence), and be willing to share your contribution under a permissive Creative Commons Attribution .

See more details on our curation policy , and reach out by sending email to with any questions or inquries. Every application will be reviewed by IDC stakeholders.

upon conversion, we will create a Zenodo entry under the for your contribution so that you get the Digital Object Identifier (DOI), citation and recognition of your contribution

IDC is a component of the broader NCI , giving you access to the following:

can be used to find data related to the images in IDC in , and

Broad and (SB-CGC) can be used to apply analysis tools to the data in IDC (you can read more about how this can be done in from the IDC team)

platform curates a growing number of cancer imaging AI models that can be applied directly to the DICOM data available in IDC

IDC Portal
Slim
OHIF Viewer
idc-index
3D Slicer
SlicerIDCBrowser
OpenSlide
BioFormats bfconvert
dcmqi
CC BY 4.0 license
here
support+submissions@canceridc.dev
NCI Imaging Data Commons Zenodo community
Cancer Research Data Commons (CRDC)
Cancer Data Aggregator (CDA)
Genomics Data Commons
Proteomics Data Commons
Integrated Canine Data Commons
FireCloud
Seven Bridges Cancer Genimics Cloud
this preprint
MHub.AI
support+submissions@canceridc.dev
Schematic summary of the IDC data ingestion and release process.
Although IDC data is stored in DICOM format, it can be converted into alternative research representations using open-source tools.