IDC User Guide
  • Welcome!
  • 🚀Getting started
  • Core functions
  • Frequently asked questions
  • Support
  • Key pointers
  • Publications
  • IDC team
  • Acknowledgments
  • Jobs
  • Data
    • Introduction
    • Data model
    • Data versioning
    • Organization of data
      • Files and metadata
      • Resolving CRDC Globally Unique Identifiers (GUIDs)
      • Clinical data
      • Organization of data, v2 through V13 (deprecated)
        • Files and metadata
        • Resolving CRDC Globally Unique Identifiers (GUIDs)
        • Clinical data
      • Organization of data in v1 (deprecated)
    • Downloading data
      • Downloading data with s5cmd
      • Directly loading DICOM objects from Google Cloud or AWS in Python
    • Data release notes
    • Data known issues
  • Tutorials
    • Portal tutorial
    • Python notebook tutorials
    • Slide microscopy
      • Using QuPath for visualization
  • DICOM
    • Introduction to DICOM
    • DICOM data model
    • Original objects
    • Derived objects
      • DICOM Segmentations
      • DICOM Radiotherapy Structure Sets
      • DICOM Structured Reports
    • Coding schemes
    • DICOM-TIFF dual personality files
    • IDC DICOM white papers
  • Portal
    • Getting started
    • Exploring and subsetting data
      • Configuring your search
      • Exploring search results
      • Data selection and download
    • Visualizing images
    • Proxy policy
    • Viewer release notes
    • Portal release notes
  • API
    • Getting Started
    • IDC API Concepts
    • Manifests
    • Accessing the API
    • Endpoint Details
    • V1 API
      • Getting Started
      • IDC Data Model Concepts
      • Accessing the API
      • Endpoint Details
      • Release Notes
  • Cookbook
    • Colab notebooks
    • BigQuery
    • Looker dashboards
      • Dashboard for your cohort
      • More dashboard examples
    • ACCESS allocations
    • Compute engine
      • 3D Slicer desktop VM
      • Using a BQ Manifest to Load DICOM Files onto a VM
      • Using VS Code with GCP VMs
      • Security considerations
    • NCI Cloud Resources
Powered by GitBook
On this page
  • How to download data from IDC?
  • How do I get my data into IDC?
  • How much does it cost to use the cloud?
  • What is the status of IDC?
  • What data is available?
  • How to acknowledge IDC?
  • What is the difference between IDC and TCIA?
  • Where do I learn more about other components of CRDC?
  • What about non-imaging data that accompanies IDC collections?
  • I want to search IDC content using an attribute not available in the portal

Was this helpful?

Edit on GitHub
Export as PDF

Frequently asked questions

PreviousCore functionsNextSupport

Last updated 27 days ago

Was this helpful?

How to download data from IDC?

Check out the Downloading data documentation page!

How do I get my data into IDC?

Note that currently IDC prioritizes submissions from NCI-funded driving projects and data from special selected projects.

  • If you would like to submit images, it will be your responsibility to de-identify them first, documenting the de-identification process and submitting that documentation for the review by IDC stakeholders.

  • We welcome submissions of image-derived data (expert annotations, AI-generated segmentations) for the images already in IDC, see IDC Zenodo community to learn about the requirements for such submissions!

IDC works closely with and mirrors TCIA public collections. If you submit your DICOM data to TCIA and your data is released as a public collection, it will be automatically available in IDC in a following release.

If you are interested in making your data available within IDC, please contact us by sending email to .

How much does it cost to use the cloud?

IDC data is stored in the cloud buckets, and you can search and for free and without login.

If you would like to use the cloud for analysis of the data, we recommend you start with the free tier of to get free access to a cloud-hosted VM with GPU to experiment with analysis workflows for IDC data. If you are an NIH-funded researcher, you may be eligible for a free allocation via . US-based researchers can also access free cloud-based computing resources via .

What is the status of IDC?

IDC pilot release took place in Fall 2020, followed by the production release in September 2021.

What data is available?

How to acknowledge IDC?

Please cite the latest paper from the IDC team. Please also make sure you acknowledge the specific data collections you used in your analysis.

What is the difference between IDC and TCIA?

IDC and TCIA are partners in providing FAIR data for cancer imaging researchers. While some of the functions between the two resources are similar, there are also key differences. The table below provides a summary of similarities and differences.

Function

IDC

TCIA

De-identification

no, IDC can only host data already de-identified

yes

Cloud-based data co-located with compute resources

yes

no

Conversion of pathology images and image-derived data into DICOM format

yes

no

Private data collections

no

yes

Public data collections

yes

yes

Version control of the data

partial

Where do I learn more about other components of CRDC?

What about non-imaging data that accompanies IDC collections?

I want to search IDC content using an attribute not available in the portal

IDC Portal gives you access to just a small subset of the metadata accompanying IDC images. If you want to learn more about what is available, you have several options:

We host most of the public collections from . We also host HTAN and other pathology images not hosted by TCIA. You can review the complete, up-to-date list of .

Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W. L., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National cancer institute imaging data commons: Toward transparency, reproducibility, and scalability in imaging artificial intelligence. Radiographics 43, (2023).

The main website for the Cancer Research Data Commons (CRDC) is

Clinical data that was shared by the submitters is available for a number of imaging collections in IDC. Please see on how to search that data and how to link clinical data with imaging metadata!

Many of the imaging collections are also accompanied by the genomics or proteomics data. CRDC provides the API to locate such related datasets.

from our Getting Started tutorial series explains how to use - a python package that aims to simplify access to IDC data

will help you get started with searching IDC metadata in BigQuery, which gives you access to all of the DICOM metadata extracted from IDC-hosted files

if you are not comfortable writing queries or coding in pyhon, you can use to search using some of the attributes that are not available through the portal. You can also to include additional attributes.

Curation policy
The Cancer Imaging Archive (TCIA)
support+submissions@canceridc.dev
download data from IDC
Google Colab
NIH Cloud Lab
ACCESS program allocations
The Cancer Imaging Archive (TCIA)
collections included in IDC
https://doi.org/10.1148/rg.230180
https://datacommons.cancer.gov/
this tutorial
Cancer Data Aggregator (CDA)
this notebook
idc-index
this more advanced notebook
this DataStudio dashboard
extend this dashboard
yes