IDC User Guide
  • Welcome!
  • 🚀Getting started
  • Core functions
  • Frequently asked questions
  • Support
  • Key pointers
  • Publications
  • IDC team
  • Acknowledgments
  • Jobs
  • Data
    • Introduction
    • Data model
    • Data versioning
    • Organization of data
      • Files and metadata
      • Resolving CRDC Globally Unique Identifiers (GUIDs)
      • Clinical data
      • Organization of data, v2 through V13 (deprecated)
        • Files and metadata
        • Resolving CRDC Globally Unique Identifiers (GUIDs)
        • Clinical data
      • Organization of data in v1 (deprecated)
    • Downloading data
      • Downloading data with s5cmd
      • Directly loading DICOM objects from Google Cloud or AWS in Python
    • Data release notes
    • Data known issues
  • Tutorials
    • Portal tutorial
    • Python notebook tutorials
    • Slide microscopy
      • Using QuPath for visualization
  • DICOM
    • Introduction to DICOM
    • DICOM data model
    • Original objects
    • Derived objects
      • DICOM Segmentations
      • DICOM Radiotherapy Structure Sets
      • DICOM Structured Reports
    • Coding schemes
    • DICOM-TIFF dual personality files
    • IDC DICOM white papers
  • Portal
    • Getting started
    • Exploring and subsetting data
      • Configuring your search
      • Exploring search results
      • Data selection and download
    • Visualizing images
    • Proxy policy
    • Viewer release notes
    • Portal release notes
  • API
    • Getting Started
    • IDC API Concepts
    • Manifests
    • Accessing the API
    • Endpoint Details
    • V1 API
      • Getting Started
      • IDC Data Model Concepts
      • Accessing the API
      • Endpoint Details
      • Release Notes
  • Cookbook
    • Colab notebooks
    • BigQuery
    • Looker dashboards
      • Dashboard for your cohort
      • More dashboard examples
    • ACCESS allocations
    • Compute engine
      • 3D Slicer desktop VM
      • Using a BQ Manifest to Load DICOM Files onto a VM
      • Using VS Code with GCP VMs
      • Security considerations
    • NCI Cloud Resources
Powered by GitBook
On this page
  • Highlights
  • Functionality

Was this helpful?

Edit on GitHub
Export as PDF

Welcome!

NextGetting started

Last updated 1 month ago

Was this helpful?

If you need support with IDC or have any questions, please open a new topic in IDC User Forum (preferred) or send email to support@canceridc.dev.

Would you rather discuss your questions in an meeting with an expert from the IDC team? Book a 1-on-1 support session here: https://tinyurl.com/idc-help-request

NCI Imaging Data Commons (IDC) is a cloud-based environment containing publicly available cancer imaging data co-located with analysis and exploration tools. IDC is a node within the broader NCI Cancer Research Data Commons (CRDC) infrastructure that provides secure access to a large, comprehensive, and expanding collection of cancer research data.

Highlights

  • >85 TB of data: IDC contains radiology, brightfield (H&E) and fluorescence slide microscopy images, along with image-derived data (annotations, segmentations, quantitative measurements) and accompanying clinical data

  • free: all of the data in IDC is publicly available: no registration, no access requests

  • commercial-friendly: >95% of the data in IDC is covered by the permissive CC-BY license, which allows commercial reuse (small subset of data is covered by the CC-NC license); each file in IDC is tagged with the license to make it easier for you to understand and follow the rules

  • cloud-based: all of the data in IDC is available from both Google and AWS public buckets: fast and free to download, no out-of-cloud egress fees

  • harmonized: all of the images and image-derived data in IDC is harmonized into standard DICOM representation

Functionality

IDC is as much about data as it is about what you can do with the data! We maintain and actively develop a variety of tools that are designed to help you efficiently navigate, access and analyze IDC data:

  • exploration: start with the IDC Portal to get an idea of the data available

  • visualization: examine images and image-derived annotations and analysis results from the convenience of your browser using integrated OHIF, VolView and Slim open source viewers

  • programmatic access: use idc-index python package to perform search, download and other operations programmatically

  • cohort building: use rich and extensive metadata to build subsets of data programmatically using idc-index or BigQuery SQL

  • download: use your favorite S3 API client or idc-index to efficiently fetch any of the IDC files from our public buckets

  • analysis: conveniently access IDC files and metadata from the tools that are cloud-native, such as Google Colab or Looker; fetch IDC data directly into 3D Slicer using SlicerIDCBrowser extension

The overview of IDC is available in this open access publication. If you use IDC, please acknowledge us by citing it!

Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National Cancer Institute Imaging Data Commons: Toward Transparency, Reproducibility, and Scalability in Imaging Artificial Intelligence. RadioGraphics (2023). https://doi.org/10.1148/rg.230180

IDC data release v20 summary; see live dashboard here.