Resolving CRDC Globally Unique Identifiers (GUIDs)
Last updated
Was this helpful?
Last updated
Was this helpful?
As described in the section, a UUID identifies a particular version of an IDC data object. Thus, there is a UUID for every version of every DICOM instance in IDC hosted data. An IDC BigQuery manifest optionally includes the UUID (called a crdc_instance_uuid) of each instance (version) in the cohort.
Each such UUID can be used to form a that has been indexed by the (DCF), and can be used to access data that defines that object. In particular this data includes the GCS and AWS URLs of the DICOM instance file. Though the GCS or AWS URL of an instance might change over time, the UUID of an instance can always be resolved to obtain its current URLs. Thus, for long term curation of data, it is recommended to record instance UUIDs.
The data object returned by the server is a GA4GH DRS :
This is a typical IDC instance UUID:
641121f1-5ca0-42cc-9156-fb5538c14355
of a (version of a) DICOM instance, and this is the corresponding DRS ID:
dg.4DFC/641121f1-5ca0-42cc-9156-fb5538c14355
A DRS ID can be resolved by appending it to the following URL, which is the resolution service within CRDC: https://nci-crdc.datacommons.io/ga4gh/drs/v1/objects/
. For example, the following curl
command:
>> curl https://nci-crdc.datacommons.io/ga4gh/drs/v1/objects/dg.4DFC/641121f1-5ca0-42cc-9156-fb5538c14355
returns this DrsObject:
AS can be seen, the access_methods
component in the returned DrsObject includes a URL for each of the corresponding files in Google GCS and AWS S3.