Comment on page
Getting started with GCP
Whether you are new to the cloud, or you consider yourself an expert, we encourage you to apply for a free Google cloud credits that we provide to our users to support cancer imaging research projects that work with Imaging Data Commons. All reasonable requests will receive a $300 allocation of credits that do not expire, and we will not require you to provide a credit card information to verify your identity. All you have to do is fill out and submit this application form.
You are also encouraged to review the slides in the following presentation that provides an introduction into GCP, and shares some best practices for its usage.
Google Cloud platform provides a range of solutions to better understand and analyze data hosted by IDC. Depending on what you want to do (see the range of options here), you may need to complete one or more of the following steps below.
Do you have a Google identity? If so, you can proceed to the next step.
- 2.Click "Select a project" button in the upper left corner of the screen, and then click "New project".
- 3.Open the GCP Dashboard ( ≡ > Cloud overview > Dashboard) and take note of the "Project ID" value - you will need it to perform some of the operations.
Additional reading materials:
IDC uses BigQuery for managing metadata for the hosted data. In order to locate the tables that contain such metadata, complete the following steps:
- 2.Click "+ ADD" button, and select "Star a project by name" from the Additional Resource table
bigquery-public-datain the text box and click "PIN" button
- 4.In the left panel, expand the
bigquery-public-datadrop-down, and navigate to the items called
idc_current, which are the datasets containing metadata tables maintained by IDC. Numbered datasets correspond to the IDC data versions documented in Data Release Notes.
idc_currentis an alias that always points to the latest IDC version.
Navigate to the GCP BigQuery API page. If the BigQuery API has not been enabled, you will see a blue "ENABLE" button that you will need to push to enable that API. This is needed in order to be able to query IDC BigQuery tables using Python API.
Note that you will need to do this only if you want to interact with IDC data from your computer. If you use Google Colab, or Google Compute Engine VMs, Cloud SDK tools will be pre-installed and ready to use.
You will need to set up project billing if you want to launch your own VMs, or use resources beyond the free usage tier.
Once you set up billing, we can't stress enough how important it is to be diligent in tracking your usage of GCP resources!
- Be sure to shut down anything you aren't using - free trial credits, IDC-provided credits or your credit card will be charged otherwise for the resources you are not using.
- Be careful with your login information. If someone takes over your account they could run up a huge bill that you will be responsible for paying.
- Unless you are not concerned about billing, remember to SHUT DOWN THE MACHINE when you aren't using it! You are billed continuously while the VM instance is running.
- Even after you stop the VMs, you keep paying for the disk storage attached to those machines! You can delete the VM instances to stop incurring those costs.