sl_3.jpg
web_icons-06.png

ODC Google Sandbox

odc_white.png
google-logo-white-png-4.png

Contributed by the CEOS SEO

How to use the ODC-Google Sandbox

The Open Data Cube (ODC) Google Sandbox is a free and open programming interface that connects users to Google Earth Engine datasets. The open source tool allows users to run Python application algorithms using Google's Colab notebook environment. This tool demonstrates rapid creation of science products anywhere in the world without the need to download and process the satellite data. Some example applications include: scene-based cloud statistics, custom cloud-filtered mosaics, spectral index products including vegetation fractional cover, historic water extent, and vegetation land change. Basic operation of the tool will support many users for small-scale demonstrations and training but can also be scaled in size and scope, with Google Cloud resources, to support enhanced user needs.

Earth Engine Authorization is required to run the Google Colab notebooks. This process requires you have a Google account and have Earth Engine user authorization. Users with a GMAIL address will already have a Google account. Below are links for more information on how to get a Google account and Earth Engine user authorization. The Earth Engine authorization may take several days.

 

Creating a new Google Account:  https://support.google.com/accounts/answer/27441

Getting Earth Engine Authorization: https://signup.earthengine.google.com/

 

Steps to Run a Notebook in the ODC-Google Sandbox

 

The quickest way to start running the ODC-Google Colab Sandbox is to click the following link, or the button below which will open Colab with a notebook called Getting Started: Open Data Cube on Google Colab.

In this notebook, you will find an overview of ODC and Colab, simple interactions with the Data Cube, and links to explore several example applications.

For a more detailed Colab Setup walkthrough check out our instructions Here>>

 

If you’d like to access the applications directly, you can start by opening the GitHub folder with a list of sample Python notebooks: https://github.com/ceos-seo/odc-colab/tree/master/notebooks

colab.PNG
Picture1.png
GitHub-logo.png
  • Right-click on any notebook and "Open Link in a New Tab". 

  • Click on "Open in Colab" at the top of the notebook. This will prepare the notebook to run. An example of the Colab view and menu is shown below.

 

  • NOTE: Each notebook must be run in a separate tab and will have its own dedicated Google processing instance. The Earth Engine authorization step is required for each separate notebook.

Picture2.png
  • From the menu, select "Runtime > Run all" to run the notebook for the first time.

  • Wait for the ODC and data index to load (first 2 cells of code).

  • Typically at cell 3, there is a Google Earth Engine Authorization step where you will need to access a code.earthengine.google link to authorize access. Such as show below:

Picture1.png
  • Select your Google Account.

Picture6.png
  • Click "Choose Project".

Picture3.png
  • Create a new Cloud Project. Your project ID must be globally unique. To help guarantee the uniqueness of your project ID, it is recommended you follow the form: odc-colab-<your>-<name>

    • This only has to be performed once.

    • If you receive an error "A project with the same ID already exists" - please try another unique name. 

    • You may also have to accept "Google's Cloud Terms of Service" before continuing. 

Picture4.png
  • Leave "Use read-only scopes" unchecked and click "Generate Token".

Picture5.png
  • Select your Google account.

Picture6.png
  • Click "Continue".

Picture7.png
  • Select all checkboxes and click "Continue".

Picture8.png
  • Copy the Authorization Code.

Picture9.png
  • Return to the Notebook and paste the Authorization Code from the previous step into the text box and press enter. 

Picture10.png
  • Please note - if through this process you find an error that says "Not signed up for Earth Engine or project is not registered" you will need to request access. Please see our full setup instructions for more details!

​​

  • The rest of the code should run until completion.

  • As a notebook is running, cells are numbered as they are completed. A cell that is currently being executed will have a "circle" moving around the cell number label. Users can also view the execution time in the banner at the very bottom of the notebook and view the resources being consumed (RAM and Storage) using the banner in the top-right of the notebook screen.

  • NOTE: Each notebook must be run in a separate tab and will have its own dedicated Google processing instance. The Earth Engine authorization step is required for each separate notebook.

  • Please see our detailed Colab Setup Instructions Here>>

Running all of parts of a notebook after the first run

Once a notebook has run to completion it is possible to make changes to the notebook and run again. This does not require new Earth Engine authorization unless the notebook is restarted from the beginning using one of the "Runtime > Restart" commands. In order to run the notebook with new edits (e.g. new region, new time window, new plot configuration) users can select “Runtime > Run after” from within the block the edits were made. This will first run the current block and then continue running blocks until the end of the notebook is reached. A user may also select “Runtime > Run all” to run the entire notebook again. This is not the same as a restart so will maintain the Colab setup process and current authorization. Finally, a common method of running notebooks is to run one block at a time. This is done by selecting the block and then hitting "Shift+Return". This will only run a single block but can be quickly repeated to continue running successive blocks.

Saving and sharing notebooks is possible using Google Drive. Users will see a link "Copy to Drive" at the top of the notebook or using the menu at "File > Save a copy in Drive". Users will notice that the saved file is stored in a "Colab Notebooks" folder on their Google Drive with an assigned name and "time stamp". The filename can be altered later from the user's Google Drive account. In order to share this notebook, users should click the "share" button in the Drive menu or right-click the filename to find the sharing link. Just enter the name or email address of another Google user to allow them sharing rights. It should be noted that these notebooks are generally small files and that users receive up to 15GB of free Google Drive storage.

Several export options are available:  downloading the Jupyter notebook as a *.ipynb file to your local drive, saving the notebook to your personal GitHub account, printing the notebook on a local printer, or saving the notebook as a PDF (select "Save as PDF" as the print destination). 

Output files are saved in the temporary Colab instance. Users can find these output files by clicking the folder icon  on the far-left menu. This folder structure will allow users to find the output file and download it to their local computer. Once the Colab instance is closed, the output file is lost.

The background code for the ODC-Google Sandbox is located in these GitHub locations:

Colab Project: https://github.com/ceos-seo/odc-colab
GEE Project: https://github.com/ceos-seo/odc-gee

Summary of the Sample Notebooks and How to Make Modifications

Each of the baseline notebooks uses global Landsat-8 data. Users are able to make changes to the region and time window to run sample cases anywhere in the world. It is suggested that users keep their regions and time windows similar in scale to those used in the baseline notebooks as this will allow the code to run to completion in a reasonable amount of time (e.g. less than 5 minutes). Larger regions and longer time windows are possible, but they may exceed the limits of the Google Colab environment (12GB RAM, 100 GB storage) or take a long time to run to completion. In addition to modifying regions and times, users may also want to modify plot settings or add their own code. Comments (lines of code starting with "#") are used throughout the notebooks to describe the details of code blocks and where code blocks can be easily modified by users. Look for "# MODIFY HERE" statements in the code to identify blocks of code that are easily modified by users to yield new results. 

If you have any questions, please reach us through our Contact Form!