Accessing Cloud Optimized GeoTIFFs in ArcGIS Pro

NOTE: if using ArcGIS Pro 2.9.1, this method will NOT work. For work-around, see the posting Connecting to Google Cloud Storage in ArcGIS Pro

This is a continuation of my last post on hosting and accessing Cloud Optimized GeoTIFFs or COGs on AWS S3. In my last post, I showed how to access the COG on AWS S3 in QGIS, but now I want to access them in ArcGIS Pro. ESRI supports COGs - see documentation on supported file formats and create cloud storage connection.

Below are the steps for connecting to COGs on AWS S3 in ArcGIS Pro - I am using Pro version 2.7.

Private Data Option

Create ArcGIS Pro Connection File

  1. In ArcGIS Pro , add a New Map (optional)

  2. Go to the Insert menu >> Connections >> New Cloud Storage Connection

In the Create Cloud Storage Connection window, fill in information for your AWS S3. Below is an example. Just make sure for Service Provider, that you select AMAZON.

Add COG to Map

If connection is successful, then it will be to Cloud Stores in the Catalog Pane. Here is the example of my COG on my private AWS S3.

Here is what my COG looks in ArcGIS Pro.

Public Data Option

OpenAerialMap COG Example

Here I am using a publicly accessible COG from OpenAerialMap. You can use the link here if you want use the same COG I am for this example.

  1. Click on the Image or Tile you want to use.

  2. Then Right click on Download button >> Copy Link Address

  3. Paste the link address into Notepad or something else - you will need the bucket name and subfolders to add the connection in Pro

Here I have pasted the link address I copied to Notepad. You will need the bucket name and subfolders to create the connection file in Pro.

Fig06.png

Create ArcGIS Pro Connection File

  1. Go to Insert menu >> Connections >> New Cloud Storage Connection. Or in Catalog Pane >> Right Click Cloud Stores >> New Cloud Storage Connection

  2. In Create Cloud Storage Connection window:

    • Connection File Name: Give your connection file a name (e.g. TestPublicCOG)

    • Service Provider: Select AMAZON

    • Bucket (Container) Name: Copy and paste the bucket name and subfolders (e.g. oin-hotosm/5cf6cf7cc25f7e00059bac86/0) from the link you copied in the previous step

    • Leave the Access Key ID and options blank (OK if Public)

Fig07.png

Add COG to Map

If your connection is successful then you should see it in under Cloud Stores in the Catalog Pane. Here is the example COG from OpenAerialMap added to map.

Sentinel 2 COG Example

Sentinel-2 Cloud-Optimized GeoTIFFs are available on AWS Open Data Registry. Searching for and downloading Sentinel imagery is a bit more complex so I’m not going to go over it in this post. But you can find more information on using Earth Search to find the data on AWS.

For my example connection file in ArcGIS Pro, I am using the Search for recent imagery of Alexandria, VA example given in the Earth Search site.

This is the href link I copied from the Alexandria, VA example: https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/18/S/UH/2021/4/S2B_18SUH_20210423_0_L2A

In the New Cloud Storage Connection window I fill in the info and connect to the Sentinel imagery.

This is the Cloud Storage Connection Properties for the connection file I created.

My connection was successful and I can see all the available imagery in the Catalog Pane. The image shown below is the TCI.tif (or True Color Imagery). Several band imagery (e.g. B01.tf, B02.tif) are available to use.

Before I end this post, for those who are interested ESRI has a nice Sentinel Imagery Explorer that you can use to explore Sentinel 2 imagery - snapshot shown below.

That’s it for this post. Thanks for reading.

Hosting and Accessing Cloud Optimized GeoTIFFs on AWS S3

This post is about hosting and accessing Cloud Optimized GeoTIFFs or COGs on AWS S3. I first learned about COGs while trying to find a solution for storing/hosting large raster images (e.g. aerial imagery at 2-4cm resolution) that my team and clients can easily access and use over the web.

But first, what are COGs? It isn’t a new format; COGs are simply GeoTIFFs that have an internal organization that supports efficient access via HTTP. This internal organization, combined with an HTTP feature called GET range requests (also known as Byte Serving) that allows only the portion of the file that it needs to be retrieved. Think about the way a video or music file is streamed online - you can skip forward or backward and start at a specific point without downloading the full video or music file. The COG format works the same way but for raster files - you can access the parts of the GeoTIFF as you need it instead of having to download the whole file. GOGs provide an open data format that provides an efficient and cloud ready workflow. Many data agencies and companies have started using it and it is maturing at a rapid pace. For example, the USGS have switched over to providing DEMs and Landsat data as COGs on their websites. You can find out more about GOGs at cogeo.org.

This is the workflow that I’ve come up with for creating COGs using GDAL and hosting and accessing them on AWS S3.

Step 1: Check for GDAL installation and version

First thing to do is to make sure you have GDAL installed and check which version you have. If you’ve installed QGIS using the OSGeo4W installer then you most likely will have GDAL installed. Open the OSGeo4W Shell and see which version you have. To do this, just type: gdalinfo --version

The version of GDAL is important to know as version 3.1 has a built in COG driver that support COG creation.

Step 2: Take a look at metadata

It’s always a good idea to take a look at the metadata of your geoTIFF. Go into the directory of where your geoTIFF is stored and use the gdalinfo command on your file (e..g. gdalinfo testimage.tif).

Looking at the metadata, I can tell that my geoTIFF file doesn’t have overviews and it is not tiled. Internal tiling allows rendering applications to quickly select, decompress and display only the portion or tile(s) of the image that it needs. Overviews also allows for fast access of zoomed out views of the image when needed. GDAL 3.1 version has a built-in COG driver that supports COG creation so tiling and overview creation are applied as default options.

Step 3: Create COG

So basically in this step, I’ll use the gdal_translate command with the COG option. This will make a copy of my original GeoTIFF file that is COG compliant. Below is an example for a single GeoTIFF file. If I had more files that I wanted to make COG compliant then I could create a batch file and run through all the files in my directory.

Fig03.png

See GDAL COG Creation page for all the different options available. In my case, I am using a JPEG compression which works fine for an aerial imagery. For older versions of GDAL, see the GDAL wiki page for more information on creating COGs.

Step 4: Validate COG

After creating the COG, I need to check to make sure that it is valid. Several sources (e.g. cogeo.org and GDAL Wiki) says to use the ​validate_cloud_optimized_geotiff.py to check the COG but I found that this python script doesn’t work on COG created with GDAL 3.1. It didn’t matter what options I tried - the script just said invalid COG. Other people seem to have same issue so it would seem this validation script does not work with GDAL 3.1 - see issue 151.

Instead of using the validation script, I just check the metadata of my COG to make sure that it has overviews and tiles. In the metadata, look for the Image Structure Metadata section - this will tell you if your COG is compliant. At least I am assuming it is valid since it indicates that the layout structure of my image is a COG.

Fig04.png

For comparison, my original aerial imagery is about 3.5 GB but my COG is only about 260 MB - that’s a huge difference in file size. The COG is only ~8% of the original aerial.

Step 5: Upload COG to AWS S3 Bucket

After I validate that my COG is good, I can upload it to a web server. I’m using an AWS S3 bucket to store my image. There is already plenty of information and tutorials on setting up and using AWS available; see signing up for an AWS account and free usage tier. The sign up process requires a credit card, but I think the free tier is good for testing the set up, which is what I’m using. I just followed the tutorial on Amazon’s site for creating a S3 bucket.

Here are the steps I went through:

  1. Sign up for AWS account and see getting started with AWS

  2. Create an IAM user as recommended by AWS and save Access keys (access key ID and secret access key) . These access keys are important for private data access.

  3. Create my bucket (e.g. cogaerials)

  4. Create a folder (e.g. uhm-project) to help with organization of my files in my bucket,

  5. Upload my COG aerial into the folder I just created.

  6. Also, I made the object (i.e. my COG aerial) in my bucket to public read option to see if I can easily access via https in QGIS.

  7. Copy the URL of my object or COG_aerial so I can access it in QGIS in the next step.

Step 6: Visualize COG in QGIS

Using QGIS 3.18, I’m going to load my COG and view it. Here’s a good tutorial on using COG in QGIS. The method I’m using below works for QGIS version 3.2 or higher.

Public Data Option:

  1. In QGIS, open the Data Source Manager window

  2. Go to Raster tab

  3. Source Type: Select Protocol: HTTP(S), Cloud, etc

  4. Type: HTTP/HTTPS/FTP (NOTE: this maybe be the easiest way to access the data if it’s public)

  5. URI: Paste in the object URL (i.e. COG image URL) from your AWS S3 Object

  6. Authentication: leave as default option

  7. Options: NOTE: I left this section as the default options

  8. Click Add

If I look at the source properties, I can tell that the raster layer is coming from my AWS S3 bucket.

This is how my COG aerial looks in QGIS. I am really happy with it. I can zoomed in/out, pan around and it displays pretty fast and the resolution is still good - at least I can’t tell the difference between the COG aerial and the original aerial.

Private Data Option:

If your data is private but you still want to visualize it in QIGS, you will need to set the environmental variables for AWS SECRET ACCESS KEY and AWS ACCESS KEY ID that is provided by AWS when you create an IAM user.

  1. In QGIS, got to Settings menu >> Options

  2. In the Options window: go to System tab

  3. Expand the Environment section: check the box use custom variables

  4. Click add to add variables: AWS secret access key and AWS access key id. For Apply: select Overwrite

  5. Click OK and restart QGIS

After you restart QGIS, then use the Data Source Manager window to add in the COG image.

  1. In the Data Source Manager: Select Raster tab

  2. Select Protocol: HTTP(S), cloud, etc.

  3. Type: Select AWS S3

  4. Bucket or container: enter your bucket name (e.g. cogaerials)

  5. Object key: enter your object name include subfolder if any (e.g. uhm-project/UHM_orthoCOG.tif)

  6. Options: I left as default options

  7. Click Add

If all is successful, then you should be able to see your COG added in QGIS like the example below.

There are other options for accessing AWS S3 such as Requester Pay option as well, but that’s not covered here. I haven’t tried this option, yet but my guess is you will need to set up or enable Requester Pay option for your bucket and then also set the environmental variable for that as well in QGIS.

Anyway, I hope you find this useful. I’ve learned a lot working through this workflow myself and I am happy to share with others.

Thanks for reading and until next time.