Heat and Trees#
Urban Heat Islands and Street Trees#
In this notebook we’ll be exploring the urban heat island effect by looking at the impact on surface temperature of roof color and street trees. We’ll be replicating the process described here: http://urbanspatialanalysis.com/urban-heat-islands-street-trees-in-philadelphia/ but using Python tools rather than ESRI.
Extra packages: To run this notebook, you’ll need the PyViz tools and a library of top of atmosphere calculations from rio-toa
: pip install rio-toa
Data sources: This notebook uses Landsat data from Google Cloud Storage as well as some geographic data from OpenDataPhilly.
import warnings
import intake
import xarray as xr
import pandas as pd
import numpy as np
import geopandas as gpd
import cartopy.crs as ccrs
import hvplot.xarray # noqa
import hvplot.pandas # noqa
from geoviews.tile_sources import EsriImagery
from pyproj import CRS
WARNING:param.main: pandas could not register all extension types imports failed with the following error: cannot import name 'ABCIndexClass' from 'pandas.core.dtypes.generic' (/home/runner/work/examples/examples/heat_and_trees/envs/default/lib/python3.8/site-packages/pandas/core/dtypes/generic.py)
warnings.simplefilter('ignore')
Just some extra info about Landsat data:
band_info = pd.DataFrame([
(1, "Aerosol", " 0.43 - 0.45", 0.440, "30", "Coastal aerosol"),
(2, "Blue", " 0.45 - 0.51", 0.480, "30", "Blue"),
(3, "Green", " 0.53 - 0.59", 0.560, "30", "Green"),
(4, "Red", " 0.64 - 0.67", 0.655, "30", "Red"),
(5, "NIR", " 0.85 - 0.88", 0.865, "30", "Near Infrared (NIR)"),
(6, "SWIR1", " 1.57 - 1.65", 1.610, "30", "Shortwave Infrared (SWIR) 1"),
(7, "SWIR2", " 2.11 - 2.29", 2.200, "30", "Shortwave Infrared (SWIR) 2"),
(8, "Panc", " 0.50 - 0.68", 0.590, "15", "Panchromatic"),
(9, "Cirrus", " 1.36 - 1.38", 1.370, "30", "Cirrus"),
(10, "TIRS1", "10.60 - 11.19", 10.895, "100 * (30)", "Thermal Infrared (TIRS) 1"),
(11, "TIRS2", "11.50 - 12.51", 12.005, "100 * (30)", "Thermal Infrared (TIRS) 2")],
columns=['Band', 'Name', 'Wavelength Range (µm)', 'Nominal Wavelength (µm)', 'Resolution (m)', 'Description']).set_index(["Band"])
band_info
Name | Wavelength Range (µm) | Nominal Wavelength (µm) | Resolution (m) | Description | |
---|---|---|---|---|---|
Band | |||||
1 | Aerosol | 0.43 - 0.45 | 0.440 | 30 | Coastal aerosol |
2 | Blue | 0.45 - 0.51 | 0.480 | 30 | Blue |
3 | Green | 0.53 - 0.59 | 0.560 | 30 | Green |
4 | Red | 0.64 - 0.67 | 0.655 | 30 | Red |
5 | NIR | 0.85 - 0.88 | 0.865 | 30 | Near Infrared (NIR) |
6 | SWIR1 | 1.57 - 1.65 | 1.610 | 30 | Shortwave Infrared (SWIR) 1 |
7 | SWIR2 | 2.11 - 2.29 | 2.200 | 30 | Shortwave Infrared (SWIR) 2 |
8 | Panc | 0.50 - 0.68 | 0.590 | 15 | Panchromatic |
9 | Cirrus | 1.36 - 1.38 | 1.370 | 30 | Cirrus |
10 | TIRS1 | 10.60 - 11.19 | 10.895 | 100 * (30) | Thermal Infrared (TIRS) 1 |
11 | TIRS2 | 11.50 - 12.51 | 12.005 | 100 * (30) | Thermal Infrared (TIRS) 2 |
Loading data#
For this example, we will be using landsat data stored on Google Cloud Storage. Since these data are accessed via https, there is no guaranteed directory structure, so we will need to specify the url pointing to each file and then iterate over the files to create a concatenated dataset. We use jinja template notation in intake to pass parameters to the urlpath
.
cat = intake.open_catalog('catalog.yml')
list(cat)
['google_landsat_band']
Let’s take a look at what the google_landsat_band
looks like:
google_landsat_band:
description: Landsat bands from Google Cloud Storage
driver: rasterio
parameters:
path:
description: landsat path
type: int
row:
description: landsat row
type: int
product_id:
description: landsat file id
type: str
band:
description: band
type: int
args:
urlpath: https://storage.googleapis.com/gcp-public-data-landsat/LC08/01/{{ '%03d' % path }}/{{ '%03d' % row }}/{{ product_id }}/{{ product_id }}_B{{ band }}.TIF
chunks:
band: 1
x: 256
y: 256
The following might feel a bit arbitrary, but we have chosen the path and row corresponding to the area over Philadelphia using the earth explorer. We have also found the id of the particular date of interest using the same tool. With these values in hand, we can access parts of each file on Google Cloud Storage.
path = 14
row = 32
product_id = 'LC08_L1TP_014032_20160727_20170222_01_T1'
The first step to using intake is to initialize the catalog entry with user parameters to create a data source
.
data_source = cat.google_landsat_band(path=path, row=row, product_id=product_id)
From this data source
we can get a lazily loaded xarray object using dask. To make sure that we can inspect what dask is up to, it can be helpful to create a dask client.
ds = data_source.to_dask()
ds.name = 'value'
Loading in metadata regarding these particular Landsat images from the associated matlab.txt file.
def load_google_landsat_metadata(path, row, product_id):
"""Load Landsat metadata for path, row, product_id from Google Cloud Storage
"""
def parse_type(x):
try:
return eval(x)
except:
return x
baseurl = 'https://storage.googleapis.com/gcp-public-data-landsat/LC08/01'
suffix = f'{path:03d}/{row:03d}/{product_id}/{product_id}_MTL.txt'
df = intake.open_csv(
urlpath=f'{baseurl}/{suffix}',
csv_kwargs={'sep': '=',
'header': None,
'names': ['index', 'value'],
'skiprows': 2,
'converters': {'index': (lambda x: x.strip()),
'value': parse_type}}).read()
metadata = df.set_index('index')['value']
return metadata
metadata = load_google_landsat_metadata(path, row, product_id)
metadata.head()
index
ORIGIN Image courtesy of the U.S. Geological Survey
REQUEST_ID 0501702206266_00020
LANDSAT_SCENE_ID LC80140322016209LGN01
LANDSAT_PRODUCT_ID LC08_L1TP_014032_20160727_20170222_01_T1
COLLECTION_NUMBER 01
Name: value, dtype: object
Sub-setting to area of interest#
So far we haven’t downloaded any band data. Since we know that we are interested in Philadelphia, we can just take a smaller square of data that covers the extents of the city. First we need to know the projection of the dataset:
ds.crs
'+init=epsg:32618'
We’ll convert that into something directly usable for later:
proj_crs= CRS.from_epsg(32618)
proj_crs
<Projected CRS: EPSG:32618>
Name: WGS 84 / UTM zone 18N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: World - N hemisphere - 78°W to 72°W - by country
- bounds: (-78.0, 0.0, -72.0, 84.0)
Coordinate Operation:
- name: UTM zone 18N
- method: Transverse Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
crs = ccrs.UTM(zone=18)
Now if we were just looking for one particular point we could use that point, converted to the coordinate system of the data, and then select the data nearest to it:
x_center, y_center = crs.transform_point(-75.1652, 39.9526, ccrs.PlateCarree())
nearest_to_center = ds.sel(x=x_center, y=y_center, method='nearest')
print(nearest_to_center.compute())
nearest_to_center.hvplot.line(x='band')
<xarray.DataArray 'value' (band: 4)>
array([11465, 12774, 30347, 26335], dtype=uint16)
Coordinates:
* band (band) int64 4 5 10 11
y float64 4.423e+06
x float64 4.859e+05
Attributes:
transform: (30.0, 0.0, 395385.0, 0.0, -30.0, 4582215.0)
crs: +init=epsg:32618
res: (30.0, 30.0)
is_tiled: 1
nodatavals: (nan,)
scales: (1.0,)
offsets: (0.0,)
AREA_OR_POINT: Point
In this case, though, we are interested in a subset of data that covers that city of Philadelphia. So we need some geometry to specify the bounds of the city. We can get a GeoJSON of neighborhood data from OpenDataPhilly.
url = 'https://github.com/azavea/geo-data/raw/master/Neighborhoods_Philadelphia/Neighborhoods_Philadelphia.geojson'
geoms = gpd.read_file(url)
We can compute the bounds of each of these neighborhoods and then using min and max get a rectangle that encompasses all of Philly.
bounds = geoms.geometry.bounds
lower_left_corner_lat_lon = bounds.minx.min(), bounds.miny.min()
upper_right_corner_lat_lon = bounds.maxx.max(), bounds.maxy.max()
print(lower_left_corner_lat_lon, upper_right_corner_lat_lon)
(-75.280266, 39.867004) (-74.955763, 40.137992)
Using the crs defined above, we can transform these lat lons into map coordinates.
ll_corner = crs.transform_point(*lower_left_corner_lat_lon, ccrs.PlateCarree())
ur_corner = crs.transform_point(*upper_right_corner_lat_lon, ccrs.PlateCarree())
print(ll_corner, ur_corner)
(476030.2131105056, 4413033.712613869) (503768.4451229853, 4443074.1017382825)
Then we can use those corners to slice the data. If the subset is empty along x or y, the ordering of the coordinates might not be what you anticipated. Try swapping the order of arguments in the slice.
subset = ds.sel(x=slice(ll_corner[0], ur_corner[0]), y=slice(ur_corner[1], ll_corner[1]))
We can persist this slice of the dataset in memory for easy use later.
subset = subset.persist()
subset
<xarray.DataArray 'value' (band: 4, y: 1001, x: 925)> dask.array<getitem, shape=(4, 1001, 925), dtype=uint16, chunksize=(1, 256, 256), chunktype=numpy.ndarray> Coordinates: * band (band) int64 4 5 10 11 * y (y) float64 4.443e+06 4.443e+06 4.443e+06 ... 4.413e+06 4.413e+06 * x (x) float64 4.76e+05 4.761e+05 4.761e+05 ... 5.037e+05 5.038e+05 Attributes: transform: (30.0, 0.0, 395385.0, 0.0, -30.0, 4582215.0) crs: +init=epsg:32618 res: (30.0, 30.0) is_tiled: 1 nodatavals: (nan,) scales: (1.0,) offsets: (0.0,) AREA_OR_POINT: Point
- band: 4
- y: 1001
- x: 925
- dask.array<chunksize=(1, 226, 128), meta=np.ndarray>
Array Chunk Bytes 7.06 MiB 128.00 kiB Shape (4, 1001, 925) (1, 256, 256) Count 100 Tasks 100 Chunks Type uint16 numpy.ndarray - band(band)int644 5 10 11
array([ 4, 5, 10, 11])
- y(y)float644.443e+06 4.443e+06 ... 4.413e+06
array([4443060., 4443030., 4443000., ..., 4413120., 4413090., 4413060.])
- x(x)float644.76e+05 4.761e+05 ... 5.038e+05
array([476040., 476070., 476100., ..., 503700., 503730., 503760.])
- transform :
- (30.0, 0.0, 395385.0, 0.0, -30.0, 4582215.0)
- crs :
- +init=epsg:32618
- res :
- (30.0, 30.0)
- is_tiled :
- 1
- nodatavals :
- (nan,)
- scales :
- (1.0,)
- offsets :
- (0.0,)
- AREA_OR_POINT :
- Point
To check that we got the right area, we can do a simple plot of one of the bands and overlay the neighborhoods on top of it. We’ll use hvplot to quickly create a holoviews object rendered in bokeh.
band_plot = subset.mean('band').hvplot(x='x', y='y', datashade=True, project=True, crs=crs, cmap='gray')
hood_plot = geoms.hvplot(geo=True, alpha=.5, c='mapname', legend=False, frame_height=450)
band_plot * hood_plot
Calculate NDVI#
We’ll calculate NDVI but we won’t yet do any computations – our bands are actually dask arrays, which allow for lazy computation.
subset = subset.where(subset > 0)
NDVI = (subset.sel(band=5) - subset.sel(band=4)) / (subset.sel(band=5) + subset.sel(band=4))
NDVI = NDVI.where(NDVI < np.inf)
NDVI
<xarray.DataArray 'value' (y: 1001, x: 925)> dask.array<where, shape=(1001, 925), dtype=float64, chunksize=(256, 256), chunktype=numpy.ndarray> Coordinates: * y (y) float64 4.443e+06 4.443e+06 4.443e+06 ... 4.413e+06 4.413e+06 * x (x) float64 4.76e+05 4.761e+05 4.761e+05 ... 5.037e+05 5.038e+05
- y: 1001
- x: 925
- dask.array<chunksize=(226, 128), meta=np.ndarray>
Array Chunk Bytes 7.06 MiB 512.00 kiB Shape (1001, 925) (256, 256) Count 575 Tasks 25 Chunks Type float64 numpy.ndarray - y(y)float644.443e+06 4.443e+06 ... 4.413e+06
array([4443060., 4443030., 4443000., ..., 4413120., 4413090., 4413060.])
- x(x)float644.76e+05 4.761e+05 ... 5.038e+05
array([476040., 476070., 476100., ..., 503700., 503730., 503760.])
In order to visualize NDVI, the data will need to be loaded and the NDVI computed. We can expect this to take some non-trivial amount of time (on the order of 20 sec on my machine).
NDVI.hvplot(x='x', y='y', crs=crs, datashade=True, project=True, cmap='viridis', frame_height=450)
Calculate land surface temperature#
Given the NDVI calculated above, we can determine land surface temperature. For ease, we’ll use some top of atmosphere calculations that have already been written up as Python functions as part of rasterio work in the rio_toa
module. We’ll also need to specify one more for transforming satellite temperature (brightness temp) to land surface temperature. For these calculations we’ll use both Thermal Infrared bands - 10 and 11.
from rio_toa import brightness_temp, toa_utils
def land_surface_temp(BT, w, NDVI):
"""Calculate land surface temperature of Landsat 8
temp = BT/1 + w * (BT /p) * ln(e)
BT = At Satellite temperature (brightness)
w = wavelength of emitted radiance (μm)
where p = h * c / s (1.439e-2 mK)
h = Planck's constant (Js)
s = Boltzmann constant (J/K)
c = velocity of light (m/s)
"""
h = 6.626e-34
s = 1.38e-23
c = 2.998e8
p = (h * c / s) * 1e6
Pv = (NDVI - NDVI.min() / NDVI.max() - NDVI.min())**2
e = 0.004 * Pv + 0.986
return BT / 1 + w * (BT / p) * np.log(e)
Now we’ll set up a helper function to retrieve all the parameters from the metadata and general Landsat info table, and calculate the land surface temperature for bands 10 and 11.
def land_surface_temp_for_band(band, data, units='F'):
# params from general Landsat info
w = band_info.loc[band]['Nominal Wavelength (µm)']
# params from specific Landsat data text file for these images
ML = metadata[f'RADIANCE_MULT_BAND_{band}']
AL = metadata[f'RADIANCE_ADD_BAND_{band}']
K1 = metadata[f'K1_CONSTANT_BAND_{band}']
K2 = metadata[f'K2_CONSTANT_BAND_{band}']
at_satellite_temp = brightness_temp.brightness_temp(data.sel(band=band).values, ML, AL, K1, K2)
temp = land_surface_temp(at_satellite_temp, w, NDVI).compute()
return toa_utils.temp_rescale(temp, units)
temp_10_f = land_surface_temp_for_band(10, subset)
temp_11_f = land_surface_temp_for_band(11, subset)
temp_f = xr.concat([temp_10_f, temp_11_f],
dim=xr.DataArray([10,11], name='band', dims=['band']))
temp_f
<xarray.DataArray 'value' (band: 2, y: 1001, x: 925)> array([[[83.10190773, 82.91038144, 82.67288906, ..., 97.35640666, 97.2117854 , 96.91421684], [82.62258043, 82.52232388, 82.37209341, ..., 96.97288603, 97.08645353, 97.08646979], [82.26775089, 82.28015277, 82.26777614, ..., 96.63993145, 97.01599608, 97.156813 ], ..., [88.05508442, 87.47605607, 86.78928099, ..., 85.07908909, 84.91395172, 84.42642427], [88.10400659, 87.44762668, 86.74031197, ..., 85.20723062, 84.98407478, 84.53367952], [87.99411402, 87.16168886, 86.31773458, ..., 85.281523 , 85.19033866, 85.04210212]], [[76.12944534, 75.9028247 , 75.7064429 , ..., 86.37502271, 86.23002247, 86.07993912], [75.86254688, 75.62539663, 75.35801325, ..., 86.16211012, 86.13313788, 86.17671575], [75.59035087, 75.47415302, 75.27727089, ..., 86.0315293 , 86.31695341, 86.31209095], ..., [79.19698226, 78.8634322 , 78.5047179 , ..., 76.37078288, 76.08386227, 75.64569457], [79.33132965, 78.85358094, 78.37023776, ..., 76.59738753, 76.33553042, 75.96802995], [79.34624269, 78.68921338, 78.03036428, ..., 76.74314736, 76.67230545, 76.56716019]]]) Coordinates: * y (y) float64 4.443e+06 4.443e+06 4.443e+06 ... 4.413e+06 4.413e+06 * x (x) float64 4.76e+05 4.761e+05 4.761e+05 ... 5.037e+05 5.038e+05 * band (band) int64 10 11
- band: 2
- y: 1001
- x: 925
- 83.1 82.91 82.67 82.41 82.11 81.69 ... 77.1 76.75 76.74 76.67 76.57
array([[[83.10190773, 82.91038144, 82.67288906, ..., 97.35640666, 97.2117854 , 96.91421684], [82.62258043, 82.52232388, 82.37209341, ..., 96.97288603, 97.08645353, 97.08646979], [82.26775089, 82.28015277, 82.26777614, ..., 96.63993145, 97.01599608, 97.156813 ], ..., [88.05508442, 87.47605607, 86.78928099, ..., 85.07908909, 84.91395172, 84.42642427], [88.10400659, 87.44762668, 86.74031197, ..., 85.20723062, 84.98407478, 84.53367952], [87.99411402, 87.16168886, 86.31773458, ..., 85.281523 , 85.19033866, 85.04210212]], [[76.12944534, 75.9028247 , 75.7064429 , ..., 86.37502271, 86.23002247, 86.07993912], [75.86254688, 75.62539663, 75.35801325, ..., 86.16211012, 86.13313788, 86.17671575], [75.59035087, 75.47415302, 75.27727089, ..., 86.0315293 , 86.31695341, 86.31209095], ..., [79.19698226, 78.8634322 , 78.5047179 , ..., 76.37078288, 76.08386227, 75.64569457], [79.33132965, 78.85358094, 78.37023776, ..., 76.59738753, 76.33553042, 75.96802995], [79.34624269, 78.68921338, 78.03036428, ..., 76.74314736, 76.67230545, 76.56716019]]])
- y(y)float644.443e+06 4.443e+06 ... 4.413e+06
array([4443060., 4443030., 4443000., ..., 4413120., 4413090., 4413060.])
- x(x)float644.76e+05 4.761e+05 ... 5.038e+05
array([476040., 476070., 476100., ..., 503700., 503730., 503760.])
- band(band)int6410 11
array([10, 11])
Compare the results from the two different bands, noticing that the colorbars are different.
temp_f.hvplot(x='x', y='y', groupby='band', cmap='fire_r',
crs=crs, rasterize=True, project=True, frame_height=350).layout()
We’ll take the mean of the calculated land surface temperature for each of the bands and mimic the colormap used in the project that we are duplicating.
mean_temp_f = temp_f.mean(dim='band')
mean_temp_f.hvplot(x='x', y='y', title='Mean Surface Temp (F)', crs=crs, tiles='EsriImagery',
frame_height=450, project=True, cmap='rainbow', alpha=.5, legend=False)
Notice how the hot spots are located over warehouse roofs and parking lots. This becomes even more visible when just the temperatures greater than 90F are displayed. To show this, we’ll make a special colormap that just includes high intensity reds that are found at the top of the fire_r
colormap.
import colorcet as cc
special_cmap = cc.fire[::-1][90:]
thresholded_temp_p = (mean_temp_f.where(mean_temp_f > 90)
.hvplot(x='x', y='y', title='Mean Temp (F) > 90',
cmap=special_cmap, crs=crs, frame_width=400,
frame_height=450,
colorbar=False, legend=False)
.redim(value='Temperature (F)'))
thresholded_temp_p + thresholded_temp_p.opts(alpha=.3, data_aspect=None) * EsriImagery