Carbon Flux#

Carbon Monitoring Project#

FluxNet is a worldwide collection of sensor stations that record a number of local variables relating to atmospheric conditions, solar flux and soil moisture. This notebook visualizes the data used in the NASA Goddard/University of Alabama carbon monitoring project NEE Data Fusion (Grey Nearing et al., 2018), but using Python tools rather than Matlab.

The scientific goals of this notebook are to:

  • examine the carbon flux measurements from each site (net C02 ecosystem exchange, or NEE)

  • determine the feasibility of using a model to predict the carbon flux at one site from every other site.

  • generate and explain model

The “meta” goal is to show how Python tools let you solve the scientific goals, so that you can apply these tools to your own problems.

import sys
import dask
import numpy as np
import pandas as pd

import holoviews as hv

import hvplot.pandas  # noqa
import geoviews.tile_sources as gts

pd.options.display.max_columns = 10
hv.extension('bokeh')
ERROR 1: PROJ: proj_create_from_database: Open of /home/runner/work/examples/examples/carbon_flux/envs/default/share/proj failed

Open the intake catalog#

This notebook uses intake to set up a data catalog with instructions for loading data for various projects. Before we read in any data, we’ll open that catalog file and inspect the various data sources:

import intake

cat = intake.open_catalog('./catalog.yml')
list(cat)
['fluxnet_daily', 'fluxnet_metadata']

Load metadata#

First we will load in the fluxnet_metadata containing some site information for each of the fluxnet sites. Included in these data are the lat and lon of each site and the vegetation encoding (more on this below). In the next cell we will read in these data and take a look at a random few lines:

metadata = cat.fluxnet_metadata().read()
metadata.sample(5)
0/|/allflux_metadata.txt: |
                           

site lat lon igbp
307 US-ORv 40.0201 -83.0183 WET
275 US-AR1 36.4267 -99.4200 GRA
25 US-An3 68.9300 -150.2700 OSH
337 US-Wi9 46.6188 -91.0814 ENF
334 US-Wi6 46.6249 -91.2982 OSH

The vegetation type is classified according to the categories set out in the International Geosphere–Biosphere Programme (igbd) with several additional categories defined on the FluxNet website.

igbp_vegetation = {
    'WAT': '00 - Water',
    'ENF': '01 - Evergreen Needleleaf Forest',
    'EBF': '02 - Evergreen Broadleaf Forest',
    'DNF': '03 - Deciduous Needleleaf Forest',
    'DBF': '04 - Deciduous Broadleaf Forest',
    'MF' : '05 - Mixed Forest',
    'CSH': '06 - Closed Shrublands',
    'OSH': '07 - Open shrublands',
    'WSA': '08 - Woody Savannas',
    'SAV': '09 - Savannas',
    'GRA': '10 - Grasslands',
    'WET': '11 - Permanent Wetlands',
    'CRO': '12 - Croplands',
    'URB': '13 - Urban and Built-up',
    'CNV': '14 - Cropland/Nartural Vegetation Mosaics',
    'SNO': '15 - Snow and Ice',
    'BSV': '16 - Baren or Sparsely Vegetated'
}

We can use the dictionary above to map from igbp codes to longer labels - creating a new column on our metadata. We will make this column an ordered categorical to improve visualizations.

from pandas.api.types import CategoricalDtype

dtype = CategoricalDtype(ordered=True, categories=sorted(igbp_vegetation.values()))
metadata['vegetation'] = (metadata['igbp']
                          .apply(lambda x: igbp_vegetation[x])
                          .astype(dtype))
metadata.sample(5)
site lat lon igbp vegetation
198 CN-Sw2 41.7902 111.8971 GRA 10 - Grasslands
48 US-Dix 39.9712 -74.4346 MF 05 - Mixed Forest
279 US-ARc 35.5465 -98.0400 GRA 10 - Grasslands
65 US-Ho2 45.2091 -68.7470 ENF 01 - Evergreen Needleleaf Forest
269 RU-SkP 62.2550 129.1680 DNF 03 - Deciduous Needleleaf Forest

Visualize the fluxdata sites#

The PyViz ecosystem strives to make it always straightforward to visualize your data, to encourage you to be aware of it and understand it at each stage of a workflow. Here we will use Open Street Map tiles from geoviews to make a quick map of where the different sites are located and the vegetation at each site.

metadata.hvplot.points('lon', 'lat', geo=True, color='vegetation',
                       height=420, width=800, cmap='Category20') * gts.OSM

Loading FluxNet data#

The data in the nee_data_fusion repository is expressed as a collection of CSV files where the site names are expressed in the filenames.

This cell defines a function to:

  • read in the data from all sites

  • discard columns that we don’t need

  • calculate day of year

  • caculate the season (spring, summer, fall, winter)

data_columns = ['P_ERA', 'TA_ERA', 'PA_ERA', 'SW_IN_ERA', 'LW_IN_ERA', 'WS_ERA',
                'VPD_ERA', 'TIMESTAMP', 'site', 'NEE_CUT_USTAR50']
soil_data_columns = ['SWC_F_MDS_1', 'SWC_F_MDS_2', 'SWC_F_MDS_3',
                     'TS_F_MDS_1', 'TS_F_MDS_2', 'TS_F_MDS_3']

keep_from_csv = data_columns + soil_data_columns

y_variable = 'NEE_CUT_USTAR50'

def season(df, metadata):
    """Add season column based on lat and month
    """
    site = df['site'].cat.categories.item()
    lat = metadata[metadata['site'] == site]['lat'].item()
    if lat > 0:
        seasons = {3: 'spring',  4: 'spring',  5: 'spring',
                   6: 'summer',  7: 'summer',  8: 'summer',
                   9: 'fall',   10: 'fall',   11: 'fall',
                  12: 'winter',  1: 'winter',  2: 'winter'}
    else:
        seasons = {3: 'fall',    4: 'fall',    5: 'fall',
                   6: 'winter',  7: 'winter',  8: 'winter',
                   9: 'spring', 10: 'spring', 11: 'spring',
                  12: 'summer',  1: 'summer',  2: 'summer'}
    return df.assign(season=df.TIMESTAMP.dt.month.map(seasons))

def clean_data(df):
    """
    Clean data columns:
    
    * add NaN col for missing columns
    * throw away un-needed columns
    * add day of year
    """
    df = df.assign(**{col: np.nan for col in keep_from_csv if col not in df.columns})
    df = df[keep_from_csv]
    
    df = df.assign(DOY=df.TIMESTAMP.dt.dayofyear)
    df = df.assign(year=df.TIMESTAMP.dt.year)
    df = season(df, metadata)
    
    return df

Read and clean data#

This will take a few minutes if the data is not cached yet. First we will get a list of all the files on the S3 bucket, then we will iterate over those files and cache, read, and munge the data in each one. This is necessary since the columns in each file don’t necessarily match the columns in the other files. Before we concatenate across sites, we need to do some cleaning.

from s3fs import S3FileSystem
s3 = S3FileSystem(anon=True)
s3_paths = s3.glob('earth-data/carbon_flux/nee_data_fusion/FLX*')
datasets = []
skipped = []
used = []

for i, s3_path in enumerate(s3_paths):
    sys.stdout.write('\r{}/{}'.format(i+1, len(s3_paths)))
    
    try:
        dd = cat.fluxnet_daily(s3_path=s3_path).to_dask()
    except FileNotFoundError:
        try:
            dd = cat.fluxnet_daily(s3_path=s3_path.split('/')[-1]).to_dask()
        except FileNotFoundError:
            continue
    site = dd['site'].cat.categories.item()
    
    if not set(dd.columns) >= set(data_columns):
        skipped.append(site)
        continue

    datasets.append(clean_data(dd))
    used.append(site)

print()
print('Found {} fluxnet sites with enough data to use - skipped {}'.format(len(used), len(skipped)))
1/209
0/|/FLX_AR-SLu_FLUXNET2015_FULLSET_DD_2009-2011_1-3.csv:   0%|
2/|/FLX_AR-SLu_FLUXNET2015_FULLSET_DD_2009-2011_1-3.csv: 100%|
                                                              

2/209
0/|/FLX_AR-Vir_FLUXNET2015_FULLSET_DD_2009-2012_1-3.csv:   0%|
3/|/FLX_AR-Vir_FLUXNET2015_FULLSET_DD_2009-2012_1-3.csv: 100%|
                                                              

3/209
0/|/FLX_AT-Neu_FLUXNET2015_FULLSET_DD_2002-2012_1-3.csv:   0%|
4/|/FLX_AT-Neu_FLUXNET2015_FULLSET_DD_2002-2012_1-3.csv:  44%|
                                                              

4/209
0/|/FLX_AU-ASM_FLUXNET2015_FULLSET_DD_2010-2014_2-3.csv:   0%|
4/|/FLX_AU-ASM_FLUXNET2015_FULLSET_DD_2010-2014_2-3.csv: 100%|
                                                              

5/209
0/|/FLX_AU-Ade_FLUXNET2015_FULLSET_DD_2007-2009_1-3.csv:   0%|
2/|/FLX_AU-Ade_FLUXNET2015_FULLSET_DD_2007-2009_1-3.csv: 100%|
                                                              

6/209
0/|/FLX_AU-Cpr_FLUXNET2015_FULLSET_DD_2010-2014_2-3.csv:   0%|
4/|/FLX_AU-Cpr_FLUXNET2015_FULLSET_DD_2010-2014_2-3.csv: 100%|
                                                              

7/209
0/|/FLX_AU-Cum_FLUXNET2015_FULLSET_DD_2012-2014_2-3.csv:   0%|
2/|/FLX_AU-Cum_FLUXNET2015_FULLSET_DD_2012-2014_2-3.csv: 100%|
                                                              

8/209
0/|/FLX_AU-DaP_FLUXNET2015_FULLSET_DD_2007-2013_2-3.csv:   0%|
4/|/FLX_AU-DaP_FLUXNET2015_FULLSET_DD_2007-2013_2-3.csv:  80%|
                                                              

9/209
0/|/FLX_AU-DaS_FLUXNET2015_FULLSET_DD_2008-2014_2-3.csv:   0%|
4/|/FLX_AU-DaS_FLUXNET2015_FULLSET_DD_2008-2014_2-3.csv:  80%|
                                                              

10/209
0/|/FLX_AU-Dry_FLUXNET2015_FULLSET_DD_2008-2014_2-3.csv:   0%|
4/|/FLX_AU-Dry_FLUXNET2015_FULLSET_DD_2008-2014_2-3.csv:  80%|
                                                              

11/209
0/|/FLX_AU-Emr_FLUXNET2015_FULLSET_DD_2011-2013_1-3.csv:   0%|
2/|/FLX_AU-Emr_FLUXNET2015_FULLSET_DD_2011-2013_1-3.csv: 100%|
                                                              

12/209
0/|/FLX_AU-Fog_FLUXNET2015_FULLSET_DD_2006-2008_1-3.csv:   0%|
2/|/FLX_AU-Fog_FLUXNET2015_FULLSET_DD_2006-2008_1-3.csv: 100%|
                                                              

13/209
0/|/FLX_AU-GWW_FLUXNET2015_FULLSET_DD_2013-2014_1-3.csv:   0%|
                                                              

14/209
0/|/FLX_AU-Gin_FLUXNET2015_FULLSET_DD_2011-2014_1-3.csv:   0%|
3/|/FLX_AU-Gin_FLUXNET2015_FULLSET_DD_2011-2014_1-3.csv: 100%|
                                                              

15/209
0/|/FLX_AU-How_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:   0%|
4/|/FLX_AU-How_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:  36%|
9/|/FLX_AU-How_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:  82%|
                                                              

16/209
0/|/FLX_AU-Lox_FLUXNET2015_FULLSET_DD_2008-2009_1-3.csv: |
                                                          
17/209

0/|/FLX_AU-RDF_FLUXNET2015_FULLSET_DD_2011-2013_1-3.csv:   0%|
2/|/FLX_AU-RDF_FLUXNET2015_FULLSET_DD_2011-2013_1-3.csv: 100%|
                                                              

18/209
0/|/FLX_AU-Rig_FLUXNET2015_FULLSET_DD_2011-2014_2-3.csv:   0%|
3/|/FLX_AU-Rig_FLUXNET2015_FULLSET_DD_2011-2014_2-3.csv: 100%|
                                                              

19/209
0/|/FLX_AU-Rob_FLUXNET2015_FULLSET_DD_2014-2014_1-3.csv: |
0/|/FLX_AU-Rob_FLUXNET2015_FULLSET_DD_2014-2014_1-3.csv: |
                                                          

20/209
0/|/FLX_AU-Stp_FLUXNET2015_FULLSET_DD_2008-2014_1-3.csv:   0%|
4/|/FLX_AU-Stp_FLUXNET2015_FULLSET_DD_2008-2014_1-3.csv:  80%|
                                                              

21/209
0/|/FLX_AU-TTE_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv:   0%|
2/|/FLX_AU-TTE_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv: 100%|
                                                              

22/209
0/|/FLX_AU-Tum_FLUXNET2015_FULLSET_DD_2001-2014_2-3.csv:   0%|
4/|/FLX_AU-Tum_FLUXNET2015_FULLSET_DD_2001-2014_2-3.csv:  36%|
9/|/FLX_AU-Tum_FLUXNET2015_FULLSET_DD_2001-2014_2-3.csv:  82%|
                                                              

23/209
0/|/FLX_AU-Wac_FLUXNET2015_FULLSET_DD_2005-2008_1-3.csv:   0%|
3/|/FLX_AU-Wac_FLUXNET2015_FULLSET_DD_2005-2008_1-3.csv: 100%|
                                                              

24/209
0/|/FLX_AU-Whr_FLUXNET2015_FULLSET_DD_2011-2014_2-3.csv:   0%|
3/|/FLX_AU-Whr_FLUXNET2015_FULLSET_DD_2011-2014_2-3.csv: 100%|
                                                              

25/209
0/|/FLX_AU-Wom_FLUXNET2015_FULLSET_DD_2010-2014_1-3.csv:   0%|
4/|/FLX_AU-Wom_FLUXNET2015_FULLSET_DD_2010-2014_1-3.csv: 100%|
                                                              

26/209
0/|/FLX_AU-Ync_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv:   0%|
2/|/FLX_AU-Ync_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv: 100%|
                                                              

27/209
0/|/FLX_BE-Bra_FLUXNET2015_FULLSET_DD_1996-2014_2-3.csv:   0%|
4/|/FLX_BE-Bra_FLUXNET2015_FULLSET_DD_1996-2014_2-3.csv:  29%|
12/|/FLX_BE-Bra_FLUXNET2015_FULLSET_DD_1996-2014_2-3.csv:  86%|
                                                               

28/209
0/|/FLX_BE-Lon_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:   0%|
4/|/FLX_BE-Lon_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:  44%|
                                                              

29/209
0/|/FLX_BE-Vie_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:   0%|
4/|/FLX_BE-Vie_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:  25%|
12/|/FLX_BE-Vie_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:  75%|
16/|/FLX_BE-Vie_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv: 100%|
                                                               

30/209
0/|/FLX_BR-Sa1_FLUXNET2015_FULLSET_DD_2002-2011_1-3.csv:   0%|
4/|/FLX_BR-Sa1_FLUXNET2015_FULLSET_DD_2002-2011_1-3.csv:  57%|
                                                              

31/209
0/|/FLX_BR-Sa3_FLUXNET2015_FULLSET_DD_2000-2004_1-3.csv:   0%|
3/|/FLX_BR-Sa3_FLUXNET2015_FULLSET_DD_2000-2004_1-3.csv: 100%|
                                                              

32/209
0/|/FLX_CA-Gro_FLUXNET2015_FULLSET_DD_2003-2014_1-3.csv:   0%|
4/|/FLX_CA-Gro_FLUXNET2015_FULLSET_DD_2003-2014_1-3.csv:  40%|
8/|/FLX_CA-Gro_FLUXNET2015_FULLSET_DD_2003-2014_1-3.csv:  80%|
                                                              

33/209
0/|/FLX_CA-Man_FLUXNET2015_FULLSET_DD_1994-2008_1-3.csv:   0%|
4/|/FLX_CA-Man_FLUXNET2015_FULLSET_DD_1994-2008_1-3.csv:  33%|
10/|/FLX_CA-Man_FLUXNET2015_FULLSET_DD_1994-2008_1-3.csv:  83%|
                                                               

34/209
0/|/FLX_CA-NS1_FLUXNET2015_FULLSET_DD_2001-2005_2-3.csv:   0%|
3/|/FLX_CA-NS1_FLUXNET2015_FULLSET_DD_2001-2005_2-3.csv: 100%|
                                                              

35/209
0/|/FLX_CA-NS2_FLUXNET2015_FULLSET_DD_2001-2005_1-3.csv:   0%|
4/|/FLX_CA-NS2_FLUXNET2015_FULLSET_DD_2001-2005_1-3.csv: 100%|
                                                              

36/209
0/|/FLX_CA-NS3_FLUXNET2015_FULLSET_DD_2001-2005_1-3.csv:   0%|
4/|/FLX_CA-NS3_FLUXNET2015_FULLSET_DD_2001-2005_1-3.csv: 100%|
                                                              

37/209
0/|/FLX_CA-NS4_FLUXNET2015_FULLSET_DD_2002-2005_1-3.csv:   0%|
3/|/FLX_CA-NS4_FLUXNET2015_FULLSET_DD_2002-2005_1-3.csv: 100%|
                                                              

38/209
0/|/FLX_CA-NS5_FLUXNET2015_FULLSET_DD_2001-2005_1-3.csv:   0%|
4/|/FLX_CA-NS5_FLUXNET2015_FULLSET_DD_2001-2005_1-3.csv: 100%|
                                                              

39/209
0/|/FLX_CA-NS6_FLUXNET2015_FULLSET_DD_2001-2005_1-3.csv:   0%|
4/|/FLX_CA-NS6_FLUXNET2015_FULLSET_DD_2001-2005_1-3.csv: 100%|
                                                              

40/209
0/|/FLX_CA-NS7_FLUXNET2015_FULLSET_DD_2002-2005_1-3.csv:   0%|
3/|/FLX_CA-NS7_FLUXNET2015_FULLSET_DD_2002-2005_1-3.csv: 100%|
                                                              

41/209
0/|/FLX_CA-Oas_FLUXNET2015_FULLSET_DD_1996-2010_1-3.csv:   0%|
4/|/FLX_CA-Oas_FLUXNET2015_FULLSET_DD_1996-2010_1-3.csv:  33%|
11/|/FLX_CA-Oas_FLUXNET2015_FULLSET_DD_1996-2010_1-3.csv:  92%|
                                                               

42/209
0/|/FLX_CA-Obs_FLUXNET2015_FULLSET_DD_1997-2010_1-3.csv:   0%|
4/|/FLX_CA-Obs_FLUXNET2015_FULLSET_DD_1997-2010_1-3.csv:  36%|
10/|/FLX_CA-Obs_FLUXNET2015_FULLSET_DD_1997-2010_1-3.csv:  91%|
                                                               

43/209
0/|/FLX_CA-Qfo_FLUXNET2015_FULLSET_DD_2003-2010_1-3.csv:   0%|
4/|/FLX_CA-Qfo_FLUXNET2015_FULLSET_DD_2003-2010_1-3.csv:  67%|
                                                              

44/209
0/|/FLX_CA-SF1_FLUXNET2015_FULLSET_DD_2003-2006_1-3.csv:   0%|
3/|/FLX_CA-SF1_FLUXNET2015_FULLSET_DD_2003-2006_1-3.csv: 100%|
                                                              

45/209
0/|/FLX_CA-SF2_FLUXNET2015_FULLSET_DD_2001-2005_1-3.csv:   0%|
4/|/FLX_CA-SF2_FLUXNET2015_FULLSET_DD_2001-2005_1-3.csv: 100%|
                                                              

46/209
0/|/FLX_CA-SF3_FLUXNET2015_FULLSET_DD_2001-2006_1-3.csv:   0%|
4/|/FLX_CA-SF3_FLUXNET2015_FULLSET_DD_2001-2006_1-3.csv: 100%|
                                                              

47/209
0/|/FLX_CA-TP1_FLUXNET2015_FULLSET_DD_2002-2014_2-3.csv:   0%|
4/|/FLX_CA-TP1_FLUXNET2015_FULLSET_DD_2002-2014_2-3.csv:  40%|
                                                              

48/209
0/|/FLX_CA-TP2_FLUXNET2015_FULLSET_DD_2002-2007_1-3.csv:   0%|
4/|/FLX_CA-TP2_FLUXNET2015_FULLSET_DD_2002-2007_1-3.csv: 100%|
                                                              

49/209
0/|/FLX_CA-TP3_FLUXNET2015_FULLSET_DD_2002-2014_1-3.csv:   0%|
4/|/FLX_CA-TP3_FLUXNET2015_FULLSET_DD_2002-2014_1-3.csv:  40%|
                                                              

50/209
0/|/FLX_CA-TP4_FLUXNET2015_FULLSET_DD_2002-2014_1-3.csv:   0%|
4/|/FLX_CA-TP4_FLUXNET2015_FULLSET_DD_2002-2014_1-3.csv:  40%|
9/|/FLX_CA-TP4_FLUXNET2015_FULLSET_DD_2002-2014_1-3.csv:  90%|
                                                              

51/209
0/|/FLX_CA-TPD_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv:   0%|
2/|/FLX_CA-TPD_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv: 100%|
                                                              

52/209
0/|/FLX_CG-Tch_FLUXNET2015_FULLSET_DD_2006-2009_1-3.csv:   0%|
3/|/FLX_CG-Tch_FLUXNET2015_FULLSET_DD_2006-2009_1-3.csv: 100%|
                                                              

53/209
0/|/FLX_CH-Cha_FLUXNET2015_FULLSET_DD_2005-2014_2-3.csv:   0%|
4/|/FLX_CH-Cha_FLUXNET2015_FULLSET_DD_2005-2014_2-3.csv:  50%|
                                                              

54/209
0/|/FLX_CH-Dav_FLUXNET2015_FULLSET_DD_1997-2014_1-3.csv:   0%|
4/|/FLX_CH-Dav_FLUXNET2015_FULLSET_DD_1997-2014_1-3.csv:  27%|
12/|/FLX_CH-Dav_FLUXNET2015_FULLSET_DD_1997-2014_1-3.csv:  80%|
15/|/FLX_CH-Dav_FLUXNET2015_FULLSET_DD_1997-2014_1-3.csv: 100%|
                                                               

55/209
0/|/FLX_CH-Fru_FLUXNET2015_FULLSET_DD_2005-2014_2-3.csv:   0%|
4/|/FLX_CH-Fru_FLUXNET2015_FULLSET_DD_2005-2014_2-3.csv:  50%|
                                                              

56/209
0/|/FLX_CH-Lae_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:   0%|
4/|/FLX_CH-Lae_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:  44%|
                                                              

57/209
0/|/FLX_CH-Oe1_FLUXNET2015_FULLSET_DD_2002-2008_2-3.csv:   0%|
4/|/FLX_CH-Oe1_FLUXNET2015_FULLSET_DD_2002-2008_2-3.csv:  80%|
                                                              

58/209
0/|/FLX_CH-Oe2_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:   0%|
4/|/FLX_CH-Oe2_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:  44%|
                                                              

59/209
0/|/FLX_CN-Cha_FLUXNET2015_FULLSET_DD_2003-2005_1-3.csv:   0%|
2/|/FLX_CN-Cha_FLUXNET2015_FULLSET_DD_2003-2005_1-3.csv: 100%|
                                                              

60/209
0/|/FLX_CN-Cng_FLUXNET2015_FULLSET_DD_2007-2010_1-3.csv:   0%|
3/|/FLX_CN-Cng_FLUXNET2015_FULLSET_DD_2007-2010_1-3.csv: 100%|
                                                              

61/209
0/|/FLX_CN-Dan_FLUXNET2015_FULLSET_DD_2004-2005_1-3.csv:   0%|
                                                              

62/209
0/|/FLX_CN-Din_FLUXNET2015_FULLSET_DD_2003-2005_1-3.csv:   0%|
2/|/FLX_CN-Din_FLUXNET2015_FULLSET_DD_2003-2005_1-3.csv: 100%|
                                                              

63/209
0/|/FLX_CN-Du2_FLUXNET2015_FULLSET_DD_2006-2008_1-3.csv:   0%|
2/|/FLX_CN-Du2_FLUXNET2015_FULLSET_DD_2006-2008_1-3.csv: 100%|
                                                              

64/209
0/|/FLX_CN-Du3_FLUXNET2015_FULLSET_DD_2009-2010_1-3.csv: |
                                                          

65/209
0/|/FLX_CN-Ha2_FLUXNET2015_FULLSET_DD_2003-2005_1-3.csv:   0%|
2/|/FLX_CN-Ha2_FLUXNET2015_FULLSET_DD_2003-2005_1-3.csv: 100%|
                                                              

66/209
0/|/FLX_CN-HaM_FLUXNET2015_FULLSET_DD_2002-2004_1-3.csv:   0%|
2/|/FLX_CN-HaM_FLUXNET2015_FULLSET_DD_2002-2004_1-3.csv: 100%|
                                                              

67/209
0/|/FLX_CN-Qia_FLUXNET2015_FULLSET_DD_2003-2005_1-3.csv:   0%|
2/|/FLX_CN-Qia_FLUXNET2015_FULLSET_DD_2003-2005_1-3.csv: 100%|
                                                              

68/209
0/|/FLX_CN-Sw2_FLUXNET2015_FULLSET_DD_2010-2012_1-3.csv:   0%|
2/|/FLX_CN-Sw2_FLUXNET2015_FULLSET_DD_2010-2012_1-3.csv: 100%|
                                                              

69/209
0/|/FLX_CZ-BK1_FLUXNET2015_FULLSET_DD_2004-2014_2-3.csv:   0%|
4/|/FLX_CZ-BK1_FLUXNET2015_FULLSET_DD_2004-2014_2-3.csv:  44%|
                                                              

70/209
0/|/FLX_CZ-BK2_FLUXNET2015_FULLSET_DD_2004-2012_2-3.csv:   0%|
4/|/FLX_CZ-BK2_FLUXNET2015_FULLSET_DD_2004-2012_2-3.csv:  57%|
                                                              

71/209
0/|/FLX_CZ-wet_FLUXNET2015_FULLSET_DD_2006-2014_1-3.csv:   0%|
4/|/FLX_CZ-wet_FLUXNET2015_FULLSET_DD_2006-2014_1-3.csv:  57%|
                                                              

72/209
0/|/FLX_DE-Akm_FLUXNET2015_FULLSET_DD_2009-2014_1-3.csv:   0%|
4/|/FLX_DE-Akm_FLUXNET2015_FULLSET_DD_2009-2014_1-3.csv: 100%|
                                                              

73/209
0/|/FLX_DE-Geb_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:   0%|
4/|/FLX_DE-Geb_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:  33%|
10/|/FLX_DE-Geb_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:  83%|
                                                               

74/209
0/|/FLX_DE-Gri_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:   0%|
4/|/FLX_DE-Gri_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:  44%|
                                                              

75/209
0/|/FLX_DE-Hai_FLUXNET2015_FULLSET_DD_2000-2012_1-3.csv:   0%|
4/|/FLX_DE-Hai_FLUXNET2015_FULLSET_DD_2000-2012_1-3.csv:  36%|
9/|/FLX_DE-Hai_FLUXNET2015_FULLSET_DD_2000-2012_1-3.csv:  82%|
                                                              

76/209
0/|/FLX_DE-Kli_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:   0%|
4/|/FLX_DE-Kli_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:  44%|
                                                              

77/209
0/|/FLX_DE-Lkb_FLUXNET2015_FULLSET_DD_2009-2013_1-3.csv:   0%|
4/|/FLX_DE-Lkb_FLUXNET2015_FULLSET_DD_2009-2013_1-3.csv: 100%|
                                                              

78/209
0/|/FLX_DE-Lnf_FLUXNET2015_FULLSET_DD_2002-2012_1-3.csv:   0%|
4/|/FLX_DE-Lnf_FLUXNET2015_FULLSET_DD_2002-2012_1-3.csv:  44%|
                                                              

79/209
0/|/FLX_DE-Obe_FLUXNET2015_FULLSET_DD_2008-2014_1-3.csv:   0%|
4/|/FLX_DE-Obe_FLUXNET2015_FULLSET_DD_2008-2014_1-3.csv:  80%|
                                                              

80/209
0/|/FLX_DE-RuR_FLUXNET2015_FULLSET_DD_2011-2014_1-3.csv:   0%|
3/|/FLX_DE-RuR_FLUXNET2015_FULLSET_DD_2011-2014_1-3.csv: 100%|
                                                              

81/209
0/|/FLX_DE-RuS_FLUXNET2015_FULLSET_DD_2011-2014_1-3.csv:   0%|
3/|/FLX_DE-RuS_FLUXNET2015_FULLSET_DD_2011-2014_1-3.csv: 100%|
                                                              

82/209
0/|/FLX_DE-Seh_FLUXNET2015_FULLSET_DD_2007-2010_1-3.csv:   0%|
3/|/FLX_DE-Seh_FLUXNET2015_FULLSET_DD_2007-2010_1-3.csv: 100%|
                                                              

83/209
0/|/FLX_DE-SfN_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv:   0%|
2/|/FLX_DE-SfN_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv: 100%|
                                                              

84/209
0/|/FLX_DE-Spw_FLUXNET2015_FULLSET_DD_2010-2014_1-3.csv:   0%|
4/|/FLX_DE-Spw_FLUXNET2015_FULLSET_DD_2010-2014_1-3.csv: 100%|
                                                              

85/209
0/|/FLX_DE-Tha_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:   0%|
4/|/FLX_DE-Tha_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:  25%|
12/|/FLX_DE-Tha_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:  75%|
13/|/FLX_DE-Tha_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:  81%|
                                                               

86/209
0/|/FLX_DE-Zrk_FLUXNET2015_FULLSET_DD_2013-2014_2-3.csv:   0%|
1/|/FLX_DE-Zrk_FLUXNET2015_FULLSET_DD_2013-2014_2-3.csv: 100%|
                                                              

87/209
0/|/FLX_DK-Eng_FLUXNET2015_FULLSET_DD_2005-2008_1-3.csv:   0%|
3/|/FLX_DK-Eng_FLUXNET2015_FULLSET_DD_2005-2008_1-3.csv: 100%|
                                                              

88/209
0/|/FLX_DK-Fou_FLUXNET2015_FULLSET_DD_2005-2005_1-3.csv: |
                                                          

89/209
0/|/FLX_DK-NuF_FLUXNET2015_FULLSET_DD_2008-2014_1-3.csv:   0%|
4/|/FLX_DK-NuF_FLUXNET2015_FULLSET_DD_2008-2014_1-3.csv:  80%|
                                                              

90/209
0/|/FLX_DK-Sor_FLUXNET2015_FULLSET_DD_1996-2014_2-3.csv:   0%|
4/|/FLX_DK-Sor_FLUXNET2015_FULLSET_DD_1996-2014_2-3.csv:  25%|
12/|/FLX_DK-Sor_FLUXNET2015_FULLSET_DD_1996-2014_2-3.csv:  75%|
14/|/FLX_DK-Sor_FLUXNET2015_FULLSET_DD_1996-2014_2-3.csv:  88%|
                                                               

91/209
0/|/FLX_DK-ZaF_FLUXNET2015_FULLSET_DD_2008-2011_2-3.csv:   0%|
3/|/FLX_DK-ZaF_FLUXNET2015_FULLSET_DD_2008-2011_2-3.csv: 100%|
                                                              

92/209
0/|/FLX_DK-ZaH_FLUXNET2015_FULLSET_DD_2000-2014_2-3.csv:   0%|
4/|/FLX_DK-ZaH_FLUXNET2015_FULLSET_DD_2000-2014_2-3.csv:  33%|
10/|/FLX_DK-ZaH_FLUXNET2015_FULLSET_DD_2000-2014_2-3.csv:  83%|
                                                               

93/209
0/|/FLX_ES-Amo_FLUXNET2015_FULLSET_DD_2007-2012_1-3.csv:   0%|
4/|/FLX_ES-Amo_FLUXNET2015_FULLSET_DD_2007-2012_1-3.csv: 100%|
                                                              

94/209
0/|/FLX_ES-LJu_FLUXNET2015_FULLSET_DD_2004-2013_1-3.csv:   0%|
4/|/FLX_ES-LJu_FLUXNET2015_FULLSET_DD_2004-2013_1-3.csv:  50%|
                                                              

95/209
0/|/FLX_ES-LgS_FLUXNET2015_FULLSET_DD_2007-2009_1-3.csv:   0%|
2/|/FLX_ES-LgS_FLUXNET2015_FULLSET_DD_2007-2009_1-3.csv: 100%|
                                                              

96/209
0/|/FLX_ES-Ln2_FLUXNET2015_FULLSET_DD_2009-2009_1-3.csv: |
0/|/FLX_ES-Ln2_FLUXNET2015_FULLSET_DD_2009-2009_1-3.csv: |
                                                          
97/209

0/|/FLX_FI-Hyy_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:   0%|
4/|/FLX_FI-Hyy_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:  25%|
12/|/FLX_FI-Hyy_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:  75%|
                                                               

98/209
0/|/FLX_FI-Jok_FLUXNET2015_FULLSET_DD_2000-2003_1-3.csv:   0%|
3/|/FLX_FI-Jok_FLUXNET2015_FULLSET_DD_2000-2003_1-3.csv: 100%|
                                                              

99/209
0/|/FLX_FI-Let_FLUXNET2015_FULLSET_DD_2009-2012_1-3.csv:   0%|
3/|/FLX_FI-Let_FLUXNET2015_FULLSET_DD_2009-2012_1-3.csv: 100%|
                                                              

100/209
0/|/FLX_FI-Lom_FLUXNET2015_FULLSET_DD_2007-2009_1-3.csv:   0%|
2/|/FLX_FI-Lom_FLUXNET2015_FULLSET_DD_2007-2009_1-3.csv: 100%|
                                                              

101/209
0/|/FLX_FI-Sod_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:   0%|
4/|/FLX_FI-Sod_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:  36%|
10/|/FLX_FI-Sod_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:  91%|
                                                               

102/209
0/|/FLX_FR-Fon_FLUXNET2015_FULLSET_DD_2005-2014_1-3.csv:   0%|
4/|/FLX_FR-Fon_FLUXNET2015_FULLSET_DD_2005-2014_1-3.csv:  57%|
                                                              

103/209
0/|/FLX_FR-Gri_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:   0%|
4/|/FLX_FR-Gri_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:  40%|
                                                              

104/209
0/|/FLX_FR-LBr_FLUXNET2015_FULLSET_DD_1996-2008_1-3.csv:   0%|
4/|/FLX_FR-LBr_FLUXNET2015_FULLSET_DD_1996-2008_1-3.csv:  36%|
9/|/FLX_FR-LBr_FLUXNET2015_FULLSET_DD_1996-2008_1-3.csv:  82%|
                                                              

105/209
0/|/FLX_FR-Pue_FLUXNET2015_FULLSET_DD_2000-2014_2-3.csv:   0%|
4/|/FLX_FR-Pue_FLUXNET2015_FULLSET_DD_2000-2014_2-3.csv:  33%|
10/|/FLX_FR-Pue_FLUXNET2015_FULLSET_DD_2000-2014_2-3.csv:  83%|
                                                               

106/209
0/|/FLX_GF-Guy_FLUXNET2015_FULLSET_DD_2004-2014_2-3.csv:   0%|
4/|/FLX_GF-Guy_FLUXNET2015_FULLSET_DD_2004-2014_2-3.csv:  44%|
                                                              

107/209
0/|/FLX_IT-CA1_FLUXNET2015_FULLSET_DD_2011-2014_2-3.csv:   0%|
3/|/FLX_IT-CA1_FLUXNET2015_FULLSET_DD_2011-2014_2-3.csv: 100%|
                                                              

108/209
0/|/FLX_IT-CA2_FLUXNET2015_FULLSET_DD_2011-2014_2-3.csv:   0%|
3/|/FLX_IT-CA2_FLUXNET2015_FULLSET_DD_2011-2014_2-3.csv: 100%|
                                                              

109/209
0/|/FLX_IT-CA3_FLUXNET2015_FULLSET_DD_2011-2014_2-3.csv:   0%|
3/|/FLX_IT-CA3_FLUXNET2015_FULLSET_DD_2011-2014_2-3.csv: 100%|
                                                              

110/209
0/|/FLX_IT-Col_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:   0%|
4/|/FLX_IT-Col_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:  25%|
12/|/FLX_IT-Col_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:  75%|
16/|/FLX_IT-Col_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv: 100%|
                                                               

111/209
0/|/FLX_IT-Cp2_FLUXNET2015_FULLSET_DD_2012-2014_2-3.csv:   0%|
2/|/FLX_IT-Cp2_FLUXNET2015_FULLSET_DD_2012-2014_2-3.csv: 100%|
                                                              

112/209
0/|/FLX_IT-Cpz_FLUXNET2015_FULLSET_DD_1997-2009_1-3.csv:   0%|
4/|/FLX_IT-Cpz_FLUXNET2015_FULLSET_DD_1997-2009_1-3.csv:  40%|
                                                              

113/209
0/|/FLX_IT-Isp_FLUXNET2015_FULLSET_DD_2013-2014_1-3.csv:   0%|
1/|/FLX_IT-Isp_FLUXNET2015_FULLSET_DD_2013-2014_1-3.csv: 100%|
                                                              

114/209
0/|/FLX_IT-La2_FLUXNET2015_FULLSET_DD_2000-2002_1-3.csv:   0%|
2/|/FLX_IT-La2_FLUXNET2015_FULLSET_DD_2000-2002_1-3.csv: 100%|
                                                              

115/209
0/|/FLX_IT-Lav_FLUXNET2015_FULLSET_DD_2003-2014_2-3.csv:   0%|
4/|/FLX_IT-Lav_FLUXNET2015_FULLSET_DD_2003-2014_2-3.csv:  40%|
8/|/FLX_IT-Lav_FLUXNET2015_FULLSET_DD_2003-2014_2-3.csv:  80%|
                                                              

116/209
0/|/FLX_IT-MBo_FLUXNET2015_FULLSET_DD_2003-2013_1-3.csv:   0%|
4/|/FLX_IT-MBo_FLUXNET2015_FULLSET_DD_2003-2013_1-3.csv:  44%|
                                                              

117/209
0/|/FLX_IT-Noe_FLUXNET2015_FULLSET_DD_2004-2014_2-3.csv:   0%|
4/|/FLX_IT-Noe_FLUXNET2015_FULLSET_DD_2004-2014_2-3.csv:  50%|
                                                              

118/209
0/|/FLX_IT-PT1_FLUXNET2015_FULLSET_DD_2002-2004_1-3.csv:   0%|
2/|/FLX_IT-PT1_FLUXNET2015_FULLSET_DD_2002-2004_1-3.csv: 100%|
                                                              

119/209
0/|/FLX_IT-Ren_FLUXNET2015_FULLSET_DD_1998-2013_1-3.csv:   0%|
4/|/FLX_IT-Ren_FLUXNET2015_FULLSET_DD_1998-2013_1-3.csv:  31%|
12/|/FLX_IT-Ren_FLUXNET2015_FULLSET_DD_1998-2013_1-3.csv:  92%|
                                                               

120/209
0/|/FLX_IT-Ro1_FLUXNET2015_FULLSET_DD_2000-2008_1-3.csv:   0%|
4/|/FLX_IT-Ro1_FLUXNET2015_FULLSET_DD_2000-2008_1-3.csv:  57%|
                                                              

121/209
0/|/FLX_IT-Ro2_FLUXNET2015_FULLSET_DD_2002-2012_1-3.csv:   0%|
4/|/FLX_IT-Ro2_FLUXNET2015_FULLSET_DD_2002-2012_1-3.csv:  50%|
                                                              

122/209
0/|/FLX_IT-SR2_FLUXNET2015_FULLSET_DD_2013-2014_1-3.csv:   0%|
1/|/FLX_IT-SR2_FLUXNET2015_FULLSET_DD_2013-2014_1-3.csv: 100%|
                                                              

123/209
0/|/FLX_IT-SRo_FLUXNET2015_FULLSET_DD_1999-2012_1-3.csv:   0%|
4/|/FLX_IT-SRo_FLUXNET2015_FULLSET_DD_1999-2012_1-3.csv:  36%|
10/|/FLX_IT-SRo_FLUXNET2015_FULLSET_DD_1999-2012_1-3.csv:  91%|
                                                               

124/209
0/|/FLX_IT-Tor_FLUXNET2015_FULLSET_DD_2008-2014_2-3.csv:   0%|
4/|/FLX_IT-Tor_FLUXNET2015_FULLSET_DD_2008-2014_2-3.csv:  80%|
                                                              

125/209
0/|/FLX_JP-MBF_FLUXNET2015_FULLSET_DD_2003-2005_1-3.csv:   0%|
2/|/FLX_JP-MBF_FLUXNET2015_FULLSET_DD_2003-2005_1-3.csv: 100%|
                                                              

126/209
0/|/FLX_JP-SMF_FLUXNET2015_FULLSET_DD_2002-2006_1-3.csv:   0%|
4/|/FLX_JP-SMF_FLUXNET2015_FULLSET_DD_2002-2006_1-3.csv: 100%|
                                                              

127/209
0/|/FLX_MY-PSO_FLUXNET2015_FULLSET_DD_2003-2009_1-3.csv:   0%|
4/|/FLX_MY-PSO_FLUXNET2015_FULLSET_DD_2003-2009_1-3.csv:  80%|
                                                              

128/209
0/|/FLX_NL-Hor_FLUXNET2015_FULLSET_DD_2004-2011_1-3.csv:   0%|
4/|/FLX_NL-Hor_FLUXNET2015_FULLSET_DD_2004-2011_1-3.csv:  67%|
                                                              

129/209
0/|/FLX_NL-Loo_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:   0%|
4/|/FLX_NL-Loo_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:  27%|
12/|/FLX_NL-Loo_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv:  80%|
15/|/FLX_NL-Loo_FLUXNET2015_FULLSET_DD_1996-2014_1-3.csv: 100%|
                                                               

130/209
0/|/FLX_NO-Adv_FLUXNET2015_FULLSET_DD_2011-2014_1-3.csv:   0%|
3/|/FLX_NO-Adv_FLUXNET2015_FULLSET_DD_2011-2014_1-3.csv: 100%|
                                                              

131/209
0/|/FLX_NO-Blv_FLUXNET2015_FULLSET_DD_2008-2009_1-3.csv:   0%|
1/|/FLX_NO-Blv_FLUXNET2015_FULLSET_DD_2008-2009_1-3.csv: 100%|
                                                              

132/209
0/|/FLX_PA-SPn_FLUXNET2015_FULLSET_DD_2007-2009_1-3.csv:   0%|
2/|/FLX_PA-SPn_FLUXNET2015_FULLSET_DD_2007-2009_1-3.csv: 100%|
                                                              

133/209
0/|/FLX_PA-SPs_FLUXNET2015_FULLSET_DD_2007-2009_1-3.csv:   0%|
2/|/FLX_PA-SPs_FLUXNET2015_FULLSET_DD_2007-2009_1-3.csv: 100%|
                                                              

134/209
0/|/FLX_RU-Che_FLUXNET2015_FULLSET_DD_2002-2005_1-3.csv:   0%|
3/|/FLX_RU-Che_FLUXNET2015_FULLSET_DD_2002-2005_1-3.csv: 100%|
                                                              

135/209
0/|/FLX_RU-Cok_FLUXNET2015_FULLSET_DD_2003-2014_2-3.csv:   0%|
4/|/FLX_RU-Cok_FLUXNET2015_FULLSET_DD_2003-2014_2-3.csv:  44%|
                                                              

136/209
0/|/FLX_RU-Fyo_FLUXNET2015_FULLSET_DD_1998-2014_2-3.csv:   0%|
4/|/FLX_RU-Fyo_FLUXNET2015_FULLSET_DD_1998-2014_2-3.csv:  29%|
12/|/FLX_RU-Fyo_FLUXNET2015_FULLSET_DD_1998-2014_2-3.csv:  86%|
                                                               

137/209
0/|/FLX_RU-Ha1_FLUXNET2015_FULLSET_DD_2002-2004_1-3.csv:   0%|
2/|/FLX_RU-Ha1_FLUXNET2015_FULLSET_DD_2002-2004_1-3.csv: 100%|
                                                              

138/209
0/|/FLX_RU-Sam_FLUXNET2015_FULLSET_DD_2002-2014_1-3.csv:   0%|
4/|/FLX_RU-Sam_FLUXNET2015_FULLSET_DD_2002-2014_1-3.csv:  40%|
                                                              

139/209
0/|/FLX_RU-SkP_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv:   0%|
2/|/FLX_RU-SkP_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv: 100%|
                                                              

140/209
0/|/FLX_RU-Tks_FLUXNET2015_FULLSET_DD_2010-2014_1-3.csv:   0%|
4/|/FLX_RU-Tks_FLUXNET2015_FULLSET_DD_2010-2014_1-3.csv: 100%|
                                                              

141/209
0/|/FLX_RU-Vrk_FLUXNET2015_FULLSET_DD_2008-2008_1-3.csv: |
                                                          

142/209
0/|/FLX_SD-Dem_FLUXNET2015_FULLSET_DD_2005-2009_2-3.csv:   0%|
4/|/FLX_SD-Dem_FLUXNET2015_FULLSET_DD_2005-2009_2-3.csv: 100%|
                                                              

143/209
0/|/FLX_SE-St1_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv:   0%|
2/|/FLX_SE-St1_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv: 100%|
                                                              

144/209
0/|/FLX_SN-Dhr_FLUXNET2015_FULLSET_DD_2010-2013_1-3.csv:   0%|
3/|/FLX_SN-Dhr_FLUXNET2015_FULLSET_DD_2010-2013_1-3.csv: 100%|
                                                              

145/209
0/|/FLX_US-AR1_FLUXNET2015_FULLSET_DD_2009-2012_1-3.csv:   0%|
3/|/FLX_US-AR1_FLUXNET2015_FULLSET_DD_2009-2012_1-3.csv: 100%|
                                                              

146/209
0/|/FLX_US-AR2_FLUXNET2015_FULLSET_DD_2009-2012_1-3.csv:   0%|
3/|/FLX_US-AR2_FLUXNET2015_FULLSET_DD_2009-2012_1-3.csv: 100%|
                                                              

147/209
0/|/FLX_US-ARM_FLUXNET2015_FULLSET_DD_2003-2012_1-3.csv:   0%|
4/|/FLX_US-ARM_FLUXNET2015_FULLSET_DD_2003-2012_1-3.csv:  50%|
                                                              

148/209
0/|/FLX_US-ARb_FLUXNET2015_FULLSET_DD_2005-2006_1-3.csv:   0%|
1/|/FLX_US-ARb_FLUXNET2015_FULLSET_DD_2005-2006_1-3.csv: 100%|
                                                              

149/209
0/|/FLX_US-ARc_FLUXNET2015_FULLSET_DD_2005-2006_1-3.csv:   0%|
1/|/FLX_US-ARc_FLUXNET2015_FULLSET_DD_2005-2006_1-3.csv: 100%|
                                                              

150/209
0/|/FLX_US-Atq_FLUXNET2015_FULLSET_DD_2003-2008_1-3.csv:   0%|
4/|/FLX_US-Atq_FLUXNET2015_FULLSET_DD_2003-2008_1-3.csv: 100%|
                                                              

151/209
0/|/FLX_US-Blo_FLUXNET2015_FULLSET_DD_1997-2007_1-3.csv:   0%|
4/|/FLX_US-Blo_FLUXNET2015_FULLSET_DD_1997-2007_1-3.csv:  50%|
                                                              

152/209
0/|/FLX_US-CRT_FLUXNET2015_FULLSET_DD_2011-2013_1-3.csv:   0%|
2/|/FLX_US-CRT_FLUXNET2015_FULLSET_DD_2011-2013_1-3.csv: 100%|
                                                              

153/209
0/|/FLX_US-Cop_FLUXNET2015_FULLSET_DD_2001-2007_1-3.csv:   0%|
4/|/FLX_US-Cop_FLUXNET2015_FULLSET_DD_2001-2007_1-3.csv:  80%|
                                                              

154/209
0/|/FLX_US-GBT_FLUXNET2015_FULLSET_DD_1999-2006_1-3.csv:   0%|
4/|/FLX_US-GBT_FLUXNET2015_FULLSET_DD_1999-2006_1-3.csv:  80%|
                                                              

155/209
0/|/FLX_US-GLE_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:   0%|
4/|/FLX_US-GLE_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:  44%|
                                                              

156/209
0/|/FLX_US-Goo_FLUXNET2015_FULLSET_DD_2002-2006_1-3.csv:   0%|
4/|/FLX_US-Goo_FLUXNET2015_FULLSET_DD_2002-2006_1-3.csv: 100%|
                                                              

157/209
0/|/FLX_US-Ha1_FLUXNET2015_FULLSET_DD_1991-2012_1-3.csv:   0%|
4/|/FLX_US-Ha1_FLUXNET2015_FULLSET_DD_1991-2012_1-3.csv:  24%|
12/|/FLX_US-Ha1_FLUXNET2015_FULLSET_DD_1991-2012_1-3.csv:  71%|
15/|/FLX_US-Ha1_FLUXNET2015_FULLSET_DD_1991-2012_1-3.csv:  88%|
                                                               

158/209
0/|/FLX_US-IB2_FLUXNET2015_FULLSET_DD_2004-2011_1-3.csv:   0%|
4/|/FLX_US-IB2_FLUXNET2015_FULLSET_DD_2004-2011_1-3.csv:  67%|
                                                              

159/209
0/|/FLX_US-Ivo_FLUXNET2015_FULLSET_DD_2004-2007_1-3.csv:   0%|
3/|/FLX_US-Ivo_FLUXNET2015_FULLSET_DD_2004-2007_1-3.csv: 100%|
                                                              

160/209
0/|/FLX_US-KS1_FLUXNET2015_FULLSET_DD_2002-2002_1-3.csv: |
                                                          

161/209
0/|/FLX_US-KS2_FLUXNET2015_FULLSET_DD_2003-2006_1-3.csv:   0%|
3/|/FLX_US-KS2_FLUXNET2015_FULLSET_DD_2003-2006_1-3.csv: 100%|
                                                              

162/209
0/|/FLX_US-LWW_FLUXNET2015_FULLSET_DD_1997-1998_1-3.csv:   0%|
                                                              

163/209
0/|/FLX_US-Lin_FLUXNET2015_FULLSET_DD_2009-2010_1-3.csv: |
                                                          

164/209
0/|/FLX_US-Los_FLUXNET2015_FULLSET_DD_2000-2014_2-3.csv:   0%|
4/|/FLX_US-Los_FLUXNET2015_FULLSET_DD_2000-2014_2-3.csv:  36%|
10/|/FLX_US-Los_FLUXNET2015_FULLSET_DD_2000-2014_2-3.csv:  91%|
                                                               

165/209
0/|/FLX_US-MMS_FLUXNET2015_FULLSET_DD_1999-2014_1-3.csv:   0%|
4/|/FLX_US-MMS_FLUXNET2015_FULLSET_DD_1999-2014_1-3.csv:  31%|
11/|/FLX_US-MMS_FLUXNET2015_FULLSET_DD_1999-2014_1-3.csv:  85%|
                                                               

166/209
0/|/FLX_US-Me1_FLUXNET2015_FULLSET_DD_2004-2005_1-3.csv: |
0/|/FLX_US-Me1_FLUXNET2015_FULLSET_DD_2004-2005_1-3.csv: |
                                                          
167/209

0/|/FLX_US-Me2_FLUXNET2015_FULLSET_DD_2002-2014_1-3.csv:   0%|
4/|/FLX_US-Me2_FLUXNET2015_FULLSET_DD_2002-2014_1-3.csv:  40%|
9/|/FLX_US-Me2_FLUXNET2015_FULLSET_DD_2002-2014_1-3.csv:  90%|
                                                              

168/209
0/|/FLX_US-Me3_FLUXNET2015_FULLSET_DD_2004-2009_1-3.csv:   0%|
4/|/FLX_US-Me3_FLUXNET2015_FULLSET_DD_2004-2009_1-3.csv:  80%|
                                                              

169/209
0/|/FLX_US-Me4_FLUXNET2015_FULLSET_DD_1996-2000_1-3.csv:   0%|
3/|/FLX_US-Me4_FLUXNET2015_FULLSET_DD_1996-2000_1-3.csv: 100%|
                                                              

170/209
0/|/FLX_US-Me5_FLUXNET2015_FULLSET_DD_2000-2002_1-3.csv:   0%|
2/|/FLX_US-Me5_FLUXNET2015_FULLSET_DD_2000-2002_1-3.csv: 100%|
                                                              

171/209
0/|/FLX_US-Me6_FLUXNET2015_FULLSET_DD_2010-2014_2-3.csv:   0%|
4/|/FLX_US-Me6_FLUXNET2015_FULLSET_DD_2010-2014_2-3.csv: 100%|
                                                              

172/209
0/|/FLX_US-Myb_FLUXNET2015_FULLSET_DD_2010-2014_2-3.csv:   0%|
3/|/FLX_US-Myb_FLUXNET2015_FULLSET_DD_2010-2014_2-3.csv: 100%|
                                                              

173/209
0/|/FLX_US-NR1_FLUXNET2015_FULLSET_DD_1998-2014_1-3.csv:   0%|
4/|/FLX_US-NR1_FLUXNET2015_FULLSET_DD_1998-2014_1-3.csv:  29%|
12/|/FLX_US-NR1_FLUXNET2015_FULLSET_DD_1998-2014_1-3.csv:  86%|
                                                               

174/209
0/|/FLX_US-Ne1_FLUXNET2015_FULLSET_DD_2001-2013_1-3.csv:   0%|
4/|/FLX_US-Ne1_FLUXNET2015_FULLSET_DD_2001-2013_1-3.csv:  40%|
9/|/FLX_US-Ne1_FLUXNET2015_FULLSET_DD_2001-2013_1-3.csv:  90%|
                                                              

175/209
0/|/FLX_US-Ne2_FLUXNET2015_FULLSET_DD_2001-2013_1-3.csv:   0%|
4/|/FLX_US-Ne2_FLUXNET2015_FULLSET_DD_2001-2013_1-3.csv:  40%|
9/|/FLX_US-Ne2_FLUXNET2015_FULLSET_DD_2001-2013_1-3.csv:  90%|
                                                              

176/209
0/|/FLX_US-Ne3_FLUXNET2015_FULLSET_DD_2001-2013_1-3.csv:   0%|
4/|/FLX_US-Ne3_FLUXNET2015_FULLSET_DD_2001-2013_1-3.csv:  40%|
9/|/FLX_US-Ne3_FLUXNET2015_FULLSET_DD_2001-2013_1-3.csv:  90%|
                                                              

177/209
0/|/FLX_US-ORv_FLUXNET2015_FULLSET_DD_2011-2011_1-3.csv: |
                                                          

178/209
0/|/FLX_US-Oho_FLUXNET2015_FULLSET_DD_2004-2013_1-3.csv:   0%|
4/|/FLX_US-Oho_FLUXNET2015_FULLSET_DD_2004-2013_1-3.csv:  50%|
                                                              

179/209
0/|/FLX_US-PFa_FLUXNET2015_FULLSET_DD_1995-2014_1-3.csv:   0%|
4/|/FLX_US-PFa_FLUXNET2015_FULLSET_DD_1995-2014_1-3.csv:  25%|
12/|/FLX_US-PFa_FLUXNET2015_FULLSET_DD_1995-2014_1-3.csv:  75%|
16/|/FLX_US-PFa_FLUXNET2015_FULLSET_DD_1995-2014_1-3.csv: 100%|
                                                               

180/209
0/|/FLX_US-Prr_FLUXNET2015_FULLSET_DD_2010-2014_1-3.csv:   0%|
4/|/FLX_US-Prr_FLUXNET2015_FULLSET_DD_2010-2014_1-3.csv: 100%|
                                                              

181/209
0/|/FLX_US-SRC_FLUXNET2015_FULLSET_DD_2008-2014_1-3.csv:   0%|
4/|/FLX_US-SRC_FLUXNET2015_FULLSET_DD_2008-2014_1-3.csv:  67%|
                                                              

182/209
0/|/FLX_US-SRG_FLUXNET2015_FULLSET_DD_2008-2014_1-3.csv:   0%|
4/|/FLX_US-SRG_FLUXNET2015_FULLSET_DD_2008-2014_1-3.csv:  67%|
                                                              

183/209
0/|/FLX_US-SRM_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:   0%|
4/|/FLX_US-SRM_FLUXNET2015_FULLSET_DD_2004-2014_1-3.csv:  44%|
                                                              

184/209
0/|/FLX_US-Sta_FLUXNET2015_FULLSET_DD_2005-2009_1-3.csv:   0%|
4/|/FLX_US-Sta_FLUXNET2015_FULLSET_DD_2005-2009_1-3.csv: 100%|
                                                              

185/209
0/|/FLX_US-Syv_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:   0%|
4/|/FLX_US-Syv_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:  36%|
9/|/FLX_US-Syv_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:  82%|
                                                              

186/209
0/|/FLX_US-Ton_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:   0%|
4/|/FLX_US-Ton_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:  36%|
10/|/FLX_US-Ton_FLUXNET2015_FULLSET_DD_2001-2014_1-3.csv:  91%|
                                                               

187/209
0/|/FLX_US-Tw1_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv:   0%|
2/|/FLX_US-Tw1_FLUXNET2015_FULLSET_DD_2012-2014_1-3.csv: 100%|
                                                              

188/209
0/|/FLX_US-Tw2_FLUXNET2015_FULLSET_DD_2012-2013_1-3.csv: |
                                                          

189/209
0/|/FLX_US-Tw3_FLUXNET2015_FULLSET_DD_2013-2014_2-3.csv:   0%|
1/|/FLX_US-Tw3_FLUXNET2015_FULLSET_DD_2013-2014_2-3.csv: 100%|
                                                              

190/209
0/|/FLX_US-Tw4_FLUXNET2015_FULLSET_DD_2013-2014_1-3.csv: |
0/|/FLX_US-Tw4_FLUXNET2015_FULLSET_DD_2013-2014_1-3.csv: |
                                                          

191/209
0/|/FLX_US-Twt_FLUXNET2015_FULLSET_DD_2009-2014_1-3.csv:   0%|
4/|/FLX_US-Twt_FLUXNET2015_FULLSET_DD_2009-2014_1-3.csv: 100%|
                                                              

192/209
0/|/FLX_US-UMB_FLUXNET2015_FULLSET_DD_2000-2014_1-3.csv:   0%|
4/|/FLX_US-UMB_FLUXNET2015_FULLSET_DD_2000-2014_1-3.csv:  33%|
10/|/FLX_US-UMB_FLUXNET2015_FULLSET_DD_2000-2014_1-3.csv:  83%|
                                                               

193/209
0/|/FLX_US-UMd_FLUXNET2015_FULLSET_DD_2007-2014_1-3.csv:   0%|
4/|/FLX_US-UMd_FLUXNET2015_FULLSET_DD_2007-2014_1-3.csv:  67%|
                                                              

194/209
0/|/FLX_US-Var_FLUXNET2015_FULLSET_DD_2000-2014_1-3.csv:   0%|
4/|/FLX_US-Var_FLUXNET2015_FULLSET_DD_2000-2014_1-3.csv:  33%|
11/|/FLX_US-Var_FLUXNET2015_FULLSET_DD_2000-2014_1-3.csv:  92%|
                                                               

195/209
0/|/FLX_US-WCr_FLUXNET2015_FULLSET_DD_1999-2014_1-3.csv:   0%|
4/|/FLX_US-WCr_FLUXNET2015_FULLSET_DD_1999-2014_1-3.csv:  33%|
11/|/FLX_US-WCr_FLUXNET2015_FULLSET_DD_1999-2014_1-3.csv:  92%|
                                                               

196/209
0/|/FLX_US-WPT_FLUXNET2015_FULLSET_DD_2011-2013_1-3.csv:   0%|
2/|/FLX_US-WPT_FLUXNET2015_FULLSET_DD_2011-2013_1-3.csv: 100%|
                                                              

197/209
0/|/FLX_US-Whs_FLUXNET2015_FULLSET_DD_2007-2014_1-3.csv:   0%|
4/|/FLX_US-Whs_FLUXNET2015_FULLSET_DD_2007-2014_1-3.csv:  67%|
                                                              

198/209
0/|/FLX_US-Wi0_FLUXNET2015_FULLSET_DD_2002-2002_1-3.csv: |
                                                          

199/209
0/|/FLX_US-Wi1_FLUXNET2015_FULLSET_DD_2003-2003_1-3.csv: |
                                                          

200/209
0/|/FLX_US-Wi2_FLUXNET2015_FULLSET_DD_2003-2003_1-3.csv: |
0/|/FLX_US-Wi2_FLUXNET2015_FULLSET_DD_2003-2003_1-3.csv: |
                                                          
201/209

0/|/FLX_US-Wi3_FLUXNET2015_FULLSET_DD_2002-2004_1-3.csv:   0%|
2/|/FLX_US-Wi3_FLUXNET2015_FULLSET_DD_2002-2004_1-3.csv: 100%|
                                                              

202/209
0/|/FLX_US-Wi4_FLUXNET2015_FULLSET_DD_2002-2005_1-3.csv:   0%|
3/|/FLX_US-Wi4_FLUXNET2015_FULLSET_DD_2002-2005_1-3.csv: 100%|
                                                              

203/209
0/|/FLX_US-Wi5_FLUXNET2015_FULLSET_DD_2004-2004_1-3.csv: |
0/|/FLX_US-Wi5_FLUXNET2015_FULLSET_DD_2004-2004_1-3.csv: |
                                                          
204/209

0/|/FLX_US-Wi6_FLUXNET2015_FULLSET_DD_2002-2003_1-3.csv:   0%|
                                                              

205/209
0/|/FLX_US-Wi7_FLUXNET2015_FULLSET_DD_2005-2005_1-3.csv: |
0/|/FLX_US-Wi7_FLUXNET2015_FULLSET_DD_2005-2005_1-3.csv: |
                                                          

206/209
0/|/FLX_US-Wi8_FLUXNET2015_FULLSET_DD_2002-2002_1-3.csv: |
0/|/FLX_US-Wi8_FLUXNET2015_FULLSET_DD_2002-2002_1-3.csv: |
                                                          

207/209
0/|/FLX_US-Wi9_FLUXNET2015_FULLSET_DD_2004-2005_1-3.csv: |
0/|/FLX_US-Wi9_FLUXNET2015_FULLSET_DD_2004-2005_1-3.csv: |
                                                          

208/209
0/|/FLX_ZA-Kru_FLUXNET2015_FULLSET_DD_2000-2013_1-3.csv:   0%|
4/|/FLX_ZA-Kru_FLUXNET2015_FULLSET_DD_2000-2013_1-3.csv:  36%|
10/|/FLX_ZA-Kru_FLUXNET2015_FULLSET_DD_2000-2013_1-3.csv:  91%|
                                                               

209/209
0/|/FLX_ZM-Mon_FLUXNET2015_FULLSET_DD_2000-2009_2-3.csv:   0%|
4/|/FLX_ZM-Mon_FLUXNET2015_FULLSET_DD_2000-2009_2-3.csv:  57%|
                                                              

Found 179 fluxnet sites with enough data to use - skipped 30

Now that we have a list of datasets, we will concatenate across all rows. Since the data is loaded lazily - using dask - we need to explicitly call compute to get the data in memory.

data = dask.dataframe.concat(datasets).compute()
data.columns
Index(['P_ERA', 'TA_ERA', 'PA_ERA', 'SW_IN_ERA', 'LW_IN_ERA', 'WS_ERA',
       'VPD_ERA', 'TIMESTAMP', 'site', 'NEE_CUT_USTAR50', 'SWC_F_MDS_1',
       'SWC_F_MDS_2', 'SWC_F_MDS_3', 'TS_F_MDS_1', 'TS_F_MDS_2', 'TS_F_MDS_3',
       'DOY', 'year', 'season'],
      dtype='object')

We’ll also set the data type of 'site' to 'category'. This will come in handy later.

data['site'] = data['site'].astype('category')

Visualizing Data Available at Sites#

We can look at the sites for which we have data. We’ll plot the sites on a world map again - this time using a custom colormap to denote sites with valid data, sites where data exist but were not loaded because too many fields were missing, and sites where no data was available. In addition to this map we’ll get the count of different vegetation types at the sites.

def mapper(x):
    if x in used:
        return 'valid'
    elif x in skipped:
        return 'skipped'
    else:
        return 'no data'
    
cmap = {'valid': 'green', 'skipped': 'red', 'no data': 'darkgray'}

QA = metadata.copy()
QA['quality'] = QA['site'].map(mapper)

all_points = QA.hvplot.points('lon', 'lat', geo=True, color='quality', 
                              cmap=cmap, hover_cols=['site', 'vegetation'],
                              height=420, width=600).options(tools=['hover', 'tap'], 
                                                             legend_position='top')

def veg_count(data):
    veg_count = data['vegetation'].value_counts().sort_index(ascending=False)
    return veg_count.hvplot.barh(height=420, width=500)

hist = veg_count(QA[QA.quality=='valid']).relabel('Vegetation counts for valid sites')

all_points * gts.OSM + hist

We’ll make a couple of functions that generate plots on the full set of data or a subset of the data. We will use these in a dashboard below.

def site_timeseries(data):
    """Timeseries plot showing the mean carbon flux at each DOY as well as the min and max"""
    
    tseries = hv.Overlay([
        (data.groupby(['DOY', 'year'])[y_variable]
             .mean().groupby('DOY').agg([np.min, np.max])
             .hvplot.area('DOY', 'amin', 'amax', alpha=0.2, fields={'amin': y_variable})),
        data.groupby('DOY')[y_variable].mean().hvplot()])
    
    return tseries.options(width=800, height=400)

def site_count_plot(data):
    """Plot of the number of observations of each of the non-mandatory variables."""
    return data[soil_data_columns + ['site']].count().hvplot.bar(rot=90, width=300, height=400)

timeseries = site_timeseries(data)
count_plot = site_count_plot(data)
timeseries + count_plot

Dashboard#

Using the plots and functions defined above, we can make a Panel dashboard of sites where by clicking on a site, you get the timeseries and variable count for that particular site.

from holoviews.streams import Selection1D
import panel as pn
stream = Selection1D(source=all_points)
empty = timeseries.relabel('No selection') + count_plot.relabel('No selection')

def site_selection(index):
    if not index:
        return empty
    i = index[0]
    if i in QA[QA.quality=='valid'].index:
        site = QA.iloc[i].site
        ts = site_timeseries(data[data.site == site]).relabel(site)
        ct = site_count_plot(data[data.site == site]).relabel(site)
        return ts + ct
    else:
        return empty

one_site = hv.DynamicMap(site_selection, streams=[stream])

pn.Column(pn.Row(all_points * gts.OSM, hist), pn.Row(one_site))

Merge data#

Now that the data are loaded in we can merge the daily data with the metadata from before.

In order to use the categorical igbp field with machine-learning tools, we will create a one-hot encoding where each column corresponds to one of the igbp types, the rows correspond to observations and all the cells are filled with 0 or 1. This can be done use the method pd.get_dummies:

onehot_metadata = pd.get_dummies(metadata, columns=['igbp'])
onehot_metadata.sample(5)
site lat lon vegetation igbp_BSV ... igbp_SAV igbp_SNO igbp_WAT igbp_WET igbp_WSA
305 US-Ne2 41.1649 -96.4701 12 - Croplands 0 ... 0 0 0 0 0
300 US-Me5 44.4372 -121.5668 01 - Evergreen Needleleaf Forest 0 ... 0 0 0 0 0
310 US-Prr 65.1237 -147.4876 01 - Evergreen Needleleaf Forest 0 ... 0 0 0 0 0
36 US-Br1 41.9749 -93.6906 12 - Croplands 0 ... 0 0 0 0 0
253 IT-SRo 43.7279 10.2844 01 - Evergreen Needleleaf Forest 0 ... 0 0 0 0 0

5 rows × 19 columns

We’ll do the same for season - keeping season as a column.

data = pd.get_dummies(data, columns=['season']).assign(season=data['season'])

We’ll merge the metadata with all our daily observations - creating a tidy dataframe.

df = pd.merge(data, onehot_metadata, on='site')
df.sample(5)
P_ERA TA_ERA PA_ERA SW_IN_ERA LW_IN_ERA ... igbp_SAV igbp_SNO igbp_WAT igbp_WET igbp_WSA
192513 0.304 -4.332 96.991 51.409 230.432 ... 0 0 0 0 0
444673 0.291 10.006 71.059 314.236 275.579 ... 0 0 0 0 0
95251 0.000 -1.383 94.776 119.297 221.178 ... 0 0 0 0 0
120767 4.892 20.151 98.418 206.128 370.650 ... 0 0 0 0 0
462282 0.292 21.780 98.164 198.436 364.736 ... 0 0 0 0 0

5 rows × 41 columns

Visualizing Soil Data Availability at Sites#

Now that all of our observations are merged with the site metadata, we can take a look at which sites have soil data. Some sites have soil moisture and temperature data at one depths and others have the data at all 3 depths. We’ll look at the distribution of availability across sites.

partial_soil_data = df[df[soil_data_columns].notnull().any(1)]
partial_soil_data_sites = metadata[metadata.site.isin(partial_soil_data.site.unique())]
full_soil_data = df[df[soil_data_columns].notnull().all(1)]
full_soil_data_sites = metadata[metadata.site.isin(full_soil_data.site.unique())]
args = dict(geo=True, hover_cols=['site', 'vegetation'], height=420, width=600)

partial = partial_soil_data_sites.hvplot.points('lon', 'lat', **args).relabel('partial soil data')
full    =    full_soil_data_sites.hvplot.points('lon', 'lat', **args).relabel('full soil data')

(partial * full * gts.OSM).options(legend_position='top') +  veg_count(partial_soil_data_sites) * veg_count(full_soil_data_sites)

Since there seems to be a strong geographic pattern in the availablity of soil moisture and soil temperature data, we won’t use those columns in our model.

df = df.drop(columns=soil_data_columns)

Now we will set data to only the rows where there are no null values:

df = df[df.notnull().all(1)].reset_index(drop=True)
df['site'] = df['site'].astype('category')

Assigning roles to variables#

Before we train a model to predict carbon flux globally we need to choose which variables will be included in the input to the model. For those we should only use variables that we expect to have some relationship with the variable that we are trying to predict.

explanatory_cols = ['lat']
data_cols = ['P_ERA', 'TA_ERA', 'PA_ERA', 'SW_IN_ERA', 'LW_IN_ERA', 'WS_ERA', 'VPD_ERA']
season_cols = [col for col in df.columns if col.startswith('season_')]
igbp_cols = [col for col in df.columns if col.startswith('igbp_')]
x = df[data_cols + igbp_cols + explanatory_cols + season_cols].values
y = df[y_variable].values

Scaling the Data#

from sklearn.preprocessing import StandardScaler

# transform data matrix so 0 mean, unit variance for each feature
X = StandardScaler().fit_transform(x)

Now we are ready to train a model to predict carbon flux globally.

Training and Testing#

We’ll shuffle the sites and select 10% of them to be used as a test set. The rest we will use for training. Note that you might get better results using leave-one-out, but since we have a large amount of data, classical validation will be much faster.

from sklearn.model_selection import GroupShuffleSplit

sep = GroupShuffleSplit(train_size=0.9, test_size=0.1)
train_idx, test_idx = next(sep.split(X, y, df.site.cat.codes.values))
train_sites = df.site.iloc[train_idx].unique()
test_sites = df.site.iloc[test_idx].unique()

train_site_metadata = metadata[metadata.site.isin(train_sites)]
test_site_metadata = metadata[metadata.site.isin(test_sites)]

Let’s make a world map showing the sites that will be used as in training and those that will be used in testing:

train = train_site_metadata.hvplot.points('lon', 'lat', **args).relabel('training sites')
test  = test_site_metadata.hvplot.points( 'lon', 'lat', **args).relabel('testing sites') 

(train * test * gts.OSM).options(legend_position='top') +  veg_count(train_site_metadata) * veg_count(test_site_metadata)

This distribution seems reasonably uniform and unbiased, though a different random sampling might have allowed testing for each continent and all vegetation types.

Training the Regression Model#

We’ll construct a linear regression model using our randomly selected training sites and test sites.

from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X[train_idx], y[train_idx]);

We’ll create a little function to look at observed vs predicted values

from holoviews.operation.datashader import datashade

def result_plot(predicted, observed, title, corr=None, res=0.1):
    """Plot datashaded observed vs predicted"""
    
    corr = corr if corr is not None else np.corrcoef(predicted, observed)[0][1]
    title = '{} (correlation: {:.02f})'.format(title, corr)
    scatter = hv.Scatter((predicted, observed), 'predicted', 'observed')\
                .redim.range(predicted=(observed.min(), observed.max()))
    
    return datashade(scatter, y_sampling=res, x_sampling=res).relabel(title)
(result_plot(model.predict(X[train_idx]), y[train_idx], 'Training') + \
 result_plot(model.predict(X[test_idx ]), y[test_idx],  'Testing')).options('RGB', axiswise=True, width=500)

Prediction at test sites#

We can see how well the prediction does at each of our testing sites by making another dashboard.

results = []

for site in test_sites:
    site_test_idx = df[df.site == site].index
    y_hat_test = model.predict(X[site_test_idx])
    corr =  np.corrcoef(y_hat_test, y[site_test_idx])[0][1]
    
    results.append({'site': site,
                    'observed': y[site_test_idx], 
                    'predicted': y_hat_test, 
                    'corr': corr})
test_site_results = pd.merge(test_site_metadata, pd.DataFrame(results), 
                             on='site').set_index('site', drop=False)

Now we can set up another dashboard with just the test sites, where tapping on a given site produces a plot of the predicted vs. observed carbon flux.

First we’ll set up a timeseries function.

def timeseries_observed_vs_predicted(site=None):
    """
    Make a timeseries plot showing the predicted/observed 
    mean carbon flux at each DOY as well as the min and max
    """
    if site:
        data = df[df.site == site].assign(predicted=test_site_results.loc[site, 'predicted'])
        corr = test_site_results.loc[site, 'corr']
        title = 'Site: {}, correlation coefficient: {:.02f}'.format(site, corr)
    else:
        data = df.assign(predicted=np.nan)
        title = 'No Selection'

    spread = data.groupby(['DOY', 'year'])[y_variable].mean().groupby('DOY').agg([np.min, np.max]) \
             .hvplot.area('DOY', 'amin', 'amax', alpha=0.2, fields={'amin': 'observed'})
    observed  = data.groupby('DOY')[y_variable ].mean().hvplot().relabel('observed')
    predicted = data.groupby('DOY')['predicted'].mean().hvplot().relabel('predicted')
    
    return (spread * observed * predicted).options(width=800).relabel(title)
timeseries_observed_vs_predicted(test_sites[0])

Then we’ll set up the points colored by correlation coefficient.

test_points = test_site_results.hvplot.points('lon', 'lat', geo=True, c='corr', legend=False,
                                              cmap='coolwarm_r', s=150, height=420, width=800, 
                                              hover_cols=['vegetation', 'site']).options(
                                              tools=['tap', 'hover'], line_color='black')

And put it together into a dashboard. This will look very similar to the one above.

test_stream = Selection1D(source=test_points)

def test_site_selection(index):
    site = None if not index else test_sites[index[0]]
    return timeseries_observed_vs_predicted(site)

one_test_site = hv.DynamicMap(test_site_selection, streams=[test_stream])
title = 'Test sites colored by correlation: tap on site to plot long-term-mean timeseries'

dash = pn.Column((test_points * gts.OSM).relabel(title), one_test_site)
dash.servable()

Optional: Seasonal Prediction#

Clicking on some of the sites above suggests that prediction often works well for some months and not for others. Perhaps different variables are important for prediction, depending on the season? We might be able to achieve better results if we generate separate models for each season. First we’ll set up a function that computes prediction stats for a given training index, test index, array of X, array of y and array of seasons.

seasons = ['summer', 'fall', 'spring', 'winter']
def prediction_stats(train_idx, test_idx, X, y, season):
    """
    Compute prediction stats for equal length arrays X, y, and season
    split into train_idx and test_idx
    """
    pred = {}

    for s in seasons:
        season_idx = np.where(season==s)
        season_train_idx = np.intersect1d(season_idx, train_idx, assume_unique=True)
        season_test_idx = np.intersect1d(season_idx, test_idx, assume_unique=True)
        
        model = LinearRegression()
        model.fit(X[season_train_idx], y[season_train_idx])
        
        y_hat = model.predict(X[season_test_idx])
        y_test = y[season_test_idx]
        pred[s] = {'predicted': y_hat,
                   'observed': y_test,
                   'corrcoef': np.corrcoef(y_hat, y_test)[0][1],
                   'test_index': test_idx}
    return pred

Setup Dask#

With dask, we can distribute tasks over cores and do parallel computation. For more information see https://dask.org/

from distributed import Client

client = Client()
client

Client

Client-26dc9f0d-7736-11ee-8ab4-3b8312c40499

Connection method: Cluster object Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

Now we’ll scatter our data using dask and make a bunch of different splits. For each split we’ll compute the predicton stats for each season.

futures = []
sep = GroupShuffleSplit(n_splits=50, train_size=0.9, test_size=0.1)

X_future = client.scatter(X)
y_future = client.scatter(y)
season_future = client.scatter(df['season'].values)

for i, (train_index, test_index) in enumerate(sep.split(X, y, df.site.cat.codes.values)):
    train_future = client.scatter(train_index)
    test_future = client.scatter(test_index)
    futures += [client.submit(prediction_stats, train_future, test_future,
                              X_future, y_future, season_future)]

Now that we have our computations set up in dask, we can gather the results:

results = client.gather(futures)

And consolidate the results for each season.

output = {
    s: {
        'predicted': np.concatenate([i[s]['predicted'] for i in results]),
        'observed': np.concatenate([i[s]['observed'] for i in results]),
        'test_index': np.concatenate([i[s]['test_index'] for i in results]),
        'corrcoef': np.array([i[s]['corrcoef'] for i in results])
    } for s in seasons}
hv.Layout([
    result_plot(output[s]['predicted'], output[s]['observed'], s, output[s]['corrcoef'].mean())
    for s in seasons]).cols(2).options('RGB', axiswise=True, width=400)
def helper(s):
    corr = output[s]['corrcoef']
    return pd.DataFrame([corr, [s] * len(corr)], index=['corr', 'season']).T

corr = pd.concat(map(helper, seasons)).reset_index(drop=True)
corr.hvplot.hist(y='corr', groupby='season', bins=np.arange(0, .9, .05).tolist(), dynamic=False, width=500)
corr.mean()
/tmp/ipykernel_2740/2549477633.py:1: FutureWarning: Dropping of nuisance columns in DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError.  Select only valid columns before calling the reduction.
  corr.mean()
corr    0.342822
dtype: float64

Suggested Next Steps#

  • Can we predict certain vegetations better than others?

  • Calculate fraction of explained variance.

  • Replace each FluxNet input variable with a remotely sensed (satellite imaged) quantity to predict carbon flux globally