Quick Start Demo of Ocean Data Gateway for finding data

Goal: to be able to search for and handle the read in of ocean datasets easily. The package we’ve written for this is called ocean_data_gateway, and here we show a short demo.

[17]:
import ocean_data_gateway as odg
import pandas as pd
pd.set_option('display.max_rows', 5)

Find Data in a Region

Here we will search for data in the Bering Sea region.

[2]:
kw = {
    "min_lon": -180,
    "max_lon": -158,
    "min_lat": 50,
    "max_lat": 66,
    "min_time": '2021-4-1',
    "max_time": '2021-4-2',
}

All the servers

Set up search object, data, then do an initial metadata search to find the dataset_ids of the relevant datasets. We are searching for all variables currently.

[3]:
%%time

# setup Data search object
data = odg.Gateway(kw=kw, approach='region')

# find dataset_ids to make sure it works
data.dataset_ids[0][:5]

CPU times: user 5.69 s, sys: 411 ms, total: 6.1 s
Wall time: 1min 38s
[3]:
['gov_noaa_nws_papb',
 'gov_noaa_uscrn_1143',
 'noaa_nos_co_ops_snda2',
 'org_mxak_nikolski',
 'nelson-lagoon-1']

The search checked dataset_ids for each of 5 readers and found the following number of datasets in them:

[4]:
for dataset_ids in data.dataset_ids:
    print(len(dataset_ids))
217
178
1
19
0

This searches through 2 ERDDAP servers (but more can be added by the user), 2 Axiom databases, and any known local files.

Just one server

Since that search took 1.5 min just for the dataset_ids, let’s narrow which databases are checked.

[5]:
%%time

# setup Data search object
data = odg.Gateway(kw=kw, approach='region', readers=odg.erddap, erddap={'known_server': 'ioos'})

# look up dataset_ids
print(data.dataset_ids[0][:5], len(data.dataset_ids[0]))
['gov_noaa_nws_papb', 'gov_noaa_uscrn_1143', 'noaa_nos_co_ops_snda2', 'org_mxak_nikolski', 'nelson-lagoon-1'] 217
CPU times: user 18.1 ms, sys: 6.85 ms, total: 24.9 ms
Wall time: 1.1 s
[6]:
%%time
data.meta[0]
CPU times: user 315 ms, sys: 136 ms, total: 450 ms
Wall time: 11.2 s
[6]:
database download_url geospatial_lat_min geospatial_lat_max geospatial_lon_min geospatial_lon_max time_coverage_start time_coverage_end defaultDataQuery subsetVariables keywords id infoUrl institution featureType source sourceUrl variable names
org_mxak_dutch_harbor_port_of http://erddap.sensors.ioos.us/erddap http://erddap.sensors.ioos.us/erddap/tabledap/... 53.902729 53.902729 -166.528400 -166.528400 2015-05-05T14:10:00Z 2021-05-06T18:20:00Z wind_gust_from_direction,wind_speed_qc_agg,rel... NA NA 103343 https://sensors.ioos.us/#metadata/103343/station Marine Exchange of Alaska (MXAK) TimeSeries NA https://evision.mxak.org/mxakwx/DUTCH_HARBOR_P... None
noaa_nos_co_ops_9461162 http://erddap.sensors.ioos.us/erddap http://erddap.sensors.ioos.us/erddap/tabledap/... 51.778300 51.778300 -177.800000 -177.800000 2015-05-05T21:28:00Z 2021-05-13T09:13:00Z sea_surface_height_amplitude_due_to_geocentric... NA NA 15540 https://sensors.ioos.us/#metadata/15540/station NOAA Center for Operational Oceanographic Prod... TimeSeries NA https://sensors.axds.co/api/ None
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
gov_noaa_uscrn_1143 http://erddap.sensors.ioos.us/erddap http://erddap.sensors.ioos.us/erddap/tabledap/... 57.160000 57.160000 -170.210000 -170.210000 2015-08-23T14:00:00Z 2021-05-06T12:00:00Z lwe_thickness_of_precipitation_amount_cm_time_... NA NA 14115 https://sensors.ioos.us/#metadata/14115/station US Climate Research Network TimeSeries NA https://sensors.axds.co/api/ None
gov_noaa_nws_papb http://erddap.sensors.ioos.us/erddap http://erddap.sensors.ioos.us/erddap/tabledap/... 56.583333 56.583333 -169.666667 -169.666667 2015-05-05T11:53:00Z 2021-05-06T14:53:00Z air_temperature,z,wind_speed,time,relative_hum... NA NA 14088 https://sensors.ioos.us/#metadata/14088/station NOAA National Weather Service (NWS) TimeSeries NA https://sensors.axds.co/api/ None

217 rows × 18 columns

[7]:
%%time
data_out = data.data[0]()
CPU times: user 704 ms, sys: 57.9 ms, total: 762 ms
Wall time: 17.3 s
[8]:
data_out['noaa_nos_co_ops_nmta2']
[8]:
latitude (degrees_north) longitude (degrees_east) z (m) air_pressure (millibars) air_temperature (degree_Celsius) wind_speed_of_gust (mile.hour-1) wind_speed (m.s-1) wind_from_direction (degrees) station
time (UTC)
2021-04-02 00:00:00+00:00 64.5 -165.43 0.0 1024.6 -15.8 10.2900 4.1 100.0 NMTA2 - 9468756 - Nome, Norton Sound, AK
2021-04-01 23:54:00+00:00 64.5 -165.43 0.0 1024.6 -15.9 9.1714 3.6 90.0 NMTA2 - 9468756 - Nome, Norton Sound, AK
... ... ... ... ... ... ... ... ... ...
2021-04-01 00:06:00+00:00 64.5 -165.43 0.0 1018.1 -14.8 24.1590 8.2 320.0 NMTA2 - 9468756 - Nome, Norton Sound, AK
2021-04-01 00:00:00+00:00 64.5 -165.43 0.0 1017.9 -14.8 26.3960 9.8 320.0 NMTA2 - 9468756 - Nome, Norton Sound, AK

234 rows × 9 columns

One variable in one server

[9]:
%%time

# setup Data search object
data = odg.Gateway(kw=kw, approach='region', readers=odg.erddap,
                   erddap={'known_server': 'ioos', 'variables': 'sea_water_temperature'})

# look up dataset_ids
print(data.dataset_ids[0][:5], len(data.dataset_ids[0]))
['noaa_nos_co_ops_9461380', 'noaa_nos_co_ops_snda2', 'noaa_nos_co_ops_9459450', 'gov_usgs_waterdata_15297610', 'gov_usgs_waterdata_15302000'] 23
CPU times: user 82.1 ms, sys: 11.8 ms, total: 93.8 ms
Wall time: 779 ms
[10]:
%%time
data.meta[0]
CPU times: user 31.2 ms, sys: 4.76 ms, total: 36 ms
Wall time: 898 ms
[10]:
database download_url geospatial_lat_min geospatial_lat_max geospatial_lon_min geospatial_lon_max time_coverage_start time_coverage_end defaultDataQuery subsetVariables keywords id infoUrl institution featureType source sourceUrl variable names
wmo_46073 http://erddap.sensors.ioos.us/erddap http://erddap.sensors.ioos.us/erddap/tabledap/... 55.031000 55.031000 -172.00100 -172.00100 2015-05-05T12:50:00Z 2021-05-06T16:00:00Z wind_speed_of_gust,sea_surface_swell_wave_to_d... NA NA 41997 https://sensors.ioos.us/#metadata/41997/station NOAA National Data Buoy Center (NDBC) TimeSeries NA https://sensors.axds.co/api/ [sea_water_temperature]
yugayu-lake-bethel-ak http://erddap.sensors.ioos.us/erddap http://erddap.sensors.ioos.us/erddap/tabledap/... 60.799950 60.799950 -161.76575 -161.76575 2020-10-24T18:15:00Z 2021-04-23T18:15:00Z air_temperature,sea_water_temperature,z,time&t... NA NA 105532 https://sensors.ioos.us/#metadata/105532/station Fresh Eyes on Ice TimeSeriesProfile NA https://app.beadedstream.com/projects/7604/sit... [sea_water_temperature]
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
noaa_nos_co_ops_snda2 http://erddap.sensors.ioos.us/erddap http://erddap.sensors.ioos.us/erddap/tabledap/... 55.337000 55.337000 -160.50200 -160.50200 2015-05-05T12:06:00Z 2021-05-06T13:48:00Z air_temperature,wind_speed_of_gust,sea_water_t... NA NA 13824 https://sensors.ioos.us/#metadata/13824/station NOAA Center for Operational Oceanographic Prod... TimeSeries NA https://sensors.axds.co/api/ [sea_water_temperature]
noaa_nos_co_ops_9461380 http://erddap.sensors.ioos.us/erddap http://erddap.sensors.ioos.us/erddap/tabledap/... 51.863306 51.863306 -176.63200 -176.63200 1950-03-17T01:00:00Z 2021-05-13T19:00:00Z wind_speed_qc_agg,sea_surface_height_amplitude... NA NA 12011 https://sensors.ioos.us/#metadata/12011/station NOAA Center for Operational Oceanographic Prod... TimeSeries NA https://tidesandcurrents.noaa.gov/api/ [sea_water_temperature]

23 rows × 18 columns

[11]:
%%time
data_out = data.data[0]()
CPU times: user 56.1 ms, sys: 5.74 ms, total: 61.9 ms
Wall time: 15.3 s
[12]:
data_out['noaa_nos_co_ops_9459450']
[12]:
longitude (degrees_east) latitude (degrees_north) sea_water_temperature (degree_Celsius)
time (UTC)
2021-04-01 00:00:00+00:00 -160.5043 55.33172 3.8
2021-04-01 00:06:00+00:00 -160.5043 55.33172 3.8
... ... ... ...
2021-04-01 23:54:00+00:00 -160.5043 55.33172 3.6
2021-04-02 00:00:00+00:00 -160.5043 55.33172 3.6

245 rows × 3 columns

Use Local Files

Local files can be easily input into the gateway using Python package intake under the hood. It is set up to automatically recognize either csv or netcdf files and be able to read them in.

[13]:
filenames = ['/Users/kthyng/Downloads/ANIMIDA_III_BeaufortSea_2014-2015/kasper-netcdf/ANIMctd14.nc',
             '/Users/kthyng/Downloads/Harrison_Bay_CTD_MooringData_2014-2015/Harrison_Bay_data/SBE16plus_01604787_2015_08_09_final.csv']

data = odg.Gateway(readers=odg.local, local={'filenames': filenames})
[14]:
data.meta[0]
[14]:
geospatial_lon_max variables coords geospatial_lat_max time_coverage_end time_coverage_start geospatial_lat_min time_variable download_url catalog_dir geospatial_lon_min lon_variable lat_variable
ANIMctd14.nc -141.717438 [station_name, sal, tem, fluoro, turbidity, PA... [time, lat, lon, pressure] 71.488255 2014-08-07T21:35:54.000004381 2014-07-31T15:33:33.999999314 69.850874 time /Users/kthyng/Downloads/ANIMIDA_III_BeaufortSe... /Users/kthyng/.ocean_data_gateway/catalogs/ -152.581114 lon lat
SBE16plus_01604787_2015_08_09_final.csv -150.237 [time, latitude, longitude, water_depth, Condu... NaN 70.6349 2015-08-09T06:00:05Z 2014-08-01T12:00:05Z 70.6349 NaN /Users/kthyng/Downloads/Harrison_Bay_CTD_Moori... /Users/kthyng/.ocean_data_gateway/catalogs/ -150.237 NaN NaN
[15]:
data.data[0]()['ANIMctd14.nc']
[15]:
<xarray.Dataset>
Dimensions:              (nzmax: 1587, profile: 57)
Coordinates:
    time                 (profile) datetime64[ns] 2014-08-07T02:02:34.0000028...
    lat                  (profile) float64 71.27 71.23 71.18 ... 70.45 70.46
    lon                  (profile) float64 -152.2 -152.3 ... -145.8 -145.8
    pressure             (profile, nzmax) float64 2.187 2.399 ... -9.999e+03
Dimensions without coordinates: nzmax, profile
Data variables:
    station_name         (profile) |S12 b'1.01        ' ... b'T-XA        '
    sal                  (profile, nzmax) float64 24.85 24.85 ... -9.999e+03
    tem                  (profile, nzmax) float64 1.625 1.589 ... -9.999e+03
    fluoro               (profile, nzmax) float64 0.6842 0.7452 ... -9.999e+03
    turbidity            (profile, nzmax) float64 0.604 0.6895 ... -9.999e+03
    PAR                  (profile, nzmax) float64 9.596 9.097 ... -9.999e+03
    platform_variable    float64 9.969e+36
    instrument_variable  float64 9.969e+36
    crs                  float64 9.969e+36
Attributes: (12/35)
    Conventions:                CF-1.6
    Metadata_Conventions:       Unidata Dataset Discovery v1.0
    featureType:                profile
    cdm_data_type:              Station
    nodc_template_version:      NODC_NetCDF_Profile_Incomplete_Templete_v1.1
    standard_name_vocabulary:   NetCDF Climate and Forecast(CF) Metadata Conv...
    ...                         ...
    keywords:                   OCEAN TEMPERATURE,SALINITY,TURBIDITY,WATER PR...
    acknowledgement:            Kasper, J., CTD measurements collected from s...
    publisher_name:             Tim Whiteaker
    publisher_email:            whiteaker@utexas.edu
    publisher_url:              http://arcticstudies.org/animida_iii
    license:                    Creative Commons Attribution 3.0 United State...
[16]:
data.data[0]()['SBE16plus_01604787_2015_08_09_final.csv']
[16]:
time latitude longitude water_depth Conductivity_[S/m] Pressure_[db] Temperature_ITS90_[deg C] Salinity_Practical_[PSU] Voltage0_[volts] Instrument_Time_[juliandays] flag
0 2014-08-01T12:00:05Z 70.6349 -150.237 13.0 2.495646 12.687 -1.4619 31.0905 0.3091 213.500058 0.0
1 2014-08-01T13:00:05Z 70.6349 -150.237 13.0 2.495454 12.699 -1.4595 31.0854 0.3265 213.541725 0.0
... ... ... ... ... ... ... ... ... ... ... ...
8945 2015-08-09T05:00:05Z 70.6349 -150.237 13.0 2.591448 12.777 0.3619 30.5086 0.3873 586.208391 0.0
8946 2015-08-09T06:00:05Z 70.6349 -150.237 13.0 2.585462 12.754 0.2862 30.5062 0.2441 586.250058 0.0

8947 rows × 11 columns

[ ]: