ocean_data_gateway.readers.axds.region

class ocean_data_gateway.readers.axds.region(kwargs)[source]

Bases: ocean_data_gateway.readers.axds.AxdsReader

Inherits from AxdsReader to search over a region of space and time.

kw

Contains space and time search constraints: min_lon, max_lon, min_lat, max_lat, min_time, max_time.

Type

dict

variables

Variable names if you want to limit the search to those. There is different behavior depending on axds_type:

  • ‘platform2’: the variable name or names must be from the list available in odg.all_variables(‘axds’) and pass the check in odg.check_variables(‘axds’, variables).

  • ‘layer_group’: the variable name or names will be searched for as a query so just do your best with the names and experiment.

Alternatively, if the user inputs criteria, variables can be a list of the keys from criteria.

Type

string or list

criteria

A dictionary describing how to recognize variables by their name and attributes with regular expressions to be used with cf-xarray. It can be local or a URL point to a nonlocal gist. This is required for running QC in Gateway. For example: >>> my_custom_criteria = {“salt”: { … “standard_name”: “sea_water_salinity$|sea_water_practical_salinity$”, … “name”: (?i)sal$|(?i)s.sea_water_practical_salinity$”}}

Type

dict, str, optional

var_def

A dictionary with the same keys as criteria (criteria can have more) that describes QC definitions and units. It should include the variable units, fail_span, and suspect_span. For example: >>> var_def = {“salt”: {“units”: “psu”, … “fail_span”: [-10, 60], “suspect_span”: [-1, 45]}}

Type

dict, optional

approach

approach is defined as ‘region’ for this class.

Type

string

num_variables

Number of variables stored in self.variables. This is set initially and if self.variables is modified, this is updated accordingly. If variables is None, num_variables==0.

Type

int

Attributes
catalog

Write then open the catalog.

dataset_ids

Find dataset_ids for server.

meta

Rearrange the individual metadata into a dataframe.

search_results

Loop over self.urls to read in search results.

urls

Return a list of search urls.

Methods

clear()

data([dataset_ids])

Read in data for some or all dataset_ids.

data_by_dataset(dataset_id)

Return the data for a single dataset_id.

get(k[,d])

items()

keys()

Regular dict-like way to return keys.

meta_by_dataset(dataset_id)

Return the catalog metadata for a single dataset_id.

pop(k[,d])

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem()

as a 2-tuple; but raise KeyError if D is empty.

save()

Save datasets locally.

setdefault(k[,d])

update([E, ]**F)

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

url_builder(url_base[, dataset_id, …])

Build an individual search url.

url_dataset_id(dataset_id)

url modification to search for known dataset_id.

url_query(query)

url modification to add query field.

url_region()

url modification to add spatial search box.

url_time()

url modification to add time filtering.

url_variable(variable)

url modification to add variable search.

values()

Regular dict-like way to return values.

write_catalog()

Write catalog file.

write_catalog_layer_group_entry(dataset, …)

Write part of catalog in case of layer_group.

__init__(kwargs)[source]
Parameters

kwargs (dict) –

Can contain arguments to pass onto the base AxdsReader class (catalog_name, parallel, axds_type). The dict entries to initialize this class are:

  • kw: dict Contains space and time search constraints: min_lon, max_lon, min_lat, max_lat, min_time, max_time.

  • variables: string or list, optional Variable names if you want to limit the search to those. There is different behavior depending on axds_type:

    • ’platform2’: the variable name or names must be from the list available in odg.all_variables(‘axds’) and pass the check in odg.check_variables(‘axds’, variables).

    • ’layer_group’: the variable name or names will be searched for as a query so just do your best with the names and experiment.

    Alternatively, if the user inputs criteria, variables can be a list of the keys from criteria.

  • criteria: dict, optional A dictionary describing how to recognize variables by their name and attributes with regular expressions to be used with cf-xarray. It can be local or a URL point to a nonlocal gist. This is required for running QC in Gateway. For example: >>> my_custom_criteria = {“salt”: { … “standard_name”: “sea_water_salinity$|sea_water_practical_salinity$”, … “name”: (?i)sal$|(?i)s.sea_water_practical_salinity$”}}

  • var_def: dict, optional A dictionary with the same keys as criteria (criteria can have more) that describes QC definitions and units. It should include the variable units, fail_span, and suspect_span. For example: >>> var_def = {“salt”: {“units”: “psu”, … “fail_span”: [-10, 60], “suspect_span”: [-1, 45]}}

Methods

__init__(kwargs)

param kwargs

Can contain arguments to pass onto the base AxdsReader class

clear()

data([dataset_ids])

Read in data for some or all dataset_ids.

data_by_dataset(dataset_id)

Return the data for a single dataset_id.

get(k[,d])

items()

keys()

Regular dict-like way to return keys.

meta_by_dataset(dataset_id)

Return the catalog metadata for a single dataset_id.

pop(k[,d])

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem()

as a 2-tuple; but raise KeyError if D is empty.

save()

Save datasets locally.

setdefault(k[,d])

update([E, ]**F)

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

url_builder(url_base[, dataset_id, …])

Build an individual search url.

url_dataset_id(dataset_id)

url modification to search for known dataset_id.

url_query(query)

url modification to add query field.

url_region()

url modification to add spatial search box.

url_time()

url modification to add time filtering.

url_variable(variable)

url modification to add variable search.

values()

Regular dict-like way to return values.

write_catalog()

Write catalog file.

write_catalog_layer_group_entry(dataset, …)

Write part of catalog in case of layer_group.

Attributes

catalog

Write then open the catalog.

dataset_ids

Find dataset_ids for server.

meta

Rearrange the individual metadata into a dataframe.

search_results

Loop over self.urls to read in search results.

urls

Return a list of search urls.

_abc_impl = <_abc_data object>
property catalog

Write then open the catalog.

clear()None.  Remove all items from D.
data(dataset_ids=None)

Read in data for some or all dataset_ids.

NOT USED CURRENTLY

Once data is read in for a dataset_ids, it is remembered.

See full documentation in utils.load_data().

data_by_dataset(dataset_id)

Return the data for a single dataset_id.

Returns

  • A tuple of (dataset_id, data), where data type depends on self.axds_type

  • If `self.axds_type==’platform2’` (a pandas DataFrame)

  • If `self.axds_type==’layer_group’` (an xarray Dataset)

Notes

Read behavior depends on axds_type:

  • If self.axds_type==’platform2’: data is read into memory with dask.

  • If self.axds_type==’layer_group’: data is pointed to with dask but nothing is read in except metadata associated with the xarray Dataset.

property dataset_ids

Find dataset_ids for server.

Notes

The dataset_ids are read from the catalog, so the catalog is created before this can happen.

The number of dataset_ids can change if a variable is removed from the list of variables and this is rerun.

get(k[, d])D[k] if k in D, else d.  d defaults to None.
items()a set-like object providing a view on D’s items
keys()

Regular dict-like way to return keys.

property meta

Rearrange the individual metadata into a dataframe.

meta_by_dataset(dataset_id)

Return the catalog metadata for a single dataset_id.

TO DO: Should this return intake-style or a row of the metadata dataframe?

pop(k[, d])v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem()(k, v), remove and return some (key, value) pair

as a 2-tuple; but raise KeyError if D is empty.

save()

Save datasets locally.

property search_results

Loop over self.urls to read in search results.

Notes

The logic removes duplicate searches. This returns a dict of the datasets from the search results with the key of each entry being the dataset_id. For

  • self.axds_type == “platform2”: dataset_id is the uuid

  • self.axds_type == “layer_group”: dataset_id is the module_uuid since multiple layer_groups can be linked under one module_uuid

setdefault(k[, d])D.get(k,d), also set D[k]=d if k not in D
update([E, ]**F)None.  Update D from mapping/iterable E and F.

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

url_builder(url_base, dataset_id=None, add_region=False, add_time=False, variable=None, query=None)

Build an individual search url.

Parameters
  • url_base (string) – There are 2 possible bases for the url: * self.url_axds_type, for searching * self.url_docs_base, for selecting known dataset by dataset_id

  • dataset_id (string, optional) – dataset_id of station, if known.

  • add_region (boolean, optional) – True to filter the search by lon/lat box. Requires self.kw that contains keys min_lon, max_lon, min_lat, max_lat.

  • add_time (boolean, optional) – True to filter the search by time range. Requires self.kw that contains keys min_time and max_time.

  • variable (string, optional) – String of variable description to filter by, if desired. If axds_type==’platform2’, find the variable name options with class function odg.all_variables(‘axds’), search for variable names by string with odg.search_variables(‘axds’, variables), and check your variable list with check_variables(‘axds’, variables). If axds_type==’layer_group’, there is no official variable list and you can instead just put in a basic variable name and hope the search works.

  • query (string, optional) – This could be any search query you want, but it is used in the code to search for station names (not dataset_ids).

Returns

Return type

Url for search.

url_dataset_id(dataset_id)

url modification to search for known dataset_id.

Parameters

dataset_id (string) – String of dataset_id to exactly match.

Returns

Return type

Modification for url to search for dataset_id.

url_query(query)

url modification to add query field.

Parameters

query (string) – String to query for. Can be multiple words.

Returns

Return type

Modification for url to add query field.

url_region()

url modification to add spatial search box.

Returns

Return type

Modification for url to add lon/lat filtering.

Notes

Uses the kw dictionary already stored in the class object to access the spatial limits of the box.

url_time()

url modification to add time filtering.

Returns

Return type

Modification for url to add time filtering.

Notes

Uses the kw dictionary already stored in the class object to access the time limits of the search.

url_variable(variable)

url modification to add variable search.

Parameters

variable (string) – String to search for.

Returns

Return type

Modification for url to add variable search.

Notes

This variable search is specifically by parameter group and only works for axds_type=’platform2’. For axds_type=’layer_group’, use url_query with the variable name.

property urls

Return a list of search urls.

Notes

Use this through the class methods region or stations to put together the search urls to represent the basic reader setup.

values()

Regular dict-like way to return values.

write_catalog()

Write catalog file.

write_catalog_layer_group_entry(dataset, dataset_id, urlpath, layer_groups)

Write part of catalog in case of layer_group.

Notes

This is used to manage the logic for axds_type=’layer_group’ in which the module is being linked to the set of layer_groups.