Grid Workflow#
- class datacube.api.GridWorkflow(index, grid_spec=None, product=None)[source]#
GridWorkflow deals with cell- and tile-based processing using a grid defining a projection and resolution.
Use GridWorkflow to specify your desired output grid. The methods
list_cells()
andlist_tiles()
query the index and return a dictionary of cell or tile keys, each mapping to aTile
object.The
Tile
object can then be used to load the data without needing the index, and can be serialized for use with the distributed package.Create a grid workflow tool.
Either grid_spec or product must be supplied.
- Parameters:
index (
AbstractIndex
) – The database index to use.grid_spec (
GridSpec
|None
) – The grid projection and resolutionproduct (
Product
|str
|None
) – The name of an existing product, if no grid_spec is supplied.
- Members:
- cell_observations(cell_index=None, geopolygon=None, tile_buffer=None, **indexers)[source]#
List datasets, grouped by cell.
- Parameters:
geopolygon (
Geometry
|None
) – Only return observations with data inside polygon.tile_buffer (
tuple
[float
,float
] |None
) – buffer tiles by (y, x) in CRS unitscell_index (
tuple
[int
,int
] |None
) – The cell index. E.g. (14, -40)indexers (
str
|float
|int
|Range
|datetime
|Not
) – Query to match the datasets, seedatacube.api.query.Query
- Return type:
- Returns:
A dictionary of cell index (int, int) mapping to a dict containing two keys, “datasets”, with a list of datasets, and “geobox”, containing the geobox for the cell.
- static group_into_cells(observations, group_by)[source]#
Group observations into a stack of source tiles.
- Parameters:
observations – datasets grouped by cell index, like from
cell_observations()
group_by (
GroupBy
) – grouping method, as returned bydatacube.api.query.query_group_by()
- Return type:
- Returns:
tiles grouped by cell index
- list_cells(cell_index=None, **query)[source]#
List cells that match the query.
Returns a dictionary of cell indexes to
Tile
objects.Cells are included if they contain any datasets that match the query using the same format as
datacube.Datacube.load()
.E.g.:
gw.list_cells(product='ls5_nbar_albers', time=('2001-1-1 00:00:00', '2001-3-31 23:59:59'))
- list_tiles(cell_index=None, **query)[source]#
List tiles of data, sorted by cell.
tiles = gw.list_tiles(product='ls5_nbar_albers', time=('2001-1-1 00:00:00', '2001-3-31 23:59:59'))
The values can be passed to
load()
- Parameters:
cell_index (
tuple
[int
,int
] |None
) – The cell index (optional). E.g. (14, -40)query – see
datacube.api.query.Query
- Return type:
dict
[tuple
[int
,int
,datetime64
],Tile
]
See also
- static load(tile, measurements=None, dask_chunks=None, fuse_func=None, resampling=None, skip_broken_datasets=False)[source]#
Load data for a cell/tile.
The data to be loaded is defined by the output of
list_tiles()
.This is a static function and does not use the index. This can be useful when running as a worker in a distributed environment and you wish to minimize database connections.
See the documentation on using xr with dask for more information.
- Parameters:
tile (
Tile
) – The tile to load.measurements (
Iterable
[str
] |None
) – The names of measurements to loaddask_chunks (
dict
[str
,str
|int
] |None
) –If the data should be loaded as needed using
dask.array.Array
, specify the chunk size in each output direction.See the documentation on using xr with dask for more information.
fuse_func – Function to fuse together a tile that has been pre-grouped by calling
list_cells()
with agroup_by
parameter.resampling (
str
|dict
|None
) –The resampling method to use if re-projection is required, could be configured per band using a dictionary (:meth: load_data)
Valid values are:
'nearest', 'cubic', 'bilinear', 'cubic_spline', 'lanczos', 'average'
Defaults to
'nearest'
.skip_broken_datasets (
bool
) – If True, ignore broken datasets and continue processing with the data that can be loaded. If False, an exception will be raised on a broken dataset. Defaults to False.
- Return type:
See also
- static tile_sources(observations, group_by)[source]#
Split observations into tiles and group into source tiles
- Parameters:
observations – datasets grouped by cell index, like from
cell_observations()
group_by (
GroupBy
) – grouping method, as returned bydatacube.api.query.query_group_by()
- Return type:
dict
[tuple
[int
,int
,datetime64
],Tile
]- Returns:
tiles grouped by cell index and time