API Reference

Modules

class common.bestiapop_utils.MyUtilityBeast(input_path=None)[source]

This class will provide methods to perform generic or shared operations on data

Parameters:logger (str) – A pointer to an initialized Argparse logger
download_nc4_file_from_cloud(year, climate_variable, output_path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/bestiapop/checkouts/stable/docs'), data_source='silo', skip_certificate_checks=False)[source]

Downloads a file from AWS S3 bucket or other cloud API

Parameters:
  • year (int) – the year we require data for. SILO stores climate data as separate years like so: daily_rain.2018.nc
  • climate_variable (str) – the climate variable short name as per SILO nomenclature, see https://www.longpaddock.qld.gov.au/silo/about/climate-variables/
  • output_path (str, optional) – The target folder where files should be downloaded. Defaults to Path().cwd().
  • skip_certificate_checks (bool, optional) – ask the requests library to skip certificate checks, useful when attempting to download files behind a proxy. Defaults to False.
generate_climate_dataframe_from_disk(year_range, climate_variables, lat_range, lon_range, input_dir, data_source='silo')[source]

This function generates a dataframe containing (a) climate values (b) for every variable requested (c) for every day of the year (d) for every year passed in as argument. The values will be sourced from Disk. :param year_range: a numpy array with all the years for which we are seeking data. :type year_range: numpy.ndarray :param climate_variables: the climate variable short name as per SILO or NASAPOWER nomenclature. For SILO check https://www.longpaddock.qld.gov.au/silo/about/climate-variables/. For NASAPOWER check: XXXXX. :type climate_variables: str :param lat_range: a numpy array of latitude values to extract data from :type lat_range: numpy.ndarray :param lon_range: a numpy array of longitude values to extract data from :type lon_range: numpy.ndarray :param input_dir: when selecting the option to generate Climate Data Files from local directories, this parameter must be specified, otherwise data will be fetched directly from the cloud either via an available API or S3 bucket. :type input_dir: str

Returns:a tuple consisting of (a) the final dataframe containing values for all years, latitudes and longitudes for a particular climate variable, (b) the curated list of longitude ranges (which excludes all those lon values where there were no actual data points). The tuple is ordered as follows: (final_dataframe, final_lon_range)
Return type:tuple
load_cdf_file(sourcepath, data_category, year=None)[source]

This function loads a NetCDF4 file either from the cloud or locally

Parameters:
  • sourcepath (str) – when loading a NetCDF4 file locally, this specifies the source folder. Only the “folder” must be specified, the actual file name will be further qualified by BestiaPop grabbing data from the year and climate variable paramaters passed to the SILO class.
  • data_category (str) – the short name variable, examples: daily_rain, max_temp, etc.
  • year (int, optional) – the year we want to extract data from, it is used to compose the final AWS S3 URL or to qualify the full path to the local NetCDF4 file we would like to load. Defaults to None.
Returns:

a dictionary containing two items, “value_array” which is a xarray DataSet object and “data_year” which is the year that the NetCDF4 file contains data for, extracted by looking at the contents of the NetCDF4 file itself.

Return type:

dict

class connectors.silo_connector.SILOClimateDataConnector(climate_variables, data_source='silo', input_path=None)[source]

This class will provide methods that query and parse data from SILO climate database

Parameters:
  • logger (str) – A pointer to an initialized Argparse logger
  • data_source (str) – The climate database where the values are being extracted from: SILO or NASAPOWER
generate_climate_dataframe_from_silo_cloud_api(year_range, climate_variables, lat_range, lon_range, input_dir)[source]

This function generates a dataframe containing (a) climate values (b) for every variable requested (c) for every day of the year (d) for every year passed in as argument. It will leverage SILO API to do it.

Parameters:
  • year_range (numpy.ndarray) – a numpy array with all the years for which we are seeking data.
  • climate_variables (str) – the climate variable short name as per SILO or NASAPOWER nomenclature. For SILO check https://www.longpaddock.qld.gov.au/silo/about/climate-variables/. For NASAPOWER check: XXXXX.
  • lat_range (numpy.ndarray) – a numpy array of latitude values to extract data from
  • lon_range (numpy.ndarray) – a numpy array of longitude values to extract data from
  • input_dir (str) – when selecting the option to generate Climate Data Files from local directories, this parameter must be specified, otherwise data will be fetched directly from the cloud either via an available API or S3 bucket.
Returns:

a tuple consisting of (a) the final dataframe containing values for all years, latitudes and longitudes for a particular climate variable, (b) the curated list of longitude ranges (which excludes all those lon values where there were no actual data points). The tuple is ordered as follows: (final_dataframe, final_lon_range)

Return type:

tuple

get_yearly_data(lat, lon, value_array, year, year_range, climate_variable)[source]

Extract values from an API endpoint in the cloud or a xarray.Dataset object

Parameters:
  • lat (float) – the latitude that values should be returned for
  • lon (float) – the longitude that values should be returned for
  • value_array (xarray.Dataset) – the xarray Dataset object to extract values from
  • year (string) – the year of the file
  • variable_short_name (string) – the climate variable short name as per SILO nomenclature, see https://www.longpaddock.qld.gov.au/silo/about/climate-variables/
Raises:

ValueError – if there was “NO” data available for all days under a particular combination of lat & lon, then the total values collected should equal “0” (meaning, there was no data for that point in the grid). If this is the case, then the function will simply return with a “no_values” message and signal the calling function that it should ignore this particular year-lat-lon combination.

Returns:

a dataframe containing 5 columns: the Julian day, the grid data value for that day, the year, the latitude, the longitude.

Return type:

pandas.core.frame.DataFrame

class connectors.nasapower_connector.NASAPowerClimateDataConnector(climate_variables, data_source='silo', input_path=None)[source]

This class will provide methods that query and parse data from NASA POWER climate database

Parameters:
  • logger (str) – A pointer to an initialized Argparse logger
  • data_source (str) – The climate database where the values are being extracted from: SILO or NASAPOWER
generate_climate_dataframe_from_nasapower_cloud_api(year_range, climate_variables, lat_range, lon_range, input_dir)[source]

This function generates a dataframe containing (a) climate values (b) for every variable requested (c) for every day of the year (d) for every year passed in as argument. It will leverage NASAPOWER API to do it.

Parameters:
  • year_range (numpy.ndarray) – a numpy array with all the years for which we are seeking data.
  • climate_variables (str) – the climate variable short name as per SILO nomenclature. For SILO check https://www.longpaddock.qld.gov.au/silo/about/climate-variables/. Variable names are automatically translated from SILO to NASAPOWER codes.
  • lat_range (numpy.ndarray) – a numpy array of latitude values to extract data from
  • lon_range (numpy.ndarray) – a numpy array of longitude values to extract data from
  • input_dir (str) – when selecting the option to generate Climate Data Files from local directories, this parameter must be specified, otherwise data will be fetched directly from the cloud either via an available API or S3 bucket.
Returns:

a tuple consisting of (a) the final dataframe containing values for all years, latitudes and longitudes for a particular climate variable, (b) the curated list of longitude ranges (which excludes all those lon values where there were no actual data points). The tuple is ordered as follows: (final_dataframe, final_lon_range)

Return type:

tuple

get_yearly_data(lat, lon, value_array, year, year_range, climate_variable)[source]

Extract values from an API endpoint in the cloud or a xarray.Dataset object

Parameters:
  • lat (float) – the latitude that values should be returned for
  • lon (float) – the longitude that values should be returned for
  • value_array (xarray.Dataset) – the xarray Dataset object to extract values from
  • year (string) – the year of the file
  • variable_short_name (string) – the climate variable name
Raises:

ValueError – if there was “NO” data available for all days under a particular combination of lat & lon, then the total values collected should equal “0” (meaning, there was no data for that point in the grid). If this is the case, then the function will simply return with a “no_values” message and signal the calling function that it should ignore this particular year-lat-lon combination.

Returns:

a dataframe containing 5 columns: the Julian day, the grid data value for that day, the year, the latitude, the longitude.

Return type:

pandas.core.frame.DataFrame

The NASA POWER database is a global database of daily weather data specifically designed for agrometeorological applications. The spatial resolution of the database is 0.5x0.5 degrees (as of 2018). For more information on the NASA POWER database see the documentation at: http://power.larc.nasa.gov/common/AgroclimatologyMethodology/Agro_Methodology_Content.html The NASAPowerClimateDataConnector is used by BestiaPop to retrieve data from NASA POWER database and provides functions to parse and extract relevant information from it. Important NOTE: as per https://power.larc.nasa.gov/docs/services/api/v1/temporal/daily/, any latitude-longitude combinations within a 0.5x0.5 degrees grid box will yield the same weather data. Thus, there is no difference for data returned between lat/lon -41.5/145.3 and lat/lon -41.8/145.7. When BestiaPop requests data from NASA Power, it will automatically create coordinate series wiht 1 degree jumps. So if you pass in -lat “-41.15 -55.05” the resulting series will be: [-55.05, -54.05, -53.05, -52.05, -51.05, -50.05, -49.05, -48.05, -47.05, -46.05, -45.05, -44.05, -43.05, -42.05, -41.05]. Please bear in mind that there is no difference between -41.15 and -41.05.

Base Classes

class producers.output.DATAOUTPUT(data_source)[source]

This class will provide different methods for data output from climate dataframes

Parameters:
  • logger (str) – A pointer to an initialized Argparse logger
  • data_source (str) – The climate database where the values are being extracted from: SILO or NASAPOWER
Returns:

A class object with access to DATAOUTPUT methods

Return type:

DATAOUTPUT

generate_met(outputdir, met_dataframe, lat, lon)[source]

Generate APSIM MET File

Parameters:
  • outputdir (str) – the folder where the generated MET files will be stored
  • met_dataframe (pandas.core.frame.DataFrame) – the pandas dataframe slice to convert to MET file
  • lat (float) – the latitude for which this MET file is being generated
  • lon (float) – the longitude for which this MET file is being generated
generate_output(final_daily_df, lat_range, lon_range, outputdir=None, output_type='met')[source]

Generate required Output based on Output Type selected

Parameters:
  • final_daily_df (pandas.core.frame.DataFrame) – the pandas daframe containing all the values that are going to be parsed into a specific output
  • lat_range (numpy.ndarray) – an array of latitude values to select from the final_daily_df
  • lon_range (numpy.ndarray) – an array of longitude values to select from the final_daily_df
  • outputdir (str) – the folder that will be used to store the output files
  • output_type (str, optional) – the output type: csv (not implemented yet), json(not implemented yet), met. Defaults to “met”.
generate_wth(outputdir, wth_dataframe, lat, lon)[source]

Generate WTH File

Parameters:
  • outputdir (str) – the folder where the generated WTH files will be stored
  • wth_dataframe (pandas.core.frame.DataFrame) – the pandas dataframe slice to convert to WTH file
  • lat (float) – the latitude for which this WTH file is being generated
  • lon (float) – the longitude for which this WTH file is being generated
class connectors.silo_connector.SILOClimateDataConnector(climate_variables, data_source='silo', input_path=None)[source]

This class will provide methods that query and parse data from SILO climate database

Parameters:
  • logger (str) – A pointer to an initialized Argparse logger
  • data_source (str) – The climate database where the values are being extracted from: SILO or NASAPOWER
generate_climate_dataframe_from_silo_cloud_api(year_range, climate_variables, lat_range, lon_range, input_dir)[source]

This function generates a dataframe containing (a) climate values (b) for every variable requested (c) for every day of the year (d) for every year passed in as argument. It will leverage SILO API to do it.

Parameters:
  • year_range (numpy.ndarray) – a numpy array with all the years for which we are seeking data.
  • climate_variables (str) – the climate variable short name as per SILO or NASAPOWER nomenclature. For SILO check https://www.longpaddock.qld.gov.au/silo/about/climate-variables/. For NASAPOWER check: XXXXX.
  • lat_range (numpy.ndarray) – a numpy array of latitude values to extract data from
  • lon_range (numpy.ndarray) – a numpy array of longitude values to extract data from
  • input_dir (str) – when selecting the option to generate Climate Data Files from local directories, this parameter must be specified, otherwise data will be fetched directly from the cloud either via an available API or S3 bucket.
Returns:

a tuple consisting of (a) the final dataframe containing values for all years, latitudes and longitudes for a particular climate variable, (b) the curated list of longitude ranges (which excludes all those lon values where there were no actual data points). The tuple is ordered as follows: (final_dataframe, final_lon_range)

Return type:

tuple

get_yearly_data(lat, lon, value_array, year, year_range, climate_variable)[source]

Extract values from an API endpoint in the cloud or a xarray.Dataset object

Parameters:
  • lat (float) – the latitude that values should be returned for
  • lon (float) – the longitude that values should be returned for
  • value_array (xarray.Dataset) – the xarray Dataset object to extract values from
  • year (string) – the year of the file
  • variable_short_name (string) – the climate variable short name as per SILO nomenclature, see https://www.longpaddock.qld.gov.au/silo/about/climate-variables/
Raises:

ValueError – if there was “NO” data available for all days under a particular combination of lat & lon, then the total values collected should equal “0” (meaning, there was no data for that point in the grid). If this is the case, then the function will simply return with a “no_values” message and signal the calling function that it should ignore this particular year-lat-lon combination.

Returns:

a dataframe containing 5 columns: the Julian day, the grid data value for that day, the year, the latitude, the longitude.

Return type:

pandas.core.frame.DataFrame

class connectors.nasapower_connector.NASAPowerClimateDataConnector(climate_variables, data_source='silo', input_path=None)[source]

This class will provide methods that query and parse data from NASA POWER climate database

Parameters:
  • logger (str) – A pointer to an initialized Argparse logger
  • data_source (str) – The climate database where the values are being extracted from: SILO or NASAPOWER
generate_climate_dataframe_from_nasapower_cloud_api(year_range, climate_variables, lat_range, lon_range, input_dir)[source]

This function generates a dataframe containing (a) climate values (b) for every variable requested (c) for every day of the year (d) for every year passed in as argument. It will leverage NASAPOWER API to do it.

Parameters:
  • year_range (numpy.ndarray) – a numpy array with all the years for which we are seeking data.
  • climate_variables (str) – the climate variable short name as per SILO nomenclature. For SILO check https://www.longpaddock.qld.gov.au/silo/about/climate-variables/. Variable names are automatically translated from SILO to NASAPOWER codes.
  • lat_range (numpy.ndarray) – a numpy array of latitude values to extract data from
  • lon_range (numpy.ndarray) – a numpy array of longitude values to extract data from
  • input_dir (str) – when selecting the option to generate Climate Data Files from local directories, this parameter must be specified, otherwise data will be fetched directly from the cloud either via an available API or S3 bucket.
Returns:

a tuple consisting of (a) the final dataframe containing values for all years, latitudes and longitudes for a particular climate variable, (b) the curated list of longitude ranges (which excludes all those lon values where there were no actual data points). The tuple is ordered as follows: (final_dataframe, final_lon_range)

Return type:

tuple

get_yearly_data(lat, lon, value_array, year, year_range, climate_variable)[source]

Extract values from an API endpoint in the cloud or a xarray.Dataset object

Parameters:
  • lat (float) – the latitude that values should be returned for
  • lon (float) – the longitude that values should be returned for
  • value_array (xarray.Dataset) – the xarray Dataset object to extract values from
  • year (string) – the year of the file
  • variable_short_name (string) – the climate variable name
Raises:

ValueError – if there was “NO” data available for all days under a particular combination of lat & lon, then the total values collected should equal “0” (meaning, there was no data for that point in the grid). If this is the case, then the function will simply return with a “no_values” message and signal the calling function that it should ignore this particular year-lat-lon combination.

Returns:

a dataframe containing 5 columns: the Julian day, the grid data value for that day, the year, the latitude, the longitude.

Return type:

pandas.core.frame.DataFrame

The NASA POWER database is a global database of daily weather data specifically designed for agrometeorological applications. The spatial resolution of the database is 0.5x0.5 degrees (as of 2018). For more information on the NASA POWER database see the documentation at: http://power.larc.nasa.gov/common/AgroclimatologyMethodology/Agro_Methodology_Content.html The NASAPowerClimateDataConnector is used by BestiaPop to retrieve data from NASA POWER database and provides functions to parse and extract relevant information from it. Important NOTE: as per https://power.larc.nasa.gov/docs/services/api/v1/temporal/daily/, any latitude-longitude combinations within a 0.5x0.5 degrees grid box will yield the same weather data. Thus, there is no difference for data returned between lat/lon -41.5/145.3 and lat/lon -41.8/145.7. When BestiaPop requests data from NASA Power, it will automatically create coordinate series wiht 1 degree jumps. So if you pass in -lat “-41.15 -55.05” the resulting series will be: [-55.05, -54.05, -53.05, -52.05, -51.05, -50.05, -49.05, -48.05, -47.05, -46.05, -45.05, -44.05, -43.05, -42.05, -41.05]. Please bear in mind that there is no difference between -41.15 and -41.05.