API Reference¶
Modules¶
-
class
common.bestiapop_utils.
MyUtilityBeast
(input_path=None)[source]¶ This class will provide methods to perform generic or shared operations on data
Parameters: logger (str) – A pointer to an initialized Argparse logger -
download_nc4_file_from_cloud
(year, climate_variable, output_path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/bestiapop/checkouts/latest/docs'), data_source='silo', skip_certificate_checks=False)[source]¶ Downloads a file from AWS S3 bucket or other cloud API
Parameters: - year (int) – the year we require data for. SILO stores climate data as separate years like so: daily_rain.2018.nc
- climate_variable (str) – the climate variable short name as per SILO nomenclature, see https://www.longpaddock.qld.gov.au/silo/about/climate-variables/
- output_path (str, optional) – The target folder where files should be downloaded. Defaults to Path().cwd().
- skip_certificate_checks (bool, optional) – ask the requests library to skip certificate checks, useful when attempting to download files behind a proxy. Defaults to False.
-
generate_climate_dataframe_from_disk
(year_range, climate_variables, lat_range, lon_range, input_dir, data_source='silo')[source]¶ This function generates a dataframe containing (a) climate values (b) for every variable requested (c) for every day of the year (d) for every year passed in as argument. The values will be sourced from Disk. :param year_range: a numpy array with all the years for which we are seeking data. :type year_range: numpy.ndarray :param climate_variables: the climate variable short name as per SILO or NASAPOWER nomenclature. For SILO check https://www.longpaddock.qld.gov.au/silo/about/climate-variables/. For NASAPOWER check: XXXXX. :type climate_variables: str :param lat_range: a numpy array of latitude values to extract data from :type lat_range: numpy.ndarray :param lon_range: a numpy array of longitude values to extract data from :type lon_range: numpy.ndarray :param input_dir: when selecting the option to generate Climate Data Files from local directories, this parameter must be specified, otherwise data will be fetched directly from the cloud either via an available API or S3 bucket. :type input_dir: str
Returns: a tuple consisting of (a) the final dataframe containing values for all years, latitudes and longitudes for a particular climate variable, (b) the curated list of longitude ranges (which excludes all those lon values where there were no actual data points). The tuple is ordered as follows: (final_dataframe, final_lon_range) Return type: tuple
-
load_cdf_file
(sourcepath, data_category, year=None)[source]¶ This function loads a NetCDF4 file either from the cloud or locally
Parameters: - sourcepath (str) – when loading a NetCDF4 file locally, this specifies the source folder. Only the “folder” must be specified, the actual file name will be further qualified by BestiaPop grabbing data from the year and climate variable paramaters passed to the SILO class.
- data_category (str) – the short name variable, examples: daily_rain, max_temp, etc.
- year (int, optional) – the year we want to extract data from, it is used to compose the final AWS S3 URL or to qualify the full path to the local NetCDF4 file we would like to load. Defaults to None.
Returns: a dictionary containing two items, “value_array” which is a xarray DataSet object and “data_year” which is the year that the NetCDF4 file contains data for, extracted by looking at the contents of the NetCDF4 file itself.
Return type: dict
-
-
class
connectors.silo_connector.
SILOClimateDataConnector
(climate_variables, data_source='silo', input_path=None)[source]¶ This class will provide methods that query and parse data from SILO climate database
Parameters: - logger (str) – A pointer to an initialized Argparse logger
- data_source (str) – The climate database where the values are being extracted from: SILO or NASAPOWER
-
generate_climate_dataframe_from_silo_cloud_api
(year_range, climate_variables, lat_range, lon_range, input_dir)[source]¶ This function generates a dataframe containing (a) climate values (b) for every variable requested (c) for every day of the year (d) for every year passed in as argument. It will leverage SILO API to do it.
Parameters: - year_range (numpy.ndarray) – a numpy array with all the years for which we are seeking data.
- climate_variables (str) – the climate variable short name as per SILO or NASAPOWER nomenclature. For SILO check https://www.longpaddock.qld.gov.au/silo/about/climate-variables/. For NASAPOWER check: XXXXX.
- lat_range (numpy.ndarray) – a numpy array of latitude values to extract data from
- lon_range (numpy.ndarray) – a numpy array of longitude values to extract data from
- input_dir (str) – when selecting the option to generate Climate Data Files from local directories, this parameter must be specified, otherwise data will be fetched directly from the cloud either via an available API or S3 bucket.
Returns: a tuple consisting of (a) the final dataframe containing values for all years, latitudes and longitudes for a particular climate variable, (b) the curated list of longitude ranges (which excludes all those lon values where there were no actual data points). The tuple is ordered as follows: (final_dataframe, final_lon_range)
Return type: tuple
-
get_yearly_data
(lat, lon, value_array, year, year_range, climate_variable)[source]¶ Extract values from an API endpoint in the cloud or a xarray.Dataset object
Parameters: - lat (float) – the latitude that values should be returned for
- lon (float) – the longitude that values should be returned for
- value_array (xarray.Dataset) – the xarray Dataset object to extract values from
- year (string) – the year of the file
- variable_short_name (string) – the climate variable short name as per SILO nomenclature, see https://www.longpaddock.qld.gov.au/silo/about/climate-variables/
Raises: ValueError
– if there was “NO” data available for all days under a particular combination of lat & lon, then the total values collected should equal “0” (meaning, there was no data for that point in the grid). If this is the case, then the function will simply return with a “no_values” message and signal the calling function that it should ignore this particular year-lat-lon combination.Returns: a dataframe containing 5 columns: the Julian day, the grid data value for that day, the year, the latitude, the longitude.
Return type: pandas.core.frame.DataFrame
-
class
connectors.nasapower_connector.
NASAPowerClimateDataConnector
(climate_variables, data_source='silo', input_path=None)[source]¶ This class will provide methods that query and parse data from NASA POWER climate database
Parameters: - logger (str) – A pointer to an initialized Argparse logger
- data_source (str) – The climate database where the values are being extracted from: SILO or NASAPOWER
-
generate_climate_dataframe_from_nasapower_cloud_api
(year_range, climate_variables, lat_range, lon_range, input_dir)[source]¶ This function generates a dataframe containing (a) climate values (b) for every variable requested (c) for every day of the year (d) for every year passed in as argument. It will leverage NASAPOWER API to do it.
Parameters: - year_range (numpy.ndarray) – a numpy array with all the years for which we are seeking data.
- climate_variables (str) – the climate variable short name as per SILO nomenclature. For SILO check https://www.longpaddock.qld.gov.au/silo/about/climate-variables/. Variable names are automatically translated from SILO to NASAPOWER codes.
- lat_range (numpy.ndarray) – a numpy array of latitude values to extract data from
- lon_range (numpy.ndarray) – a numpy array of longitude values to extract data from
- input_dir (str) – when selecting the option to generate Climate Data Files from local directories, this parameter must be specified, otherwise data will be fetched directly from the cloud either via an available API or S3 bucket.
Returns: a tuple consisting of (a) the final dataframe containing values for all years, latitudes and longitudes for a particular climate variable, (b) the curated list of longitude ranges (which excludes all those lon values where there were no actual data points). The tuple is ordered as follows: (final_dataframe, final_lon_range)
Return type: tuple
-
get_yearly_data
(lat, lon, value_array, year, year_range, climate_variable)[source]¶ Extract values from an API endpoint in the cloud or a xarray.Dataset object
Parameters: - lat (float) – the latitude that values should be returned for
- lon (float) – the longitude that values should be returned for
- value_array (xarray.Dataset) – the xarray Dataset object to extract values from
- year (string) – the year of the file
- variable_short_name (string) – the climate variable name
Raises: ValueError
– if there was “NO” data available for all days under a particular combination of lat & lon, then the total values collected should equal “0” (meaning, there was no data for that point in the grid). If this is the case, then the function will simply return with a “no_values” message and signal the calling function that it should ignore this particular year-lat-lon combination.Returns: a dataframe containing 5 columns: the Julian day, the grid data value for that day, the year, the latitude, the longitude.
Return type: pandas.core.frame.DataFrame
The NASA POWER database is a global database of daily weather data specifically designed for agrometeorological applications. The spatial resolution of the database is 0.5x0.5 degrees (as of 2018). For more information on the NASA POWER database see the documentation at: http://power.larc.nasa.gov/common/AgroclimatologyMethodology/Agro_Methodology_Content.html The NASAPowerClimateDataConnector is used by BestiaPop to retrieve data from NASA POWER database and provides functions to parse and extract relevant information from it. Important NOTE: as per https://power.larc.nasa.gov/docs/services/api/v1/temporal/daily/, any latitude-longitude combinations within a 0.5x0.5 degrees grid box will yield the same weather data. Thus, there is no difference for data returned between lat/lon -41.5/145.3 and lat/lon -41.8/145.7. When BestiaPop requests data from NASA Power, it will automatically create coordinate series wiht 1 degree jumps. So if you pass in -lat “-41.15 -55.05” the resulting series will be: [-55.05, -54.05, -53.05, -52.05, -51.05, -50.05, -49.05, -48.05, -47.05, -46.05, -45.05, -44.05, -43.05, -42.05, -41.05]. Please bear in mind that there is no difference between -41.15 and -41.05.
Base Classes¶
-
class
producers.output.
DATAOUTPUT
(data_source)[source]¶ This class will provide different methods for data output from climate dataframes
Parameters: - logger (str) – A pointer to an initialized Argparse logger
- data_source (str) – The climate database where the values are being extracted from: SILO or NASAPOWER
Returns: A class object with access to DATAOUTPUT methods
Return type: -
generate_met
(outputdir, met_dataframe, lat, lon)[source]¶ Generate APSIM MET File
Parameters: - outputdir (str) – the folder where the generated MET files will be stored
- met_dataframe (pandas.core.frame.DataFrame) – the pandas dataframe slice to convert to MET file
- lat (float) – the latitude for which this MET file is being generated
- lon (float) – the longitude for which this MET file is being generated
-
generate_output
(final_daily_df, lat_range, lon_range, outputdir=None, output_type='met')[source]¶ Generate required Output based on Output Type selected
Parameters: - final_daily_df (pandas.core.frame.DataFrame) – the pandas daframe containing all the values that are going to be parsed into a specific output
- lat_range (numpy.ndarray) – an array of latitude values to select from the final_daily_df
- lon_range (numpy.ndarray) – an array of longitude values to select from the final_daily_df
- outputdir (str) – the folder that will be used to store the output files
- output_type (str, optional) – the output type: csv (not implemented yet), json(not implemented yet), met. Defaults to “met”.
-
generate_wth
(outputdir, wth_dataframe, lat, lon)[source]¶ Generate WTH File
Parameters: - outputdir (str) – the folder where the generated WTH files will be stored
- wth_dataframe (pandas.core.frame.DataFrame) – the pandas dataframe slice to convert to WTH file
- lat (float) – the latitude for which this WTH file is being generated
- lon (float) – the longitude for which this WTH file is being generated
-
class
connectors.silo_connector.
SILOClimateDataConnector
(climate_variables, data_source='silo', input_path=None)[source] This class will provide methods that query and parse data from SILO climate database
Parameters: - logger (str) – A pointer to an initialized Argparse logger
- data_source (str) – The climate database where the values are being extracted from: SILO or NASAPOWER
-
generate_climate_dataframe_from_silo_cloud_api
(year_range, climate_variables, lat_range, lon_range, input_dir)[source] This function generates a dataframe containing (a) climate values (b) for every variable requested (c) for every day of the year (d) for every year passed in as argument. It will leverage SILO API to do it.
Parameters: - year_range (numpy.ndarray) – a numpy array with all the years for which we are seeking data.
- climate_variables (str) – the climate variable short name as per SILO or NASAPOWER nomenclature. For SILO check https://www.longpaddock.qld.gov.au/silo/about/climate-variables/. For NASAPOWER check: XXXXX.
- lat_range (numpy.ndarray) – a numpy array of latitude values to extract data from
- lon_range (numpy.ndarray) – a numpy array of longitude values to extract data from
- input_dir (str) – when selecting the option to generate Climate Data Files from local directories, this parameter must be specified, otherwise data will be fetched directly from the cloud either via an available API or S3 bucket.
Returns: a tuple consisting of (a) the final dataframe containing values for all years, latitudes and longitudes for a particular climate variable, (b) the curated list of longitude ranges (which excludes all those lon values where there were no actual data points). The tuple is ordered as follows: (final_dataframe, final_lon_range)
Return type: tuple
-
get_yearly_data
(lat, lon, value_array, year, year_range, climate_variable)[source] Extract values from an API endpoint in the cloud or a xarray.Dataset object
Parameters: - lat (float) – the latitude that values should be returned for
- lon (float) – the longitude that values should be returned for
- value_array (xarray.Dataset) – the xarray Dataset object to extract values from
- year (string) – the year of the file
- variable_short_name (string) – the climate variable short name as per SILO nomenclature, see https://www.longpaddock.qld.gov.au/silo/about/climate-variables/
Raises: ValueError
– if there was “NO” data available for all days under a particular combination of lat & lon, then the total values collected should equal “0” (meaning, there was no data for that point in the grid). If this is the case, then the function will simply return with a “no_values” message and signal the calling function that it should ignore this particular year-lat-lon combination.Returns: a dataframe containing 5 columns: the Julian day, the grid data value for that day, the year, the latitude, the longitude.
Return type: pandas.core.frame.DataFrame
-
class
connectors.nasapower_connector.
NASAPowerClimateDataConnector
(climate_variables, data_source='silo', input_path=None)[source] This class will provide methods that query and parse data from NASA POWER climate database
Parameters: - logger (str) – A pointer to an initialized Argparse logger
- data_source (str) – The climate database where the values are being extracted from: SILO or NASAPOWER
-
generate_climate_dataframe_from_nasapower_cloud_api
(year_range, climate_variables, lat_range, lon_range, input_dir)[source] This function generates a dataframe containing (a) climate values (b) for every variable requested (c) for every day of the year (d) for every year passed in as argument. It will leverage NASAPOWER API to do it.
Parameters: - year_range (numpy.ndarray) – a numpy array with all the years for which we are seeking data.
- climate_variables (str) – the climate variable short name as per SILO nomenclature. For SILO check https://www.longpaddock.qld.gov.au/silo/about/climate-variables/. Variable names are automatically translated from SILO to NASAPOWER codes.
- lat_range (numpy.ndarray) – a numpy array of latitude values to extract data from
- lon_range (numpy.ndarray) – a numpy array of longitude values to extract data from
- input_dir (str) – when selecting the option to generate Climate Data Files from local directories, this parameter must be specified, otherwise data will be fetched directly from the cloud either via an available API or S3 bucket.
Returns: a tuple consisting of (a) the final dataframe containing values for all years, latitudes and longitudes for a particular climate variable, (b) the curated list of longitude ranges (which excludes all those lon values where there were no actual data points). The tuple is ordered as follows: (final_dataframe, final_lon_range)
Return type: tuple
-
get_yearly_data
(lat, lon, value_array, year, year_range, climate_variable)[source] Extract values from an API endpoint in the cloud or a xarray.Dataset object
Parameters: - lat (float) – the latitude that values should be returned for
- lon (float) – the longitude that values should be returned for
- value_array (xarray.Dataset) – the xarray Dataset object to extract values from
- year (string) – the year of the file
- variable_short_name (string) – the climate variable name
Raises: ValueError
– if there was “NO” data available for all days under a particular combination of lat & lon, then the total values collected should equal “0” (meaning, there was no data for that point in the grid). If this is the case, then the function will simply return with a “no_values” message and signal the calling function that it should ignore this particular year-lat-lon combination.Returns: a dataframe containing 5 columns: the Julian day, the grid data value for that day, the year, the latitude, the longitude.
Return type: pandas.core.frame.DataFrame
The NASA POWER database is a global database of daily weather data specifically designed for agrometeorological applications. The spatial resolution of the database is 0.5x0.5 degrees (as of 2018). For more information on the NASA POWER database see the documentation at: http://power.larc.nasa.gov/common/AgroclimatologyMethodology/Agro_Methodology_Content.html The NASAPowerClimateDataConnector is used by BestiaPop to retrieve data from NASA POWER database and provides functions to parse and extract relevant information from it. Important NOTE: as per https://power.larc.nasa.gov/docs/services/api/v1/temporal/daily/, any latitude-longitude combinations within a 0.5x0.5 degrees grid box will yield the same weather data. Thus, there is no difference for data returned between lat/lon -41.5/145.3 and lat/lon -41.8/145.7. When BestiaPop requests data from NASA Power, it will automatically create coordinate series wiht 1 degree jumps. So if you pass in -lat “-41.15 -55.05” the resulting series will be: [-55.05, -54.05, -53.05, -52.05, -51.05, -50.05, -49.05, -48.05, -47.05, -46.05, -45.05, -44.05, -43.05, -42.05, -41.05]. Please bear in mind that there is no difference between -41.15 and -41.05.