pydiva
pydiva
DivaSite
Bases: Enum
DivaType
Bases: Enum
DivaLevel
load_diva_data
load_diva_data(site_name: str | DivaSite, diva_type: DivaType, level: DivaLevel | None = None, start_date: str | date | None = None, end_date: str | date | None = None, base_path: str | Path | None = None) -> xr.Dataset
find_diva_dates_in_archive
find_diva_dates_in_archive(site_name: str | DivaSite, diva_type: DivaType, level: DivaLevel | None = None, start_date: str | date | None = None, end_date: str | date | None = None, base_path: str | Path | None = None) -> list[datetime.date]
ceilometer
CeilometerLevel
find_ceilometer_archive_dates
find_ceilometer_archive_dates(site_name: str, level: CeilometerLevel, start_date: str | date | None = None, end_date: str | date | None = None, base_path: str | Path | None = None) -> list[datetime.date]
load_ceilometer_geoms
load_ceilometer_geoms(site_name: str, level: CeilometerLevel, start_date: str | date | None = None, end_date: str | date | None = None, base_path: str | Path | None = None) -> xr.Dataset
Used to easily load ceilometer GEOMS files stored on the cloud server's archive without needing to know where to find it.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
site_name
|
str
|
Full name of the site you are looking for. |
required |
level
|
CeilometerLevel
|
Level of the data, from the ones enumerated in CeilometerLevel. |
required |
start_date
|
str | date | None
|
Measurement start date of the data you are looking for. |
None
|
end_date
|
str | date | None
|
Measurement end date of the data you are looking for. |
None
|
base_path
|
str | Path | None
|
Default is the ceilometer geoms path on the cloud server, can be adjusted if needed. |
None
|
Returns:
| Type | Description |
|---|---|
Dataset
|
An xarray.Dataset of the data in the loaded GEOMS file. |
correlation
gathering
actris_ares
ACTRIS_ARES_API_URL
module-attribute
ACTRIS_ARES_KIND
module-attribute
ACTRIS_KWARG_ALTERNATIVES
module-attribute
ACTRIS_KWARG_ALTERNATIVES = [('ewls', ['wavelength']), ('from_date', ['fromDate']), ('from_day_time', ['fromDayTime']), ('to_date', ['toDate']), ('to_day_time', ['toDayTime']), ('file_types', ['fileTypes', 'opticaltype', 'optical_type']), ('quality_control_version', ['qualityControlVersion', 'qa_version']), ('scc_version', ['sccVersion'])]
ActrisAresParameters
Bases: BaseModel
quality_control_version
class-attribute
instance-attribute
ActrisAresGatherer
Bases: Gatherer
Wrapper for the ACTRIS Ares Rest API (https://data.earlinet.org/api/services/restapi?_wadl)
Example
ares_gatherer = ActrisAresGatherer() params = { "kind": ["cloudmask", "optical"], "from_date": "2020-04-01", "to_date": datetime.strptime("2020-04-07", "%Y-%m-%d"), "from_day_time": "22:00:00", "to_day_time": time(22, 30, 0), "stations": "waw", "ewls": [355, 532], "file_types": "b", "levels": 1.0, } ares_gatherer.fetch(**params)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
kind
|
Kind of data to fetch, if 'optical', uses the specific endpoint, otherwise the generic one; str | list[str]; default 'optical' |
required | |
from_date
|
Start of sensing time; str | date | datetime |
required | |
to_date
|
End of sensing time; str | date | datetime |
required | |
stations
|
ACTRIS stations from which to fetch data; str | list[str] |
required | |
ewls | wavelength
|
Wavelengths for which to fetch data; str | int | float | list[Any] |
required | |
file_types | optical_type
|
File types to fetch; str | list[str] |
required | |
tag
|
str | list[str] | None = None |
required | |
overwrite_cache
|
Set True to redownload and overwrite existing cached files; default is False; bool |
required |
Other Parameters:
| Name | Type | Description |
|---|---|---|
from_day_time |
Start of sensing time in time of day; str | time | datetime |
|
to_day_time |
End of sensing time in time of day; str | time | datetime |
|
levels |
File levels to fetch; str | int | float | list[Any] |
|
quality_control_version |
For which QA version to fetch files; str | int |
Note
ActrisAresGatherer() ActrisAresGatherer.STATIONS ActrisAresGatherer.WAVELENGTHS ActrisAresGatherer.FILETYPES ActrisAresGatherer.LEVELS ActrisAresGatherer.QA_VERSIONS ActrisAresGatherer.TAGS
actris_cloudnet
ActrisCloudnetParameters
Bases: BaseModel
updated_at_from
class-attribute
instance-attribute
ActrisCloudnetGatherer
Bases: Gatherer
Wrapper for the ACTRIS Cloudnet api client (https://pypi.org/project/cloudnet-api-client/)
Example
cloudnet_gatherer = ActrisCloudnetGatherer() params = { "site_id": "hyytiala", "date": "2021-01-01", "product": ["mwr", "radar"], "updated_at_to": datetime.strptime( "2025-01-01T12:00:00", "%Y-%m-%dT%H:%M:%S" ) } cloudnet_gatherer.fetch(**params)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
date
|
Sensing time at this date; str | date | datetime |
required | |
date_from
|
Sensing after this date; str | date | datetime |
required | |
date_to
|
Sensing time before this date; str | date | datetime |
required | |
site_id
|
Id of the site from which to fetch data: str | list[str] |
required | |
updated_at
|
Fetch files updated on this date; str | date | datetime |
required | |
updated_at_from
|
Fetch files updated after this date; str | date | datetime |
required | |
updated_at_to
|
Fetch files updated before this date; str | date | datetime |
required | |
product
|
Which products to fetch; str | list[str] |
required | |
instrument_id
|
ID of the instruments from which to fetch; str | list[str] |
required | |
instrument_pid
|
PID of the instruments from which to fetch; str | list[str] |
required | |
overwrite_cache
|
Set True to redownload and overwrite existing cached files; default is False; bool |
required |
Note
ActrisCloudnetGatherer() ActrisCloudnetGatherer.SITES ActrisCloudnetGatherer.PRODUCTS ActrisCloudnetGatherer.INSTRUMENTS
aeronet
AERONET_URL_PATTERN
module-attribute
AERONET_URL_PATTERN = '{source}/cgi-bin/{url_segment}site={site}&year={start_year}&month={start_month}&day={start_day}&year2={end_year}&month2={end_month}&day2={end_day}&{data_type}=1&AVG={averaging}&if_no_html=1'
DATA_TYPE_URLS
module-attribute
DATA_TYPE_URLS = {'ALM00': 'print_web_data_raw_sky_v3?', 'ALM15': 'print_web_data_inv_v3?product=ALL&', 'ALM20': 'print_web_data_inv_v3?product=ALL&', 'ALP00': 'print_web_data_raw_sky_v3?', 'AOD10': 'print_web_data_v3?', 'AOD15': 'print_web_data_v3?', 'AOD20': 'print_web_data_v3?', 'HYB00': 'print_web_data_raw_sky_v3?', 'HYB15': 'print_web_data_inv_v3?product=ALL&', 'HYB20': 'print_web_data_inv_v3?product=ALL&', 'HYP00': 'print_web_data_raw_sky_v3?', 'LWN10': 'print_web_data_v3?', 'LWN15': 'print_web_data_v3?', 'LWN20': 'print_web_data_v3?', 'PPL00': 'print_web_data_raw_sky_v3?', 'PPP00': 'print_web_data_raw_sky_v3?', 'SDA10': 'print_web_data_v3?', 'SDA15': 'print_web_data_v3?', 'SDA20': 'print_web_data_v3?', 'TOT10': 'print_web_data_v3?', 'TOT15': 'print_web_data_v3?', 'TOT20': 'print_web_data_v3?', 'ZEN00': 'print_web_data_zenith_radiance_v3?'}
AeronetGatherer
Bases: Gatherer
Fetches data for the specified arguments using the AERONET Web Service.
Example
aeronet_gatherer = AeronetGatherer() aeronet_gatherer.fetch( site="Magurele_Inoe", data_type="AOD15", start_date="2023-05-01", end_date="2023-05-21", )
The fetched data is also stored in the caching_location by default.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
site
|
The exact sitename as found in AERONET, e.g. "Magurele_Inoe" |
required | |
data_type
|
The data type as found in AERONET, e.g. "AOD20" |
required | |
start_date
|
Start date of the measurement points, e.g. "2022-05-16" |
required | |
end_date
|
End date of the measurement points, e.g. "2022-08-25" |
required | |
averaging
|
False for all points, True for daily averages; default is False |
required |
Other Parameters: overwrite_cache: Set True to redownload and overwrite existing cached files; default is False
base
Gatherer
Bases: ABC
Abstract base class for data gathering operations.
This class defines the interface for all pydiva Gatherers that collect data from external sources like APIs, websites, or other remote services.
caching_location
instance-attribute
__init__
__init__(caching_base_location: Path | str = Path('/workspace/.cache/pydiva'), caching_name: str = 'pydiva_data', caching_sub_location: Path | str = '')
fetch
Function to call to start the data gathering for the given parameters.
Return the collected data and writes it to disk in the caching location. To redownload existing files, use bool parameter "overwrite_cache".
earthcare
Gatherer wrapper for oads-download https://github.com/koenigleon/oads-download/
FILE_TYPES
module-attribute
FILE_TYPES = ['ATL_NOM_1B', 'ATL_DCC_1B', 'ATL_CSC_1B', 'ATL_FSC_1B', 'MSI_NOM_1B', 'MSI_BBS_1B', 'MSI_SD1_1B', 'MSI_SD2_1B', 'BBR_NOM_1B', 'BBR_SNG_1B', 'BBR_SOL_1B', 'BBR_LIN_1B', 'CPR_NOM_1B', 'MSI_RGR_1C', 'AUX_MET_1D', 'AUX_JSG_1D', 'ATL_FM__2A', 'ATL_AER_2A', 'ATL_ICE_2A', 'ATL_TC__2A', 'ATL_EBD_2A', 'ATL_CTH_2A', 'ATL_ALD_2A', 'ATL_CLA_2A', 'MSI_CM__2A', 'MSI_COP_2A', 'MSI_AOT_2A', 'MSI_CLP_2A', 'CPR_FMR_2A', 'CPR_CD__2A', 'CPR_TC__2A', 'CPR_CLD_2A', 'CPR_APC_2A', 'CPR_ECO_2A', 'CPR_CLP_2A', 'AM__MO__2B', 'AM__CTH_2B', 'AM__ACD_2B', 'AC__TC__2B', 'AC__CLP_2B', 'BM__RAD_2B', 'BMA_FLX_2B', 'ACM_CAP_2B', 'ACM_COM_2B', 'ACM_RT__2B', 'ACM_CLP_2B', 'ALL_DF__2B', 'ALL_3D__2B', 'ALL_RAD_2B', 'MPL_ORBSCT', 'AUX_ORBPRE', 'AUX_ORBRES']
COLLECTIONS
module-attribute
COLLECTIONS = ['EarthCAREL1Validated', 'EarthCAREL2Validated', 'EarthCAREXMETL1DProducts10', 'JAXAL2Validated', 'EarthCAREAuxiliary', 'EarthCAREL1InstChecked', 'EarthCAREL2InstChecked', 'JAXAL2InstChecked', 'EarthCAREL0L1Products', 'EarthCAREL2Products', 'JAXAL2Products']
EarthCAREParameters
Bases: BaseModel
Pydantic model for EarthCARE download parameters. Uses validator to pass the expected types to the OADS downloader.
product_types
class-attribute
instance-attribute
radius_search
class-attribute
instance-attribute
bounding_box
class-attribute
instance-attribute
is_found_files_list_to_txt
class-attribute
instance-attribute
validate_collections
classmethod
Validate collections are in the valid list.
EarthCAREGatherer
Bases: Gatherer
Flexible wrapper for EarthCARE OADS download script.
Example
ec_gatherer = EarthCAREGatherer(username="user", password="pass") file_paths = ec_gatherer.fetch( product_types=["ATL_NOM_1B"], collections=["EarthCAREL1InstChecked"], start_time="2024-07-31T13:00:00Z", ... )
Fetches EarthCARE satellite data from ESA's Online Access and Distribution System (OADS) using the OpenSearch API. Downloaded files are stored in a temporary directory and file paths are returned.
Note
Standard Users: EarthCAREL1Validated, EarthCAREL2Validated, JAXAL2Validated, EarthCAREXMETL1DProducts10, EarthCAREOrbitData Cal/Val Users: All standard collections plus InstChecked variants and EarthCAREAuxiliary Commissioning Team: All standard collections plus Products and *L0L1Products variants
For more details check the EarthCARE Online Dissemination Service: https://ec-pdgs-dissemination2.eo.esa.int/oads/access/collection
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
product_types
|
List of EarthCARE products, e.g. ["ATL_NOM_1B", "MSI_NOM_1B"] Supports short names e.g. "ANOM" and version specification e.g. "ANOM:AC" |
required | |
collections
|
List of accessible collections, e.g. ["EarthCAREL1Validated", "EarthCAREL2Validated"] Must be from the valid collections list based on user access level |
required |
Other Parameters:
| Name | Type | Description |
|---|---|---|
start_time |
Start of sensing time, e.g. "2024-07-31T13:00:00Z" |
|
end_time |
End of sensing time, e.g. "2024-07-31T14:00:00Z" |
|
timestamps |
List of specific timestamps to search for |
|
orbit_numbers |
List of orbit numbers, e.g. [981, 982, 983] |
|
frame_ids |
List of frame IDs (A-H), e.g. ["A", "B", "C"] |
|
radius_search |
Spatial search around point [radius_m, lat, lon], e.g. [25000, 51.35, 12.43] |
|
bounding_box |
Spatial search in box [latS, lonW, latN, lonE], e.g. [14.9, 37.7, 14.99, 37.78] |
|
product_version |
Two-letter processor baseline, e.g. "AC" |
|
download_idx |
Select single file by index from results |
|
orbit_and_frames |
Combined orbit/frame strings, e.g. ["00981E", "00982A"] |
|
start_orbit_number/end_orbit_number |
Orbit range bounds |
|
start_orbit_and_frame/end_orbit_and_frame |
Combined orbit/frame range bounds |
|
is_download |
Download files; default True |
|
is_unzip |
Extract downloaded archives; default True |
|
is_delete |
Delete archives after extraction; default True |
|
is_overwrite |
Overwrite existing files; default False |
|
is_create_subdirs |
Create organized subdirectory structure; default True |
|
is_debug |
Enable debug logging; default False |
|
overwrite_cache |
Set True to redownload and overwrite existing cached files; default False |
geoms
geoms_loader
load_grasp_geoms
load_grasp_geoms(site_name: str, start_date: str | date | None = None, end_date: str | date | None = None, base_path: str | Path | None = None) -> xr.Dataset
Used to easily load GRASP results GEOMS files stored on the cloud server's archive without needing to know where to find it.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
site_name
|
str
|
Full name of the site you are looking for. |
required |
start_date
|
str | date | None
|
Measurement start date of the data you are looking for. |
None
|
end_date
|
str | date | None
|
Measurement end date of the data you are looking for. |
None
|
base_path
|
str | Path | None
|
Default is the GRASP results geoms path on the cloud server, can be adjusted if needed. |
None
|
Returns:
| Type | Description |
|---|---|
Dataset
|
An xarray.Datset of the data in the loaded GEOMS file. |
geoms_writer
TYPE_MAPPINGS
module-attribute
TYPE_MAPPINGS = {dtype('float32'): ('f4', 'REAL'), dtype('float64'): ('f8', 'DOUBLE'), dtype('int32'): ('i4', 'INTEGER')}
GeomsWriter
write
Write a results object in a file in the GEOMS format
This is done in multiple steps: 1. Constant values are written 2. Dimensions and their coordinates are written 3. Data arrays are written 4. File metadata is written
write_dimensions
Create the named dimensions that will later be used for arrays in the dataset
Each dimension has a unique name and is just an integer number describing the size
Also each dimension has coordinates, which are also set here as a 1D Array of the size of the dimension with the proper values
datetime_to_mjd2k
Convert datetime object to modified julian date 2000.
According to GEOMS documentation
The Modified Julian Date, MJD2K, used throughout this document is defined as follows: MJD2K is 0.000000 on January 1, 2000 at 00:00:00 UTC
plot
processing
DEFAULT_PROCESSING_SETTINGS
module-attribute
GENERAL_DEFAULT_SETTINGS
module-attribute
TYPE_MAPPINGS
module-attribute
find_available_processing_dates
find_available_processing_dates(site: str | DivaSite, start_date: str | date | None = None, end_date: str | date | None = None, photometer_base_path: str | Path | None = None, lidar_base_path: str | Path | None = None) -> list[datetime.date]
process_diva_data
process_diva_data(site: str | DivaSite, day: str | date, height_range: Sequence[int], output_path: str | Path | None = None, settings_path: str | Path | None = None, photometer_base_path: str | Path | None = None, lidar_base_path: str | Path | None = None, skip_retrieval: bool = False, sdata_dump: str | Path | None = None, print_grasp_output: bool = False) -> PixelResults | None
Run processing and GRASP retrieval for DIVA data for one site and day.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
site
|
str | DivaSite
|
Measuring site for which data should be processed. It is recommended to use pydiva.DivaSite types. |
required |
day
|
str | date
|
Date or string (format "YYYY-MM-DD") of the day for which data should be processed. |
required |
height_range
|
Sequence[int]
|
Lower and upper bounds for the lidar heights selection in meters. |
required |
output_path
|
str | Path | None
|
(Optional) Output path for a GEOMS file created from the retrieved GRASP results. If not set, no output file will be created. |
None
|
settings_path
|
str | Path | None
|
(Optional) Path to a settings file for the GRASP retrieval, defaults to a DIVA template. |
None
|
photometer_base_path
|
str | Path | None
|
(Optional) Base path for Photometer data, defaults to its location on the Cloud server. |
None
|
lidar_base_path
|
str | Path | None
|
(Optional) Base path for Lidar data, defaults to its location on the Cloud server. |
None
|
skip_retrieval
|
bool
|
(Optional) Set to True to skip the GRASP retrieval and return an empty results object |
False
|
sdata_dump
|
str | Path | None
|
(Optional) Set to a file path to dump the sdata that is created to the specified file |
None
|
print_grasp_output
|
bool
|
(Optional) Set to True to print screen output during GRASP retrieval |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
PixelResults |
PixelResults | None
|
An object which allows random access to the internal structure containing the output of the retrieval. |