-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor 654 zonal mean xy #752
Refactor 654 zonal mean xy #752
Conversation
The components to refactor include the driver and the plotter.
File 2, zonal_mean_xy_plot.py
|
0dfc01d
to
d1aec82
Compare
With the new commit 341ac3c, there are two problems:
|
2808fb1
to
25747e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @chengzhuzhang, I addressed your recent issues in this comment.
I also rebased this branch on the latest version of cdat-migration-fy24
, which I squashed down into less commits. If you have any un-committed changes, can you stash then, check out the latest version of this branch, then pop them back on top? Thanks
e101244
to
6bbc864
Compare
@tomvothecoder thank you for fixing this branch! For the excessive
I think I will try dropping the |
yep, I have checked out the latest version. Thanks for the heads-up! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some updates to your test script to pass diags.cfg
using sys.argv.extend
. This allows you to perform Python debugging with parameters defined in diags.cfg
.
@tomvothecoder I made plotting work, but will try consolidate the codes a little better. |
Other than |
340d3b3
to
56c7ffe
Compare
08b3ab0
to
553fea7
Compare
56c7ffe
to
59781c8
Compare
553fea7
to
09fbe13
Compare
I listed the errors from the log below: Issue 1 File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/driver/zonal_mean_xy_driver.py", line 322, in regrid_to_lower_res_1d
if len(ds_test_1d.lat) > len(ds_ref_1d.lat):
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/core/common.py", line 278, in __getattr__
raise AttributeError(
AttributeError: 'DataArray' object has no attribute 'lat'
Issue 2packages/e3sm_diags/plot/zonal_mean_xy_plot.py:115: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`). Consider using `matplotlib.pyplot.close()`.
fig = plt.figure(figsize=parameter.figsize, dpi=parameter.dpi)
Issue 3 File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/plot/zonal_mean_xy_plot.py", line 130, in plot
ax1.set_ylabel(da_test.long_name + " (" + da_test.units + ")")
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/core/common.py", line 278, in __getattr__
raise AttributeError(
AttributeError: 'DataArray' object has no attribute 'long_name'
Issue 4Traceback (most recent call last):
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/coding/times.py", line 213, in _decode_cf_datetime_dtype
result = decode_cf_datetime(example_value, units, calendar, use_cftime)
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/coding/times.py", line 321, in decode_cf_datetime
dates = _decode_datetime_with_cftime(
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/coding/times.py", line 237, in _decode_datetime_with_cftime
cftime.num2date(num_dates, units, calendar, only_use_cftime_datetimes=True)
File "src/cftime/_cftime.pyx", line 580, in cftime._cftime.num2date
File "src/cftime/_cftime.pyx", line 98, in cftime._cftime._dateparse
ValueError: 'months since' units only allowed for '360_day' calendar
Issue 5Traceback (most recent call last):
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/conventions.py", line 432, in decode_cf_variables
new_vars[k] = decode_cf_variable(
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/conventions.py", line 283, in decode_cf_variable
var = times.CFDatetimeCoder(use_cftime=use_cftime).decode(var, name=name)
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/coding/times.py", line 832, in decode
dtype = _decode_cf_datetime_dtype(data, units, calendar, self.use_cftime)
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/coding/times.py", line 223, in _decode_cf_datetime_dtype
raise ValueError(msg)
ValueError: unable to decode time units 'months since 1983-06' with 'the default calendar'. Try opening your dataset with decode_times=False or installing cftime if it is not installed.
Issue 6 File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/driver/utils/dataset_xr.py", line 254, in _get_global_attr_from_climo_dataset
ds = xr.open_dataset(filepath)
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/backends/api.py", line 572, in open_dataset
backend_ds = backend.open_dataset(
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/backends/netCDF4_.py", line 607, in open_dataset
ds = store_entrypoint.open_dataset(
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/backends/store.py", line 58, in open_dataset
ds = Dataset(vars, attrs=attrs)
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/core/dataset.py", line 697, in __init__
variables, coord_names, dims, indexes, _ = merge_data_and_coords(
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/core/dataset.py", line 426, in merge_data_and_coords
return merge_core(
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/core/merge.py", line 724, in merge_core
dims = calculate_dimensions(variables)
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/core/variable.py", line 3007, in calculate_dimensions
raise ValueError(
ValueError: dimension 'time' already exists as a scalar variable
Issue 7 File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/derivations/utils.py", line 175, in cosp_bin_sum
prs: FileAxis = cld.getAxis(0)
File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/xarray/core/common.py", line 278, in __getattr__
raise AttributeError(
AttributeError: 'DataArray' object has no attribute 'getAxis'
Issue 8 File "/global/cfs/cdirs/e3sm/zhang40/conda_envs/e3sm_diags_dev_654_zonal_mean_xy/lib/python3.10/site-packages/e3sm_diags/derivations/formulas.py", line 81, in pminuse_convert_units
var.attrs["units"] == "kg/m2/s"
KeyError: 'units'
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @chengzhuzhang, I did some code clean up code and fixed the mypy warnings (which were all definitely useful).
59781c8
to
dabb728
Compare
f2e8b16
to
6a9a9d0
Compare
Got it, we can transfer these to another GitHub issue. |
Actually, I'll take a brief look at these issues and try to debug while I'm at it. |
4a7336e
to
b7195da
Compare
b7195da
to
46a10ed
Compare
Thank you! |
- Add new parameters for cfg_path and multiprocessing
- Needed to sort the `prs_range` and `tau_range` in ascending order before creating `cond` to pass to `.where()`, otherwise if out of order then there is a possibility that the condition will fail entirely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
matching_target_var_map = self._get_matching_climo_src_vars(ds, target_var_map) | ||
|
||
# NOTE: Since there's only one set of vars, we get the first and only set | ||
# of vars from the derived variable dictionary. | ||
# 1. Get the derivation function. | ||
derivation_func = list(matching_target_var_map.values())[0] | ||
if matching_target_var_map is not None: | ||
# NOTE: Since there's only one set of vars, we get the first and only set | ||
# of vars from the derived variable dictionary. | ||
# 1. Get the derivation function. | ||
derivation_func = list(matching_target_var_map.values())[0] | ||
|
||
# 2. Get the derivation function arguments using source variable keys. | ||
# Example: [xr.DataArray(name="PRECC",...), xr.DataArray(name="PRECL",...)] | ||
src_var_keys = list(matching_target_var_map.keys())[0] | ||
# 2. Get the derivation function arguments using source variable keys. | ||
# Example: [xr.DataArray(name="PRECC",...), xr.DataArray(name="PRECL",...)] | ||
src_var_keys = list(matching_target_var_map.keys())[0] | ||
|
||
# 3. Use the derivation function to derive the variable. | ||
ds_final = self._get_dataset_with_derivation_func( | ||
ds, derivation_func, src_var_keys, target_var | ||
) | ||
# 3. Use the derivation function to derive the variable. | ||
ds_derived = self._get_dataset_with_derivation_func( | ||
ds, derivation_func, src_var_keys, target_var | ||
) | ||
|
||
return ds_final | ||
return ds_derived | ||
|
||
# None of the entries in the derived variables dictionary worked, | ||
# so try to get the variable directly from he dataset. | ||
if target_var in ds.data_vars.keys(): | ||
return ds | ||
|
||
raise IOError( | ||
f"The dataset file has no matching source variables for {target_var}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE: Issue 1 in #752 (comment)
This fixes the missing outputs for the following variables: FLUT
, FLUTC
, SOLIN
, FLDS
, FLNS
, FLNSC
, FSDS
, FLDSC
, FSNS
, FSNSC
, ERA5-TMQ
, MERRA2-TMQ
The issue was that I did not include a conditional to try to return the variable directly from the climatology dataset if it cannot be derived (lines 661-664)
# 5. Get the axes mask conditional and subset the variable on it. | ||
cond = ( | ||
(prs >= prs_range[0]) | ||
& (prs <= prs_range[-1]) | ||
& (tau >= tau_range[0]) | ||
& (tau <= tau_range[-1]) | ||
) | ||
|
||
var_sub = var.where(cond, drop=True) | ||
|
||
# 5. Sum on axis=0 and axis=1 (tau and prs) | ||
# 7. Sum on axis=0 and axis=1 (tau and prs) | ||
var_sum = var_sub.sum(dim=[prs.name, tau.name]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE: Issue 1 in #752 (comment)
The changes in cosp_bin_sum()
now fix the missing outputs for the remaining variables: ISCCPCOSP-CLDTOT_TAU1.3_9.4_ISCCP
, ISCCPCOSP-CLDTOT_TAU1.3_ISCCP
, MODISCOSP-CLDHGH/TOT_TAU1.3_9.4_MODIS
, MODISCOSP-CLDHGH/TOT_TAU1.3_MODIS
Root Cause
- The
prs_range
can be in descending order (e.g.,(90000.0, 9000.0)
) - This caused the condition (
cond
) to not match any of the data values when subsetting with.where()
- The condition being used is
>=90000.0 and <=9000.0
which is incorrect - The correct condition should to be the other way around,
>= 9000 and <= 90000
- The condition being used is
- Since there are no matching values, the below
numpy
error is thrown withdrop=True
(drop allnan
values that don't match):ValueError: Cannot apply_along_axis when any iteration dimensions are 0
Solution
Sort prs_range
and tau_range
before subsetting
Side-note
The CDMS2 version of sub-setting a variable is pretty confusing. Notice below how prs_low
and prs_high
are used to subset, but prs_low
is actually the larger value with my example, (90000.0, 9000.0)
I think CDMS2 swaps both values around and subsets to include all values within the range. It uses the correct sub-setting condition, >= 9000 and <= 90000
.
e3sm_diags/e3sm_diags/derivations/acme.py
Lines 488 to 490 in 434ab8c
if cld.id == "FISCCP1_COSP": # ISCCP model | |
cld_bin = cld(cosp_prs=(prs_low, prs_high), cosp_tau=(tau_low, tau_high)) | |
simulator = "ISCCP" |
@tomvothecoder Thank you for the perseverance to fix the derived variable logic and those COSP variable calculation!! The PR is in a much better shape after the fix! |
Co-authored-by: Tom Vo <[email protected]>
Co-authored-by: Tom Vo <[email protected]>
Co-authored-by: Tom Vo <[email protected]>
Co-authored-by: Tom Vo <[email protected]>
Co-authored-by: Tom Vo <[email protected]>
Co-authored-by: Tom Vo <[email protected]>
Description
zonal_mean_xy
set #654 to refactor zonal mean xy to use xcdat/xarrayChecklist
xgcm
If applicable: