QCEFF Table #545

mjs2369 · 2023-09-26T01:08:35Z

Description:

Previously the QCF code required an algorithm_info_mod specific to the model, which meant editing algorithm_info_mod.f90 to specify which distribution should be used for which quantity.

This code implements a QCF input table, which reads in the algorithm info choices (QCF options) at runtime and stores them in algorithm_info_mod module storage.

This replaces the former functionality of algorithm_info_mod if statements with the table information.

The observation, state, and inflation variables are read in from a single table. Each field keeps its own column, having 28 total in the table.

The full list of QCF input options and information of the structure of the table can be found in the documentation at DART/guide/qcf_table.rst

More info on the background of the issue can be read in the specification here: https://docs.google.com/document/d/1MnvEFfgj5SfFbnIahGHwjy1XJ5IWBvPS8NB1nrIjc8k/edit

Fixes issue

Fixes #503

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update

Documentation changes needed?

My change requires a change to the documentation.
- I have updated the documentation accordingly.

While I have included new documentation on how to use the input table at DART/guide/qcf_table.rst , we may want to edit Jeff’s documentation at https://docs.dart.ucar.edu/en/quantile_methods/models/lorenz_96_tracer_advection/work/readme.html to reflect the difference in workflow for the tests listed. I think we should also add to this page the need to include the &probit_transform_nml when running on quantile_methods.

Tests

Compiled and ran filter with full debugging flags with Intel, CCE, gfortran
Bitwise identical to quantile_methods, tested with Intel

Information on how to use the QCF input table with the quantile code is in the documentation at DART/guide/qcf_table.rst

build_everything now passes for all models:

RESULT: 0  models/null_model/work/ finished
RESULT: 1  models/POP/work/ finished
RESULT: 2  models/lorenz_63/work/ finished
RESULT: 3  models/9var/work/ finished
RESULT: 4  models/gitm/work/ finished
RESULT: 5  models/simple_advection/work/ finished
RESULT: 6  models/lorenz_96/work/ finished
RESULT: 7  models/ikeda/work/ finished
RESULT: 8  models/ROMS/work/ finished
RESULT: 9  models/lorenz_84/work/ finished
RESULT: 10 models/cam-fv/work/ finished
RESULT: 11  models/mpas_atm/work/ finished
RESULT: 12  models/forced_lorenz_96/work/ finished
RESULT: 13  models/wrf/work/ finished
RESULT: 14  models/cice/work/ finished
RESULT: 15  models/cm1/work/ finished
RESULT: 16  models/lorenz_04/work/ finished
RESULT: 17  models/bgrid_solo/work/ finished
RESULT: 18  models/noah/work/ finished
RESULT: 19  models/wrf_hydro/work/ finished
RESULT: 20  models/lorenz_96_2scale/work/ finished
RESULT: 21 observations/obs_converters/GOES/work/ failed
RESULT: 22  observations/obs_converters/tec/work/ finished
RESULT: 23  observations/obs_converters/GRACE/work/ finished
RESULT: 24  observations/obs_converters/CNOFS/work/ finished
RESULT: 25  observations/obs_converters/GPSPW/work/ finished
RESULT: 26  observations/obs_converters/SSEC/work/ finished
RESULT: 27  observations/obs_converters/text_GITM/work/ finished
RESULT: 28  observations/obs_converters/GTSPP/work/ finished
RESULT: 29 observations/obs_converters/gps/work/ failed
RESULT: 30  observations/obs_converters/GSI2DART/work/ finished
RESULT: 31  observations/obs_converters/SABER/work/ finished
RESULT: 32  observations/obs_converters/SIF/work/ finished
RESULT: 33  observations/obs_converters/WOD/work/ finished
RESULT: 34  observations/obs_converters/tpw/work/ finished
RESULT: 35  observations/obs_converters/ROMS/work/ finished
RESULT: 36  observations/obs_converters/COSMOS/work/ finished
RESULT: 37 observations/obs_converters/var/work/ failed
RESULT: 38  observations/obs_converters/tropical_cyclone/work/ finished
RESULT: 39  observations/obs_converters/CONAGUA/work/ finished
RESULT: 40  observations/obs_converters/Ameriflux/work/ finished
RESULT: 41  observations/obs_converters/CHAMP/work/ finished
RESULT: 42  observations/obs_converters/cice/work/ finished
RESULT: 43 observations/obs_converters/GMI/work/ failed
RESULT: 44  observations/obs_converters/DWL/work/ finished
RESULT: 45  observations/obs_converters/MIDAS/work/ finished
RESULT: 46  observations/obs_converters/USGS/work/ finished
RESULT: 47  observations/obs_converters/SST/work/ finished
RESULT: 48  observations/obs_converters/MPD/work/ finished
RESULT: 49  observations/obs_converters/even_sphere/work/ finished
RESULT: 50  observations/obs_converters/MODIS/work/ finished
RESULT: 51  observations/obs_converters/NCEP/prep_bufr/work/ finished
RESULT: 52  observations/obs_converters/NCEP/ascii_to_obs/work/ finished
RESULT: 53  observations/obs_converters/NCEP/netcdf/work/ finished
RESULT: 54  observations/obs_converters/gnd_gps_vtec/work/ finished
RESULT: 55  observations/obs_converters/SSUSI/work/ finished
RESULT: 56  observations/obs_converters/ok_mesonet/work/ finished
RESULT: 57  observations/obs_converters/snow/work/ finished
RESULT: 58  observations/obs_converters/text/work/ finished
RESULT: 59  observations/obs_converters/AURA/work/ finished
RESULT: 60  observations/obs_converters/radar/work/ finished
RESULT: 61  observations/obs_converters/MADIS/work/ finished
RESULT: 62 observations/obs_converters/quikscat/work/ failed
RESULT: 63 observations/obs_converters/AIRS/work/ failed
RESULT: 64  observations/obs_converters/AVISO/work/ finished

A developer test for the table read is in progress.

Checklist for merging

Updated changelog entry
Documentation updated
Update conf.py

Checklist for release

Merge into main
Create release from the main branch with appropriate tag
Delete feature-branch

Testing Datasets

Dataset needed for testing available upon request
Dataset download instructions included
No dataset needed

Open Issues/Questions

There are two input options for obs_inc_info (rectangular_quadrature and gaussian_likelihood_tails) that are only compatible with the original RHF implementation. Currently, these variables are unused in algorithm_info_mod.f90. These could either be removed from alg_info_mod and the namelist, or we can implement a conditional:
if (filter_kind == RHF) then
rectangular_quadrature: .true.
gaussian_likelihood_tails: .false.

DART/assimilation_code/modules/assimilation/algorithm_info_mod.f90

Line 233 in fc48d9d

! Only need to set these two for options the original RHF implementation

Currently, the only check on the bounds that is implemented is a simple check to ensure that the lower bound is not less than the upper bound. Do we know if we want to put more explicit limits on the bounds?

There are differences in the formatting of log_qcf_info to dart_log.out with the cce compiler. This PR #491 describes this general issue with cce.

Currently, I am logging the headers of the QCF table to dart_log.out. Do we want these in the log? Similarly, is writing the data straight to the dart_log sufficient, or do we want to format this more cleanly (i.e. make it look like a table)?

…ing types

…ls_mod

…ning size of table

… them to the variables in the qcf_table_data_type

…o that these type structs are only used in algorithm_info_mod

…es/assimilation

… read in correctly; removed rowheaders argument from subroutines where not needed

…bs_inc_info subroutine

…bs_error_info subroutine

…qceff_table

…ted code

…ools_mod to filter_main in filter_mod

…e top of algorithm_info_mod

…findloc

…e version

…ork with log_qcf_info

currently the "lower bound only" test is failing because the upper < lower check happens always rather then only when you have two bounds

…ing invalid bounds otherwise missing_r8 -88888 value for the upper bound is "less than" the lower bound

assimilation_code/modules/assimilation/algorithm_info_mod.f90

hkershaw-brown · 2023-10-02T16:38:41Z

Code to be removed:

models/cam-fv/work/algorithm_info_mod.f90
assimilation_code/modules/assimilation/all_eakf_algorithm_info_mod
assimilation_code/modules/assimilation/neg_algorithm_info_mod
assimilation_code/modules/assimilation/state_eakf_tracer_bnrhf_algorithm_info_mod
assimilation_code/modules/assimilation/one_above_algorithm_info_mod

hkershaw-brown · 2023-10-02T16:45:13Z

obs_kind is outdated terminology. qty (quantity) is the term that has replaced kind.
I'll go ahead and switch this out.

kind is outdated terminolgy for quantity #545 (comment)

assimilation_code/modules/assimilation/algorithm_info_mod.f90

mjs2369 · 2023-10-02T17:29:37Z

@hkershaw-brown great suggestions. I'll commit them after we change the author on the previous commits

assimilation_code/modules/assimilation/algorithm_info_mod.f90

hkershaw-brown · 2023-10-02T17:50:43Z

assimilation_code/modules/assimilation/algorithm_info_mod.f90

+
 ! The information arguments are all intent (inout). This means that if they are not set
 ! here, they retain the default values from the assim_tools_mod namelist. Bounds don't exist 
 ! in that namelist, so default values are set in assim_tools_mod just before the call to here.


Do all these arguments need to be in inout?

obs_inc_info now sets the defaults for

filter_kind,
~~rectangular_quadrature~~
~~gaussian_likelihood_tails~~
sort_obs_inc
spread_restoration
bounded_below
bounded_above
lower_bound
upper_bound

I don't think they need to be anymore. Previous to these changes, the defaults for all these were set in assim_tools_mod. The info for the bounds set right before the call to obs_inc_info (https://github.com/NCAR/DART/blob/6a5fbb6126d472a48a449ab6f13ff671c59bfb41/assimilation_code/modules/assimilation/assim_tools_mod.f90#L986C1-L988C51). I think we can definitely remove these lines linked above from assim_tools_mod and make the bounds info intent(out).

The other control options for obs_inc_info are a bit more difficult. filter_kind, sort_obs_inc, spread_restoration, rectangular_quadrature, and gaussian_likelihood_tails are being set to the values specified in the &assim_tools_nml section of input.nml, as this comment describes:
(https://github.com/NCAR/DART/blob/b46a922aacd9f2226d15eccb0de22368196f9078/assimilation_code/modules/assimilation/algorithm_info_mod.f90#L438C1-L443C88).

The defaults for these variables are used if they are not specified in the nml (https://github.com/NCAR/DART/blob/6a5fbb6126d472a48a449ab6f13ff671c59bfb41/assimilation_code/modules/assimilation/assim_tools_mod.f90#L146C1-L153C48)

Then we we get to the call to obs_inc_info, the values are changed to the defaults specified in obs_inc_info OR whatever is set in the qcf table. I think we can still set them to intent(out), but I think it is a bit weird to have our users set these options in both the qcf table and the assim_tools_nml. What do you think @hkershaw-brown ?

After thinking more about this about this comment I made, I don't know if its bad that the users can set these options in both the &assim_tools_nml and in the qcf table

While we are setting the values for filter_kind in both the &assim_tools_nml and in the qcf table, the filter_kind read in from the qcf table is specific to observation space increments

But this leads me to a new question, which is if we have a value for filter_kind (or any of the other obs_inc_info options) set in the &assim_tools_nml that is not the default (EAKF = 1), do we want to use this filter_kind in obs_inc_info as the default instead of using EAKF? This would mean that by default, the filter_kind used for the observation space increments would match the filter_kind being used in assim_tools_mod

For example, if a user has filter_kind = 8 () set in their &assim_tools_nml, this would be passed in as the filter_kind argument to obs_inc_info, which would then use this value as the 'default' filter

I could see there being issues where a user sets rectangular_quadrature = .false. or something in the namelist, but they don’t include a qcf table. So it ends up changing the value of rectangular_quatrature to .true. (the default) when it calls obs_inc_info without the user expecting/wanting this to happen

hkershaw-brown · 2023-10-02T17:56:22Z

assimilation_code/modules/assimilation/algorithm_info_mod.f90

+possible_filter_kind_ints(1) = 1
+possible_filter_kind_ints(2) = 2
+possible_filter_kind_ints(3) = 8
+possible_filter_kind_ints(4) = 11
+possible_filter_kind_ints(5) = 101


Definite change:

These filter_kind parameters are already defined in this module

DART/assimilation_code/modules/assimilation/algorithm_info_mod.f90

Lines 30 to 38 in a4723e7

! Defining parameter strings for different observation space filters

! For now, retaining backwards compatibility in assim_tools_mod requires using

! these specific integer values and there is no point in using these in assim_tools.

! That will change if backwards compatibility is removed in the future.

integer, parameter :: EAKF = 1

integer, parameter :: ENKF = 2

integer, parameter :: UNBOUNDED_RHF = 8

integer, parameter :: GAMMA_FILTER = 11

integer, parameter :: BOUNDED_NORMAL_RHF = 101

Don't redefine them here.

Convert the strings to the parameter value when reading the table, rather that doing the string comparison everytime.

@hkershaw-brown I made this suggested change for both the dist_type and filter_kind. They are both now converted from string to integer in read_qcf_table and are set to integers in the types.

hkershaw-brown · 2023-10-02T17:58:56Z

assimilation_code/modules/assimilation/algorithm_info_mod.f90

+      lower_bound = qcf_table_data(QTY_loc(1))%obs_inc_info%lower_bound
+      upper_bound = qcf_table_data(QTY_loc(1))%obs_inc_info%upper_bound
+
+endif

 ! Only need to set these two for options the original RHF implementation
 !!!rectangular_quadrature = .true.


Do we have enough information to set appropriate values for these:

! Only need to set these two for options the original RHF implementation !!!rectangular_quadrature = .true. !!!gaussian_likelihood_tails = .false.

Not yet. I discussed this with Jeff, and he said to just make a note of it for now. I have it mentioned in the open issues section of this PR

@jlaucar this is the second one - what should be done with rectangular_quadrature and gaussian_likelihood_tails?

assimilation_code/modules/assimilation/algorithm_info_mod.f90

hkershaw-brown · 2023-10-02T18:09:22Z

assimilation_code/modules/assimilation/algorithm_info_mod.f90

+
+!use default values if qcf_table_filename is not in namelist
+if (.not. qcf_table_listed) then
+   dist_type = BOUNDED_NORMAL_RH_DISTRIBUTION


This is just a question for my own understanding. Why is the default BOUNDED_NORMAL_RH_DISTRIBUTION
with no bounds,
why not have the default NORMAL_DISTRIBUTION?

@jlaucar there are two comments/questions on this pull request that you need to have a look at and answer. This is one of them.

…bution Currently the algorithm_info_mod does not catch these on table read

hkershaw-brown · 2023-10-02T18:35:08Z

Note on:

RESULT: 10 models/cam-fv/work/ finished

Possibly the local algorithm_info_mod.f90 was removed for your test but not committed, because the branch as-is
the cam-fv build fails because there is an algorithm_info_mod.f90 in cam-fv/work

/Users/hkershaw/DART/pull_requests/pull_545/assimilation_code/modules/assimilation/filter_mod.f90(93): error #6580: Name in only-list does not exist or is not accessible.   [INIT_ALGORITHM_INFO_MOD]
use algorithm_info_mod, only : probit_dist_info, init_algorithm_info_mod, end_algorithm_info_mod
-------------------------------------------------^
/Users/hkershaw/DART/pull_requests/pull_545/assimilation_code/modules/assimilation/filter_mod.f90(93): error #6580: Name in only-list does not exist or is not accessible.   [END_ALGORITHM_INFO_MOD]
use algorithm_info_mod, only : probit_dist_info, init_algorithm_info_mod, end_algorithm_info_mod
--------------------------------------------------------------------------^
compilation aborted for /Users/hkershaw/DART/pull_requests/pull_545/assimilation_code/modules/assimilation/filter_mod.f90 (code 1)
make: *** [filter_mod.o] Error 1

mjs2369 · 2023-10-02T18:42:07Z

Note on:

RESULT: 10 models/cam-fv/work/ finished

Possibly the local algorithm_info_mod.f90 was removed for your test but not committed, because the branch as-is the cam-fv build fails because there is an algorithm_info_mod.f90 in cam-fv/work

/Users/hkershaw/DART/pull_requests/pull_545/assimilation_code/modules/assimilation/filter_mod.f90(93): error #6580: Name in only-list does not exist or is not accessible.   [INIT_ALGORITHM_INFO_MOD]
use algorithm_info_mod, only : probit_dist_info, init_algorithm_info_mod, end_algorithm_info_mod
-------------------------------------------------^
/Users/hkershaw/DART/pull_requests/pull_545/assimilation_code/modules/assimilation/filter_mod.f90(93): error #6580: Name in only-list does not exist or is not accessible.   [END_ALGORITHM_INFO_MOD]
use algorithm_info_mod, only : probit_dist_info, init_algorithm_info_mod, end_algorithm_info_mod
--------------------------------------------------------------------------^
compilation aborted for /Users/hkershaw/DART/pull_requests/pull_545/assimilation_code/modules/assimilation/filter_mod.f90 (code 1)
make: *** [filter_mod.o] Error 1

@hkershaw-brown yes, cam-fv finished for me because I removed the algorithm_info_mod in work locally, but forgot to commit and push this change.

This has now been pushed to the remote qcf_table branch.

kind is outdated terminolgy for quantity #545 (comment)

hkershaw-brown · 2023-10-05T14:31:12Z

closing this pull request in favour of #553 which this request, but with commits correctly attributed to Marlee.

Do not merge this use #553

Marlee Smith and others added 30 commits August 14, 2023 14:11

draft program to experiment with reading table values into correspond…

ef152a1

…ing types

prototype table data file that uses CAM-FV QTYs

ebdbf31

adding new subroutine init_qcf_table to return number of rows in table

262965b

Adding a new namelist variable to the assim_tools_nml

715875d

Adding QCF table type definitions to algorithm_info_mod

1f9dcc3

adding type defs to use statement for algorithm_info_mod in assim_too…

e8ff87f

…ls_mod

Adding allocatable variables for table data, allocating after determi…

7024e2b

…ning size of table

New subroutine to read through the values in the QCF table and assign…

935bcb5

… them to the variables in the qcf_table_data_type

Removing qcf table data types from assim_tools_mod and reorganizing s…

be4ee63

…o that these type structs are only used in algorithm_info_mod

Fixing small inconsistencies/typos

66356a6

moving the location of draft program outside /assimilation_code/modul…

b62339e

…es/assimilation

Adding draft subroutine write_qcf_table to test that values are being…

7659472

… read in correctly; removed rowheaders argument from subroutines where not needed

replaicing conditionals and hardcoded values in probit_dist_info

e995563

using get_name_for_quantity to get generic quantity from integer index

fa514c2

Replacing conditionals and hard coded values with qcf_table_data in o…

f291025

…bs_inc_info subroutine

Replacing conditionals and hard coded values with qcf_table_data in o…

5a26816

…bs_error_info subroutine

add subroutine to deallocate qcf table data structures

2376bbc

making dealloc subroutine available to assim_tools_mod

90c9beb

Merge branch 'quantile_methods' of https://github.com/NCAR/DART into …

b1c0658

…qceff_table

removing comment blocks of old code

6de99ba

Adding call to deallocate routine, removing unused var and old commen…

c5e0eb1

…ted code

Fixing typo in subroutine names

d61382e

Moving the allocation and deallocation of qcf table data from assim_t…

e842a2b

…ools_mod to filter_main in filter_mod

uncommenting call to end_alg_info_mod

a78be83

moving call to init_algortihm_info_mod out of conditional

dc0c617

Reorganizing the subroutines so that init_algorithm_info_mod is at th…

2d6808e

…e top of algorithm_info_mod

Adding qcf_table_listed logical and module_initialized checks

19b6150

Moving location of qcf_table_listed check to before data access from …

efa929b

…findloc

Using error_handler from utilities_mod; adding check for correct tabl…

16354ab

…e version

adding qcf_table_file_listed logical to two remaining subrountines; w…

8595523

…ork with log_qcf_info

hkershaw-brown added 5 commits September 28, 2023 16:38

remove stray /dev/null left in accidentally

200a8f4

add tests for various bounds options

46f238f

currently the "lower bound only" test is failing because the upper < lower check happens always rather then only when you have two bounds

test for bounds set to false, but bounds values in the table

ca99e8b

fix: need to check that a qty is bounded above and below before check…

22f86c4

…ing invalid bounds otherwise missing_r8 -88888 value for the upper bound is "less than" the lower bound

fix: remove extra call to test_table read from runall.sh

38353bb