Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dissaggregated glossic horizons unsupported? #122

Open
phytoclast opened this issue Jan 22, 2020 · 17 comments
Open

Dissaggregated glossic horizons unsupported? #122

phytoclast opened this issue Jan 22, 2020 · 17 comments
Labels
NASIS-local This tag is used for pull requests, issues, discussions etc. for soilDB local NASIS functions
Milestone

Comments

@phytoclast
Copy link

The fetchNASIS_pedons(rmHzErrors=TRUE) function filters out pedons 2012MI139004-2 and 2012MI139015, presumably due to horizon overlap. Indeed they have their B/E horizons separated out as 2 completely overlapping horizons, but with total volume distributed among them for balance. When I modified the records to force separate horizons, the error went away.
This method of representing glossic horizons seems to be unsupported.
image

@phytoclast
Copy link
Author

phytoclast commented Jan 22, 2020

Looks like I might have populated them incorrectly according to the guidance below, except that it seems only applicable to components.,

The national instructions for population of what they call “combination” horizons like E/B, B/E, etc. are listed in the NSSH in section 618.38.C.3 and read as follows:

Combination horizons (E and Bt, Btn/E, E/Bt, etc.) should be entered as two separate horizon records, such as one for the E part of the horizon and the second for the Bt part of the horizon. Both records must have the same horizon designations assigned (e.g., E/Bt). But these separate horizon records must have different RV depth values for the top and bottom depths. The RV horizon depths must be completely in sync with no duplication, overlaps, or gaps. For example, the E part of a E/Bt horizon could have RV depths of 20 to 35 cm and the Bt part of the E/Bt horizon could have RV depths of 35 to 50 cm. The depth values for the “Low” and “High” columns of the horizon top and bottom depths may be populated to identify the overlapping nature of the horizon (e.g., both records may have the same low value for the top depth of 10 cm). Soil property data elements would be populated for each part to describe the characteristics of that separate part of the combination horizon.

@phytoclast
Copy link
Author

phytoclast commented Jan 22, 2020

This could be a work around, but bogus horizon depths, since the actual variability in Bt is expressed more horizontally than vertically.
image

@brownag
Copy link
Member

brownag commented May 5, 2020

Hey Greg

This issue has been on the back of my mind for some time -- I am sorry that we have not prioritized getting some sort of a solution in place.

If you want to chat some time about how you, or others using this data, would like to see these types of records interpreted by soilDB I am open to suggestions. I agree that putting bogus top/bottom depth and fudging thickness to portray overlap leaves a lot to be desired. It really isn't useful in terms of aggregating properties, IMO, it just allows the pedon to be retained in the set.

Since we don't have a lot of combination horizons, we have not grappled with the unique topologic considerations that occur in e.g. glossic horizons. That is not by choice but simply practicality/what has been needed so far.

I think we could come up with a variety of ways of "flattening" "overlapping" horizons that are populated with a designation containing a virgule / -- I am just not sure what would be most useful. Such a routine could be implemented into the various flavors of fetchNASIS, and would have a couple options for the choice of flattening method. For instance: combining texture classes and other categorical variables into hybrids (similar to the NASIS Texture Group concept), weighted-averaging of numeric properties, etc.

In a more complex view, we could start to consider the soil profile as a matrix of horizon record IDs, rather than a data.frame of horizontally-bedded "rows" of horizon data.

The matrix representation, in the simplest sense could just be a n x 2 representation, where n is the number of layers (pairs of top and bottom depths) and the two "columns" each represent a mixture of up to two contrasting soil materials. The cells of the matrix point to a horizon ID, which can be joined to data.frame that contains the relevant horizon data.

This could be further generalized to a X x Z representation, where each cell is represents, e.g. a 1cm x 1cm portion of the profile in the horizontal (X) and vertical (Z) direction. Photographs or other digital methods could be processed to represent lateral and horizontal variability. We obv. do not have data structures in NASIS capable of handling this (yet) but it could be useful for representing things like cryoturbated soils and Vertisols, and I presume glossic horizons -- that all can have tremendous short-range lateral variability.

Let me know what you think. I don't like when legitimate pedon data gets removed -- and I would be happy to help implement a fix some time in future -- but I need some perspective on how you would like to use the data.

@dylanbeaudette
Copy link
Member

This related issue (ncss-tech/aqp#46) comes to mind. There is a real need to accurately represent 2D horizonation.

@brownag
Copy link
Member

brownag commented May 5, 2020

More on that:

I think the vertical "scanline" (slice) approach Nic shows here is a really interesting and probably useful way of integrating lateral variability.

https://github.com/umnpedology/2D-Morphology/issues/1

@phytoclast
Copy link
Author

Andrew, in terms of how I would use the soil attributes in R, I will probably end up just rendering a weighted average of soil properties by depth to derive my interpretations. Conceptualizing the profile is probably not on my mind, but I can see this problem as reoccurring issue that defies fitting into a simplistic one dimensional (depth) data structure. My favored solution is to avoid having to enter extra sub-tables or bogus depth and to just have overlapping depths and apportion the properties by their user determined percentages. Where the percentages either don't add up to, or exceed 100%, my solution (short of fixing the error in NASIS) would be to recalculate the percentages them proportionally to the entered values.

Documenting the horizontal structure in a more sophisticated way for a single pedon would still need to have a common quantifiable parameter to represent multiple pedons, or to display as a raster map. Each level of complexity is generally less important, and less likely to be recorded on a regular basis. The sweet spot is data with just enough detail to be diagnostic while having enough replicates to be useful spatially.

It reminds me of quantifying leaf shape for numerical plant taxonomy, where length and width, are primary, followed by the angles of the base and tip, then the depth and angles of any teeth and lobes. Each level of description is more complex like a fractal.

Microtopography is another issue that I see as sometimes being important to describe in horizontal and vertical terms such as one describes waves. An even bigger one is vegetation (for me), with species cover by stratum, wherein strata may be either predetermined heights, or the average live canopy heights of individuals within the strata as an estimated parameters themselves.

@brownag
Copy link
Member

brownag commented May 5, 2020

Agreed. I suspected that was the main avenue that you would want to go -- as I would do the same for my routine DMU / ES analyses.

In near future Ill start tinkering around with a prototype flattening function capable of the weighted averaging of the horizon data. There will be a toggle argument to fetchNASIS to turn it on. I am thinking I might generalize it for arbitrary vertical aggregation - not just for the overlapping case. More to come on that soon.

@dylanbeaudette
Copy link
Member

Excellent discussion. @phytoclast: are there ~10 pedons that you know of which can be used as a test set? I'd like to add them to the aqp example data so that we can develop / document / test without the need for a connection to NASIS and populated selected set. Thanks!

@phytoclast
Copy link
Author

Dylan, I will have to dig into that. I had been mining NASIS to fill in missing pedons for some of my veg plots based on proximity, as I sometimes collected vegetation data without digging. A NASIS query spanning older pedon data sometimes jams the fetchNASIS, even after the recent fixes. Otherwise, there is the one used above as example.

@phytoclast
Copy link
Author

phytoclast commented May 14, 2020 via email

@brownag
Copy link
Member

brownag commented May 15, 2020

I am having some issues replicating this problem. Here is what I tried.

library(aqp)
library(soilDB)

# try your call exactly as is
fp <-  fetchNASIS(from = 'pedons', SS=FALSE, rmHzErrors=FALSE)

# I loaded just your problem pedons into my SS
f <- fetchNASIS()

pids <- c("94OH063003", "94OH063004", "1982OH065009", "1982OH065006", "1982OH065007", "1983OH065008", "1983OH065003", "1983OH065013", "1983OH065002", "1983OH065007", "1983OH065004", "1983OH065009", "S1993MI069002", "S1993MI069003", "2009MI139F190", "C1103P07-1", "C1103P07-2", "C1103P07-3", "C1103P07-4", "C1103P07-5", "S1980MI081008", "2013MI163023", "2018OH039004", "TR1999IN011SH016119", "RP1983IN159SH028001")

# in both cases, all of the above pedon IDs are in the fetchNASIS result
all(pids %in% f$pedon_id)
all(pids %in% fp$pedon_id)
> library(soilDB)
> f <- fetchNASIS()
multiple horizontal datums present, consider using WGS84 coordinates (x_std, y_std)
mixing moist colors ... [5 of 161 horizons]
Loading required namespace: farver
replacing missing lower horizon depths with top depth + 1cm ... [2 horizons]
converting IDs from integer to character
-> QC: duplicate pedons: use `get('dup.pedon.ids', envir=soilDB.env)` for related peiid values
-> QC: pedons missing bottom hz depths: use `get('missing.bottom.depths', envir=soilDB.env)` for related pedon IDs
Warning messages:
1: some records are missing rock fragment volume, these have been removed 
2: some records are missing artifact volume, these have been removed 
> length(f)
[1] 28
> f <- fetchNASIS(rmHzErrors = F)
multiple horizontal datums present, consider using WGS84 coordinates (x_std, y_std)
mixing moist colors ... [5 of 161 horizons]
replacing missing lower horizon depths with top depth + 1cm ... [2 horizons]
converting IDs from integer to character
-> QC: duplicate pedons: use `get('dup.pedon.ids', envir=soilDB.env)` for related peiid values
-> QC: pedons missing bottom hz depths: use `get('missing.bottom.depths', envir=soilDB.env)` for related pedon IDs
Warning messages:
1: some records are missing rock fragment volume, these have been removed 
2: some records are missing artifact volume, these have been removed 
> length(f)
[1] 28
> get('dup.pedon.ids', envir=soilDB.env)
[1] "RP1983IN159SH028001" "S1980MI081008"      
> get('missing.bottom.depths', envir=soilDB.env)
[1] "1983OH065008"        "TR1999IN011SH016119"
> fp <-  fetchNASIS(from = 'pedons', SS=FALSE, rmHzErrors=FALSE)
multiple horizontal datums present, consider using WGS84 coordinates (x_std, y_std)
mixing dry colors ... [132 of 5820 horizons]
mixing moist colors ... [158 of 7732 horizons]
-> QC: some fragsize_h values == 76mm, may be mis-classified as cobbles [1755 / 13550 records]
replacing missing lower horizon depths with top depth + 1cm ... [115 horizons]
top/bottom depths equal, adding 1cm to bottom depth ... [16 horizons]
converting IDs from integer to character
-> QC: sites without pedons: use `get('sites.missing.pedons', envir=soilDB.env)` for related usersiteid values
-> QC: duplicate pedons: use `get('dup.pedon.ids', envir=soilDB.env)` for related peiid values
-> QC: horizon errors detected, use `get('bad.pedon.ids', envir=soilDB.env)` for related userpedonid values or `get('bad.horizons', envir=soilDB.env)` for related horizon designations
-> QC: pedons missing bottom hz depths: use `get('missing.bottom.depths', envir=soilDB.env)` for related pedon IDs
-> QC: equal hz top and bottom depths: use `get('top.bottom.equal', envir=soilDB.env)` for related pedon IDs
Warning messages:
1: 'the_value' has been rounded to the nearest integer. 
2: some records are missing rock fragment volume, these have been removed 
3: some records are missing artifact volume, these have been removed 
> get('sites.missing.pedons', envir=soilDB.env)
 [1] "03CA632004"        "07CA630DWB018"     "08CA630JCR005"     "T06CA630SMM014-10" "T06CA630SMM014-09" "T06CA630SMM014-07"
 [7] "T06CA630SMM014-05" "T06CA630SMM014-02" "T06CA630SMM014-03" "T09CA630CKS002-01" "T09CA630CKS002-02" "T09CA630CKS002-03"
[13] "T09CA630CKS002-04" "T09CA630CKS002-05" "T09CA630CKS002-06" "T09CA630CKS002-07" "T09CA630CKS002-08" "T09CA630CKS002-09"
[19] "T09CA630CKS002-10" "T06CA630SKC005-01" "T06CA630SKC005-03" "T06CA630SKC005-04" "T06CA630SKC005-05" "T06CA630SKC005-06"
[25] "T06CA630SKC005-07" "T06CA630SKC005-08" "T06CA630SKC005-10" "2017CA6306087N"   
> get('dup.pedon.ids', envir=soilDB.env)
 [1] "09BJM030"            "R11CA009003"         "R11CA109001"         "RP1983IN159SH028001" "S07CA009003"        
 [6] "S07CA009004"         "S07CA009005"         "S07CA009006"         "S07CA009008"         "S08CA009001"        
[11] "S08CA009002"         "S08CA009003"         "S08CA109001"         "S08CA109002"         "S09CA009001"        
[16] "S09CA009002"         "S09CA009003"         "S09CA009005"         "S09CA009006"         "S09CA009007"        
[21] "S09CA109001"         "S09CA109002"         "S09CA109003"         "S09CA109004"         "S09CA109005"        
[26] "S1980MI081008"       "S2004CA099001"       "S2007CA009001"       "S2007CA009002"       "S2007CA109001"      
[31] "S2007CA109002"       "S2007CA109004"       "S2007CA109005"       "S2007CA109006"       "S2007DWB005"        
[36] "S2007DWB007"         "S2007DWB015"         "S2008DWB020"         "S2008DWB022"         "S2009BAH003"        
[41] "S2009BAH004"         "S2009BJM027"         "S2009BJM035"         "S2009BJM052"         "S2009CKS040"        
[46] "S2009CKS043"         "S2009CKS044"         "S2009CKS049"         "S2009CKS054"         "S2009SMM006"        
[51] "S2009SMM007"         "S2009SMM009"         "S2009SMM011"         "S2009SMM014"         "S2011CA009001"      
[56] "S2011CA009002"       "S2011CA009003"       "S2011CA009004"       "S2011CA009005"       "S2011DEB094N"       
> get('bad.pedon.ids', envir=soilDB.env)
 [1] "S2007CA109002" "S2007CA109002" "S09CA009005"   "S09CA009007"   "S2007DWB007"   "S2008DWB022"   "09BJM053N"    
 [8] "10BJM044"      "09SMM016"      "2015CA6303027"
> get('missing.bottom.depths', envir=soilDB.env)
  [1] "06SMM012"            "07DWB001"            "07DWB002"            "07RJV015"            "07RJV016"           
  [6] "07RJV018"            "07RJV025"            "07RJV026"            "07RJV027"            "07RJV028"           
 [11] "07RJV033"            "07RJV037"            "07RJV131"            "07SKC004"            "07SKC007"           
 [16] "07SKC008"            "07SKC009"            "07SMM027"            "07SMM047"            "08AMS002"           
 [21] "08AMS003"            "08AMS006"            "08AMS007"            "08AMS018"            "08AMS019"           
 [26] "08AMS020"            "08BJM002"            "08BJM004"            "08SMM005"            "08SMM008"           
 [31] "09AMS002"            "09AMS003"            "09AMS008"            "09AMS009"            "09AMS011"           
 [36] "09AMS020"            "09AMS021"            "09BAH010"            "09BAH011"            "09BAH012"           
 [41] "09BJM011"            "09BJM016"            "09BJM021"            "09BJM022"            "09BJM023"           
 [46] "09BJM024"            "09BJM025"            "09BJM026"            "09BJM046"            "09BJM047"           
 [51] "09BJM092"            "09BJM128"            "09CKS007"            "09SMM001"            "09SMM002"           
 [56] "10MJE010"            "10MJE026"            "10MJE037"            "10MJE052"            "10PDM123"           
 [61] "11BJM021"            "11BJM032"            "11BJM039"            "11CKS032"            "11KWJ016"           
 [66] "11KWJ019"            "11KWJ021"            "11MJE014"            "11MJE019N"           "1983OH065008"       
 [71] "2002CA630002"        "2012CA6302014"       "2012CA6303011"       "2013CA6302002"       "2013CA6302005"      
 [76] "2013CA6302007"       "2013CA6302008"       "2013CA6302009"       "2013CA6302010"       "2013CA6302012"      
 [81] "2013CA6302023"       "2013CA6302038"       "2013CA6302045"       "2013CA6303002"       "2013CA6303006"      
 [86] "2013CA6303017"       "2013CA6303022"       "2013CA6303027"       "2013CA6303036"       "2013CA6303039"      
 [91] "2013CA6303044"       "2013CA6303045"       "2013CA6304011"       "2013CA6304012"       "2015CA6302005"      
 [96] "2015CA6302008"       "2015CA6302009"       "2015CA6302010"       "2015CA6302011"       "2015CA6302012"      
[101] "2015CA6302016"       "2015CA6302020"       "2015CA6302037"       "2017CA6306085N"      "2017CA6306092N"     
[106] "75-CA-55-023x"       "S04CA077-001"        "S04CA099-003"        "S08CA109001"         "S09CA009003"        
[111] "S2007DWB005"         "S2007DWB015"         "S2009BAH004"         "TR1999IN011SH016119"
> pids <- c("94OH063003", "94OH063004", "1982OH065009", "1982OH065006", "1982OH065007", "1983OH065008", "1983OH065003", "1983OH065013", "1983OH065002", "1983OH065007", "1983OH065004", "1983OH065009", "S1993MI069002", "S1993MI069003", "2009MI139F190", "C1103P07-1", "C1103P07-2", "C1103P07-3", "C1103P07-4", "C1103P07-5", "S1980MI081008", "2013MI163023", "2018OH039004", "TR1999IN011SH016119", "RP1983IN159SH028001")
> all(pids %in% f$pedon_id)
[1] TRUE
> all(pids %in% fp$pedon_id)
[1] TRUE

@brownag
Copy link
Member

brownag commented May 15, 2020

Also, I queried the list of user pedon IDs you provided using the Region 2 query: "Pedon/Site/Transect by pedonUserIDs (Multiple)".

As you can see in output above, a couple pedons in your list trigger the "duplicate" and "missing hz bottom depth" QC warnings -- but even when I run against my entire local (which has a lot more problematic data) those pedon IDs are in the collection.

> get('dup.pedon.ids', envir=soilDB.env)
[1] "RP1983IN159SH028001" "S1980MI081008"      
> get('missing.bottom.depths', envir=soilDB.env)
[1] "1983OH065008"        "TR1999IN011SH016119"

@phytoclast
Copy link
Author

phytoclast commented May 15, 2020 via email

@phytoclast
Copy link
Author

phytoclast commented May 15, 2020 via email

@brownag
Copy link
Member

brownag commented May 19, 2020

All 22 of those site/pedon IDs have no horizon data, so they get excluded from the result. Most look like they are correlated to miscellaneous areas (which presumably actually are soils at least in some cases)

image

We don't currently have a fill argument set up for pedons like we do components, but that is what you would need to have these horizonless profiles included in the fetchNASIS result.

I personally am not opposed to having fill for pedons, especially since we provide a single interface to all of the fetchNASIS sources through fetchNASIS, and it is sort of odd to offer that function for one source and not the other. I would use it frequently, as I often have note observations that do not have horizon data, but might have site/taxonomic history populated, among other things.

That said, fill can cause issues so we will need to be careful about implementing it.

@dylanbeaudette
Copy link
Member

Thanks for all of the follow-up testing. As for the fill argument: I'd prefer that described / explored in another issue (#131) and in the general case (ncss-tech/aqp#134).

@phytoclast
Copy link
Author

phytoclast commented May 19, 2020 via email

@brownag brownag added the NASIS-local This tag is used for pull requests, issues, discussions etc. for soilDB local NASIS functions label Jan 16, 2021
@brownag brownag added this to the soilDB 3.0 milestone Jul 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NASIS-local This tag is used for pull requests, issues, discussions etc. for soilDB local NASIS functions
Projects
None yet
Development

No branches or pull requests

3 participants