Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON string vs markup-line or markup-multiline #2007

Open
swanky-oscal opened this issue Apr 11, 2024 · 7 comments
Open

JSON string vs markup-line or markup-multiline #2007

swanky-oscal opened this issue Apr 11, 2024 · 7 comments
Labels

Comments

@swanky-oscal
Copy link

Question

It appears that all instances of markup-line and markup-multiline in the JSON model are converted to strings in the schema.

For instance the metadata title element references markup-line in the v1.1.2 reference model.
And the JSON metaschema defines MarkupLineDatatype. However complete-schema.json in the v1.1.2 release defines #assembly_metadata:title as a String.

So, my question is two part:

  1. Why does the published JSON schema replace markup-line and markup-multiline with String?
  2. Should this be clearly explained in the guidance? (And apologies if it is. I just can't find it)?
@iMichaela
Copy link
Contributor

@swanky-oscal - JSON was not designed for human readability and string was the best approach our team came up with. One can craft the string in such a way that accomplishes their expected output. If you have a proposal for a better representation, we are listening. The reason for not explicitly documenting it was our wrong assumption that all our OSCAL developers working with JSON will understand the reason for this choice.

@swanky-oscal
Copy link
Author

@iMichaela, thanks for your response! I'm thinking of this from a strongly typed schema perspective. Many of the metaschema types are represented in JSON as string. But being able to type the string as a date-time vs base64 vs markup-line vs string is useful for validation.

I am developing a Rust based OSCAL schema lib. So, strong typing is pretty core. The standard JSON serialization crate (serde, serde_json) is very well suited for managing the validation at a schema level. So I will just type the appropriate metaschema elements as MarkupLineDatatype and MarkupMultilineDatatype anyways. But it would be very cool if complete-schema.json helped me with this.

Incidentally, I wrote a simple code generator that generates Rust code from complete-schema.json. Very rudimentary. But it helps to show how I could take advantage of markup-multiline being included in the metaschema data types.

@iMichaela
Copy link
Contributor

@swanky-oscal - Very exciting to learn of your effort. Please note that JSON schema is also short of documenting current constraints. I hope you are also aware of @gborough 's work: ROSCAL :) See #1986

@wendellpiez
Copy link
Contributor

wendellpiez commented May 1, 2024

@swanky-oscal this scrutiny is actually timely. Over in another repo we are looking at a different JSON Schema bug. If there is an obvious enhancement to the schema suggested by your observation, we could fold that in with the repair.

Until we look harder I guess we won't know, but suggestions are welcome (the more concrete the better).

Of course the fundamental problem is that the "Markdown datatypes" (markup-line and markup-multiine) -- or so they are in the JSON -- are not exactly easy either to specify or to validate (even if lexically). Where we are stuck, essentially we try to follow the 'do no harm' rule and pass the problem along. This doesn't mean we can't do better (either validating the data, or simply providing hooks) and community suggestions have already led to improvements in this schema. (Same goes for other artifacts of course.)

If we can take this to usnistgov/metaschema-xslt#105 (or to https://github.com/usnistgov/metaschema-xslt/pulls/108, addressing it) perhaps we could close it here?

@wendellpiez
Copy link
Contributor

Having looked again I am not sure the root of the problem is simply that JSON Schema syntax doesn't give us enough 'play' to express what we want. We want to describe the field being defined, but it also may have a datatype (whether a markup type or other) that reduces to a string (if that's the most the JSON Schema can say) or is otherwise 'bobbled'.

One workaround would be to extend the use of description to include the designated type.

So

"title" : 
     { "title" : "Part Title",
      "description" : "An optional name given to the part, which may be used by a tool for display and navigation. [MarkupLineDatatype]",
      "type" : "string" },

Then at least the info would be there in an annotation where it could be found.

I am looking at https://json-schema.org/draft-07/draft-handrews-json-schema-validation-01 for JSON Schema vocabulary.

@wendellpiez
Copy link
Contributor

Actually let me retract that ... there is indeed something more going on here. inasmuch as although some data types are handled by the defined types (StringDataType), others (notably MarkupLineDatatype) appear to be falling through a crack.

Thanks @swanky-oscal for picking this up -- it can take some digging to see and almost certainly an improvement to be made to the schema generation --

@wendellpiez
Copy link
Contributor

A correction is on line in a working branch and lightly tested. So this bug should go away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Needs Triage
Development

No branches or pull requests

3 participants