Skip to content

Commit

Permalink
Metaschema XSLT Inspector (#73)
Browse files Browse the repository at this point in the history
Several months of work building InspectorXSLT using test-driven-development.
Also includes considerable work on XSpec support including XProc and Saxon runtimes, scripts and CI/CD support.

---------

Co-authored-by: A.J. Stein <[email protected]>
  • Loading branch information
wendellpiez and aj-stein-nist committed Feb 6, 2024
1 parent 2515000 commit d8c759d
Show file tree
Hide file tree
Showing 77 changed files with 11,175 additions and 94 deletions.
45 changes: 45 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
name: CI
on:
push:
branches:
- main
- dev
pull_request: {}
env:
JAVA_VERSION: "17"
JAVA_DISTRIBUTION: "temurin"
jobs:
deploy:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 0
- uses: actions/setup-java@v3
with:
distribution: "${{ env.JAVA_DISTRIBUTION }}"
java-version: "${{ env.JAVA_VERSION }}"
- name: Run unit tests
run: |
make -C src unit-test
id: unit-tests
- name: Run integration tests
run: |
make -C src smoke-test
id: integration-tests
- name: Run specification tests
run: |
make -C src spec-test
id: spec-tests

# Publish the test summary as comment on the PR
- name: Publish XSpec Test Results Summary
uses: EnricoMi/publish-unit-test-result-action@8885e273a4343cd7b48eaa72428dea0c3067ea98
if: runner.os == 'Linux'
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
check_name: XSpec Test Results
files: "**/*_junit-report.xml"
report_individual_runs: true
deduplicate_classes_by_file_name: false
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@
*.xpr

# other generated files
xspec/*-result.html
**/*-xspec/*-result.html
**/xspec/*-result.html
**/*xspec*-result.html

# test outputs
src/**/test_output/
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,10 @@ See the [Metaschema Repository](https://github.com/usnistgov/metaschema) and its

See the Project [Wiki](https://github.com/usnistgov/metaschema-xslt/wiki) for documentation maintained on this site.

## Acknowledgements

This work and especially work on testing XSLT represented by this project would have been impossible without examples and leadership provided by persons including AG; AJS; DW; NW.

## Required outline

This page includes all the following, as described by guidelines at https://raw.githubusercontent.com/usnistgov/opensource-repo/main/README.md
Expand Down
54 changes: 36 additions & 18 deletions TESTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,47 +16,65 @@ If code legibility and consistency become an issue, we can consider more stringe

## *Test Everything*

TBD - a TEST EVERYTHING subroutine.
With `make`, `bash`, and Maven installed, `make test -C src` runs all the tests (tested under Ubuntu) in the `src` directory relative to the current working directory.

## Testing technologies
The top-level Makefile in this directory collects commands from Makefiles distributed throughout the repo.

You can also use `make` in isolation from the top-level testing, to focus on your application. As a developer you only need to worry about the folder containing your application, binding test tasks to the targets 'smoke-test', 'spec-test' and 'unit-test' as need be.

Model such a Makefile, which calls XSpec for testing XSLT, after the example `src/schema-gen/InspectorXSLT/Makefile`.

Experiment using different Makefile targets as configured in the various directories.

For any directory, `make` with no arguments should offer tips.

### Testing technologies

[XSpec](https://github.com/xspec/xspec/) is the preferred testing harness for XSLT in this initiative. XSpec test suites can be either self-contained, or can reference testing artifacts maintained externally. The repository contains XSpec examples demonstrating a range of usage patterns that can be applied.

Script-driven testing should rely on the same dependencies as the runtimes they test, as documented.

## Global functional testing

`src/testing` includes resources for global-level testing. This folder or its contents should not to be moved or edited without fully testing *all* test runtimes, as resources inside this directory are sometimes dependencies.
### Extensions to XSpec

Do not commit anything to this folder that you do not wish to stay there indefinitely; instead, copy into a sibling (temporary) directory that can be deleted freely.
Currently we are emulating and re-engineering some specific XSpec capabilities in the [support/xspec-dev](support/xspec-dev) folder.

## Application component-level (functional) unit testing
These efforts are focused on producing and refining XSpec runtimes for various use cases and scenarios with specialized requirements faced by this project, such as arbitrary batching and iXML support. Tools we develop here are released under the same terms as Metaschema-XSLT (as open-source software).

`src/**/testing` includes (functional) testing for utilities supported in a given folder.

When developing applications, feel free to add and modify any `testing` folder or its contents within the scope of work.
## Test-driven development

Unit tests are expected to run successfully when committed - both completing, and passing all applicable tests. Keep in mind that most testing frameworks support marking tests as not applicable (in XSpec, [flag a scenario or `expect` as `pending`](https://github.com/xspec/xspec/wiki/Focusing-Your-Efforts#marking-scenario-or-expectation-as-pending)), so it is possible to write tests ahead of an implementation and still pass.
Almost all testing in this repository falls into the category of either XSLT transformations, or runtimes that embed transformations.

### Test-driven development
### The approach

While this project began as an experimental proof of concept, it now aims for higher levels of assurance and confidence than are necessary or appropriate for applications intended only to produce findings regarding feasibility and levels of effort. Accordingly, our development approach has shifted from rapid prototyping to a more explicit and traceable process of design, specification and implementation.

If you touch a particular unit of code that doesn't have tests, write tests for it in the same PR as your change. If you touch a particular unit of code that has tests, update or augment them to test the change you are making. In general, push the tests ahead of the code, not the other way around, aligning the tests with [the Metaschema specification(s)](https://pages.nist.gov/metaschema/specification/) first.

This expenditure of effort prevents bugs (easier than repairing them) and guards against regression, opening opportunities to do more interesting things. So it is not so much "extra" as an investment in future stability and sustainability.

## Test applications
The approach can require changing some habits. Looking for inspiration and "striking while the iron is hot" no longer works as well (since the forge must be warmed up first). Sometimes immediate gratification has to be set aside. Yet the payoffs are substantial, and come early.

`examples` (tbd) includes top-level independent metaschema examples made for testing and demonstration.
## Global functional testing

This location is available for lightweight and <q>toy</q> applications, useful for evaluation, demonstration and learning. Fully built-out applications of Metaschema can also call this repository in as a submodule (like [OSCAL](https://github.com/usnistgov/oscal)).
`src/testing` includes resources for global-level testing. This folder or its contents should not to be moved or edited without fully testing *all* test runtimes, as resources inside this directory are sometimes dependencies.

Do not commit anything to this folder that you do not wish to stay there indefinitely; instead, copy into a sibling (temporary) directory that can be deleted freely.

## Application component-level (functional) unit testing

`src/**/testing` includes (functional) testing for utilities supported in a given folder.

When developing applications, feel free to add and modify any `testing` folder or its contents within the scope of work.

Unit tests are expected to run successfully when committed - both completing, and passing all applicable tests. Keep in mind that most testing frameworks support marking tests as not applicable (in XSpec, [flag a scenario or `expect` as `pending`](https://github.com/xspec/xspec/wiki/Focusing-Your-Efforts#marking-scenario-or-expectation-as-pending)), so it is possible to write tests ahead of an implementation and still pass.

## Testing under CI/CD

Also tbd
Github Actions is configured in the file [.github/workflows/test.yml](.github/workflows/test.yml)

Note that since this logic enters the `Makefile` logic from the top, make executes the specified subroutines recursively.

Accordingly, adding a test subroutine to a `spec-tests` Makefile configuration anywhere in the repository has the effect of enabling it (turning it on) for CI/CD as well.

Links of interest:

- https://github.com/nkutsche/xspec-maven-plugin
- https://github.com/galtm/xslt-accumulator-tools/blob/db1c6b2a/pom.xml#L68
7 changes: 5 additions & 2 deletions src/Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
include testing/make_common.mk

# Each subdirectory that has a makefile
dirs:=$(dir $(wildcard ./*/Makefile))
# Each subdirectory (recursively) that has a makefile
# Makefile wildcard function does not support that, so we use the shell
# function with the find utility and look ever Makefile in a child dir
# relative to this one, but exclude this one to use with the FOREACH macro.
dirs:=$(shell find '.' ! -wholename ./Makefile -name 'Makefile' -printf "%h\n")

.PHONY: test
test: ## Run all tests
Expand Down
28 changes: 28 additions & 0 deletions src/schema-gen/InspectorXSLT/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
include ../../testing/make_common.mk

module_path:=$(shell dirname $(realpath $(firstword $(MAKEFILE_LIST))))
output_folder:=$(module_path)/test_output
xspec_script=$(realpath $(module_path)/../../../support/xspec-dev/mvn-saxon-xspec-batch.sh)

.PHONY: test
test: unit-test smoke-test ## Run all tests

.PHONY: spec-test
spec-test: ## Run all specification-tests
LOGFILE="$(output_folder)/inspector-functional-tests.log" $(xspec_script) \
"folder=$(module_path)/testing/tests/inspector-functional-xspec" \
"report-to=$(output_folder)/inspector-functional-tests_report.html" \
"junit-to=$(output_folder)/inspector-functional-tests_junit-report.xml" \
"recurse=yes"

.PHONY: smoke-test
smoke-test: ## Run all smoke-tests
LOGFILE="$(output_folder)/integration-tests.log" $(xspec_script) \
"folder=$(module_path)/testing/tests/inspector-generation-xspec" \
"report-to=$(output_folder)/integration-tests_report.html" \
"junit-to=$(output_folder)/integration-tests_junit-report.xml" \
"recurse=yes"

.PHONY: clean
clean: ## Remove test output
rm -fr $(output_folder)/*
107 changes: 107 additions & 0 deletions src/schema-gen/InspectorXSLT/TESTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Testing the XSLT Metaschema Inspector

Produced from a metaschema, an Inspector is an XSLT transformation which produces, when applied to an XML document, error and warning messages from that document respecting its conformance to the rules dictated by that metaschema.

In other words the Inspector is a Schema Emulator, as it aims to return (or expose) effectively the same information as schema validation.

Broadly, this presents at least three areas for testing (each of which has complexities)

- Generation of the Inspector XSLT from controlled (metaschema) sources
- Functionality of the Inspector XSLT
- Interfaces and outputs / runtime options for Inspector XSLT

Currently we focus on the *first two* of these, producing functional results in a simplified format that we can build to later - testing interface targets at that time as appropriate. An example would be producing HTML or Markdown reports: for now we test only generic reports in a format we control.

Find resources for testing the XSLT Inspector and its production in the [testing](testing) subdirectory.

## Model metaschemas for testing

### `current` resource cache

The testing/current directory contains copies of resources produced by the Inspector generator and related tools for testing, including:

- Current-best Inspector implementation for any metaschemas, as generated
- Current-best XSD expressing (a subset of) the same rules as the corresponding Inspector, for a metaschema module, for testing
- Current-best 'composed' metaschema instance for each testing metaschema. i.e. a fully assembled and linked metaschema module, useful for debugging

Within these artifacts, initial comments should give information regarding date of creation.

### "Computer Model" metaschema

This suite uses an extended variant of the 'computer metaschema' model by AJ Stein and team for testing, maintained here as [testing/computer_metaschema.xml](testing/computer_metaschema.xml).

The Inspector XSLT feature set can be tested by generating XSLT and schemas and creating instances (e.g. `valid` and `invalid` instances) from this metaschema, which exercise its feature set.

See the original in [the specification's repository](https://github.com/usnistgov/metaschema/blob/develop/examples/).

#### Refresh the 'computer model' XSD

Use a script such as [../mvn-xsd-schema-xsl.sh](../mvn-xsd-schema-xsl.sh) or the XSLT [../nist-metaschema-MAKE-XSD.xsl](../nist-metaschema-MAKE-XSD.xsl) to produce an XSD file for the [testing/computer_metaschema.xml](testing/computer_metaschema.xml).

This XSD should validate the same set of rules as the Inspector (excluding Metaschema query constraints) and can be used to cross check functionality. Note that this XSD is also dynamically generated and might itself have bugs or issues. (If only in principle. In reality, the schema generators are also tested both in the lab and the field.) Irrespective of this question, the requirements are that both processes (schema validation and Inspector-XSLT validation) are effectively congruent, compatible and "the same" inasmuch as they detect all the same problems in data.

A copy of the current-best schema is also here, to be refreshed as necessary): [testing/current/computer_metaschema-xmlschema.xsd](testing/current/computer_metaschema-xmlschema.xsd)

### Refresh the 'computer model' Inspector XSLT

Before testing the Computer Inspector XSLT, the copy kept for testing must be refreshed.

First, build `current/computer_inspector.xsl` from `computer_metaschema.xml` using `generate-inspector-xslt`

- Use ../METASCHEMA-INSPECTOR-XSLT.xpl runtime or script to provide metaschema composition, then apply the 'generator' stylesheet to produce the Inspector XSLT
- The top-level ../nist-metaschema-MAKE-INSPECTOR-XSLT.xsl applies the same XSLT pipeline
- Either test metaschema, or any correctly tagged metaschema, can be refreshed this way

### Tiny Data mini-model

An additional small metaschema is provided specifically for the purpose of isolating markup-based datatypes (`markup-line` and `markup-multiline`) in their various configurations and testing the correctness of validations of this markup (passing valid markup and reporting invalid markup).

Use it and test with it the same way as the Computer metaschema.

"Tiny data" supports term bases (controlled vocabularies) and documents using controlled terminology, using a very few tags. With a little creative extension-by-restriction it can be used for glossaries and arbitrary structured prose in a lightweight XML format supportive of further improvement, enhancement, and conversion.

## Testing the Inspector XSLT

Question: *Is the XSLT produced from a metaschema instance capable of addressing its functional requirements?*

To address this question, functional requirements can be isolated and illustrated both in standalone complete documents, and in document fragments maintained as XSpec test suites.

### Standalone document-level tests

Question: *Can test samples including nominally-valid and invalid test cases be known to be valid or invalid, as described?*

Within `testing`, `computers-valid` contains Computer Model instances expected to test as valid.

Examples within `computers-invalid` when tested by the Computer MOdel Inspector (or any validator) are expected to return appropriate warnings and errors. They may be commented with notes indicating their lapses.

For testing the InspectorXSLT transformation, the XSpec file [testing/validations-in-batch.xspec](testing/validations-in-batch.xspec) runs both valid and invalid sets through the Inspector and ensures results are correct - reports for the invalid cases, no reports for the valid cases.

[An XSD schema ](testing/computer_metaschema-xmlschema.xsd) can also be used to confirm validity or failure to validate for sets of examples, as given. Any other metaschema-based validator, or a metaschema-derived validation that supports XML, can also be used, such as a validator produced using [metaschema-java](https://github.com/usnistgov/metaschema-java).

Also, examples within `tinydata` may be valid or invalid to the Tiny metaschema, as indicated.

### Templates and functions

Question: *How do I know a specific report is being produced correctly by Inspector XSLT for a given error condition in 'computer XML' data?*

Individual templates and defined functions can also be targeted and tested in XSpec.

XSpec testing breaking out these cases, both 'go' and 'no-go', are located in [the inspector-functional testing directory](testing/tests/inspector-functional/)

## Testing Inspector XSLT production

Question: *Is the XSLT produced from a metaschema instance correct not only respect to its capabilities (addressing functional requirements) but also other requirements such as legibility, exception handling or post-processing features?*

To the extent that 'correctly' is currently defined, it is in reference to functionality (see above) and relevant Metaschema specifications, not to an abstract design.

However, a target for this transformation - generation of XSLT from correct Metaschema source data - can be defined and codified as a 'canonical form' of Inspector XSLT. To the extent this has been done, XSpec demonstrating conformance to the expressed requirements is given in [the inspector-generation testing directory](testing/tests/inspector-generation/)

The [Inspector XSLT Generator Pipeline](../METASCHEMA-INSPECTOR-XSLT.xpl) includes a step that applies the generated XSLT and reports a finding of `OKAY` or `ERROR` as a pipeline result (on output port `OUT_xslt-prooftest`), as a convenience.

## Testing the <q>costuming</q> post-processing pipelines

Inspector XSLT first produces MX outputs. These are further processed, first by being filtered, then into HTML and Markdown results.

These transformations can be tested. An HTML-to-Markdown XSpec could also be useful elsewhere.

If these are not already to be found among the tests it remains a TODO item.
Loading

0 comments on commit d8c759d

Please sign in to comment.