Metaschema XSLT Inspector (#73)

Several months of work building InspectorXSLT using test-driven-development. Also includes considerable work on XSpec support including XProc and Saxon runtimes, scripts and CI/CD support. --------- Co-authored-by: A.J. Stein <[email protected]>
usnistgov · Feb 6, 2024 · d8c759d · d8c759d
1 parent 2515000
commit d8c759d
Show file tree

Hide file tree

Showing 77 changed files with 11,175 additions and 94 deletions.
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
@@ -0,0 +1,45 @@
+name: CI
+on:
+  push:
+    branches:
+      - main
+      - dev
+  pull_request: {}
+env:
+  JAVA_VERSION: "17"
+  JAVA_DISTRIBUTION: "temurin"
+jobs:
+  deploy:
+    runs-on: ubuntu-20.04
+    steps:
+      - uses: actions/checkout@v3
+        with:
+          submodules: true
+          fetch-depth: 0
+      - uses: actions/setup-java@v3
+        with:
+          distribution: "${{ env.JAVA_DISTRIBUTION }}"
+          java-version: "${{ env.JAVA_VERSION }}"
+      - name: Run unit tests
+        run: |
+          make -C src unit-test
+        id: unit-tests
+      - name: Run integration tests
+        run: |
+            make -C src smoke-test
+        id: integration-tests
+      - name: Run specification tests
+        run: |
+          make -C src spec-test
+        id: spec-tests
+
+      # Publish the test summary as comment on the PR
+      - name: Publish XSpec Test Results Summary
+        uses: EnricoMi/publish-unit-test-result-action@8885e273a4343cd7b48eaa72428dea0c3067ea98
+        if: runner.os == 'Linux'
+        with:
+          github_token: ${{ secrets.GITHUB_TOKEN }}
+          check_name: XSpec Test Results
+          files: "**/*_junit-report.xml"
+          report_individual_runs: true
+          deduplicate_classes_by_file_name: false
diff --git a/.gitignore b/.gitignore
@@ -2,7 +2,9 @@
 *.xpr
 
 # other generated files
-xspec/*-result.html
+**/*-xspec/*-result.html
+**/xspec/*-result.html
+**/*xspec*-result.html
 
 # test outputs
 src/**/test_output/
diff --git a/README.md b/README.md
@@ -175,6 +175,10 @@ See the [Metaschema Repository](https://github.com/usnistgov/metaschema) and its
 
 See the Project [Wiki](https://github.com/usnistgov/metaschema-xslt/wiki) for documentation maintained on this site.
 
+## Acknowledgements
+
+This work and especially work on testing XSLT represented by this project would have been impossible without examples and leadership provided by persons including AG; AJS; DW; NW.
+
 ## Required outline
 
 This page includes all the following, as described by guidelines at https://raw.githubusercontent.com/usnistgov/opensource-repo/main/README.md

diff --git a/TESTING.md b/TESTING.md
@@ -16,47 +16,65 @@ If code legibility and consistency become an issue, we can consider more stringe
 
 ## *Test Everything*
 
-TBD - a TEST EVERYTHING subroutine.
+With `make`, `bash`, and Maven installed, `make test -C src` runs all the tests (tested under Ubuntu) in the `src` directory relative to the current working directory.
 
-## Testing technologies
+The top-level Makefile in this directory collects commands from Makefiles distributed throughout the repo.
+
+You can also use `make` in isolation from the top-level testing, to focus on your application. As a developer you only need to worry about the folder containing your application, binding test tasks to the targets 'smoke-test', 'spec-test' and 'unit-test' as need be.
+
+Model such a Makefile, which calls XSpec for testing XSLT, after the example `src/schema-gen/InspectorXSLT/Makefile`.
+
+Experiment using different Makefile targets as configured in the various directories.
+
+For any directory, `make` with no arguments should offer tips.
+
+### Testing technologies
 
 [XSpec](https://github.com/xspec/xspec/) is the preferred testing harness for XSLT in this initiative. XSpec test suites can be either self-contained, or can reference testing artifacts maintained externally. The repository contains XSpec examples demonstrating a range of usage patterns that can be applied.
 
 Script-driven testing should rely on the same dependencies as the runtimes they test, as documented.
 
-## Global functional testing
-
-`src/testing` includes resources for global-level testing. This folder or its contents should not to be moved or edited without fully testing *all* test runtimes, as resources inside this directory are sometimes dependencies.
+### Extensions to XSpec
 
-Do not commit anything to this folder that you do not wish to stay there indefinitely; instead, copy into a sibling (temporary) directory that can be deleted freely.
+Currently we are emulating and re-engineering some specific XSpec capabilities in the [support/xspec-dev](support/xspec-dev) folder.
 
-## Application component-level (functional) unit testing
+These efforts are focused on producing and refining XSpec runtimes for various use cases and scenarios with specialized requirements faced by this project, such as arbitrary batching and iXML support. Tools we develop here are released under the same terms as Metaschema-XSLT (as open-source software).
 
-`src/**/testing` includes (functional) testing for utilities supported in a given folder.
 
-When developing applications, feel free to add and modify any `testing` folder or its contents within the scope of work.
+## Test-driven development
 
-Unit tests are expected to run successfully when committed - both completing, and passing all applicable tests. Keep in mind that most testing frameworks support marking tests as not applicable (in XSpec, [flag a scenario or `expect` as `pending`](https://github.com/xspec/xspec/wiki/Focusing-Your-Efforts#marking-scenario-or-expectation-as-pending)), so it is possible to write tests ahead of an implementation and still pass.
+Almost all testing in this repository falls into the category of either XSLT transformations, or runtimes that embed transformations.
 
-### Test-driven development
+### The approach
 
 While this project began as an experimental proof of concept, it now aims for higher levels of assurance and confidence than are necessary or appropriate for applications intended only to produce findings regarding feasibility and levels of effort. Accordingly, our development approach has shifted from rapid prototyping to a more explicit and traceable process of design, specification and implementation.
 
 If you touch a particular unit of code that doesn't have tests, write tests for it in the same PR as your change. If you touch a particular unit of code that has tests, update or augment them to test the change you are making. In general, push the tests ahead of the code, not the other way around, aligning the tests with [the Metaschema specification(s)](https://pages.nist.gov/metaschema/specification/) first.
 
 This expenditure of effort prevents bugs (easier than repairing them) and guards against regression, opening opportunities to do more interesting things. So it is not so much "extra" as an investment in future stability and sustainability.
 
-## Test applications
+The approach can require changing some habits. Looking for inspiration and "striking while the iron is hot" no longer works as well (since the forge must be warmed up first). Sometimes immediate gratification has to be set aside. Yet the payoffs are substantial, and come early.
 
-`examples` (tbd) includes top-level independent metaschema examples made for testing and demonstration.
+## Global functional testing
 
-This location is available for lightweight and <q>toy</q> applications, useful for evaluation, demonstration and learning. Fully built-out applications of Metaschema can also call this repository in as a submodule (like [OSCAL](https://github.com/usnistgov/oscal)).
+`src/testing` includes resources for global-level testing. This folder or its contents should not to be moved or edited without fully testing *all* test runtimes, as resources inside this directory are sometimes dependencies.
+
+Do not commit anything to this folder that you do not wish to stay there indefinitely; instead, copy into a sibling (temporary) directory that can be deleted freely.
+
+## Application component-level (functional) unit testing
+
+`src/**/testing` includes (functional) testing for utilities supported in a given folder.
+
+When developing applications, feel free to add and modify any `testing` folder or its contents within the scope of work.
+
+Unit tests are expected to run successfully when committed - both completing, and passing all applicable tests. Keep in mind that most testing frameworks support marking tests as not applicable (in XSpec, [flag a scenario or `expect` as `pending`](https://github.com/xspec/xspec/wiki/Focusing-Your-Efforts#marking-scenario-or-expectation-as-pending)), so it is possible to write tests ahead of an implementation and still pass.
 
 ## Testing under CI/CD
 
-Also tbd
+Github Actions is configured in the file [.github/workflows/test.yml](.github/workflows/test.yml)
+
+Note that since this logic enters the `Makefile` logic from the top, make executes the specified subroutines recursively.
+
+Accordingly, adding a test subroutine to a `spec-tests` Makefile configuration anywhere in the repository has the effect of enabling it (turning it on) for CI/CD as well.
 
-Links of interest: 
 
-- https://github.com/nkutsche/xspec-maven-plugin
-- https://github.com/galtm/xslt-accumulator-tools/blob/db1c6b2a/pom.xml#L68
diff --git a/src/Makefile b/src/Makefile
@@ -1,7 +1,10 @@
 include testing/make_common.mk
 
-# Each subdirectory that has a makefile
-dirs:=$(dir $(wildcard ./*/Makefile))
+# Each subdirectory (recursively) that has a makefile
+# Makefile wildcard function does not support that, so we use the shell
+# function with the find utility and look ever Makefile in a child dir
+# relative to this one, but exclude this one to use with the FOREACH macro.
+dirs:=$(shell find '.' ! -wholename ./Makefile -name 'Makefile' -printf "%h\n")
 
 .PHONY: test
 test: ## Run all tests

diff --git a/src/schema-gen/InspectorXSLT/Makefile b/src/schema-gen/InspectorXSLT/Makefile
@@ -0,0 +1,28 @@
+include ../../testing/make_common.mk
+
+module_path:=$(shell dirname $(realpath $(firstword $(MAKEFILE_LIST))))
+output_folder:=$(module_path)/test_output
+xspec_script=$(realpath $(module_path)/../../../support/xspec-dev/mvn-saxon-xspec-batch.sh)
+
+.PHONY: test
+test: unit-test smoke-test ## Run all tests
+
+.PHONY: spec-test
+spec-test: ## Run all specification-tests
+	LOGFILE="$(output_folder)/inspector-functional-tests.log" $(xspec_script) \
+		"folder=$(module_path)/testing/tests/inspector-functional-xspec" \
+		"report-to=$(output_folder)/inspector-functional-tests_report.html" \
+		"junit-to=$(output_folder)/inspector-functional-tests_junit-report.xml" \
+		"recurse=yes"
+
+.PHONY: smoke-test
+smoke-test: ## Run all smoke-tests
+	LOGFILE="$(output_folder)/integration-tests.log" $(xspec_script) \
+		"folder=$(module_path)/testing/tests/inspector-generation-xspec" \
+		"report-to=$(output_folder)/integration-tests_report.html" \
+		"junit-to=$(output_folder)/integration-tests_junit-report.xml" \
+		"recurse=yes"
+
+.PHONY: clean
+clean: ## Remove test output
+	rm -fr $(output_folder)/*
diff --git a/src/schema-gen/InspectorXSLT/TESTING.md b/src/schema-gen/InspectorXSLT/TESTING.md
@@ -0,0 +1,107 @@
+# Testing the XSLT Metaschema Inspector
+
+Produced from a metaschema, an Inspector is an XSLT transformation which produces, when applied to an XML document, error and warning messages from that document respecting its conformance to the rules dictated by that metaschema.
+
+In other words the Inspector is a Schema Emulator, as it aims to return (or expose) effectively the same information  as schema validation.
+
+Broadly, this presents at least three areas for testing (each of which has complexities)
+
+- Generation of the Inspector XSLT from controlled (metaschema) sources
+- Functionality of the Inspector XSLT
+- Interfaces and outputs / runtime options for Inspector XSLT
+
+Currently we focus on the *first two* of these, producing functional results in a simplified format that we can build to later - testing interface targets at that time as appropriate. An example would be producing HTML or Markdown reports: for now we test only generic reports in a format we control.
+
+Find resources for testing the XSLT Inspector and its production in the [testing](testing) subdirectory.
+
+## Model metaschemas for testing
+
+### `current` resource cache
+
+The testing/current directory contains copies of resources produced by the Inspector generator and related tools for testing, including:
+
+- Current-best Inspector implementation for any metaschemas, as generated
+- Current-best XSD expressing (a subset of) the same rules as the corresponding Inspector, for a metaschema module, for testing
+- Current-best 'composed' metaschema instance for each testing metaschema. i.e. a fully assembled and linked metaschema module, useful for debugging
+
+Within these artifacts, initial comments should give information regarding date of creation.
+
+### "Computer Model" metaschema
+
+This suite uses an extended variant of the 'computer metaschema' model by AJ Stein and team for testing, maintained here as [testing/computer_metaschema.xml](testing/computer_metaschema.xml).
+
+The Inspector XSLT feature set can be tested by generating XSLT and schemas and creating instances (e.g. `valid` and `invalid` instances) from this metaschema, which exercise its feature set.
+
+See the original in [the specification's repository](https://github.com/usnistgov/metaschema/blob/develop/examples/).
+
+#### Refresh the 'computer model' XSD
+
+Use a script such as [../mvn-xsd-schema-xsl.sh](../mvn-xsd-schema-xsl.sh) or the XSLT [../nist-metaschema-MAKE-XSD.xsl](../nist-metaschema-MAKE-XSD.xsl) to produce an XSD file for the [testing/computer_metaschema.xml](testing/computer_metaschema.xml).
+
+This XSD should validate the same set of rules as the Inspector (excluding Metaschema query constraints) and can be used to cross check functionality. Note that this XSD is also dynamically generated and might itself have bugs or issues. (If only in principle. In reality, the schema generators are also tested both in the lab and the field.) Irrespective of this question, the requirements are that both processes (schema validation and Inspector-XSLT validation) are effectively congruent, compatible and "the same" inasmuch as they detect all the same problems in data.
+
+A copy of the current-best schema is also here, to be refreshed as necessary): [testing/current/computer_metaschema-xmlschema.xsd](testing/current/computer_metaschema-xmlschema.xsd)
+
+### Refresh the 'computer model' Inspector XSLT
+
+Before testing the Computer Inspector XSLT, the copy kept for testing must be refreshed.
+
+First, build `current/computer_inspector.xsl` from `computer_metaschema.xml` using `generate-inspector-xslt`
+
+  - Use ../METASCHEMA-INSPECTOR-XSLT.xpl runtime or script to provide metaschema composition, then apply the 'generator' stylesheet to produce the Inspector XSLT
+  - The top-level ../nist-metaschema-MAKE-INSPECTOR-XSLT.xsl applies the same XSLT pipeline
+  - Either test metaschema, or any correctly tagged metaschema, can be refreshed this way
+
+### Tiny Data mini-model
+
+An additional small metaschema is provided specifically for the purpose of isolating markup-based datatypes (`markup-line` and `markup-multiline`) in their various configurations and testing the correctness of validations of this markup (passing valid markup and reporting invalid markup).
+
+Use it and test with it the same way as the Computer metaschema.
+
+"Tiny data" supports term bases (controlled vocabularies) and documents using controlled terminology, using a very few tags. With a little creative extension-by-restriction it can be used for glossaries and arbitrary structured prose in a lightweight XML format supportive of further improvement, enhancement, and conversion.
+
+## Testing the Inspector XSLT
+
+Question: *Is the XSLT produced from a metaschema instance capable of addressing its functional requirements?*
+
+To address this question, functional requirements can be isolated and illustrated both in standalone complete documents, and in document fragments maintained as XSpec test suites.
+
+### Standalone document-level tests
+
+Question: *Can test samples including nominally-valid and invalid test cases be known to be valid or invalid, as described?*
+
+Within `testing`, `computers-valid` contains Computer Model instances expected to test as valid.
+
+Examples within `computers-invalid` when tested by the Computer MOdel Inspector (or any validator) are expected to return appropriate warnings and errors. They may be commented with notes indicating their lapses.
+
+For testing the InspectorXSLT transformation, the XSpec file [testing/validations-in-batch.xspec](testing/validations-in-batch.xspec) runs both valid and invalid sets through the Inspector and ensures results are correct - reports for the invalid cases, no reports for the valid cases.
+
+[An XSD schema ](testing/computer_metaschema-xmlschema.xsd) can also be used to confirm validity or failure to validate for sets of examples, as given. Any other metaschema-based validator, or a metaschema-derived validation that supports XML, can also be used, such as a validator produced using [metaschema-java](https://github.com/usnistgov/metaschema-java).
+
+Also, examples within `tinydata` may be valid or invalid to the Tiny metaschema, as indicated.
+
+### Templates and functions
+
+Question: *How do I know a specific report is being produced correctly by Inspector XSLT for a given error condition in 'computer XML' data?*
+
+Individual templates and defined functions can also be targeted and tested in XSpec.
+
+XSpec testing breaking out these cases, both 'go' and 'no-go', are located in [the inspector-functional testing directory](testing/tests/inspector-functional/)
+
+## Testing Inspector XSLT production
+
+Question: *Is the XSLT produced from a metaschema instance correct not only respect to its capabilities (addressing functional requirements) but also other requirements such as legibility, exception handling or post-processing features?*
+
+To the extent that 'correctly' is currently defined, it is in reference to functionality (see above) and relevant Metaschema specifications, not to an abstract design.
+
+However, a target for this transformation - generation of XSLT from correct Metaschema source data - can be defined and codified as a 'canonical form' of Inspector XSLT. To the extent this has been done, XSpec demonstrating conformance to the expressed requirements is given in [the inspector-generation testing directory](testing/tests/inspector-generation/)
+
+The [Inspector XSLT Generator Pipeline](../METASCHEMA-INSPECTOR-XSLT.xpl) includes a step that applies the generated XSLT and reports a finding of `OKAY` or `ERROR` as a pipeline result (on output port `OUT_xslt-prooftest`), as a convenience.
+
+## Testing the <q>costuming</q> post-processing pipelines
+
+Inspector XSLT first produces MX outputs. These are further processed, first by being filtered, then into HTML and Markdown results.
+
+These transformations can be tested. An HTML-to-Markdown XSpec could also be useful elsewhere.
+
+If these are not already to be found among the tests it remains a TODO item.