Skip to content

Flatworm is a Java library intended to allow a developer to describe the format of a flat file using an XML definition file, and then to be able to automatically read lines from that file, and have one or more beans be instantiated for each logical record (James M. Turner). This is a revitalization of the flatworm parsing library originally crea…

License

Notifications You must be signed in to change notification settings

ahenson/flatworm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

flatworm: Flat File Parser

Flatworm is a Java library intended to allow a developer to describe the format of a flat file using an XML definition file, and then to be able to automatically read lines from that file, and have one or more beans be instantiated for each logical record (original description from James M. Turner).

Requires JDK 1.8 or higher (as 4.0.0-SNAPSHOT).

Latest release

The most recent release is flatworm 3.0.2, which was released by Josh Brackett. Work on flatworm 4.0.0-SNAPSHOT is currently underway, but is in SNAPSHOT release so it is not yet available via Maven Central repository.

To add a dependency on Guava using Maven, use the following:

<dependency>
  <groupId>com.blackbear</groupId>
  <artifactId>flatworm</artifactId>
  <version>3.0.3</version>
</dependency>

The actively developed release is flatworm 4.0.0-SNAPSHOT. flatworm 4.0.0 will not be backwards compatible with as there was far too much new functionality added to maintain the original structure and support, but the change will be worth it.

Background

The flatworm project is inherited from the efforts of multiple others. The original source code was authored by James M. Turner in what appears to be 2004. The last contributions from Mr. Turner appear to be 2.0.x and from there it appears that Josh Brackett took over and iterated it to the 3.0.x releases. There are also remnants of edits from another developer, Dave Berry, but it's unclear at what point those contributions were made.

Alan Henson has since taken over the code base and has modified it to be compatible with Maven from a build and package management sense. Additionally, where applicable, code has been converted to use Java 8 constructs.

License

flatworm is open source and is licensed under the Apache 2.0 license.

Documentation

There is an *.html file in the docs folder that explains usage, but it's a bit out of date. The hope is to get this updated soon. In the mean time, have a look at the test cases for the best examples of usage. The examples are also out of date, there is a TODO item for that as well.

Work Completed

  • Added support for Maven
  • Cleaned up exceptions to have two main checked-exceptions:
  • FlatwormConfigurationException - used for indicating issues with parsing configuration data
  • FlatwormParserException - used for indicating issues with parsing the actual data
  • Removed Generics type declaration where the type could be inferred: List<String> result = new ArrayList<>();
  • Added use of the foreach construct for(String item : collection)
  • Added use of Streams and Lambdas where it mae code cleaners
  • Cleaned up some comments
  • Removed com.blackbear.flatworm.Callback - use ExceptionCallback or RecordCallback instead (moved to new callbacks package).
  • Added ability to use a JavaScript snippet to see if a line should be processed by a given Record
  • Added ability to specify ignore-field on a record-element to explicitly ignore it.
  • Changed the record-element "type" attribute to "converter-name" as that's what it's really linked to.
  • Changed the minlength/maxlength attributes for the length-ident element to min-length/max-length for consistency.
  • On Field Identity (field-ident) - added ignore-case tag to indicate whether or not case should play a factor in comparison.
  • Added support for single segment-element configurations where the child doesn't have to be a collection
  • Added support for non-delimited segment-elements - a "child" line can be a non-delimited line
  • Added line identifiers

TODOs

  • Update the "Field Guide" to reflect the latest changes and usages
  • Update the examples to reflect the latest capabilities and provide more guidance within the examples
  • Add annotations support to all for beans to define their record makeup
  • Add the ability to specify a type in addition-to or instead-of a converter
  • Introduce constants where string literals are used
  • Complete checklist for deploying production jar to Maven Central repository
  • Add default converters based upon reflection
  • Add more broad support for the script-ident so that CDATA isn't required
  • Add support for other script types and engines
  • Refactor the parsing logic from the beans that hold the configuration data
  • Add missing JavaDocs
  • Add ability for folks to write their own Identity implementations and make them annotation enabled. Right now they would have to build their own annotation configuration loader and extend the relevant parts - which is fine, but this can be done more cleanly I think.
  • Add ability for the Script Identity approach to load scripts from files
  • Add more verbose logging
  • Is the field-length attribute really needed on Field Identity? We know the length by the matching strings. They should all be the same length else the match will always fail.

About

Flatworm is a Java library intended to allow a developer to describe the format of a flat file using an XML definition file, and then to be able to automatically read lines from that file, and have one or more beans be instantiated for each logical record (James M. Turner). This is a revitalization of the flatworm parsing library originally crea…

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published