Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proj4j should not use the Apache License if it contains the EPSG data set #90

Closed
julianhyde opened this issue Nov 26, 2022 · 16 comments
Closed
Assignees
Labels

Comments

@julianhyde
Copy link

Proj4j should not use the Apache License while it continues to contain the EPSG data set. The licensing terms are not compatible. Use of the EPSG data set implies the user's acceptance of EPSG's conditions (such as 'no commercial use'), but the Apache License implies that there are no such conditions.

You can't say 'the users should have read the fine print', because the whole point of the Apache License is that there is no fine print.

In Apache Calcite we used Proj4j, and took at face value Proj4j's assertion that it was under the Apache license. In so doing, we have placed our users in legal jeopardy because they may have used Calcite's ST_Transform function for commercial purposes. See CALCITE-5399 for more details.

I think it would be irresponsible for the authors of Proj4j to continue to license under Apache License. Everyone who uses Pro4j is getting a time bomb, and may not know it. (Sure, it sucks that EPSG is not available under an open source license. But you can't hide that fact under a layer of software abstraction.)

May I suggest the following solution. Split the EPSG library into a separate library (jar file) that is NOT under an open source license, say proj4j-data. Keep the rest of the library the same, but make the dependency optional. If the user explicitly downloads proj4j-data and places it on the classpath (thereby signaling their consent to the terms), then proj4j will work as normal. If they do not, pro4j will work in a reduced capacity. Maybe that capacity is massively reduced, I don't know.

Another possible solution. If the discussion on CALCITE-5399, Martin Desruisseaux suggests that Proj4j implements GeoAPI interfaces. If so, downstream projects like Calcite could depend on only those interfaces, and the user could download and "plug in" as an implementation of those interfaces. Calcite would ship with a 'dumb' implementation of those interfaces that did not use the EPSG data set, but would contain instructions on its web site for people to download Proj4j if they agreed with the EPSG conditions.

@julianhyde julianhyde changed the title Proj4j should use the Apache License if it contains the EPSG data set Proj4j should not use the Apache License if it contains the EPSG data set Nov 26, 2022
@echeipesh
Copy link
Contributor

Catching up with the discussion in the links you have posted. For now here are the links to the IP review tickets under which EPSG datasets have been approved by Eclipse legal review:

Name: EPSG Geodetic Parameter Dataset
Version: 7.9
Description: The EPSG database provides common identifiers for measurements
(size and shape of the earth), mathematical models (of the shape of the earth),
common reference points (such as the prime meridian) and units (of distance)
required for the use of geospatial information.

To facilitate communication between GIS systems (and alleviate an long history
of loss of property and life) these identifiers and definitions are required
for any and all interoperability between systems. To go further these
identifiers and definitions are required to responsibly use any and all data
published or information captured.

These definitions are often the responsibility of individual industry, national
and international authorities. The assignment of an identifier is the
responsibility of the International Association of Oil & Gas Producers (OGP).
Formally European Petroleum Standards Groups (EPSG).

This standards body publishes their registry as an access database, with terms
permitting commercial use.

The terms restrict "distribution for profit" for the identifiers as they form
a freely available standard the wide distribution of which is in the interest
of safety. Distribution in a commercial context requires that value charged be
based on the software, and not by virtue of including this freely available
dataset.

The identifiers and definitions are used by a GIS system in a similar fashion
as Metric or Imperial systems of measurements provided by the UOMo project.

Cryptography: No

License(s): Historical Permission Notice and Disclaimer
Other
Project URL: http://www.epsg-registry.org/

My reading of the above is precisely because the projects themselves add functionality, coordinate reprojection, and do not try to extract commercial value from the database itself they are within the terms of use for EPSG. This would also necessarily be the case for any project that use Proj4J or GeoMesa as a dependency since they would be adding further functionality.

Also, paging @jodygarnett and @jnh5y for deeper answers and insights.

@desruisseaux
Copy link

desruisseaux commented Nov 26, 2022

Yes, PROJ4J is allowed to bundle the EPSG dataset. This is not the issue. The issue is that the license shall be Apache + EPSG Terms of Use, not Apache alone. Then, it is up to projects using PROJ4J to decide what to do with the "EPSG Terms of Use" part of the license. In the particular case of the Apache Software Foundation, this is classified as Category X, which means that EPSG data can not be part of official Apache releases (but can be part of releases made by users of Apache releases).

Eclipse has approved the use of EPSG Terms of Use in PROJ4J, indeed maybe on the basis that PROJ4J add a value that free it from the "no commercial use" clause. However this is an authorization to use, not an authorization to relicense. Nobody can relicense except the copyright owner.

In summary the issues are:

  • With PROJ4J in its current form, license shall be Apache + EPSG Terms of Use, not Apache alone (same argument applies to PROJ).
  • For organisations that can not take the EPSG terms of use part, it would be convenient to separate the EPSG data from the software so that they can take only the part under Apache license. Then we can let users add themselves the part under EPSG Terms of Use if they want.

@jnh5y
Copy link

jnh5y commented Nov 28, 2022

From what I can see, @julianhyde and @desruisseaux are correct here.

Eclipse's permission is to depend on the EPSG dataset. Eclipse cannot give permission to relicense the dataset, so the project should either update the license to reflect the current situation or as suggested spin out the data into a separate jar which can be used or not as desired.

In terms of interfaces, the GeoAPI interfaces are an option. One could also get the EPSG data from various GeoTools jars (which admittedly are LGPL licensed + EPSG; I imagine Apache projects which be in the same situation of needing to provide scripts to get them without bundling the jars themselves).

@pomadchin
Copy link
Member

pomadchin commented Nov 28, 2022

Just to clarify, what portion of these files gets under the EPSG license? All of them?

I see no technical issues in splitting the project into two: proj4j and proj4j-epsg.

@desruisseaux
Copy link

Files containing EPSG data are listed below. It can be verified by entering a code (first column) in https://epsg.org/ and comparing the result with the data in the CSV file. A different tab needs to be selected depending on the file. I put the tab name in italic.

  • src/main/resources/ellipsoid.csv: Ellipsoids tab.
  • src/main/resources/gcs.csv: CRS tab.
  • src/main/resources/gdal_datum.csv: Datums tab. Despite the name, I see EPSG data in it. Maybe GDAL applied some modifications or additions.
  • src/main/resources/pcs.csv: CRS tab.
  • src/main/resources/prime_meridian.csv: Prime Meridians tab.
  • src/main/resources/projop_wparm.csv: Conversions tab. Seems to be a mix of EPSG data and data from other sources.
  • src/main/resources/unit_of_measure.csv: Units tab.
  • src/main/resources/wkt/epsg.properties: same data than above but encoded differently.

I'm not sure what src/main/resources/pcs.override.csv is. But it seems to be a few modifications applied on pcs.csv, so it may be safer to keep with it. Same would apply to gcs.override.csv but since that file seems practically empty anyway, it would not matter where it is located.

Regarding GeoTools, the situation is the same. License shall be (whatever GeoTools choose) + EPSG Terms of Use, unless they provide EPSG data in a separated JAR. Same shall apply to PROJ (the C/C++ library) as well.

@julianhyde
Copy link
Author

If you are able to remove all EPSG files from the Proj4J jar file, as in @pomadchin's PR, I believe that would solve Calcite's problems.

Thank you for finding a solution. As soon as there is a Proj4J release available with the EPSG files removed we would love to incorporate it in a Calcite release.

I apologize if my words were harsh. I know that we are all acting in good faith, doing our best to make awesome software available under a permissive open source license.

@pomadchin
Copy link
Member

pomadchin commented Nov 30, 2022

@julianhyde I'm waiting for some extra thumbs up, your review is very much appreciated as well!

I'll cut a release once we're all acknowledged with the PR 👍

@jodygarnett
Copy link

This issue is a real pain as the above is data available under one of the first open data licenses. With the key distinction that additions should not claim to be from the EPSG authority (many national governments make additional codes, the open geospatial consortium makes one called CRS:84 for example for WGS84 is lon/lat order).

Consider this a data license; and not a software license?

@jodygarnett
Copy link

jodygarnett commented Nov 30, 2022

The EPSG no commercial use is not strictly true; The goal is to include this dataset is as many GIS applications as possible. They just do not want you charging your customers additional money for the inclusion of the dataset (which you obtained for free).

aside: We had a difficult time and got an exemption from the eclipse foundation to distribute these files. One reason it is difficult is that the license is so very old it does not have a lot in common with modern data licenses.

@desruisseaux
Copy link

Indeed, this is a kind of data license. But this issue does not question the right to distribute EPSG data. It just said that any software distributing EPSG data must include EPSG Terms of Use in their list of licenses. Because not every software foundation accept those Term of Use (Apache does not at this time), the ability to separate EPSG data is a convenience for them.

@pomadchin just curious: if PROJ4J does not use those CSV files, where does it takes its data when e.g. the "EPSG:3395" CRS is requested?

@pomadchin
Copy link
Member

pomadchin commented Nov 30, 2022

@desruisseaux #92 contains the whole split; proj4j relies on nad files only https://github.com/locationtech/proj4j/tree/master/src/main/resources/proj4/nad

I'm having troubles finding out are these files okay; could not find any metnions of them on the epsg website, are these files clear? I've seen contribution into them by proj4 contributors, adjusting and adding misssing data i.e. here: https://github.com/OSGeo/PROJ/tree/5.2.0/nad

@desruisseaux
Copy link

It does not seems to be EPSG data (or at least I do not recognize them). I do not know the provenance of those files, but "NAD27" and "NAD83" suggests that they are North American Datum 1927 and 1983, which are defined (I think) by U.S. National Geodetic Survey, a U.S. federal agency. As such, those data should be in public domain.

@pomadchin
Copy link
Member

Proj4j 1.2.0 is released with no epsg files in the resource folder; all the old epsg files are in the proj4j-epsg module now.

Let me know if it does not resolve this issue and / or feel free to reopen it / create a new one.
I'm closing it for now!

@desruisseaux
Copy link

Thanks. Just for the record (in case not everyone is familiar with the relationship between those two projects), it resolves the issue for PROJ4J. It is independent of the PROJ project, for which the issue is still open at my knowledge (it was raised on their mailing list maybe one or two years ago).

@julianhyde
Copy link
Author

julianhyde commented Dec 5, 2022

Thank you @pomadchin! We have logged CALCITE-5417 to migrate to the stripped-down proj4j 1.2.0 artifact, and restore it to a runtime dependency under the Apache License, and expect to do it shortly.

@julianhyde
Copy link
Author

@pomadchin There's a problem with the 1.2.0 release. The version in the pom file deployed to maven central contains the version 1.2.0-SNAPSHOT, not 1.2.0.

See https://search.maven.org/remotecontent?filepath=org/locationtech/proj4j/proj4j/1.2.0/proj4j-1.2.0.pom and also https://central.sonatype.dev/artifact/org.locationtech.proj4j/proj4j/1.2.0-SNAPSHOT/versions

This caused problems when i tried to upgrade Calcite to use it: https://github.com/apache/calcite/actions/runs/3666724770/jobs/6198671745

TomasJohansson added a commit to TomasJohansson/crsTransformations that referenced this issue Jan 5, 2023
… version 1.2.2, and generated a new output csv file "CrsTransformationAdapterProj4jLocationtech_version_1.2.2.csv" with lots of coordinate transformations created by this new version,

which can be compared (e.g. by using WinMerge) with the csv file from the previously used version i.e. the csv file "CrsTransformationAdapterProj4jLocationtech_version_1.1.4.csv"
(and there were no differences)
Also added a new dependency "proj4j-epsg" and updated the license text because of this new dependency.
Below is some text copied from https://github.com/locationtech/proj4j :
"Important! As of 1.2.2 version, proj4-core contains no EPSG Licensed files. In order to make proj4j properly operate, it makes sense to consider proj4-epsg dependency usage."

Some related "locationtech/proj4j" issues and pull requests:

"Proj4j should not use the Apache License if it contains the EPSG data set"
locationtech/proj4j#90

"Split projects into proj4j and proj4j-epsg"
locationtech/proj4j#92

"Move all core resources to epsg submodule"
locationtech/proj4j#95
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants