Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for version-controlled point clouds #565

Closed
olsen232 opened this issue Mar 2, 2022 · 10 comments
Closed

Add support for version-controlled point clouds #565

olsen232 opened this issue Mar 2, 2022 · 10 comments
Assignees
Labels
meta Meta issue for a larger set of issues/PRs/tasks

Comments

@olsen232
Copy link
Collaborator

olsen232 commented Mar 2, 2022

Tracking bug - will add PRs here, and make notes of functionality that is missing. Later when point clouds gets closer to launch, this can be used to file individual point cloud issues.

Right now all functionality is missing - point clouds can't be imported, checked out, diffed, modified, committed, etc.

@olsen232 olsen232 self-assigned this Mar 2, 2022
@olsen232 olsen232 added the meta Meta issue for a larger set of issues/PRs/tasks label Mar 2, 2022
@olsen232
Copy link
Collaborator Author

olsen232 commented Jun 8, 2022

If you use a recent build of Kart, and get PDAL and Git LFS installed, such that they are linked in properly with their existing dependencies such as sqlite, and if you set environment variable X_KART_POINT_CLOUDS=1 then you should be able to do the following:

  • import point clouds, including automatically converting to COPC
  • push and pull
  • show, diff and commit tile changes
    As of Point-cloud working-copy part. #645, branch switches and resets and the working copy side of things is more robust.

Not done:

  • meta changes - these are not detected and not supported
  • merges + conflicts
  • spatial filtering
  • the build (PDAL and Git LFS)
  • copy-on-write in order to save disk space (?)

@olsen232
Copy link
Collaborator Author

olsen232 commented Jun 13, 2022

Meta

  • Detect meta changes

Right now we don’t store LAZ version or COPC version except in the tiles themselves, and in the pointer files to those tiles (unlike the schema, which we store per-dataset in schema.json). This means that a dataset with no tiles (eg if you delete every tile it has) still has a schema, but no longer has a well-defined LAS version or COPC version.

  • Commit meta changes
    o Commit tile changes and discard meta changes
    o Commit tile and meta changes: update dataset meta items to match new tile(s)

Make sure we support adding non-copc files to copc dataset, since, we auto convert non-copc to copc during import, and it will confuse users if they then can't commit additional non-copc files to the imported dataset.

Merges/conflicts

Make sure repo is correctly moved into merge-state in the case of conflicts.
Add a way to resolve conflicts.
(For vector, this is only possible by providing the resolution to the CLI using a stdin or a file. There is no way to provide the resolution using a working copy. For PC, it would probably make most sense to provide it using the working copy. We could - if we wanted - add support for providing vector resolutions at the working copy level too).

Spatial filtering

Support spatial filtering.

Build issues

PDAL
Git LFS client

Copy-on-write

Not sure what this will involve

Odds and ends

  • Some more work on reset() PC: Restores and resets #652
  • Naming bug - don't name things .copc.copc.laz
  • Changes to kart status output
  • kart create-workingcopy only supports tabular working copies - need some PC equivalent
  • Rename detection? If you rename a tile, right now it is treated as a delete + insert.
  • Fix the bug where we extract only the horizontal CRS
  • Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?)
  • Calling git lfs fetch to fetch from HEAD won't work for a partial reset from another branch
  • git lfs fetch seems to be fetching missing+promised blobs, which breaks spatial filtering.
  • Putting LFS tiles into workdir-index is also causing them to be copied to the ODB, which is a huge waste of space.
  • Handle finer-grained soft-resets than entire datasets

@olsen232
Copy link
Collaborator Author

olsen232 commented Jul 19, 2022

Meta changes - done ✅

  • Detect and commit meta changes ✅
  • Commit meta changes ✅
  • Commit new tiles with --convert-to-dataset-format ✅

Merges/conflicts

  • Make sure repo is correctly moved into merge-state in the case of conflicts.
  • Make sure conflicts can be resolved (in a basic way)
  • Write all versions conflicting tiles to the workdir somewhere, to let the user view them and figure out how to merge them.
  • Let the user merge conflicting tiles by reading them from the workdir
  • Relatedly: Let the user merge conflicting vector features by reading from the working copy
  • Clean up any conflicts that are in the workdir when the merge is complete.

Spatial filtering

Support spatial filtering.

Build issues

PDAL
Git LFS client

Copy-on-write

Not sure what this will involve

Odds and ends

  • Changes to kart status output
  • kart create-workingcopy only supports tabular working copies - need some PC equivalent
  • Rename detection? If you rename a tile, right now it is treated as a delete + insert.
  • Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?)
  • Calling git lfs fetch to fetch from HEAD won't work for a partial reset from another branch
  • git lfs fetch seems to be fetching missing+promised blobs, which breaks spatial filtering.

@olsen232
Copy link
Collaborator Author

olsen232 commented Aug 1, 2022

Spatial filtering vs Git LFS

(this is mentioned briefly in Odds and Ends above but makes more sense as as part of Spatial Filtering)
Git LFS doesn't seem to currently support missing+promised blobs - it seems to deal with them in two different ways, both of which are undesirable for us:

  • it fetches them, which makes things simple from its point of view - now it can check if they are pointer files - but this means that content that is outside the spatial filter can leak into your repo, causing it to grow in size towards the repo's unfiltered size, which is undesirable
  • it doesn't fetch them and crashes instead, complaining that they are missing

Probably we will need to patch it. Possibly such patches can be upstreamed, since partial-clones are an official git concept which other LFS users might want to use without Git LFS crashing.

A Kart repo need not have any LFS tiles to hit problems with Git LFS - it is enough that the repo was created with experimental flag X_KART_POINT_CLOUDS=1 in the environment, which causes the LFS hooks to be installed, and the repo has missing+promised blobs.

@olsen232
Copy link
Collaborator Author

olsen232 commented Aug 9, 2022

Meta changes - done ✅

  • Detect and commit meta changes ✅
  • Commit meta changes ✅
  • Commit new tiles with --convert-to-dataset-format ✅

Merges/conflicts - done ✅

  • Make sure repo is correctly moved into merge-state in the case of conflicts. ✅
  • Make sure conflicts can be resolved (in a basic way) ✅
  • Write all versions conflicting tiles to the workdir somewhere, to let the user view them and figure out how to merge them. ✅
  • Let the user merge conflicting tiles by reading them from the workdir ✅
  • Relatedly: Let the user merge conflicting vector features by reading from the working copy ✅
  • Clean up any conflicts that are in the workdir when the merge is complete. ✅

Spatial filtering

  • Part 1 - don't accidentally fetch missing+promised blobs when scanning for pointer files, as this breaks existing spatial filtering
  • Part 2 - don't fetch LFS tiles that are outside the spatial filter

Build issues

PDAL
Git LFS client

Copy-on-write

Not sure what this will involve

Odds and ends

  • Changes to kart status output
  • kart create-workingcopy only supports tabular working copies - need some PC equivalent
  • Rename detection? If you rename a tile, right now it is treated as a delete + insert.
  • Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?)

@olsen232
Copy link
Collaborator Author

Meta changes - done ✅

Merges/conflicts - done ✅

Spatial filtering ✅

  • Part 1 - don't accidentally fetch missing+promised blobs when scanning for pointer files, as this breaks existing spatial filtering ✅
  • Part 2 - don't fetch LFS tiles that are outside the spatial filter ✅

Build issues

PDAL
Git LFS client

Copy-on-write

Not sure what this will involve

Odds and ends

  • Changes to kart status output
  • kart create-workingcopy only supports tabular working copies - need some PC equivalent ✅
  • Rename detection? If you rename a tile, right now it is treated as a delete + insert.
  • Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?) ❌ no success so far
  • kart lfs+ push is fetching a potentially large number of files using command-line arguments right now - use stdin instead so as not to overflow the terminal.

@olsen232
Copy link
Collaborator Author

olsen232 commented Sep 8, 2022

Bugs

  • Test on windows - pre-push script is not working there:

@craigds
Copy link
Member

craigds commented Oct 13, 2022

there's an import bug:

@olsen232
Copy link
Collaborator Author

olsen232 commented Dec 1, 2022

Build issues ❌/✅
Entire build system is reworked. Done but still not merged
https://github.com/koordinates/kart/tree/rc-vendor-vcpkg-pyremove

Copy-on-write
Should work fairly simply (on macos and linux only) using reflink package. Waiting for build issues to settle.
(Maybe need a windows copy-on-write solution for launch?)

Odds and ends
Changes to kart status output format
Rename detection? If you rename a tile, right now it is treated as a delete + insert.
Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?) ❌ no success so far. Can launch without this
kart lfs+ push is fetching a potentially large number of files using command-line arguments right now - use stdin instead so as not to overflow the terminal. ✅
Merge import command with point-cloud-import command
Remove X_KART_POINT_CLOUDS feature flag / default it on.
Better error recovery if operations are interrupted or fail partway through - see #741

Docs
Write point cloud specific docs
Skim through existing docs and websites, make sure they are still up-to-date

@olsen232
Copy link
Collaborator Author

Build issues ✅ - #752

Copy-on-write ✅ - #757
Not done on windows (or on any platform that doesn't support reflink) - see #772

Odds and ends
Changes to kart status output format ✅
Rename detection? If you rename a tile, right now it is treated as a delete + insert. ❌ skipped.
Merge import command with point-cloud-import command ✅ - #746
Remove X_KART_POINT_CLOUDS feature flag / default it on. ✅
Better error recovery if operations are interrupted or fail partway through - ✅ #751

Docs ✅ just finishing these up

Will close out this tracking issue since it's not being used any more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta Meta issue for a larger set of issues/PRs/tasks
Projects
None yet
Development

No branches or pull requests

2 participants