Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move sno repository into .git folder (working trees) #147

Closed
olsen232 opened this issue Jul 2, 2020 · 6 comments
Closed

Move sno repository into .git folder (working trees) #147

olsen232 opened this issue Jul 2, 2020 · 6 comments
Assignees
Milestone

Comments

@olsen232
Copy link
Collaborator

olsen232 commented Jul 2, 2020

It seems like the entire sno repository - excepting the working copy - could be moved into a .git folder. Thoughts?

  • Seems like all existing code would Just Work™ since a git repo is either a bare repo, or, a folder with a .git folder in it. pygit2 still finds all the refs and blobs etc.
  • We only need to change the code where a sno repository is initialised, and there shouldn't be any compatibility issues
  • Tools that recognise git repositories still work, eg command line tools that show the git repo state.

Pros:

  • sno repository is less cluttered, easy to find your working copy, easy to not break things you shouldn't be messing with
    Cons:
  • sno repository doesn't look very much like anything any more, a user might not realise when they are in one.
@olsen232
Copy link
Collaborator Author

olsen232 commented Jul 2, 2020

Current structure

myrepo/
    HEAD
    config
    description
    hooks
    info
    myrepo.gpkg
    myrepo.gpkg-journal
    objects
    refs

Proposed

myrepo/
    .git/
        HEAD
        config
        description
        hooks
        info
        objects
        refs
    myrepo.gpkg
    myrepo.gpkg-journal

@olsen232 olsen232 added this to the 0.5 milestone Jul 2, 2020
@rcoup
Copy link
Member

rcoup commented Jul 2, 2020

So medium-term, we want to be able to support files in repos as well. These could be documentation, map legends, or other supporting information that goes with the datasets. And some folder structure (which we already kinda-have internally).

eg:

myproject/
  myproject.gpkg
  myproject.style
  dataset1/
    doc.pdf
  base_datasets/
    dataset2/
      style.style
    dataset3/
      notes.txt

The concept was that this would be an actual worktree/working-copy and exclude dataset objects from being populated into it via sparse-checkouts, specifically the newer core.sparseCheckoutCone mechanism. Then dropping doc.pdf into the right place would follow a normal-ish git workflow for committing files, combined with any data changes.

There are some UX issues to work through here with respect to how to effectively stop people from committing GPKG/SHP/CSV/whatever data as "files" instead of getting them imported into actual Sno datasets.

→ In the meantime, no problem with bare repo + .git/ (.sno/?), there are a couple of issues we need to work through though:

If the working copy doesn't have local files — eg. it's a PostgisDB, then we end up with:

myproject/
  .sno/

As you identified, since .sno/ is hidden, the whole thing will look like an empty directory, which I feel would confuse people. Similar applies to subdirectories / actual working-copies.

💡 generate a placeholder file into each folder (maybe with human-readable info on where the working copy/dataset is?) so the folders don't appear empty, then effectively git-ignore it (via .git/info/exclude?)

@rcoup
Copy link
Member

rcoup commented Jul 2, 2020

(.sno/?)

From a very quick look through the git/libgit2 source, seems like .git is very much hardcoded. Which is fine.

@olsen232
Copy link
Collaborator Author

olsen232 commented Jul 2, 2020

Yeah, .sno wouldn't be recognised by other tools either eg https://git-scm.com/book/id/v2/Appendix-A%3A-Git-in-Other-Environments-Git-in-Bash

I missed out a con earlier - a bare repository doesn't let you check in random files by putting them in the working tree, but if we move the git files into .git then you would be able to, if I understand git correctly. I guess this should wait until we get sparse checkouts working.

@olsen232 olsen232 removed this from the 0.5 milestone Jul 2, 2020
@rcoup
Copy link
Member

rcoup commented Jul 16, 2020

So I did some testing of sparse-checkouts, and it's fairly functional and seems to work well.

For a new repo:

  • set core.sparseCheckout=true in the config.
  • populate .git/info/sparse-checkout with:
/*
!**/.sno-table/

If cloning... either:

$ git clone --sparse ...
$ git sparse-checkout set '/*' '!**/.sno-table/'

or:

$ git clone --no-checkout ...
$ git sparse-checkout init  # sets core.sparseCheckout
$ git sparse-checkout set '/*' '!**/.sno-table/'  # updates .git/info/sparse-checkout
$ git checkout HEAD

(which do the same thing)

@rcoup rcoup changed the title Move sno repository into .git folder Move sno repository into .git folder (working trees) Jul 17, 2020
@olsen232 olsen232 added this to the 0.5 milestone Aug 14, 2020
@olsen232 olsen232 self-assigned this Aug 20, 2020
@olsen232 olsen232 modified the milestones: 0.5, 0.6 Aug 20, 2020
@olsen232
Copy link
Collaborator Author

This is done, in that git internals are now (generally) hidden in a .sno folder.

There is a plan to do sparse checkouts of the git objects into a working tree - with ".sno-dataset" trees filtered out since they get put into the DB working copy, but checking out everything else, and allowing any extra user files to be checked into this structure. This has not been started and still needs more design work. It can be tracked separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants