Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-37936: Systematically distinguish rooted vs. unrooted in .gitignore. #15823

Merged
merged 10 commits into from
Sep 11, 2019

Conversation

gnprice
Copy link
Contributor

@gnprice gnprice commented Sep 10, 2019

A root cause of bpo-37936 is that it's easy to write a .gitignore
rule that's intended to apply to a specific file (e.g., the
pyconfig.h generated by ./configure) but actually applies to all
similarly-named files in the tree (e.g., PC/pyconfig.h.)

Specifically, any rule with no non-trailing slashes is applied in an
"unrooted" way, to files anywhere in the tree. This means that if we
write the rules in the most obvious-looking way, then

  • for specific files we want to ignore that happen to be in
    subdirectories (like Modules/config.c), the rule will work
    as intended, staying "rooted" to the top of the tree; but

  • when a specific file we want to ignore happens to be at the root of
    the repo (like platform), then the obvious rule (platform) will
    apply much more broadly than intended: if someone tries to add a
    file or directory named platform somewhere else in the tree, it
    will unexpectedly get ignored.

That's surprising behavior that can make the .gitignore file's
behavior feel finicky and unpredictable.

To avoid it, we can simply always give a rule "rooted" behavior when
that's what's intended, by systematically using leading slashes.

Further, to help make the pattern obvious when looking at the file and
minimize any need for thinking about the syntax when adding new rules:
separate the rules into one group for each type, with brief comments
identifying them.

For most of these rules it's clear whether they're meant to be rooted
or unrooted, but in a handful of cases I've only guessed. In that
case the safer default (the choice that won't hide information) is the
narrower, rooted meaning, with a leading slash. If for some of these
the unrooted meaning is desired after all, it'll be easy to move them
to the unrooted section at the top.

While here, also made some rules more general where that seems
most natural, and deleted some rules that appear long obsolete.

Co-Authored-By: Zachary Ware [email protected]

https://bugs.python.org/issue37936

A root cause of bpo-37936 is that it's easy to write a .gitignore
rule that's intended to apply to a specific file (e.g., the
`pyconfig.h` generated by `./configure`) but actually applies to all
similarly-named files in the tree (e.g., `PC/pyconfig.h`.)

Specifically, any rule with no non-trailing slashes is applied in an
"unrooted" way, to files anywhere in the tree.  This means that if we
write the rules in the most obvious-looking way, then

 * for specific files we want to ignore that happen to be in
   subdirectories (like `Modules/config.c`), the rule will work
   as intended, staying "rooted" to the top of the tree; but

 * when a specific file we want to ignore happens to be at the root of
   the repo (like `platform`), then the obvious rule (`platform`) will
   apply much more broadly than intended: if someone tries to add a
   file or directory named `platform` somewhere else in the tree, it
   will unexpectedly get ignored.

That's surprising behavior that can make the .gitignore file's
behavior feel finicky and unpredictable.

To avoid it, we can simply always give a rule "rooted" behavior when
that's what's intended, by systematically using leading slashes.

Further, to help make the pattern obvious when looking at the file and
minimize any need for thinking about the syntax when adding new rules:
separate the rules into one group for each type, with brief comments
identifying them.

For most of these rules it's clear whether they're meant to be rooted
or unrooted, but in a handful of cases I've only guessed.  In that
case the safer default (the choice that won't hide information) is the
narrower, rooted meaning, with a leading slash.  If for some of these
the unrooted meaning is desired after all, it'll be easy to move them
to the unrooted section at the top.
Copy link
Member

@zware zware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly looks good, but I have several suggestions for further cleanup.

.gitignore Outdated
/config.status
/config.status.lineno
/db_home
/.hg/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can safely be ignored globally; we're not going to check in a mercurial repository anywhere :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid prediction, I think :)

.gitignore Outdated
/python-gdb.py
/python.exe-gdb.py
/reflog.txt
/.svn/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be globally ignored.

.gitignore Outdated
/python.exe-gdb.py
/reflog.txt
/.svn/
/.coverage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this.

.gitignore Outdated
/config.status.lineno
/db_home
/.hg/
/ipch/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this was meant to be rooted at /; IIRC it's intermediate files in the Windows build rooted in PCbuild/<something>, and thus can probably just go away since those entire directories are ignored now.

.gitignore Outdated
/libpython*.a
/libpython*.so*
/libpython*.dylib
/libpython*.dll
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each of these 4 can be generalized to ignore anything with these extensions; we don't have anything with these extensions checked in and likely never will.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

.gitignore Show resolved Hide resolved
.gitignore Outdated
pybuilddir.txt
/autom4te.cache
/build/
/buildno
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is an ancient leftover from before the buildno section of sys.version referred to a VCS hash; it can just go away.

.gitignore Outdated
/python-config
/python-config.py
/python.bat
/python.exe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could globally ignore *.exe, with a specific exception for !Lib/distutils/command/*.exe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, thanks!

.gitignore Outdated
/config.log
/config.status
/config.status.lineno
/db_home
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a holdover from the bsddb package, and can go away (so can the explicit callout of db_home in Lib/test/libregrtest/runtest.py).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, indeed!

Looks like that disappeared in b98eb87 , in the Python 3.0 period (and the bit that made a db_home directory went away just before that, in d098ff2 .)

.gitignore Show resolved Hide resolved
@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be put in the comfy chair!

Copy link
Contributor Author

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zware for the many helpful suggestions!

Just pushed an update making all your suggested changes, with the exception of the one on config.status; details below.

.gitignore Show resolved Hide resolved
.gitignore Outdated
/config.log
/config.status
/config.status.lineno
/db_home
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, indeed!

Looks like that disappeared in b98eb87 , in the Python 3.0 period (and the bit that made a db_home directory went away just before that, in d098ff2 .)

.gitignore Outdated
/libpython*.a
/libpython*.so*
/libpython*.dylib
/libpython*.dll
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

.gitignore Outdated
/python-config
/python-config.py
/python.bat
/python.exe
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, thanks!

.gitignore Show resolved Hide resolved
.gitignore Outdated
/config.status
/config.status.lineno
/db_home
/.hg/
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid prediction, I think :)

@gnprice
Copy link
Contributor Author

gnprice commented Sep 11, 2019

@zware Also just edited the PR description (aka draft commit message) to credit you as a coauthor, because that seems appropriate given all the new substantive changes you contributed 🙂

(If you disagree, though, please feel free to take it out when merging.)

Copy link
Member

@zware zware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@miss-islington
Copy link
Contributor

Thanks @gnprice for the PR, and @zware for merging it 🌮🎉.. I'm working now to backport this PR to: 3.7, 3.8.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Sorry, @gnprice and @zware, I could not cleanly backport this to 3.8 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 455122a0094c8cfdf7e062eccc5e5b5885c75e8b 3.8

@miss-islington
Copy link
Contributor

Sorry @gnprice and @zware, I had trouble checking out the 3.7 backport branch.
Please backport using cherry_picker on command line.
cherry_picker 455122a0094c8cfdf7e062eccc5e5b5885c75e8b 3.7

zware pushed a commit to zware/cpython that referenced this pull request Sep 11, 2019
…itignore (pythonGH-15823)

A root cause of bpo-37936 is that it's easy to write a .gitignore
rule that's intended to apply to a specific file (e.g., the
`pyconfig.h` generated by `./configure`) but actually applies to all
similarly-named files in the tree (e.g., `PC/pyconfig.h`.)

Specifically, any rule with no non-trailing slashes is applied in an
"unrooted" way, to files anywhere in the tree.  This means that if we
write the rules in the most obvious-looking way, then

 * for specific files we want to ignore that happen to be in
   subdirectories (like `Modules/config.c`), the rule will work
   as intended, staying "rooted" to the top of the tree; but

 * when a specific file we want to ignore happens to be at the root of
   the repo (like `platform`), then the obvious rule (`platform`) will
   apply much more broadly than intended: if someone tries to add a
   file or directory named `platform` somewhere else in the tree, it
   will unexpectedly get ignored.

That's surprising behavior that can make the .gitignore file's
behavior feel finicky and unpredictable.

To avoid it, we can simply always give a rule "rooted" behavior when
that's what's intended, by systematically using leading slashes.

Further, to help make the pattern obvious when looking at the file and
minimize any need for thinking about the syntax when adding new rules:
separate the rules into one group for each type, with brief comments
identifying them.

For most of these rules it's clear whether they're meant to be rooted
or unrooted, but in a handful of cases I've only guessed.  In that
case the safer default (the choice that won't hide information) is the
narrower, rooted meaning, with a leading slash.  If for some of these
the unrooted meaning is desired after all, it'll be easy to move them
to the unrooted section at the top..
(cherry picked from commit 455122a)

Co-authored-by: Greg Price <[email protected]>
@bedevere-bot
Copy link

GH-15900 is a backport of this pull request to the 3.8 branch.

@zware
Copy link
Member

zware commented Sep 11, 2019

The 3.7 backport is just complicated enough that I'm going to leave it alone.

Thanks for the patch, @gnprice!

zware added a commit that referenced this pull request Sep 11, 2019
…itignore (GH-15823) (GH-15900)

A root cause of bpo-37936 is that it's easy to write a .gitignore
rule that's intended to apply to a specific file (e.g., the
`pyconfig.h` generated by `./configure`) but actually applies to all
similarly-named files in the tree (e.g., `PC/pyconfig.h`.)

Specifically, any rule with no non-trailing slashes is applied in an
"unrooted" way, to files anywhere in the tree.  This means that if we
write the rules in the most obvious-looking way, then

 * for specific files we want to ignore that happen to be in
   subdirectories (like `Modules/config.c`), the rule will work
   as intended, staying "rooted" to the top of the tree; but

 * when a specific file we want to ignore happens to be at the root of
   the repo (like `platform`), then the obvious rule (`platform`) will
   apply much more broadly than intended: if someone tries to add a
   file or directory named `platform` somewhere else in the tree, it
   will unexpectedly get ignored.

That's surprising behavior that can make the .gitignore file's
behavior feel finicky and unpredictable.

To avoid it, we can simply always give a rule "rooted" behavior when
that's what's intended, by systematically using leading slashes.

Further, to help make the pattern obvious when looking at the file and
minimize any need for thinking about the syntax when adding new rules:
separate the rules into one group for each type, with brief comments
identifying them.

For most of these rules it's clear whether they're meant to be rooted
or unrooted, but in a handful of cases I've only guessed.  In that
case the safer default (the choice that won't hide information) is the
narrower, rooted meaning, with a leading slash.  If for some of these
the unrooted meaning is desired after all, it'll be easy to move them
to the unrooted section at the top.

(cherry picked from commit 455122a)

Co-authored-by: Greg Price <[email protected]>
@gnprice gnprice deleted the pr-gitignore-rooted branch September 12, 2019 03:36

*.cover
*.iml
*.o
*.a
*.so*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, this is causing a minor inconvenience in packaging for debian as this now matches debian/README.source

the original pattern was libpython*.so* could that be restored?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debian/README.source does not exist in the cpython repo, so presumably you can unignore it wherever you're patching it in. I would think you would want to explicitly unignore anything added to avoid issues like this.

I prefer to keep *.so*, because we we don't want to accidentally check in any compiled extension modules (even though that shouldn't happen anyway and any *.so other than libpython* should theoretically only appear in the ignored build directory, ignoring them anyway seems better to me).

However, I could be persuaded to expand that rule to

*.so
*.so.*

If that works for you @asottile, please submit a PR and ping me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will send a PR 👍

asottile added a commit to deadsnakes/python3.9 that referenced this pull request Nov 21, 2019
asottile added a commit to deadsnakes/python3.9 that referenced this pull request Nov 21, 2019
asottile added a commit to asottile/cpython that referenced this pull request Nov 21, 2019
In python#15823 the pattern was changed from `libpython*.so*` to `*.so*` which
matches a bit too greedily for some packagers.  For instance this trips up
`debian/README.source`.  A more specific pattern fixes this issue.
zware pushed a commit that referenced this pull request Nov 27, 2019
In GH-15823 the pattern was changed from `libpython*.so*` to `*.so*` which
matches a bit too greedily for some packagers.  For instance this trips up
`debian/README.source`.  A more specific pattern fixes this issue.
jacobneiltaylor pushed a commit to jacobneiltaylor/cpython that referenced this pull request Dec 5, 2019
In pythonGH-15823 the pattern was changed from `libpython*.so*` to `*.so*` which
matches a bit too greedily for some packagers.  For instance this trips up
`debian/README.source`.  A more specific pattern fixes this issue.
shihai1991 pushed a commit to shihai1991/cpython that referenced this pull request Jan 31, 2020
In pythonGH-15823 the pattern was changed from `libpython*.so*` to `*.so*` which
matches a bit too greedily for some packagers.  For instance this trips up
`debian/README.source`.  A more specific pattern fixes this issue.
websurfer5 pushed a commit to websurfer5/cpython that referenced this pull request Jul 20, 2020
…re (pythonGH-15823)

A root cause of bpo-37936 is that it's easy to write a .gitignore
rule that's intended to apply to a specific file (e.g., the
`pyconfig.h` generated by `./configure`) but actually applies to all
similarly-named files in the tree (e.g., `PC/pyconfig.h`.)

Specifically, any rule with no non-trailing slashes is applied in an
"unrooted" way, to files anywhere in the tree.  This means that if we
write the rules in the most obvious-looking way, then

 * for specific files we want to ignore that happen to be in
   subdirectories (like `Modules/config.c`), the rule will work
   as intended, staying "rooted" to the top of the tree; but

 * when a specific file we want to ignore happens to be at the root of
   the repo (like `platform`), then the obvious rule (`platform`) will
   apply much more broadly than intended: if someone tries to add a
   file or directory named `platform` somewhere else in the tree, it
   will unexpectedly get ignored.

That's surprising behavior that can make the .gitignore file's
behavior feel finicky and unpredictable.

To avoid it, we can simply always give a rule "rooted" behavior when
that's what's intended, by systematically using leading slashes.

Further, to help make the pattern obvious when looking at the file and
minimize any need for thinking about the syntax when adding new rules:
separate the rules into one group for each type, with brief comments
identifying them.

For most of these rules it's clear whether they're meant to be rooted
or unrooted, but in a handful of cases I've only guessed.  In that
case the safer default (the choice that won't hide information) is the
narrower, rooted meaning, with a leading slash.  If for some of these
the unrooted meaning is desired after all, it'll be easy to move them
to the unrooted section at the top.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants