Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-90102: Remove isatty call during regular open #124922

Merged
merged 10 commits into from
Oct 8, 2024

Conversation

cmaloney
Copy link
Contributor

@cmaloney cmaloney commented Oct 3, 2024

TTYs are always character devices. If the interpreter knows a file is not a character device when it would call isatty, skip the isatty call. Inside open() in the same python library call there is a fresh stat result that contains that information. Use the stat result to skip a system call.

This shortcut was suggested by @eryksun in 2021. isatty is not necessarily constant over time, but inside the python library call which opened the fd and has yet to return the fd, it seems reasonable to rely on the fd not changing. The fd may be visible to caller code which passes in an opener or intercepts specific calls.

For gh-120754, with this change reading text with open('README.md').read() is down to 6 system calls:

openat(AT_FDCWD, "README.rst", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=8921, ...}) = 0
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "This is Python version 3.14.0 al"..., 8922) = 8921
read(3, "", 1)                          = 0
close(3)                                = 0

When reading bytes with buffering=0 (disabling BufferedIO which remoevs the lseek), reading a regular file is down to 5 system calls (strace python -c "from pathlib import Path; Path('README.rst').read_bytes()")

Benchmark

Run on a 2024 MacBook Air running Squoia 15.0

Benchmark code
import pyperf
from pathlib import Path


def read_all(all_paths):
    for p in all_paths:
        # note: Using open rather than pathlib since 
        # pathlib.read_bytes() skips BufferedIO and that makes a
        # significant performance delta around the reduced isatty 
        # requirement. Reading bytes rather than text since at 
        # least one `.py` file in tree results in UnicodeDecodeError
        open(p, 'rb').read()


def read_file(path_obj):
    path_obj.read_text()


all_rst = list(Path("Doc").glob("**/*.rst"))
all_py = list(Path(".").glob("**/*.py"))
assert all_rst, "Should have found rst files"
assert all_py, "Should have found python source files"


runner = pyperf.Runner()
runner.bench_func("read_file_small", read_file, Path("Doc/howto/clinic.rst"))
runner.bench_func("read_file_large", read_file, Path("Doc/c-api/typeobj.rst"))

runner.bench_func("read_all_rst", read_all, all_rst)
runner.bench_func("read_all_py", read_all, all_py)

Before:

.....................
read_file_small: Mean +- std dev: 8.30 us +- 0.09 us
.....................
read_file_large: Mean +- std dev: 21.1 us +- 0.2 us
.....................
read_all_rst: Mean +- std dev: 4.42 ms +- 0.05 ms
.....................
read_all_py: Mean +- std dev: 19.5 ms +- 0.2 ms

After:

.....................
read_file_small: Mean +- std dev: 7.78 us +- 0.07 us
.....................
read_file_large: Mean +- std dev: 20.8 us +- 0.4 us
.....................
read_all_rst: Mean +- std dev: 4.16 ms +- 0.05 ms
.....................
read_all_py: Mean +- std dev: 18.5 ms +- 0.3 ms

For files which are recently read / in OS filecache (or on fast devices), there is a ~15% performance overhead with BufferedIO (see: #122111 where I updated Path.read_bytes() and measured the change). Unfortunately multiple attempts I've made to do small reworks to make the lseek unnecessary in BufferedIO have resulted in no performance change and a lot of complexity.

I have been developing a more broad refactoring that could reduce the BufferedIO overhead as well as several other I/O overheads while meeting current API expectations (ex. each layer of the stack re-figures out readable and writeable, every call must re-validate with many branches the fd state and arguments, _pyio needs to copy at least once in user-space because os.read can't be passed a buffer / always allocates a new one, etc.).

I'm planning to put together a talk on "Journey to the center of open().read()" and submit it to present at a San Francisco bay area python meetup since as I've been working on this I've found it's a very intricate set of operations which didn't match my mental image in some interesting ways. Hope is to then do a second talk with sample working implementation which shows how could rework internals while keeping the existing Python I/O API to reduce overheads, increase readability, solve some longstanding bugs, and possibly enable usage of io_uring for more performance improvement. Overarching goal would be to get down to one largely python native I/O implementation with improved performance from the optimizations as well as opening new performance improvement avenues.

@vstinner This replaces #121593 on top of #123412

Lib/_pyio.py Outdated Show resolved Hide resolved
Lib/_pyio.py Outdated Show resolved Hide resolved
Lib/_pyio.py Outdated Show resolved Hide resolved
Modules/_io/fileio.c Show resolved Hide resolved
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vstinner vstinner merged commit cc9b9be into python:main Oct 8, 2024
37 checks passed
@vstinner
Copy link
Member

vstinner commented Oct 8, 2024

Merged, thank you. At the first read, I didn't understand the relationship between S_ISCHR() and isatty(). But your comment makes sense and explains it correctly.

@cmaloney cmaloney deleted the cmaloney/stat_atopen_skip_isatty_t3 branch October 8, 2024 06:52
"""
if (self._stat_atopen is not None
and not stat.S_ISCHR(self._stat_atopen.st_mode)):
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be:

Suggested change
return True
return False

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, Py_RETURN_FALSE; I got right in the C but not the pyio. Making a new PR to update....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

efimov-mikhail pushed a commit to efimov-mikhail/cpython that referenced this pull request Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants