Add PyBytes_Writer() API #39

vstinner · 2024-08-05T13:25:09Z

This API is basically an efficient memory manager for functions creating Python bytes objects.

It avoids the creation of incomplete/inconsistent Python bytes objects: allocate an object with uninitialized bytes, fill the object, resize it. See issues:

API:

// Create a bytes writer instance.
PyBytesWriter* PyBytesWriter_Create(Py_ssize_t size, char **str);

// Return the final Python bytes object and destroy the writer instance.
PyObject* PyBytesWriter_Finish(PyBytesWriter *writer, char *str);

// Discard the internal bytes buffer and destroy the writer instance.
void PyBytesWriter_Discard(PyBytesWriter *writer);

// Allocate *size* bytes to prepare writing *size* bytes into *writer*. 
int PyBytesWriter_Prepare(PyBytesWriter *writer, char **str, Py_ssize_t size);

// Write a the bytes string *bytes* of *size* bytes into *writer*.
int PyBytesWriter_WriteBytes(PyBytesWriter *writer, char **str, const void *bytes, Py_ssize_t size);

The str parameter is what's being used to write bytes in the writer.

Example creating the string "abc":

static PyObject* create_abc(void)
{
    char *str;
    PyBytesWriter *writer = PyBytesWriter_Create(3, &str);
    if (writer == NULL) return NULL;

    // Write "abc"
    memcpy(str, "abc", 3);
    str += 3;

    return PyBytesWriter_Finish(writer, str);
}

The "hot code" which writes bytes can only use str.

The PyBytesWriter_WriteBytes() is just a small convenient helper function (for PyBytesWriter_Prepare() + memcpy() calls).

You can find more documentation and examples in the Pull Request.

This proposed public API is based on existing _PyBytesWriter API which is quite old. I wrote an article about it in 2016: https://vstinner.github.io/pybyteswriter.html

The text was updated successfully, but these errors were encountered:

picnixz · 2024-08-06T08:23:11Z

So, my 2 cent after Victor explained to me how it's being used: it's good and helpful..

I have one question though. Is there a preference of having

PyBytesWriter_Create(Py_ssize_t size, char **str);

instead of

PyBytesWriter_Create(char **str, Py_ssize_t size);

?

Since I'm on a phone, it's hard to check the files but shouldn't we mimic the usual order of having the "output" variable on the left? (for instance the Prepare method also puts the str before its length).

vstinner · 2024-08-06T09:50:47Z

shouldn't we mimic the usual order of having the "output" variable on the left?

Ah? For me, the trend in the Python C API is more to put output parameters at the end ("right").

picnixz · 2024-08-06T09:57:19Z

Ah? Er... actually it's probably the case for the C API now that I think about with all the Getters functions.

I think I used the mem/alloc C functions as the baseline and thought as the Create as something close to the realloc() function. I don't really have a strong opinion on the matter though so let's keep your signature.

encukou · 2024-08-07T09:00:59Z

We probably should have a public discussion on Discourse before getting to the WG decision.

AFAIK, there are two main use cases for a factory like this:

Allocate writer of known size, then immediately fill it, and convert to bytes. (As in the PyLongWriter proposal.)
Build the object incrementally, which involves over-allocation and re-allocation.

For the first use case, IMO an efficient implementation should mean that Create does one memory allocation, and Finish does none (it turns the writer to bytes in place).

For the second, why not use bytearray, which already is explicitly mutable?
It seems that the advantage of a non-PyObject is that it can only have one “owner”, so:

Other code can't mutate it, which would presumably invalidate the str pointer. (The docs for the new public API should be very explicit about how long str is valid.)
One could expect that it's not thread-safe. (If that's the case, public API docs should also be explicit about that.)

It seems to me that PyBytesWriter is close to a generic “growable buffer”. Is there a special reason that it's tied to PyBytes? From the API it seems that Finish could be replaced by passing str to PyBytes_FromStringAndSize + calling Discard -- and then you could do the same to make a bytearray or any other type.

The private API has an optimization that we lose here: it puts a small initial buffer on the stack. It seems that performance-minded users will want to use their own implementation anyway.
Also, C++, Rust, etc. surely have a growable buffers already; they should be fine with PyBytes_FromStringAndSize

In CPython itself, there are quite few uses of _PyBytesWriter.
Is it worth exposing at all?

vstinner · 2024-08-07T09:13:59Z

@encukou:

For the second, why not use bytearray, which already is explicitly mutable?

It's slower since you have to copy bytearray memory to create a bytes object, and destroy the bytearray, to create the final bytes object. PyBytesWriter works in the nanoseconds area.

One could expect that it's not thread-safe.

Correct, the code is not thread safe, and expected to have a single user.

It seems to me that PyBytesWriter is close to a generic “growable buffer”. Is there a special reason that it's tied to PyBytes? From the API it seems that Finish could be replaced by passing str to PyBytes_FromStringAndSize + calling Discard -- and then you could do the same to make a bytearray or any other type.**

PyBytesWriter contains a temporary bytes object. The purpose of the API is to only expose the bytes object when it's fully initialized.

The private API has an optimization that we lose here: it puts a small initial buffer on the stack.

The small buffer is also used in the proposed public API. It's just that it's allocated on the heap memory.

Also, C++, Rust, etc. surely have a growable buffers already; they should be fine with PyBytes_FromStringAndSize

The problem of using a buffer and then calling PyBytes_FromStringAndSize() is that you have to copy the memory, and then destroy the temporary buffer. Again, PyBytesWriter uses a bytes object internally, so calling PyBytesWriter_Finish() is cheap.

vstinner · 2024-08-07T09:42:14Z

PyBytesWriter works in the nanoseconds area.

A more concrete micro-benchmark shows that the public PyBytesWriter (23.4 ns) is 4x faster than bytearray (93.0 ns): python/cpython#121726 (comment).

encukou · 2024-08-08T13:40:56Z

Thank you, I see now! Your explanations help with reading the implementation.

A hard part of making this public will be documenting what str is, when it is valid and how much of the memory it points to is valid. It's a bit hard to explain :)

vstinner · 2024-08-08T18:14:40Z

When you call PyBytesWriter_Create(size, &str), you can write size bytes into str.

When you call PyBytesWriter_Prepare(writer, &str, size), you can write size bytes into str.

str is valid until PyBytesWriter_Finish() or PyBytesWriter_Discard() is called. Would you prefer that these functions set str to NULL?

encukou · 2024-08-09T06:54:54Z

That, and the str you pass to PyBytesWriter_Create or …Finish needs to be the str you got from …Create or …Prepare, incremented by (up to?) the size given to that call. Right?

Would you prefer that these functions set str to NULL?

Probably not necessary, they don't clear writer either.

vstinner · 2024-08-09T10:05:38Z

That, and the str you pass to PyBytesWriter_Create or …Finish needs to be the str you got from …Create or …Prepare, incremented by (up to?) the size given to that call. Right?

Correct.

encukou · 2024-08-19T15:57:35Z

Except you can call Prepare() several times in succession. I'm not entirely clear on what should happen in that case.

I'd expect that each Prepare reserves size bytes after *str, overriding any previous Prepare.
Instead, AFAICS each Prepare adds size bytes to an internal “minimum size” field. IMO, that's harder to explain. But more importantly, I'm not sure on what you should do if you call Prepare with a size that's too big -- is there a good way to “decrement” that if it turns out you needed fewer bytes than you Prepared? Or are users expected to provide exact sizes (which might be expensive to calculate?) The current uses seems to rely on API that's not exposed here, like writer.min_size -= 2.

erlend-aasland · 2024-08-23T06:05:11Z

I find the Prepare API confusing. From the API names in the OP, it is not entirely obvious to me how I'm supposed to use the proposed APIs.

vstinner · 2024-09-05T12:49:16Z

I find the Prepare API confusing. From the API names in the OP, it is not entirely obvious to me how I'm supposed to use the proposed APIs.

Oh. Can you propose a better API in this case?

Prepare(n) adds n bytes to the internal buffer. Example:

PyBytesWriter_Create(3) allocates a buffer of 3 bytes
PyBytesWriter_Prepare(&writer, &str, 3) reallocates the buffer to 6 bytes
PyBytesWriter_Prepare(&writer, &str, 4) reallocates the buffer to 10 bytes

corona10 · 2024-09-06T12:37:24Z

For naming,
Why not simply

PyBytesWriter_Create -> PyBytesWriter_Init
PyBytesWriter_Prepare -> PyBytesWriter_Allocate

It will be the less confusing :)

encukou · 2024-09-06T15:35:36Z

For some brainstorming --

I would expect that Prepare looks at the current position, i.e. str, like this:

Prepare(n) clears any previous reservations, and ensures n bytes can be written after the current str:

PyBytesWriter_Create(3) allocates a buffer of 3 bytes
str += 3
PyBytesWriter_Prepare(&writer, &str, 3) reallocates the buffer to 6 bytes
str += 2 // str is now (<start of buffer> + 5)
PyBytesWriter_Prepare(&writer, &str, 4) reallocates the buffer to 9 bytes
PyBytesWriter_Prepare(&writer, &str, 2) could reallocate the buffer to 7 bytes (but in practice we only shrink in _Finish, so it stays at 9)

That said, this is still a weird API; PyUnicodeWriter, where the “current position” is handled entirely by the writer, is much more understandable.

The proposed API allows user code to keep track of the “current write position” using str (like str += 3); but str is also a pointer that's updated by the writer's realloc calls. I think this dual usage is confusing.

Why not let the user handle the “current write position” on their own, as an offset from “start of buffer” that's handled by the writer, and have for example:

// Create a bytes writer instance.
// Set `*str` to a writable buffer of at least the given `size`.
// `*str` is valid until the next `PyBytesWriter_*` call on the result.
PyBytesWriter* PyBytesWriter_Create(Py_ssize_t size, char **str);

// Reallocate to `size` bytes.
// Set `*str` to a writable buffer of at least the given `size`;
//   `min(<old_size>, size)` bytes are copied to this buffer.
// `*str` is valid until the next `PyBytesWriter_*` call on this `writer`.
// (Detail: this will never shrink the internal buffer, but if it needs to grow,
//  it will not overallocate. The user can choose an overallocation strategy
//  themself.)
int PyBytesWriter_Realloc(PyBytesWriter *writer, Py_ssize_t size, char **str);

// Return the final Python bytes object with the given size,
//   and destroy the writer instance.
// `size` must not be greater than in the previous `_Create`/`_Realloc` call.
PyObject* PyBytesWriter_Finish(PyBytesWriter *writer, Py_ssize_t size);

// Discard the internal bytes buffer and destroy the writer instance.
void PyBytesWriter_Discard(PyBytesWriter *writer);

That would make the convenience helper slightly more complicated, though,
as it needs to update the “current write position”:

// Write `data_size` bytes from `data` at the given `*offset`.
// Set `*str` to a writable buffer of at least `*offset + data_size` bytes;
//   `*offset` bytes are copied to this buffer.
// Increment `*offset` by `data_size`.
// Equivalent to:
//  - PyBytesWriter_Realloc(writer, *offset + data_size, str)
//  - memcpy((*str) + offset, data, data_size)
//  - *offset += data_size
// (Detail: unlike PyBytesWriter_Realloc, this *will* overallocate, so
//  creating bytes using `PyBytesWriter_Write` would be amortized `O(N)`.)
void PyBytesWriter_Write(PyBytesWriter *writer, char **str, Py_ssize_t *offset, Py_ssize_t data_size, char *data);

But, looking at the big picture, I see only two use cases:

You know the final size in advance. That only needs:
- PyBytesWriter* PyBytesWriter_Create(Py_ssize_t size, char **str);
- PyObject* PyBytesWriter_Finish(PyBytesWriter *writer);
- void PyBytesWriter_Discard(PyBytesWriter *writer).
You don't know the size in advance, and so PyBytesWriter_Finish would likely need to reallocate to the final size, and copy data. In this case, for the preparation you can use any “generic resizable buffer” implementation; Python doesn't need to provide one for you. Pass the result to PyBytes_FromStringAndSize.

erlend-aasland · 2024-09-11T09:49:23Z

Petr:

That said, this is still a weird API; PyUnicodeWriter, where the “current position” is handled entirely by the writer, is much more understandable.

I agree. I worry that the complexity of this API will lead to misuse.

Victor [regarding my complaints about the "prepare" API]:

Oh. Can you propose a better API in this case?

I don't know yet. I can try to think up something.

vstinner · 2024-10-07T13:09:50Z

It seems like this API is too low-level and too error-prone. I prefer to abandon promoting this API as a public API for now. We can revisit this API later if needed.

vstinner mentioned this issue Aug 5, 2024

gh-121710: Add PyBytesWriter API python/cpython#121726

Closed

vstinner closed this as completed Oct 7, 2024

vstinner mentioned this issue Oct 7, 2024

[C API] Add PyBytesWriter API python/cpython#121710

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PyBytes_Writer() API #39

Add PyBytes_Writer() API #39

vstinner commented Aug 5, 2024 •

edited

Loading

picnixz commented Aug 6, 2024 •

edited

Loading

vstinner commented Aug 6, 2024

picnixz commented Aug 6, 2024 •

edited

Loading

encukou commented Aug 7, 2024

vstinner commented Aug 7, 2024

vstinner commented Aug 7, 2024

encukou commented Aug 8, 2024

vstinner commented Aug 8, 2024

encukou commented Aug 9, 2024 •

edited

Loading

vstinner commented Aug 9, 2024

encukou commented Aug 19, 2024

erlend-aasland commented Aug 23, 2024

vstinner commented Sep 5, 2024

corona10 commented Sep 6, 2024

encukou commented Sep 6, 2024

erlend-aasland commented Sep 11, 2024

vstinner commented Oct 7, 2024

Add PyBytes_Writer() API #39

Add PyBytes_Writer() API #39

Comments

vstinner commented Aug 5, 2024 • edited Loading

picnixz commented Aug 6, 2024 • edited Loading

vstinner commented Aug 6, 2024

picnixz commented Aug 6, 2024 • edited Loading

encukou commented Aug 7, 2024

vstinner commented Aug 7, 2024

vstinner commented Aug 7, 2024

encukou commented Aug 8, 2024

vstinner commented Aug 8, 2024

encukou commented Aug 9, 2024 • edited Loading

vstinner commented Aug 9, 2024

encukou commented Aug 19, 2024

erlend-aasland commented Aug 23, 2024

vstinner commented Sep 5, 2024

corona10 commented Sep 6, 2024

encukou commented Sep 6, 2024

erlend-aasland commented Sep 11, 2024

vstinner commented Oct 7, 2024

vstinner commented Aug 5, 2024 •

edited

Loading

picnixz commented Aug 6, 2024 •

edited

Loading

picnixz commented Aug 6, 2024 •

edited

Loading

encukou commented Aug 9, 2024 •

edited

Loading