Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PyBytes_Writer() API #39

Closed
vstinner opened this issue Aug 5, 2024 · 17 comments
Closed

Add PyBytes_Writer() API #39

vstinner opened this issue Aug 5, 2024 · 17 comments

Comments

@vstinner
Copy link

vstinner commented Aug 5, 2024

This API is basically an efficient memory manager for functions creating Python bytes objects.

It avoids the creation of incomplete/inconsistent Python bytes objects: allocate an object with uninitialized bytes, fill the object, resize it. See issues:

API:

// Create a bytes writer instance.
PyBytesWriter* PyBytesWriter_Create(Py_ssize_t size, char **str);

// Return the final Python bytes object and destroy the writer instance.
PyObject* PyBytesWriter_Finish(PyBytesWriter *writer, char *str);

// Discard the internal bytes buffer and destroy the writer instance.
void PyBytesWriter_Discard(PyBytesWriter *writer);

// Allocate *size* bytes to prepare writing *size* bytes into *writer*. 
int PyBytesWriter_Prepare(PyBytesWriter *writer, char **str, Py_ssize_t size);

// Write a the bytes string *bytes* of *size* bytes into *writer*.
int PyBytesWriter_WriteBytes(PyBytesWriter *writer, char **str, const void *bytes, Py_ssize_t size);

The str parameter is what's being used to write bytes in the writer.

Example creating the string "abc":

static PyObject* create_abc(void)
{
    char *str;
    PyBytesWriter *writer = PyBytesWriter_Create(3, &str);
    if (writer == NULL) return NULL;

    // Write "abc"
    memcpy(str, "abc", 3);
    str += 3;

    return PyBytesWriter_Finish(writer, str);
}

The "hot code" which writes bytes can only use str.

The PyBytesWriter_WriteBytes() is just a small convenient helper function (for PyBytesWriter_Prepare() + memcpy() calls).


You can find more documentation and examples in the Pull Request.

This proposed public API is based on existing _PyBytesWriter API which is quite old. I wrote an article about it in 2016: https://vstinner.github.io/pybyteswriter.html

@picnixz
Copy link

picnixz commented Aug 6, 2024

So, my 2 cent after Victor explained to me how it's being used: it's good and helpful..

I have one question though. Is there a preference of having

PyBytesWriter_Create(Py_ssize_t size, char **str);

instead of

PyBytesWriter_Create(char **str, Py_ssize_t size);

?

Since I'm on a phone, it's hard to check the files but shouldn't we mimic the usual order of having the "output" variable on the left? (for instance the Prepare method also puts the str before its length).

@vstinner
Copy link
Author

vstinner commented Aug 6, 2024

shouldn't we mimic the usual order of having the "output" variable on the left?

Ah? For me, the trend in the Python C API is more to put output parameters at the end ("right").

@picnixz
Copy link

picnixz commented Aug 6, 2024

Ah? Er... actually it's probably the case for the C API now that I think about with all the Getters functions.

I think I used the mem/alloc C functions as the baseline and thought as the Create as something close to the realloc() function. I don't really have a strong opinion on the matter though so let's keep your signature.

@encukou
Copy link
Collaborator

encukou commented Aug 7, 2024

We probably should have a public discussion on Discourse before getting to the WG decision.


AFAIK, there are two main use cases for a factory like this:

  • Allocate writer of known size, then immediately fill it, and convert to bytes. (As in the PyLongWriter proposal.)
  • Build the object incrementally, which involves over-allocation and re-allocation.

For the first use case, IMO an efficient implementation should mean that Create does one memory allocation, and Finish does none (it turns the writer to bytes in place).

For the second, why not use bytearray, which already is explicitly mutable?
It seems that the advantage of a non-PyObject is that it can only have one “owner”, so:

  • Other code can't mutate it, which would presumably invalidate the str pointer. (The docs for the new public API should be very explicit about how long str is valid.)
  • One could expect that it's not thread-safe. (If that's the case, public API docs should also be explicit about that.)

It seems to me that PyBytesWriter is close to a generic “growable buffer”. Is there a special reason that it's tied to PyBytes? From the API it seems that Finish could be replaced by passing str to PyBytes_FromStringAndSize + calling Discard -- and then you could do the same to make a bytearray or any other type.

The private API has an optimization that we lose here: it puts a small initial buffer on the stack. It seems that performance-minded users will want to use their own implementation anyway.
Also, C++, Rust, etc. surely have a growable buffers already; they should be fine with PyBytes_FromStringAndSize

In CPython itself, there are quite few uses of _PyBytesWriter.
Is it worth exposing at all?

@vstinner
Copy link
Author

vstinner commented Aug 7, 2024

@encukou:

For the second, why not use bytearray, which already is explicitly mutable?

It's slower since you have to copy bytearray memory to create a bytes object, and destroy the bytearray, to create the final bytes object. PyBytesWriter works in the nanoseconds area.

One could expect that it's not thread-safe.

Correct, the code is not thread safe, and expected to have a single user.

It seems to me that PyBytesWriter is close to a generic “growable buffer”. Is there a special reason that it's tied to PyBytes? From the API it seems that Finish could be replaced by passing str to PyBytes_FromStringAndSize + calling Discard -- and then you could do the same to make a bytearray or any other type.**

PyBytesWriter contains a temporary bytes object. The purpose of the API is to only expose the bytes object when it's fully initialized.

The private API has an optimization that we lose here: it puts a small initial buffer on the stack.

The small buffer is also used in the proposed public API. It's just that it's allocated on the heap memory.

Also, C++, Rust, etc. surely have a growable buffers already; they should be fine with PyBytes_FromStringAndSize

The problem of using a buffer and then calling PyBytes_FromStringAndSize() is that you have to copy the memory, and then destroy the temporary buffer. Again, PyBytesWriter uses a bytes object internally, so calling PyBytesWriter_Finish() is cheap.

@vstinner
Copy link
Author

vstinner commented Aug 7, 2024

PyBytesWriter works in the nanoseconds area.

A more concrete micro-benchmark shows that the public PyBytesWriter (23.4 ns) is 4x faster than bytearray (93.0 ns): python/cpython#121726 (comment).

@encukou
Copy link
Collaborator

encukou commented Aug 8, 2024

Thank you, I see now! Your explanations help with reading the implementation.

A hard part of making this public will be documenting what str is, when it is valid and how much of the memory it points to is valid. It's a bit hard to explain :)

@vstinner
Copy link
Author

vstinner commented Aug 8, 2024

When you call PyBytesWriter_Create(size, &str), you can write size bytes into str.

When you call PyBytesWriter_Prepare(writer, &str, size), you can write size bytes into str.

str is valid until PyBytesWriter_Finish() or PyBytesWriter_Discard() is called. Would you prefer that these functions set str to NULL?

@encukou
Copy link
Collaborator

encukou commented Aug 9, 2024

That, and the str you pass to PyBytesWriter_Create or …Finish needs to be the str you got from …Create or …Prepare, incremented by (up to?) the size given to that call. Right?

Would you prefer that these functions set str to NULL?

Probably not necessary, they don't clear writer either.

@vstinner
Copy link
Author

vstinner commented Aug 9, 2024

That, and the str you pass to PyBytesWriter_Create or …Finish needs to be the str you got from …Create or …Prepare, incremented by (up to?) the size given to that call. Right?

Correct.

@encukou
Copy link
Collaborator

encukou commented Aug 19, 2024

Except you can call Prepare() several times in succession. I'm not entirely clear on what should happen in that case.

I'd expect that each Prepare reserves size bytes after *str, overriding any previous Prepare.
Instead, AFAICS each Prepare adds size bytes to an internal “minimum size” field. IMO, that's harder to explain. But more importantly, I'm not sure on what you should do if you call Prepare with a size that's too big -- is there a good way to “decrement” that if it turns out you needed fewer bytes than you Prepared? Or are users expected to provide exact sizes (which might be expensive to calculate?) The current uses seems to rely on API that's not exposed here, like writer.min_size -= 2.

@erlend-aasland
Copy link

I find the Prepare API confusing. From the API names in the OP, it is not entirely obvious to me how I'm supposed to use the proposed APIs.

@vstinner
Copy link
Author

vstinner commented Sep 5, 2024

I find the Prepare API confusing. From the API names in the OP, it is not entirely obvious to me how I'm supposed to use the proposed APIs.

Oh. Can you propose a better API in this case?

Prepare(n) adds n bytes to the internal buffer. Example:

  • PyBytesWriter_Create(3) allocates a buffer of 3 bytes
  • PyBytesWriter_Prepare(&writer, &str, 3) reallocates the buffer to 6 bytes
  • PyBytesWriter_Prepare(&writer, &str, 4) reallocates the buffer to 10 bytes

@corona10
Copy link

corona10 commented Sep 6, 2024

For naming,
Why not simply

  • PyBytesWriter_Create -> PyBytesWriter_Init
  • PyBytesWriter_Prepare -> PyBytesWriter_Allocate

It will be the less confusing :)

@encukou
Copy link
Collaborator

encukou commented Sep 6, 2024

For some brainstorming --

I would expect that Prepare looks at the current position, i.e. str, like this:

Prepare(n) clears any previous reservations, and ensures n bytes can be written after the current str:

  • PyBytesWriter_Create(3) allocates a buffer of 3 bytes
  • str += 3
  • PyBytesWriter_Prepare(&writer, &str, 3) reallocates the buffer to 6 bytes
  • str += 2 // str is now (<start of buffer> + 5)
  • PyBytesWriter_Prepare(&writer, &str, 4) reallocates the buffer to 9 bytes
  • PyBytesWriter_Prepare(&writer, &str, 2) could reallocate the buffer to 7 bytes (but in practice we only shrink in _Finish, so it stays at 9)

That said, this is still a weird API; PyUnicodeWriter, where the “current position” is handled entirely by the writer, is much more understandable.


The proposed API allows user code to keep track of the “current write position” using str (like str += 3); but str is also a pointer that's updated by the writer's realloc calls. I think this dual usage is confusing.

Why not let the user handle the “current write position” on their own, as an offset from “start of buffer” that's handled by the writer, and have for example:

// Create a bytes writer instance.
// Set `*str` to a writable buffer of at least the given `size`.
// `*str` is valid until the next `PyBytesWriter_*` call on the result.
PyBytesWriter* PyBytesWriter_Create(Py_ssize_t size, char **str);

// Reallocate to `size` bytes.
// Set `*str` to a writable buffer of at least the given `size`;
//   `min(<old_size>, size)` bytes are copied to this buffer.
// `*str` is valid until the next `PyBytesWriter_*` call on this `writer`.
// (Detail: this will never shrink the internal buffer, but if it needs to grow,
//  it will not overallocate. The user can choose an overallocation strategy
//  themself.)
int PyBytesWriter_Realloc(PyBytesWriter *writer, Py_ssize_t size, char **str);

// Return the final Python bytes object with the given size,
//   and destroy the writer instance.
// `size` must not be greater than in the previous `_Create`/`_Realloc` call.
PyObject* PyBytesWriter_Finish(PyBytesWriter *writer, Py_ssize_t size);

// Discard the internal bytes buffer and destroy the writer instance.
void PyBytesWriter_Discard(PyBytesWriter *writer);

That would make the convenience helper slightly more complicated, though,
as it needs to update the “current write position”:

// Write `data_size` bytes from `data` at the given `*offset`.
// Set `*str` to a writable buffer of at least `*offset + data_size` bytes;
//   `*offset` bytes are copied to this buffer.
// Increment `*offset` by `data_size`.
// Equivalent to:
//  - PyBytesWriter_Realloc(writer, *offset + data_size, str)
//  - memcpy((*str) + offset, data, data_size)
//  - *offset += data_size
// (Detail: unlike PyBytesWriter_Realloc, this *will* overallocate, so
//  creating bytes using `PyBytesWriter_Write` would be amortized `O(N)`.)
void PyBytesWriter_Write(PyBytesWriter *writer, char **str, Py_ssize_t *offset, Py_ssize_t data_size, char *data);

But, looking at the big picture, I see only two use cases:

  • You know the final size in advance. That only needs:
    • PyBytesWriter* PyBytesWriter_Create(Py_ssize_t size, char **str);
    • PyObject* PyBytesWriter_Finish(PyBytesWriter *writer);
    • void PyBytesWriter_Discard(PyBytesWriter *writer).
  • You don't know the size in advance, and so PyBytesWriter_Finish would likely need to reallocate to the final size, and copy data. In this case, for the preparation you can use any “generic resizable buffer” implementation; Python doesn't need to provide one for you. Pass the result to PyBytes_FromStringAndSize.

@erlend-aasland
Copy link

Petr:

That said, this is still a weird API; PyUnicodeWriter, where the “current position” is handled entirely by the writer, is much more understandable.

I agree. I worry that the complexity of this API will lead to misuse.

Victor [regarding my complaints about the "prepare" API]:

Oh. Can you propose a better API in this case?

I don't know yet. I can try to think up something.

@vstinner
Copy link
Author

vstinner commented Oct 7, 2024

It seems like this API is too low-level and too error-prone. I prefer to abandon promoting this API as a public API for now. We can revisit this API later if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants