Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve and complete implementation of new/delete #361

Merged
merged 8 commits into from
Oct 15, 2020

Conversation

matthijskooijman
Copy link
Collaborator

@matthijskooijman matthijskooijman commented Sep 17, 2020

Since #340, the Arduino core exposes a new header (in addition to the previously used new.h), but it was incomplete, causing code that previously relied on a more complete new header (such as provided by uclibc++) to break. This breakage, and various considerations around improving new/delete were discussed in #287.

This PR fixes this by (almost) completing the new header.

This also prepares for letting new terminate, rather than returning null on allocation failure. The standard dictates an exception should be thrown, but that is not possible on AVR, so terminating is more similar to raising an exception that is never caught (and thus probably more standards compliant). Code that needs to (portably) handle allocation failure can (after merging this PR) already start using the nothrow version of new instead of the regular one. For compatibility reasons, the code to terminate instead of returning null is disabled yet, but can be enabled using a macro now already, and should be enabled by default in the future.

This also makes some related improvements to the __cxa_pure_virtual and __cxa_deleted_virtual.

This PR fixes #287 and fixes #47.

Originally, the Arduino core used "new.h", rather than the standard
"new", probably because the implementation was incomplete, and for the
most commonly used new and delete operators, no include is needed at all
(they are defined implicitly by the compiler). However, now Arduino
does expose the "new" name, as an alias for the older "new.h". Given
that the standard name is "new", it makes more sense to put the actual
content in "new", and make "new.h" a compatibility header that includes
"new" instead of the other way around.
This makes this header complete up to including C++14, except two
exception classes that cannot be defined without `<exception>`.

The functions related to the "new_handler" are declared but not actually
defined, to prevent overhead and complexity. They are still declared to
allow implementing them in user code if needed.

This makes the implementation of all operator new and delete functions
comply with the C++11/C++14 specification in terms of which should be
actually implemented and which should be delegate to other functions.

There are still some areas where these implementations are not entirely
standards-compliant, which will be fixed in subsequent commits.

This fixes part of arduino#287 and fixes arduino#47.
This makes these functions weak, so that a sketch or library can replace
them. This does not apply to all of these operators, only for the ones
that the C++ standard specifies as replaceable.
This allows calling it from other places later. The default
implementation calls `abort()`, but making it weak allows user code to
override this function (either directly, or by including a library like
uclibc++ that implements `std::set_terminate()`).

Note that this does not add a declaration for this function, since the
standard dictates this to be in `<exception>`, but we cannot
meaningfully or completely implement that header, so better leave it to
be overridden by e.g. libraries like uclibc++.
These are special functions that are presumably put into vtables for
deleted or pure virtual functions. Previously, this would call `abort()`
directly, but calling `std::terminate()` achieves the same effect, but
allows user code to change the behavior (e.g. to print to serial, blink
leds or whatever makes sense).
This is currently disabled, keeping the old behavior of returning
NULL on failure, but should probably be enabled in the future as code
that does want to do a null check has had a chance to switch to the
more portable nothrow versions.

When enabled, allocation failure calls the weak `std::terminate()`,
which calls `abort()` by default, but can be replaced by user code to do
more specific handling.

To enable this, a macro must be defined (in new.cpp or on the compiler
commandline).

This fixes part of arduino#287.
The standard dictates that `std::size_t` is used, rather than the plain
`size_t` type.

Even though these types are usually, if not always, exactly the same
type, other code might assume that `std::size_t` is actually used and thus
also available under that name after including `<new>`.

This fixes that by using the right type. One challenge is that it is
usually declared in headers that we do not have available, so this just
defines the `std::size_t` type in the `<new>` header to work around
that.
@matthijskooijman
Copy link
Collaborator Author

For testing this, I've used the following sketch (manually varying the constant s, the NEW_TERMINATES_ON_FAILURE macro in the "new" header and whether or not std::terminate was defined in the sketch):

#include <new>

constexpr size_t s = 8000;
struct foo {
  char bar[s];
};

struct C {
  int v;
  C() : v (1234) { Serial.println("    constructor"); Serial.flush(); }
  ~C() { Serial.println("    destructor"); Serial.flush(); }
};

namespace std {
  /* void terminate() {
    Serial.println("    terminate");
    Serial.flush();
  }*/
}

void setup() {
  Serial.begin(115200);
  Serial.println("Starting");
  Serial.flush();

  char* arr;
  foo* f;
  C* c;
  
  /***** nothrow new *****/
  
  // put your setup code here, to run once:
  arr = new (std::nothrow) char[s];
  Serial.print("  arr=0x");
  Serial.println((uintptr_t)arr, HEX);
  Serial.flush();
  delete[] arr;
  
  f = new (std::nothrow) foo();
  Serial.print("  f=0x");
  Serial.println((uintptr_t)f, HEX);
  Serial.flush();
  delete f;

  c = new (std::nothrow) C();
  Serial.print("  c=0x");
  Serial.println((uintptr_t)c, HEX);
  if (c) {
    Serial.print("    c.v=");
    Serial.println(c->v);
  }
  Serial.flush();
  delete c;
  
  /***** regular new *****/
  
  arr = new char[s];
  Serial.print("  arr=0x");
  Serial.println((uintptr_t)arr, HEX);
  Serial.flush();
  delete[] arr;
  
  f = new foo;
  Serial.print("  f=0x");
  Serial.println((uintptr_t)f, HEX);
  Serial.flush();
  delete f;
  
  c = new C();
  Serial.print("  c=0x");
  Serial.println((uintptr_t)c, HEX);
  if (c) {
    Serial.print("    c.v=");
    Serial.println(c->v);
  }
  Serial.flush();
  delete c;

  /***** placement (non-allocating) new *****/
  char buf[sizeof(C)];
  
  c = new (buf) C();
  Serial.print("  c=0x");
  Serial.println((uintptr_t)c, HEX);
  if (c) {
    Serial.print("    c.v=");
    Serial.println(c->v);
  }
  Serial.flush();
  c->~C();

  Serial.println("Done");
  Serial.flush();
}

void loop() {}

@matthijskooijman
Copy link
Collaborator Author

It would be good if we could get his reviewed and merged before 1.8.14, since it fixes a regression in 1.8.13.

@facchinm facchinm merged commit 6ec8015 into arduino:master Oct 15, 2020
@matthijskooijman
Copy link
Collaborator Author

I recently wondered about the change we made to (optionally) call std::terminate() on allocation failures, and had a look at how gcc's libstdc++ implements this (which I think we haven't considered before, we've just been looking at the C++ spec).

Turns out that libstdc++ does a similar thing, except that it does not call std::terminate() (which can be replaced by user code), but instead aborts (using the seemingly undocumented __builtin_abort() function see docs and code). Since other Arduino architectures (like ARM) use libstdc++, this means that replacing std::terminate() to catch allocation errors (which was what I was looking for) would work on AVR only (when NEW_TERMINATES_ON_FAILURE is defined in the core), but would not be portable to other cores.

For future reference, what I think will work portably, is completely replacing operator new (regular and array version) with one that calls malloc and then handles allocation failures there. This should work, since these are (since this PR) declared weak and can be overridden (as required by the C++ spec, and also implemented by libstdc++).

However, reading back my own code, it seems like this PR might have made the function declarations weak, rather than the definitions, causing references to operator new become weak, rather than the definitions... I'm going to try the override approach soon, and will report on the results here.

@matthijskooijman
Copy link
Collaborator Author

matthijskooijman commented Jun 21, 2022

However, reading back my own code, it seems like this PR might have made the function declarations weak, rather than the definitions, causing references to operator new become weak, rather than the definitions... I'm going to try the override approach soon, and will report on the results here.

Seems this is correct. However:

  • This does not create a "weak reference" as I feared, turns out there is a different weakref attribute for that.
  • This does make all implementations (that #include <new>) weak, not just the Arduino-provided one.
  • Linking order seems to make sketch-supplied operator new preferred over the Arduino-supplied ones, so that is good (this also works when overriding some and still requiring some other operators from new.cpp).

So the weak attribute is in the wrong place (should be fixed - see #487), but fortunately overriding these operators does still work as expected currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Operator new improvements support for notrow
2 participants