Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocate space for arrays when deserializing JSON #217

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ebroder
Copy link
Contributor

@ebroder ebroder commented Sep 8, 2024

Previously, when using dynamic mode and deserializing an incoming JSON message from MQTT, the library would allocate space based on the number of colons - i.e. the number of object key-value pairs. However, if there are arrays in the JSON message (such as when a shared attribute gets deleted), that wouldn't allocate enough space.

@MathewHDYT
Copy link
Contributor

MathewHDYT commented Sep 8, 2024

Previously, when using dynamic mode and deserializing an incoming JSON message from MQTT, the library would allocate space based on the number of colons - i.e. the number of object key-value pairs. However, if there are arrays in the JSON message (such as when a shared attribute gets deleted), that wouldn't allocate enough space.

I'm not sure this works either because what if the array has multiple values. Like for example this payload. The previous version would measure 3 : and therefore allocate 3 * 16 = 48 but it actually requires 80 and your version would allocate 3 * 16 + 1 * 16 = 64, which is also not enough.

{
  "sensor": "gps",
  "time": 1351824120,
  "data": [
    48.75608,
    2.302038
  ]
}

@ebroder
Copy link
Contributor Author

ebroder commented Sep 8, 2024

Oh hmm, that's true. I was testing with the "deleted" message that gets sent when an attribute gets deleted, and that typically only results in a single key.

Can you do number of commas + number of [ + number of {?

@MathewHDYT
Copy link
Contributor

MathewHDYT commented Sep 8, 2024

Oh hmm, that's true. I was testing with the "deleted" message that gets sent when an attribute gets deleted, and that typically only results in a single key.

Can you do number of commas + number of [ + number of {?

I thought of that as well, numbers of commas is probably the only way, but we would have to add 1, because the last comma is optional in the overlying structure and in the array.

So something like amount of , + amount of [ * 2 + amount of { * 2, might work.

@ebroder
Copy link
Contributor Author

ebroder commented Sep 8, 2024

Ah yeah I guess for each [ or { you need to allocate both for the array and the last element. I'll make that adjustment

@MathewHDYT
Copy link
Contributor

What I just though of is that this calculation with user input is pretty risky. Because if you send a malicious string containing a lot of ",,,,,,,,,,,,,,,,,,,,,,,,," you could crash the device.

Perhaps it would be useful to add an optional cap for the amount of memory the DynamicJsonDocument is allowed to allcoate with that call and per default it is 0, meaning we do not cap the allocation. What do you think?

@ebroder
Copy link
Contributor Author

ebroder commented Sep 8, 2024

Ugh this memory management strategy is rough. I don't think I have the time to try the upgrade now, but this would be a lot easier on ArduinoJson 7.

In fairness, that was already an issue if a user sent a malicious string of ":::::::::::::::::::::". For my use case I'm fine trusting my users, so I'm probably going to run with this approach but if it's not upstreamable that's fine.

@MathewHDYT
Copy link
Contributor

MathewHDYT commented Sep 8, 2024

Ugh this memory management strategy is rough. I don't think I have the time to try the upgrade now, but this would be a lot easier on ArduinoJson 7.

In fairness, that was already an issue if a user sent a malicious string of ":::::::::::::::::::::". For my use case I'm fine trusting my users, so I'm probably going to run with this approach but if it's not upstreamable that's fine.

I know the problem is that it might be easier for this specific use case in the library, but we also use the JsonDocument a lot in the rest of the code and in all other occurences we know the size beforehand because the API of course expectes a predefined amount of arguments.

Allocating 1KB on the heap everytime we send a request which might not even require 100 bytes on the stack is questionable. But perhaps I'll implement it someday with an optional toogle, like the THINGSBOARD_ENABLE_DYNAMIC mode.

Additionally I'll implement the memory safety feature in my fork no problem. I'll also include your changes once their done in my current fork as well, because I'am already working on the next release and am currently close to completion.

@MathewHDYT
Copy link
Contributor

MathewHDYT commented Sep 9, 2024

Merged and tested in my fork. The aforementioned potential memory issue does not exist. Because if the allocation is small enough to succeed we will delete the DynamicJsonDocument pretty soon anyway and if it fails we return from the method at this point with an error message.

Additionally the calculation seems to work as well.

@ebroder
Copy link
Contributor Author

ebroder commented Sep 9, 2024

Actually, now that I've slept on it, even that isn't a risk. The parameter to DynamicJsonDocument specifies the maximum amount of space that it's willing to allocate, not the amount of space it actually allocates. So even if a user does pass an argument with a lot of commas or colons, it raises the ceiling, but not the actual allocation. This should be totally fine.

@MathewHDYT
Copy link
Contributor

Actually, now that I've slept on it, even that isn't a risk. The parameter to DynamicJsonDocument specifies the maximum amount of space that it's willing to allocate, not the amount of space it actually allocates. So even if a user does pass an argument with a lot of commas or colons, it raises the ceiling, but not the actual allocation. This should be totally fine.

That is not the case the memoryUsage() method might return only the actual memory used, but the underlying allocation still allocates as much as is given in the constructor.

That can be read with the capacity() method. I've tested it with a test script that allocates a lot of memory on the heap and then receives a malicous payload that causes the allocation to fail and it calls the esp_heap_caps_alloc_failed method.

Previously, when using dynamic mode and deserializing an incoming JSON
message from MQTT, the library would allocate space based on the number
of colons - i.e. the number of object key-value pairs. However, if there
are arrays in the JSON message (such as when a shared attribute gets
deleted), that wouldn't allocate enough space.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants