-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different memory usage behavior with Nim 2.0.0 vs 1.6.x #22510
Comments
The allocator in 2.0 did change to use a shared heap and I think this change is only in the 1.9-2.0 line but cannot remember for sure.
I created an "artificial" problem showing a leak, let's see if I can find it again... |
I want to chime in that I am experiencing a similar issue when moving from 1.6.4 to 2.0. I have a hard time finding the bug in a sufficient and minimal example. The code is part of a larger computational model and does not show increased memory footprint when running under 1.6.x versions but it does on 2.0. |
@cvanelteren Does |
It does @Araq! |
@hamidb80 just to be sure, does it also leak with ORC? |
Turned out my test program was invalid and had no leak. Any small examples reproducing the problem? |
Unfortunately my code is part of a simulator. Not sure exactly where the error originated from. |
I just ran into this with channels and (wrongly) opened an issue for it, thinking it to be a channel-related problem. #23078 I have since learned it was instead a memory allocation issue because the example code provided there ate through 16GB of memory, crashed and freed it again in under a second. Based on @beef331 s and griffith1deadly advice I'm thus linking to it here. For context: This problem does not occur if compiling As a secondary problem, even with sink provided, you'll still see crashes (when running the example in the linked issue) that do not occur under -d:useMalloc. /home/philipp/dev/threadbutler/src/threadButler/channelHub.nim(42) serverLoop
/usr/lib/nim/system/alloc.nim(1052) alloc
/usr/lib/nim/system/alloc.nim(890) rawAlloc
/usr/lib/nim/system/alloc.nim(810) freeDeferredObjects
/usr/lib/nim/system/alloc.nim(767) addToSharedFreeListBigChunks
SIGSEGV: Illegal storage access. (Attempt to read from nil?) I don't have a decent understanding there myself as this low in memory interaction still feels beyond me, but apparently the allocator is not reusing memory blocks or sth? |
So we now have a small test program? import std/[sequtils]
type
BackendMessage* = object
field*: seq[int]
type Container = ref object
chan1: Channel[BackendMessage]
chan2: Channel[BackendMessage]
proc routeMessage*(msg: BackendMessage; container: ptr Container) =
discard container[].chan2.trySend(msg)
proc setupChannelReceiver(cont: var Container): Thread[ptr Container] =
proc recvMsg(container: ptr Container) =
while true:
let resp = container[].chan1.tryRecv()
if resp.dataAvailable:
routeMessage(resp.msg, container)
createThread(result, recvMsg, cont.addr)
const MESSAGE_COUNT = 100
proc main() =
var cont = Container()
cont.chan1.open()
cont.chan2.open()
let msg: BackendMessage = BackendMessage(field: (0..500).toSeq())
let channelReceiverThread = setupChannelReceiver(cont)
while true:
echo "New iteration"
var counter = 0
for _ in 1..MESSAGE_COUNT:
discard cont.chan1.trySend(msg)
echo "After sending"
while counter < MESSAGE_COUNT:
let resp = cont.chan2.tryRecv()
if resp.dataAvailable:
counter.inc
echo "After receiving"
joinThreads(channelReceiverThread)
main() |
Aye. Though the latter bug (crash without the insane memory consumption) is pretty flaky in its occurrence, the memory issue reliably occurs. And the stacktrace you see when crashing due to the memory spike looks identical to the one I get when crashing without the memory spike (at least I believe its without the memory spike as my system monitor doesn't detect memory consumption spikes in between the 500ms it takes to check each time), so maybe that also helps. Edit: I'll likely need to derive a second example out of my own code that triggers the second error more reliably. That other code (not the example above) works nice to reproduce the error, but has a ton of "noise" associated with it that would make debugging harder. |
Does the example I provided suffice to start troubleshooting at least on the first memory allocation issue? Just trying to avoid communication errors, as me deriving a new example from the code where it occurs specifically for the second issue (segfault without it eating 15GB of RAM) will likely take a bit, cutting down the first example enough was a multiple-hour process that hopefully is going to be quicker this time around ^^' |
Better test program that doesn't misuse the threading API: import std / [atomics, strutils, sequtils]
type
BackendMessage* = object
field*: seq[int]
var
chan1: Channel[BackendMessage]
chan2: Channel[BackendMessage]
chan1.open()
chan2.open()
proc routeMessage*(msg: BackendMessage) =
discard chan2.trySend(msg)
var
recv: Thread[void]
stopToken: Atomic[bool]
proc recvMsg() =
while not stopToken.load(moRelaxed):
let resp = chan1.tryRecv()
if resp.dataAvailable:
routeMessage(resp.msg)
echo "child consumes ", formatSize getOccupiedMem()
createThread[void](recv, recvMsg)
const MESSAGE_COUNT = 100
proc main() =
let msg: BackendMessage = BackendMessage(field: (0..500).toSeq())
for j in 0..10:
echo "New iteration"
var counter = 0
for _ in 1..MESSAGE_COUNT:
discard chan1.trySend(msg)
echo "After sending"
while counter < MESSAGE_COUNT:
let resp = chan2.tryRecv()
if resp.dataAvailable:
counter.inc
echo "After receiving ", formatSize getOccupiedMem()
stopToken.store true, moRelaxed
joinThreads(recv)
main() |
I believe I'm also running into this bug. While I don't have experience with Nim 1.6, I am experiencing slowly increasing memory usage (where there should not be) while using threads in Nim 2.0.2. While "-d:useMalloc" does seem to solve the memory problem, it also runs 25%-50% slower :( |
Related issue: #23361. |
Description
I have an HTTP server written in Nim using Mummy and several other libs I wrote (some info here). It is multi-threaded and built using
--mm:orc
and--threads:on
when building with 1.6.x.I have been running this server for months now built with Nim 1.6.10, 1.6.12 and now 1.6.14 and have had consistent memory behavior and nothing that appears to be a memory leak.
Yesterday I built the server using Nim 2.0.0 and deployed it into production to see how things went. Unfortunately I noticed a difference in memory usage behavior right away. The Nim 2.0.0 build had steadily increasing memory usage, growing to 3x of what the Nim 1.6.x server would use after being stable for hours in a handful of minutes and was still rising.
I took the Nim 2.0.0 build offline and then tested with Nim 2.0.0 +
-d:useMalloc
. In this case, the memory behavior was as expected based on my previous 1.6.x experience, so it seemed to "fix" the behavior.I wanted to report this. Each different build is a clean build using the same deps and code, where the only change is the Nim version (and maybe adding
-d:useMalloc
).The memory usage behavior change being just with Nim 2.0.0 and without
-d:useMalloc
+ a long history of 1.6.x working well seems to indicate a decent probability the leak is not in the server code itself, though being sure of this is not straightforward. Is there any known behavior differences I should be expecting with the new Nim 2.0.0 memory management that could explain this, or are there known or reported issues that I could look into that may happen to be a cause?Nim Version
Built using this Docker image: https://hub.docker.com/layers/nimlang/nim/2.0.0-alpine/images/sha256-94cfb2d2d31e23759dfb02b50995b9e24c2cde8cfe3c07298addb3d6b4755457
Current Output
No response
Expected Output
No response
Possible Solution
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: