Skip to content
This repository has been archived by the owner on Sep 10, 2021. It is now read-only.

Huge memory usage when reading large files #36

Open
divaltor opened this issue Jul 26, 2020 · 4 comments
Open

Huge memory usage when reading large files #36

divaltor opened this issue Jul 26, 2020 · 4 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@divaltor
Copy link
Contributor

  • Yandex Market Language (YML) for Python version: 0.6.0
  • Python version: 3.8.3
  • Operating System: Windows 10 2004 (for testing) and Debian 10 (for production)

What I Did

I tried to process a 100 MB file and the processing machine exceeded the memory limit (~2 gb). Is there any way to optimize file reading or processing so that it doesn't go out of memory?

image

@stefanitsky
Copy link
Owner

@divaltor hi! Hm, i think you can try this in here:
image

We can add this to the project if it works, it would be great.

@stefanitsky stefanitsky added enhancement New feature or request good first issue Good for newcomers labels Jul 26, 2020
@divaltor
Copy link
Contributor Author

I will try to implement this, but I think I will have to rewrite most of the library

@divaltor
Copy link
Contributor Author

@stefanitsky I reworked the processing via iterparse, but it didn't give any result. In the end, the file takes up the same amount of RAM

@divaltor divaltor reopened this Jul 27, 2020
@stefanitsky
Copy link
Owner

@stefanitsky I reworked the processing via iterparse, but it didn't give any result. In the end, the file takes up the same amount of RAM

Then i think the problem is that the entire package is built on classes and with such a large file size the memory is overloaded, so you can try to generate the classes through the generator in order to parse into memory gradually as needed, and not all at once.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants