Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compact dlt pipeline state table in filesystem destination #1657

Closed
rudolfix opened this issue Aug 2, 2024 · 0 comments · Fixed by #1838
Closed

compact dlt pipeline state table in filesystem destination #1657

rudolfix opened this issue Aug 2, 2024 · 0 comments · Fixed by #1838
Assignees
Labels
support This issue is monitored by Solution Engineer

Comments

@rudolfix
Copy link
Collaborator

rudolfix commented Aug 2, 2024

Background
We started to store pipeline state and schemas in filesystem destination to be able to restore them. In case of state table each load may produce a new file. We should truncate old states ie. to just keep 100 newest.

Requirements

  1. when load is completed, delete old files in the _dlt_pipeline_state table
  2. keep N last state files
  3. delete only the state files that corresponds to finished loads (they have corresponding completed entry). this is to prevent a rate case when we have 100 unsuccessful partial loads and we delete the last right state
  4. make this mechanism on by default and allow it to be disabled with a flag
@rudolfix rudolfix added the support This issue is monitored by Solution Engineer label Aug 2, 2024
@rudolfix rudolfix assigned donotpush and unassigned sh-rp Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
support This issue is monitored by Solution Engineer
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants