Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add index for GUN on tuf_files #1628

Conversation

stefan-zh
Copy link

At work my team uses Notary and we encountered a serious inefficiency when it comes to database queries from the tuf_files table. The table lacks an index on the gun field, which makes the MySQL/MariaDB engine perform a full table scan each time gun is used in a WHERE clause (which most Notary server endpoints use). We have lots of repositories indexed in tuf_files, which started to become a problem and we noticed this issue.

A default BTree index on gun makes sense since GUNs are tree-like in nature (they are even represented like directory structures on disk). Making this change in our environment sped up the Notary server significantly.

Note: Just to avoid confusion, it is important to mention that there is an index on tuf_files called gun, but it is a composite UNIQUE KEY index consisting of gun, role and version. This index doesn't help when there is a WHERE clause involving the gun column alone, so the DB engine performs a full table scan anyway.

Signed-off-by: Stefan Zhelyazkov <[email protected]>
@stefan-zh stefan-zh force-pushed the improvement/add-gun-index-tuf-files branch from 03dd881 to 1c58b4b Compare January 13, 2022 12:37
@jonnystoten
Copy link
Contributor

I'm a bit confused by this - MySQL certainly should use the compound index for conditions on the 'left'-most columns, eg. the gun index on gun, role, and version should be used for queries that have conditions on gun, gun & role, and gun, role, & version. It's interesting that adding this new index makes a difference. Perhaps MySQL thinks the query will have so many results that the existing index isn't worth using 🤔

Which queries are you seeing that have a condition on gun only? AFAIK the only query that should do that is the one for deleting an entire repo.

I wonder if #1639 might help for your case? We found that the index wasn't being used when selecting the current file for a given gun and role because of multiple order by clauses that were sorting in opposite directions.

@stefan-zh
Copy link
Author

stefan-zh commented Nov 14, 2022

@jonnystoten I found a reference to what you are talking about when you said: MySQL certainly should use the compound index for conditions on the 'left'-most columns: https://stackoverflow.com/a/9764392/9698467 Honestly, I never knew this was the case and I learned something new.

Something peculiar that I found that explains the problem is that our gun index is listed as type HASH. However, the instruction UNIQUE KEY makes it a BTree by default. However, at one point our team started signing GUNs, which were longer than 255 characters and we increased the column size of gun column to 1024. At this point the composite gun index (gun+role+version) apparently silently converted from BTree to Hash because the key length exceeded the 3072-byte limit.

Given the Hash type, the index cannot do well in SELECT queries on the gun column, so we need to keep the additional gun_index in our instance. But I will close this pull request.

@stefan-zh stefan-zh closed this Nov 14, 2022
@stefan-zh stefan-zh deleted the improvement/add-gun-index-tuf-files branch November 14, 2022 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants