Adds in ability to cache low res logits from prompts #582

probicheaux · 2024-08-16T01:19:23Z

Description

Performs breadth first search to find the most similar prompt for loading cached logits.

Performance appears to be better when using cached logits. See no cached logits:

vs:

Additionally, adds logic to pad the input points to the sam model so that 2 different prompts with differeing number of poitns can be used.

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

Locally, integration tests

Any specific deployment considerations

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs

Docs updated? What were the changes:

probicheaux · 2024-08-16T01:29:06Z

Not ready for review yet

probicheaux · 2024-08-20T14:29:02Z

Ready for review @PawelPeczek-Roboflow

PawelPeczek-Roboflow · 2024-08-21T06:58:06Z

inference/core/env.py

@@ -283,6 +283,8 @@

 # Maximum embedding cache size for SAM, default is 10
 SAM_MAX_EMBEDDING_CACHE_SIZE = int(os.getenv("SAM_MAX_EMBEDDING_CACHE_SIZE", 10))
+# The sam2 low_res_masks are the biggest memory usage, and 1000 of them take 256*256*4*1000/1024/1024 MB = 250MB
+SAM2_MAX_CACHE_SIZE = int(os.getenv("SAM_MAX_EMBEDDING_CACHE_SIZE", 1000))


I propose to decouple SAM from SAM2 in terms of env variables configuring model
plus let's enlist the config variable in this page

PawelPeczek-Roboflow

There is fundamental problem I detected - algorithm used in find_prior_prompt_in_cache(...) is computationally intractable in scenario of not finding hit in cache.

code to reproduce:

import time

from inference.core.entities.requests.sam2 import (
    Sam2EmbeddingRequest,
    Sam2InferenceRequest,
    Sam2Prompt,
    Sam2PromptSet,
    Box, Point,
    Sam2SegmentationRequest,
)
from inference.models.sam2.segment_anything2 import find_prior_prompt_in_cache

initial_prompt_set = Sam2PromptSet(
    prompts=[Sam2Prompt(
        box=Box(x=10, y=10, height=10, width=10,),
        points=[Point(x=10, y=10, positive=True)] * 5
    )] * 3
)


start = time.time()
find_prior_prompt_in_cache_local(
    initial_prompt_set=initial_prompt_set,
    image_id="some",
    cache={},
)
print(f"Duration: {(time.time() - start) * 1000}ms")

basically you find stack growing exponentially.

…into sam2-id-free-caching

probicheaux · 2024-08-21T19:12:12Z

@PawelPeczek-Roboflow ready for rereview

The problem we're trying to solve is this -- Suppose you have a prompt with n points. You get a mask, and you want to negative prompt based on this mask. You have to feed that mask back in when adding the n+1st point. So when we receive the request with n+1 points, we need to go load the mask from the n point prompt.

The reason we want to cache these masks is because they take ~300kb and we don't want to incur the latency of serialization/deserialization into np arrays, as well as network latency. This matters because smart poly is trying to run in real time for image previews.

PawelPeczek-Roboflow

LGTM, three minor changes:

please add global env flag disabling functionality at server level and descriptions into changed fields in request - describing that functionality may be disabled based on server config

    save_logits_to_cache: bool = Field(default=False)
    load_logits_from_cache: bool = Field(default=False)

self.low_res_logits_cache: LogitsCacheType = {} typing is wrong - this id Dict[Tuple[str, str], LogitsCacheType]
return type of find_prior_prompt_in_cache(...) is probably Optional[np.ndarray]

tonylampada · 2024-08-22T14:20:35Z

@PawelPeczek-Roboflow thanks for helping me implement the fixes you suggested.
I just pushed them.

probicheaux added 2 commits August 16, 2024 01:07

Adds in ability to cache low res logits from prompts

40f7e01

Unneeded change

9273d2c

probicheaux requested review from PawelPeczek-Roboflow, grzegorz-roboflow, yeldarby and hansent as code owners August 16, 2024 01:19

probicheaux marked this pull request as draft August 16, 2024 01:31

probicheaux added 3 commits August 16, 2024 23:16

Granular control of logits cache

bc4dec1

Style

2603980

Don't load masks

4fdf7c4

probicheaux marked this pull request as ready for review August 19, 2024 17:53

probicheaux added 9 commits August 19, 2024 17:56

Merge branch 'main' into sam2-id-free-caching

4069632

Don't change default settings

03db682

Remove print

40ad2e5

Fix tests

607aa19

Fix tests

8e56644

Fix tests

d6b7fc5

Style

c1c5f44

If all args are empty use None instead

7ac9130

Style

80114a8

tonylampada assigned probicheaux Aug 20, 2024

Default to false

24e897c

PawelPeczek-Roboflow reviewed Aug 21, 2024

View reviewed changes

PawelPeczek-Roboflow requested changes Aug 21, 2024

View reviewed changes

Fix cache search mechanism

fbb9cfe

probicheaux requested a review from capjamesg as a code owner August 21, 2024 19:09

probicheaux added 2 commits August 21, 2024 19:11

Typing

4de7469

Merge branch 'main' into sam2-id-free-caching

0be170a

Merge branch 'sam1-id-free-caching' of github.com:roboflow/inference …

25eddb7

…into sam2-id-free-caching

probicheaux added 3 commits August 21, 2024 19:12

Style

f9887e5

Test cached mask changes behavior

6c5b03c

Reduce atol

6b1b797

PawelPeczek-Roboflow self-requested a review August 22, 2024 08:06

Merge branch 'main' into sam2-id-free-caching

02297a7

PawelPeczek-Roboflow previously approved these changes Aug 22, 2024

View reviewed changes

tonylampada added the sam2 label Aug 22, 2024

PawelPeczek-Roboflow and others added 3 commits August 22, 2024 14:51

Merge branch 'main' into sam2-id-free-caching

0edbbc2

Merge remote-tracking branch 'origin/main' into sam2-id-free-caching

a4af1df

add minor fixes suggested by Pawel

be6447a

tonylampada dismissed PawelPeczek-Roboflow’s stale review via be6447a August 22, 2024 14:19

Merge branch 'main' into sam2-id-free-caching

492e4e1

PawelPeczek-Roboflow self-requested a review August 22, 2024 15:08

PawelPeczek-Roboflow previously approved these changes Aug 22, 2024

View reviewed changes

grzegorz-roboflow approved these changes Aug 22, 2024

View reviewed changes

PawelPeczek-Roboflow added 2 commits August 22, 2024 17:56

Merge branch 'main' into sam2-id-free-caching

8c8fc70

Merge branch 'main' into sam2-id-free-caching

129a93d

grzegorz-roboflow previously approved these changes Aug 22, 2024

View reviewed changes

Add more docs codeowners

93241f3

PawelPeczek-Roboflow dismissed stale reviews from grzegorz-roboflow and themself via 93241f3 August 22, 2024 16:51

capjamesg approved these changes Aug 22, 2024

View reviewed changes

grzegorz-roboflow approved these changes Aug 22, 2024

View reviewed changes

PawelPeczek-Roboflow merged commit d3be171 into main Aug 22, 2024
58 checks passed

PawelPeczek-Roboflow deleted the sam2-id-free-caching branch August 22, 2024 17:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds in ability to cache low res logits from prompts #582

Adds in ability to cache low res logits from prompts #582

probicheaux commented Aug 16, 2024

probicheaux commented Aug 16, 2024

probicheaux commented Aug 20, 2024

PawelPeczek-Roboflow Aug 21, 2024

PawelPeczek-Roboflow left a comment

probicheaux commented Aug 21, 2024

PawelPeczek-Roboflow left a comment

tonylampada commented Aug 22, 2024

Adds in ability to cache low res logits from prompts #582

Adds in ability to cache low res logits from prompts #582

Conversation

probicheaux commented Aug 16, 2024

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

Docs

probicheaux commented Aug 16, 2024

probicheaux commented Aug 20, 2024

PawelPeczek-Roboflow Aug 21, 2024

Choose a reason for hiding this comment

PawelPeczek-Roboflow left a comment

Choose a reason for hiding this comment

probicheaux commented Aug 21, 2024

PawelPeczek-Roboflow left a comment

Choose a reason for hiding this comment

tonylampada commented Aug 22, 2024