-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Support for String and IP and boolean in the TOP
aggregate function
#110346
Comments
Pinging @elastic/es-analytical-engine (Team:Analytics) |
An example of how
This gets you a list of which hosts are hosting what service. Except if there are more than 10 hosts with the service, then it just gets you the first 10. Which is probably good for a UI that'll display a bunch of services. It can just say In this case we really don't care too much about the sorting aspect of |
- Added support for Booleans on Max and Min - Added some helper methods to BitArray (`set(index, value)` and `fill(from, to, value)`). This way, the container is more similar to other BigArrays, and it's easier to work with Part of #110346, as Max and Min are dependencies of Top.
Support Version, Keyword and Text in Max an Min aggregations. The current implementation of both max and min does: For non-grouping: - Store a BytesRef - When there's a max/min, copy it to the internal array. Grow it if needed For grouping: - Keep an array of BytesRef (null by default: there's no "initial/default value" here, as there's no "MAX" value for a string) - Each BytesRef stores their own array, which will be grown as needed to copy the new max/min Some notes: - It's not shrinking the arrays, as to avoid having to copy, and potentially grow it again - It's using raw arrays. But maybe it should use BigArrays to compute in the circuit breaker? Part of #110346
Support Version, Keyword and Text in Max an Min aggregations. The current implementation of both max and min does: For non-grouping: - Store a BytesRef - When there's a max/min, copy it to the internal array. Grow it if needed For grouping: - Keep an array of BytesRef (null by default: there's no "initial/default value" here, as there's no "MAX" value for a string) - Each BytesRef stores their own array, which will be grown as needed to copy the new max/min Some notes: - It's not shrinking the arrays, as to avoid having to copy, and potentially grow it again - It's using raw arrays. But maybe it should use BigArrays to compute in the circuit breaker? Part of elastic#110346
Support Version, Keyword and Text in Max an Min aggregations. The current implementation of both max and min does: For non-grouping: - Store a BytesRef - When there's a max/min, copy it to the internal array. Grow it if needed For grouping: - Keep an array of BytesRef (null by default: there's no "initial/default value" here, as there's no "MAX" value for a string) - Each BytesRef stores their own array, which will be grown as needed to copy the new max/min Some notes: - It's not shrinking the arrays, as to avoid having to copy, and potentially grow it again - It's using raw arrays. But maybe it should use BigArrays to compute in the circuit breaker? Part of elastic#110346
Support Version, Keyword and Text in Max an Min aggregations. The current implementation of both max and min does: For non-grouping: - Store a BytesRef - When there's a max/min, copy it to the internal array. Grow it if needed For grouping: - Keep an array of BytesRef (null by default: there's no "initial/default value" here, as there's no "MAX" value for a string) - Each BytesRef stores their own array, which will be grown as needed to copy the new max/min Some notes: - It's not shrinking the arrays, as to avoid having to copy, and potentially grow it again - It's using raw arrays. But maybe it should use BigArrays to compute in the circuit breaker? Part of elastic#110346
The ESQL aggregate function
TOP
supports numerics and dates. We'd love to get support forip
,boolean
, andkeyword
/text
fields. These are all unique and different.ip
fields are a fixed length string type. We could use exactly the same technique to supportip
.boolean
fields are single bit fields and their sorting is funky. In this case I think having a small a small counter for the number offalse
s and the number oftrue
s you've seen is probably fine. That's kind of howVALUES
works for booleans - though it deduplicates so instead of a counter it's just twoboolean
values.text
/keyword
fields have a variable length. They are super useful to support, but they are probably the most difficult. Maybe we'd store "ordinal" values in the min heap and we'd use a side car structure to assign ordinals. Or something.The text was updated successfully, but these errors were encountered: