-
Notifications
You must be signed in to change notification settings - Fork 8.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
esdsl #69254
esdsl #69254
Conversation
Pinging @elastic/kibana-app-arch (Team:AppArch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just did a quick code review pass, added a couple of minor comments below.
Other than that, we need unit tests on esRawResponse
and esdsl
.
902b768
to
bd89491
Compare
bd89491
to
0ffe81b
Compare
@elasticmachine merge upstream |
expressions.registerFunction(esdsl); | ||
expressions.registerType(esRawResponse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a note for myself here - currently we are registering esaggs from plugin.ts
, but would be nice to move that here instead for consistency. I'll try to do that as we work toward server migration of esaggs.
src/plugins/data/public/search/expressions/__snapshots__/es_raw_response.test.ts.snap
Outdated
Show resolved
Hide resolved
* under the License. | ||
*/ | ||
|
||
import { EsRawResponse, esRawResponse } from './es_raw_response'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Could also add a few tests for .to.kibana_datatable()
bb2a4f3
to
968fa8c
Compare
@elasticmachine merge upstream |
@elasticmachine merge upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested the latest changes, and was so far unable to find another way to break it. Like we discussed earlier, I have a feeling we will need to address any other edge cases as they come up. But since there will be nobody initially using this function, and we are confident we have covered the most common query cases, I am comfortable merging as-is since I expect this will address the majority of cases.
dsl: string; | ||
} | ||
|
||
export type EsdslExpressionFunctionDefinition = ExpressionFunctionDefinition< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also: This should get added to the mapping of all function definitions provided by the expressions plugin in src/plugins/expressions/common/expression_functions/types.ts
Never mind; forgot this lives in the data plugin 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API looks pretty rough, and I've left some comments about it based on the spacetime work that I did. The way I would define "ready to use" here is: can I create a visualization in Canvas from flattening multiple aggregations?
{ inspectorAdapters: {} } as any | ||
); | ||
|
||
expect(result).toMatchSnapshot(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These snapshot tests are hard to understand from a reviewing perspective. Can you make specific assertions instead of using snapshots?
|
||
expect(result).toMatchSnapshot(); | ||
}); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests look incomplete to me:
- What happens if the JSON is invalid?
- What if the JSON has line breaks?
- What if the context includes query, filters, and timerange?
- What if the JSON includes a
bool
query, how is that query merged with the context?
async fn(input, args, { inspectorAdapters, abortSignal }) { | ||
const searchService = getSearchService(); | ||
|
||
const dsl = JSON.parse(args.dsl); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could throw, please add a test
}, | ||
{ | ||
dsl: | ||
'{"index": "kibana_sample_data_logs", "size": 4, "body": { "_source": false, "query": { "term": { "machine.os.keyword": "osx"}}}}', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't the DSL... the section labelled body
is the DSL. index
should be a separate parameter to the function.
const dsl = JSON.parse(args.dsl); | ||
if (!dsl.body) dsl.body = {}; | ||
|
||
const dslBody = dsl.body; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument that you're calling body
should be the only argument. Everything else should be top-level arguments to the expression function, such as:
esdsl index="my_index" dsl='{ "query": { "match_all": {} }, "size": 0, "aggs": { } }'
}; | ||
|
||
const convertResult = (body: SearchResponse<unknown>) => { | ||
return body.hits.hits.length ? parseRawDocs(body.hits) : flatten(body.aggregations); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented on this earlier, but because of the way the query DSL will return hits
by default, I think aggregations
should take priority.
id: key, | ||
name: key, | ||
meta: { | ||
type: typeof rows[0][key], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a stricter definition of this type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think we have anything at the moment, but hopefully elasticsearch can provide this kind of meta data on the response.
"meta": Object { | ||
"field": "products.created_on", | ||
"params": Object {}, | ||
"type": "object", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the JS type, but it's not actually useful for our purposes. I would have expected that arrays would be flattened more.
skipped: 0, | ||
failed: 0, | ||
}, | ||
hits: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a copy+paste response with the previous? Can it be stored as a variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, they are different, the second one having aggs and docs
"products.created_on": Array [ | ||
"2016-12-12T09:27:22.000Z", | ||
"2016-12-12T09:27:22.000Z", | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This row value is not useful for visualizations, since it's an array. In the spacetime work that Joe and I did, this was the most important case to figure out. Both of us chose to unroll these nested arrays into multiple rows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this response handling is copied from joe's space time. I am hoping we can get elasticsearch to return meta information with responses and thus address this, but if you have a simple solution that would make this parsing better i am open for suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that we should follow up with this later. It might need special handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general this looks good to me, but if I understood correctly, there is no way to use date fields at the moment because they won't be converted to actual date
objects. At least for canvas this limits possible use cases.
One option would be to add an auto-detection of ISO dates and converting them automatically or to have a separate function for that. Doesn't have to be part of this PR though.
# Conflicts: # src/plugins/data/public/public.api.md
} | ||
|
||
const parseRawDocs = (hits: SearchResponse<unknown>['hits']) => { | ||
return hits.hits.map((hit) => hit.fields).filter((hit) => hit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to map the fields
? In most cases when we are working with raw documents we rely on the stuff in _source
(at least that's how I used the results in most cases). Another option would be to use both and prefer fields if available.
# Conflicts: # src/plugins/data/public/public.api.md # src/plugins/data/public/search/search_service.ts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making these API changes, I found it fairly easy to work with in Canvas. The main changes that I want to clarify is how we want fields
to behave: it's the new standard, and I think it's wrong to ignore it. I would be comfortable merging this sooner and making those behavior changes later, but only if you add the tests that indicate missing support for fields
.
"products.created_on": Array [ | ||
"2016-12-12T09:27:22.000Z", | ||
"2016-12-12T09:27:22.000Z", | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that we should follow up with this later. It might need special handling
}); | ||
|
||
describe('esRawResponse', () => { | ||
describe('converts aggregations to table', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}); | ||
}); | ||
|
||
describe('converts raw docs to table', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that there was previous discussion about how to handle the fields
parameter if it's present, but the current behavior of just ignoring it is probably not the long-term behavior. I'm fine with changing the behavior for this, but I'd like to see a test that is only for the fields parameter. This test would indicate that it's intentionally ignored.
In the long term, I think fields
and _source
should be merged when creating the table response. If you want a shorter term solution, you could something like with aggregations
and create a hierarchy of aggs > fields > _source
.
} | ||
|
||
const parseRawDocs = (hits: SearchResponse<unknown>['hits']) => { | ||
return hits.hits.map((hit) => hit._source).filter((hit) => hit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ignoring the hits.fields
parameter, which I know you previously discussed. I would encourage you to add support for this because it's now being documented as the primary way to retrieve more specific information. https://www.elastic.co/guide/en/elasticsearch/reference/7.x/search-fields.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An easy way would be to flatten out source fields by _source.<field name>
and fields
fields by fields.<field name>
. This stays close to the format of Elasticsearch and gives the user all they need (renaming the stuff they need shouldn't be hard)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this state. There are things to improve, but IMHO they shouldn't be part of the very first version of this. I left one suggestion about the fields vs source debate, otherwise I think this is a great addition.
Thanks for working on this!
# Conflicts: # src/plugins/data/public/index.ts
6c1fb40
to
66d0e85
Compare
💚 Build SucceededBuild metrics@kbn/optimizer bundle module count
page load bundle size
History
To update your PR or re-run it, just comment with: |
closes #70772
Summary
Adds expression function to access app arch search service from expressions.
esdsl
function takes a singledsl
parameter which should be stringified elasticsearch search request jsonit will try to merge any filters,query,timerange present on the input context into the request.
it will output
raw_es_response
format, which can be automatically converted todatatable
andkibana_table
.if raw es response contains docs and aggregation results the conversion to datatable will only take the docs.
it will log the request and response to inspector
followup:
parse_es_response
function which allows you to better control conversion from es_raw_response to datatable.Checklist
Delete any items that are not applicable to this PR.