Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[search source] ES Query rule loads fewer fields on query execution #183694

Merged
merged 45 commits into from
Jun 2, 2024

Conversation

mattkime
Copy link
Contributor

@mattkime mattkime commented May 17, 2024

Summary

tldr; ES Query alert execution creates less field_caps traffic, date fields being accessed in alert message via fields.* might not render aside from the timestamp field.

--

This PR reduces the number of fields loaded via field caps to the minimum required to run a query, rather than the full field list. It adds a createLazy method to the Search Source Service which internally loads fields via a DataViewLazy object and then adds them to a DataView object. This is to minimize changes and ship code quickly - SearchSource objects expose the DataView object they use and kibana apps may use this. It will take time to migrate away from this since the DataView object is used both internally and referenced externally. A key element of this code is the ability to extract a field list from a query so a limited (rather than complete) set of fields can be loaded.*

One side effect of loading fewer fields is that date fields available via fields.* in the alert message may no longer work. Previously, all fields were loaded including all date fields. Now, date fields are only loaded if they're part of the query. This has been determined to be a small corner case and an acceptable tradeoff.

Only the ES Query rule is using this new method of loading fields. While further work is needed before wider adoption, this should prevent significant data transfer savings via a reduction in field_caps usage.

Depends upon #183573


* We don't need to load all fields to create a query, rather we need to load all the fields where some attribute will change the output of a query. Sometimes the translation from KQL to DSL is the same no matter the field type (or any other attribute) and sometimes the translation is dependent field type and other attributes. Generally speaking, we need the latter.

There are additional complexities - we need to know which fields are dates (and date nanos) when their values are displayed so their values can be made uniform. In some circumstances we need to load a set of fields due to source field exclusion - its not supported in ES so Kibana submits a list of individual field names.

Finally, there are times where we solve a simpler problem rather than the problem definition. Its easier to get a list of all fields referenced in a KQL statement instead of only getting the subset we need. A couple of extra fields is unlikely to result in performance degradation.


Places where the field list is inspected -

packages/kbn-es-query/src/es_query/filter_matches_index.ts
packages/kbn-es-query/src/es_query/from_nested_filter.ts
packages/kbn-es-query/src/es_query/migrate_filter.ts
packages/kbn-es-query/src/kuery/functions/exists.ts
packages/kbn-es-query/src/kuery/functions/is.ts
packages/kbn-es-query/src/kuery/functions/utils/get_fields.ts

This looks like its worth closer examination since it looks at the length of the field list - https://github.com/elastic/kibana/blob/main/packages/kbn-es-query/src/kuery/functions/is.ts#L110

Next steps -

  • Discuss above usage and make sure all cases are covered in this PR
  • Add statement to PR on lack of date formatting
  • Add test to verify reduction of fields requested

@mattkime mattkime self-assigned this May 20, 2024
@mattkime mattkime changed the title Dataview lazy alert hack mattk [search source] ES Query rule loads fewer fields on query execution May 20, 2024
@mattkime mattkime added Team:DataDiscovery Discover App Team (Document Explorer, Saved Search, Surrounding documents, Graph) Feature:Search Querying infrastructure in Kibana release_note:enhancement labels May 20, 2024
@mattkime mattkime marked this pull request as ready for review May 30, 2024 04:25
@mattkime mattkime requested review from a team as code owners May 30, 2024 04:25
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

Copy link
Contributor

@dej611 dej611 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments on the code review, mostly about rewriting in a concise way some lengthy code.
I've tested this PR locally and tested some dashboards: while I think this is a great approach for search source, from my measures the performance impact on a dashboard of this optimization is mostly negligible as any Lens chart using the same dataView will load the entire fields list no matter what.
From that I can take 2 actions:

  • create a new Lens follow up issue to optimize the flow within the embeddable context as well
  • noticed that @elastic/kibana-presentation controls will load additionally the default dataView with all its fields no matter what as well.

Comment on lines 49 to 77
if (!useDataViewLazy) {
if (typeof searchSourceFields.index === 'string') {
fields.index = await indexPatterns.get(searchSourceFields.index);
} else {
fields.index = await indexPatterns.create(searchSourceFields.index);
}
} else {
fields.index = await indexPatterns.create(searchSourceFields.index);
if (typeof searchSourceFields.index === 'string') {
dataViewLazy = await indexPatterns.getDataViewLazy(searchSourceFields.index);
const dataView = new DataView({
spec: await dataViewLazy.toSpec(),
// field format functionality is not used within search source
fieldFormats: {} as FieldFormatsStartCommon,
shortDotsEnable: await searchSourceDependencies.dataViews.getShortDotsEnable(),
metaFields: await searchSourceDependencies.dataViews.getMetaFields(),
});

fields.index = dataView;
} else {
dataViewLazy = await indexPatterns.createDataViewLazy(searchSourceFields.index);
const dataView = new DataView({
spec: await dataViewLazy.toSpec(),
// field format functionality is not used within search source
fieldFormats: {} as FieldFormatsStartCommon,
shortDotsEnable: await searchSourceDependencies.dataViews.getShortDotsEnable(),
metaFields: await searchSourceDependencies.dataViews.getMetaFields(),
});
fields.index = dataView;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's possible to rewrite this in a concise and clearer way.
Some idea:

const isIndexName = typeof searchSourceFields.index === 'string';
...
  if(!isDataViewLazy){
    fields.index = isIndexName ? await indexPatterns.get(searchSourceFields.index) : await indexPatterns.create(searchSourceFields.index);
  } else {
    dataViewLazy = isIndexName ? await indexPatterns.getDataViewLazy(searchSourceFields.index): await indexPatterns.createDataViewLazy(searchSourceFields.index);
    // prefetch async data all at once
    const [spec, shortDotsEnable, metaFields] = await Promise.all([
      dataViewLazy.toSpec(),
      searchSourceDependencies.dataViews.getShortDotsEnable(),
      searchSourceDependencies.dataViews.getMetaFields()
    ]);
    const dataView = new DataView({
      spec,
       // field format functionality is not used within search source
       fieldFormats: {} as FieldFormatsStartCommon,
       shortDotsEnable,
       metaFields,
     });

     fields.index = dataView;
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason using isIndexName in the conditional causes typescript to lose the more specific searchSourceFields.index type. Otherwise I've adopted these improvements.

Copy link
Contributor

@ymao1 ymao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

response ops changes LGTM

rshen91 pushed a commit to rshen91/kibana that referenced this pull request May 30, 2024
…3939)

## Summary

Simply changing the type on `ISearchOptions` to work with DataViewLazy.
Nothing that uses `ISearchOption` uses the field list so these are type
only changes

Part of elastic#183694
@mattkime
Copy link
Contributor Author

while I think this is a great approach for search source, from my measures the performance impact on a dashboard of this optimization is mostly negligible as any Lens chart using the same dataView will load the entire fields list no matter what.

@dej611 You're correct, the rollout is going to be a long process and the initial focus is on alerts since they run frequently, increasing our COGS, but we'll eventually make our way to lens and dashboards.

@dej611
Copy link
Contributor

dej611 commented May 30, 2024

Actually, as after thought it might not be convenient to optimize per chart, as that may have the side effect of multiple partial fields requests. Perhaps that should be lifted up at dashboard level who's the only one who has the global context in view.

@kertal
Copy link
Member

kertal commented May 31, 2024

Actually, as after thought it might not be convenient to optimize per chart, as that may have the side effect of multiple partial fields requests. Perhaps that should be lifted up at dashboard level who's the only one who has the global context in view.

@mattkime @dej611 I don't see much benefit how DataViewLazy would be useful on a Dashboard given we already have Browser caching in place. Because we should omit many different field_caps requests for several visualizations. So I think we already applied the biggest improvement in this area

Copy link
Member

@lukasolson lukasolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a bunch of different tests and seems to be working correctly. Added a couple of comments but after adding the unit test this can be merged!

@elastic elastic deleted a comment from kertal Jun 1, 2024
@mattkime
Copy link
Contributor Author

mattkime commented Jun 2, 2024

/ci

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
data 2576 2585 +9

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
data 420.3KB 421.5KB +1.2KB
Unknown metric groups

API count

id before after diff
data 3185 3194 +9

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @mattkime

@mattkime mattkime merged commit 28bef65 into elastic:main Jun 2, 2024
18 of 19 checks passed
@kibanamachine kibanamachine added v8.15.0 backport:skip This commit does not require backporting labels Jun 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting ci:collect-apm Feature:Search Querying infrastructure in Kibana release_note:enhancement Team:DataDiscovery Discover App Team (Document Explorer, Saved Search, Surrounding documents, Graph) v8.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants