Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Bulk Updates Being Made After 1 Record Change #518

Open
jvanderen1 opened this issue Jan 7, 2024 · 0 comments
Open

Multiple Bulk Updates Being Made After 1 Record Change #518

jvanderen1 opened this issue Jan 7, 2024 · 0 comments

Comments

@jvanderen1
Copy link

I am noticing PGSync indexing multiple times when a single record changes.

Example

Given a schema.json file as the following:

[
  {
    "database": "default",
    "index": "my_index",
    "nodes": {
      "table": "my_table",
      "columns": [
        "foo"
      ]
    }
  }
]

I have the following query which executes:

UPDATE my_table
SET foo = "hello"
WHERE id = 1;

What is happening?

Within the following snippet, 1 bulk request happens:

pgsync/pgsync/sync.py

Lines 1228 to 1231 in 1e4c3c7

# forward pass sync
self.search_client.bulk(
self.index, self.sync(txmin=txmin, txmax=txmax)
)

Then, within this next snippet, 1 more bulk request happens:

pgsync/pgsync/sync.py

Lines 1232 to 1233 in 1e4c3c7

# now sync up to txmax to capture everything we may have missed
self.logical_slot_changes(txmin=txmin, txmax=txmax, upto_nchanges=None)

What do I expect to happen instead?

I feel like there should only be 1 request only. Is there a valid reason to index multiple times in this case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant