Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-34057 Inconsistent FTS state in concurrent scenarios #3250

Open
wants to merge 1 commit into
base: 10.5
Choose a base branch
from

Conversation

Thirunarayanan
Copy link
Member

@Thirunarayanan Thirunarayanan commented May 11, 2024

  • The Jira issue number for this PR is: MDEV-34057

Description

  • InnoDB FTS can be in inconsistent state when sync operation terminates the server before committing the operation. This could lead to incorrect synced doc id and incorrect query results.
    Fix:
  • During sync commit operation, InnoDB should pass the sync transaction to update the max doc id
    in the config table.

How can this PR be tested?

./mtr innodb_fts.fts_sync_commit_resiliency

Basing the PR against the correct MariaDB version

  • This is a new feature and the PR is based against the latest MariaDB development branch.
  • This is a bug fix and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

@Thirunarayanan Thirunarayanan requested a review from dr-m May 11, 2024 16:33
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@Thirunarayanan Thirunarayanan force-pushed the 10.4-MDEV-34057 branch 2 times, most recently from 05b6468 to 1c20084 Compare May 15, 2024 04:38
Copy link
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be rebased on 10.5, because 10.4 is EOL. Is the patch for 10.6 identical?

@Thirunarayanan Thirunarayanan changed the base branch from 10.4 to 10.5 May 31, 2024 08:29
@Thirunarayanan Thirunarayanan force-pushed the 10.4-MDEV-34057 branch 2 times, most recently from 60e22f3 to 166f64b Compare May 31, 2024 09:07
storage/innobase/fts/fts0fts.cc Show resolved Hide resolved
storage/innobase/fts/fts0fts.cc Outdated Show resolved Hide resolved
storage/innobase/fts/fts0fts.cc Outdated Show resolved Hide resolved
storage/innobase/fts/fts0fts.cc Outdated Show resolved Hide resolved
Problem:
=======
- This commit is a merge of mysql commit 129ee47ef994652081a11ee9040c0488e5275b14.
InnoDB FTS can be in inconsistent state when sync operation
terminates the server before committing the operation. This
could lead to incorrect synced doc id and incorrect query results.

Solution:
========
- During sync commit operation, InnoDB should pass
the sync transaction to update the max doc id
in the config table.

fts_read_synced_doc_id() : This function is used
to read only synced doc id from the config table.
Copy link
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this is OK after addressing my comments.

fts_table.table = table;
fts_cache_t* cache= table->fts->cache;
dberr_t error = DB_SUCCESS;
const trx_t* const created_trx = trx;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name created_trx is somewhat misleading; it could be interepreted as "created in this function", but also as "created before calling this function". A less ambiguous name would be caller_trx.

Comment on lines +6385 to +6398
trx_t *trx = trx_create();
if (srv_read_only_mode) {
trx_start_internal_read_only(trx);
} else {
trx_start_internal(trx);
}
if (fts_read_synced_doc_id(table, &start_doc, trx)) {
fts_sql_rollback(trx);
trx->free();
goto func_exit;
}
fts_sql_commit(trx);
if (start_doc) start_doc-= 1;
trx->free();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simplify this as follows?

trx_t* trx = trx_create();
trx_start_internal_read_only(trx);
dberr_t err = fts_read_synced_doc_id(table, &start_doc, trx);
fts_sql_commit(trx);
trx->free();
if (err != DB_SUCCESS) {
	goto func_exit;
}
if (start_doc) {
	start_doc--;
}

Comment on lines +2683 to +2687
if (srv_read_only_mode) {
trx_start_internal_read_only(trx);
} else {
trx_start_internal(trx);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are creating a transaction here, it would always appear to be a read-only operation, hence we should not call trx_start_internal(trx).

Comment on lines +3 to +5
source include/have_innodb.inc;
source include/have_debug.inc;
source include/not_valgrind.inc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add also have_debug_sync.inc and not_embedded.inc. not_valgrind.inc should not be strictly needed, because we are not killing the server with DBUG_SUICIDE but with an external signal while the process is blocked in DEBUG_SYNC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants