Fix value fetch and convert issue in binlog insert path#236
Fix value fetch and convert issue in binlog insert path#236shlomi-noach merged 10 commits intogithub:masterfrom
Conversation
This catches a bug in `sql.BuildDMLInsertQuery` where we are using the target column's Charset to drive the value conversion. In the case where we are altering the charset, the Charset used for conversion will be different than the original column's charset, resulting in an erroneous conversion.
This catches a bug in `sql.BuildDMLInsertQuery` where we we are fetching the insert values using the renamed column's name, and end up fetching the value of the wrong column. The test in `localtests/rename` did not catch this because binlog update events were "correcting" the error, as they follow a different code path that does not contain the bug.
When processing binlog insert statements, we want to use `sharedColumns` to decide which values to fetch and convert from the insert DML event. We only want to `mappedShareColumns` to define the column names in the `replace into ...` statement.
cd91b84 to
2f80c9d
Compare
|
Thank you for working on this! I believe your solution will not solve migrating a |
|
I am still trying to learn the details behind mysql's charset conversion semantics, but it seems to me that mysql handles all the conversion of charsets when sending to/receiving from the client. So since our connection charset is So the only place where we have to do explicit decoding is when parsing binlog events, since mysql won't be helping us there. I still have to better confirm the above, however, I did the following test: in therefore converting the columns I am still digging deeper to get a better explanation for this behavior, but if my understanding is correct we only need to worry about decoding the source value. |
|
I've added more target charsets to the test, so we have:
|
|
That's interesting! I'll check it out tomorrow. BTW, let's try a recent GH feature, if you don't mind. Since you are PRing onto |
|
I don't mind at all. In fact, it was already enabled. :) Feel free to push to this branch. |
|
No, there's no badge that I can see. |
|
Cool. :) Happy to contribute. This is a great tool. |
|
There's a mystery plot involved here. |
|
Added tests for
|
| begin | ||
| insert into gh_ost_test values (null, 11, 23); | ||
| insert into gh_ost_test values (null, 13, 23); | ||
| insert into gh_ost_test values (null, rand(), rand()); |
There was a problem hiding this comment.
rand() is a value [0..1), and you assign it onto an integer. It will always be 0 and the test is not reliable. Fixing.
|
As much as I wrap my head around it this just seems fantastic and correct. I've added a bunch of tests to validate this with, and all goes well! It also applies to |
|
Next thing is the added |
|
Added fix to I'm sending this for production testing. |
|
Good catch. I had missed the issue in |
|
Production testing is good. Merging. |
This fixes #235. In the process I found a bug triggered when inserting to a renamed column via the binlog.
The source of the issue is that we were using
mappedSharedColumnsto drive the fetching and conversion of the insert values. For this we want to usesharedColumnssince these columns are based on the original table, which is where the insert values are coming from.We continue to use
mappedSharedColumnsto generate the column names in thereplacestatement.I realize we didn't discuss the issue much in #235, but I was using lots of code examples, I figured it would be easier to continue in the PR if needed.
gofmt(please avoidgoimports)./build.sh./test.sh