Wednesday, September 25, 2019

Upgrading from MySQL 5.7 to MySQL 8.0 Part II

Reserved Words and UTF8MB4 Issues

The upgrade checker utility in the new MySQL Sell will warn you about potential problems.  The two most common problems I have seen are finding out one of your column names is a reserved word and character sets.

Reserved Words

You will spot reserved word issues on the second step of the upgrade check:
2) Usage of db objects with names conflicting with reserved keywords in 8.0
  Warning: The following objects have names that conflict with reserved
    keywords that are new to 8.0. Ensure queries sent by your applications use
    `quotes` when referring to them or they will result in errors.
  More information:
    https://dev.mysql.com/doc/refman/en/keywords.html

  davetest.reserved_word.over - Column name

In the above we are being told that in the schema 'davetest' in the table 'reserved_word' that there is a column named 'over' that conflicts with the newly added reserved word OVER.

There are two possible fixes.  One is to quote with back ticks (` not ') the column name for the rest of the lifetime of that table. You can use the back ticked item but have to remember to ALWAYS back tick the column name. Two is to rename 'over' to something else that is not a reserved word. 

UTF8 May Not Equal UTF8MB4

One of the big changes in MySQL 8.0 is the support for UTF8MB4 which includes support for version 9 of the Unicode character set. Before 8.0 UTF8 really meant UTF8MB3.  3 does not have full support for the forth plane which means no Chinese, Japanese, or Korean language support or support for emojis.

What We Used To Do


The past several years had most folks creating databases & schemas with UTF8 like the following:

CREATE DATABASE mydb  CHARACTER SET utf8  COLLATE utf8_general_ci;

This provides better support for CJK than say Latin1 but still falls short.  And if you run the util.checkForServerUpgrade(), the third step will tell you the following:

util.checkForServerUpgrade('root@localhost:3306') output for #3
And it tell you which schemas and tables should be updated to UTF8MB4.
The schema and table that need to be updated to UTF8MB4

The Fix

The good news is that is relatively easy, if not a little tedious, to convert.  First start with the database/schema.

ALTER SCHEMA davetest CHARACTER SET UTFMB4;

And for each table you will have to change the default and then change each column that uses a character set collation:

ALTER TABLE mytest DEFAULT CHARACTER SET UTF8MBF,
MODIFY COLUMN name CHARACTER SET utf8mb4 COLLATE utf8mb4_900_ai_ci;

Yes, this conversion is a little painful and tedious. But it does let you have access to all the Unicode characters.