Data migration from Oracle to MariaDB with Docker and Connect SE: A Step-by-Step Guide

In this blog we are going to learn how to migrate data from Oracle to MariaDB.

To begin, we’ll learn the basics about Oracle database to have an understanding about the steps that are done on the demo example. After, we will create a table in Oracle and migrate it to MariaDB.

To migrate data from Oracle there are 2 ways:

  1. Dump Oracle data to CSV and load data in MariaDB.
  2. Use the Connect Storage Engine to create or insert into a table from Oracle’s source definition.

For demonstration, we are going to use a docker container with an Oracle Express Edition (XE) image.

To make it simple, on the same container we will start a MariaDB instance and migrate data to it.

I had a hard time to make it work, mostly because of not knowing the basics of Oracle and Oracle images (will explain below). I will try to describe briefly what a reader needs to know about Oracle to understand commands, as well what may happen in future MariaDB development.

Oracle terminology

To have a clear picture what changes are going to be done, let’s explain basic terminology we are going to use in later examples.

SID

In Oracle, SID stands for system identifier, and it is a unique name that identifies a specific Oracle database instance on a particular server or system. Each Oracle database instance has a unique SID assigned to it which distinguishes it from other instances on the same system. One can check the specific database instance with the ORACLE_SID environment variable. SID is only used if you want an alternative way to connect to a container database, or what is called catalog in MariaDB. In our example it will be named by the XE.

Listener

Listener listens for incoming client connections to the Oracle database. The listener is responsible for accepting client connections and routing them to the appropriate Oracle database instance. By default, it listens on port `1521`. Changing the port and adding new services and networks addresses which the listener should listen for can be added in the listener.ora file. There is a lsnrctl utility.

Services

A service name is a unique identifier for a specific service that is used to establish a connection between the client and the database. Each service name is associated with a connect descriptor located in the tnsnames.ora file, used by the client to resolve the network address of the listener that hosts the service. Testing the service name can be done using the tnsping utility.

CDB & PDB

Oracle has introduced the concept of multitenant architecture with the container database (CDB) storing zero or more plugabble databases (PDB). Each pluggable database is essentially a self-contained database that can be managed independently, but is hosted within the context of the container database. In the example below we will have as default the pluggable database XEPDB1. MariaDB currently doesn’t have such a feature, but is developing the ability to create catalogs, that will be an analogy to CDB.

Users

There are default users sys, system, pbadmin that can be used to connect with the specified with ORACLE_PWD environment variable. The pdbadmin user has privileges that allow it to manage PDBs, create new PDBs, and perform other administrative tasks related to the multitenant architecture.

Oracle source data

The playground that should be set is the docker container started on the Oracle image on some version of Oracle database.

As a first attempt, I tried to use images from oracle registry, but after days of trying I couldn’t make the ODBC connector work on my AMD platform.

After that I found a setup with Oracle’s docker-images repository that worked. So let’s start with the demo.

Start the container

Build the image (it will take time)


$ git clone https://github.com/oracle/docker-images.git
$ cd docker-images/OracleDatabase/SingleInstance/dockerfiles
# It will take some time
$ ./buildContainerImage.sh -x -v 18.4.0 -o '--build-arg SLIMMING=false'
Build completed in 390 seconds.
$ docker images|grep oracle
REPOSITORY                          TAG         IMAGE ID       CREATED          SIZE
oracle/database                     18.4.0-xe   4a141cc0a851   10 seconds ago   6.03GB

Start the container (wait until healthy) based on the Express Edition (XE).

$ docker run --name oracle18xe --rm -d -p 1521:1521 -p 5500:5500 -e ORACLE_PWD=oracle oracle/database:18.4.0-xe
$ docker ps --format "table {{.Status}}"
STATUS
Up About an hour (healthy)

Connect to the container

The Docker image comes with two service names XE and XEPDB1 that we may use to connect with the Oracle client, to CDB and PDB respectively. There is also a default listener port and network address that the client will use in order to connect to the listener. Example of tnsnames.ora file for XEPDB1.

$ cat $ORACLE_HOME/network/admin/tnsnames.ora
XEPDB1 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 0.0.0.0)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = XEPDB1)
    )
  )

The listener configuration in listener.ora defines the TCP port and host and IPC connection on which the listener listens.

$ cat $ORACLE_HOME/network/admin/listener.ora
LISTENER =
           (DESCRIPTION_LIST =
             (DESCRIPTION =
               (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC_FOR_XE))
               (ADDRESS = (PROTOCOL = TCP)(HOST = 0.0.0.0)(PORT = 1521))
             )
           ) 

The lsnrctl utility can be used to check lsnrctl status or lsnrctl services. The tnsping utility can be used to check service names, like

$ docker exec -it oracle18xe tnsping XEPDB1
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = 0.0.0.0)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = XEPDB1)))
OK (0 msec)

Create Oracle data

For data we are going to connect with Oracle’s sqlplus client, create the table and insert some values.

From here there are multiple ways to connect to the database using different accounts and/or services:

1. Using the sys user (super user) that needs to be connected as the sysdba or sysoper alias to the PDB or service. sysdba and sysoper are system privileges to work on the root (CDB) database.

$ docker exec -it oracle18xe sqlplus sys/oracle@XEPDB1 as sysdba
$ docker exec -it oracle18xe sqlplus sys/oracle@XEPDB1 as sysoper
$ docker exec -it oracle18xe sqlplus sys/oracle@XE as sysdba

2. The system user, a standard user that doesn’t have the super user privileges, can use the same service names

$ docker exec -it oracle18xe sqlplus system/oracle@XE

3. The pdbadmin user can only be connected only to the XEPDB1 pluggable database.

$ docker exec -it oracle18xe sqlplus pdbadmin/oracle@XEPDB1

Knowing how to connect with the client, let’s create data on the pluggable database using the standard user.

$ docker exec -it oracle18xe sqlplus system/oracle@XEPDB1
Connected to:
Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0

SQL> show con_name;
CON_NAME
------------------------------
XEPDB1
SQL> -- Use simple table
SQL> create table t(t number);
SQL> desc t;
 Name					   Null?    Type
 ----------------------------------------- -------- ----------------------------
 T						    NUMBER
SQL> insert all into t(t) values (1) into t(t) values (2) into t(t) values(3) select 1 from dual;
SQL> select * from t;

	 T
----------
	 1
	 2
	 3

Great! We have data that we want to migrate to MariaDB.

MariaDB setup

Install MariaDB server

One can use different containers for MariaDB and Oracle, but since the purpose is to show how migration of data works, let’s use the already created Oracle container for both databases. The principle should be the same for containers on the network.

The first step is to install MariaDB Server, but before installing we will need some dependencies, like the epel repository and editor in the Oracle container.

$ docker exec -it oracle18xe bash
$ yum update
$ yum install http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm vi

For the server let’s choose the latest 11.0 RC version MariaDB Foundation repo-config repository.

$ cat /etc/yum.repos.d/mariadb.repo 
[mariadb]
name = MariaDB
baseurl = https://ftp.bme.hu/pub/mirrors/mariadb/yum/11.0/centos/$releasever/$basearch
module_hotfixes = 1
gpgkey = https://ftp.bme.hu/pub/mirrors/mariadb/yum/RPM-GPG-KEY-MariaDB
gpgcheck = 1
# check  yum repolist
$ yum update

Ready to install the MariaDB server

$ yum install MariaDB-server
$ rpm -q -a|grep -e Maria
MariaDB-common-11.0.1-1.el7.centos.x86_64
MariaDB-client-compat-11.0.1-1.el7.centos.noarch
MariaDB-server-compat-11.0.1-1.el7.centos.noarch
MariaDB-compat-11.0.1-1.el7.centos.x86_64
MariaDB-client-11.0.1-1.el7.centos.x86_64
MariaDB-server-11.0.1-1.el7.centos.x86_64

Now from this step there are 2 options.

  1. dump data in Oracle in CSV form,
  2. use Connect SE.

For the former case, I have tried dump data in CSV form, but Oracle’s XE doesn’t support this and I haven’t tried with the Enterprise edition. I have also tried to use the expdp utility to create a dump file, but it generated a binary file and as such I don’t see how it can be used for migration. The command for this case should be straightforward. Create the Oracle script and execute it, to get a CSV file.

SQL> edit dumpOracle
# Write the content
SET MARKUP CSV ON
SET HEADING OFF
SET FEEDBACK OFF
SET COLSEP ','
SET TRIMSPOOL ON
SPOOL table_oracle.csv
SELECT t FROM t;
# execute the script
SQL> @dumpOracle
SQL>  SPOOL OFF

On the MariaDB side you should create the table and load data from file

MariaDB [test]> create table t(t int);
MariaDB [test]> LOAD DATA INFILE '/table_oracle.csv'
INTO TABLE t
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES;

If you have managed this step, let me know in the comments.

Let’s proceed further with second case, using Connect SE.

Migrate data to MariaDB using Connect SE

Here Connect SE and ODBC will be used.

We will need to install the Connect SE from package, install the ODBC driver manager, set the ODBC configuration, and do the migration. All steps are meant to be executed from within the Oracle container.

Install Connect SE

We will need the mariadb-connect-engine package.

$ yum install MariaDB-connect-engine
$ rpm -q -a|grep -e MariaDB-connect
MariaDB-connect-engine-11.0.1-1.el7.centos.x86_64
# Check shared library
$ ls /usr/lib64/mysql/plugin/ha_connect*
ha_connect.so

ODBC configuration

To create the ODBC connection we need the unixODBC driver manager. It is a dependency on the mariadb-connect-engine package, so it is already installed.

$ rpm -q -a|grep -e ODBC
unixODBC-2.3.1-14.0.1.el7.x86_64

unixODBC has utilities like isql (CLI SQL tool) and odbcinst (CLI for ODBC configuration) that we are going to use below.

After installing the driver manager, we need to connect to Oracle’s database, using Oracle’s ODBC driver called libsqora.so, and add its path to LD_LIBRARY_PATH.

$ export LD_LIBRARY_PATH=$ORACLE_HOME/lib/

Setup ODBC driver

Create Oracle’s ODBC driver by writing it in the odbcinst.ini file. The default location for drivers is in /etc/odbcinst.ini, that you can check by invoking odbcinst -j.

There are already ODBC connection settings for MySQL and PostgreSQL, so we need to update it to use the OracleODBC configuration and check it by applying odbcinst CLI.

# update the configuration file (no need to install)
$ cat /etc/odbcinst.ini 
[OracleODBC]
Description    = ODBC for Oracle
Driver         = /opt/oracle/product/18c/dbhomeXE/lib/libsqora.so.18.1

# query for the driver
$ odbcinst -q -d
[PostgreSQL]
[MySQL]
[OracleODBC]

Setup ODBC data source name (DSN)

We need to add configuration to access the Oracle database using the driver created in the above step by setting the configuration in ~/.odbc.ini just like we have connected to Oracle. The file is the default file for the user data source, but since we are using the root user in the container, one could use the system data source /etc/odbc.ini as well.

$ cat ~/.odbc.ini 
[oracle]
Driver = OracleODBC
DSN = Oracle ODBC connection
ServerName = XEPDB1
UserID = system
Password = oracle

Validate your DSN with odbcinst:

odbcinst -q -s
[oracle]

Check DSN connection

Check the DSN named oracle, created above using the isql utility

 isql -v oracle
+---------------------------------------+
| Connected!                            |
|                                       |
| sql-statement                         |
| help [tablename]                      |
| quit                                  |
|                                       |
+---------------------------------------+
SQL> select * from t;
+-----------------------------------------+
| T                                       |
+-----------------------------------------+
| 1                                       |
| 2                                       |
| 3                                       |
+-----------------------------------------+
SQLRowCount returns -1
3 rows fetched

If this step works, you have one step left to migrate to MariaDB.

Migrate

Start MariaDB Server and load the Connect SE plugin

$ mariadbd --user=root --plugin-load-add=ha_connect.so &
2023-04-27 12:31:03 0 [Note] Starting MariaDB 11.0.1-MariaDB source revision 158a58245813b1959d6ee912d77734620c7cf3ba as process 1061
2023-04-27 12:31:03 0 [Note] CONNECT: Version 1.07.0002 March 22, 2021
Version: '11.0.1-MariaDB'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MariaDB Server

Connect with the mariadb client and create the table with Connect SE that will do autodiscovery of columns, create them and load data.

$ mariadb -uroot test
# Migration is done here
MariaDB [test]> create table table_maria engine=connect table_type=ODBC tabname='t' Connection='DSN=oracle' SRCDEF='select * from t';
Query OK, 0 rows affected (0.059 sec)

# Check data
MariaDB [test]> select * from table_maria;
+------+
| T    |
+------+
|    1 |
|    2 |
|    3 |
+------+
3 rows in set (0.052 sec)

MariaDB [test]> show create table table_maria \G
*************************** 1. row ***************************
       Table: table_maria
Create Table: CREATE TABLE `table_maria` (
  `T` double(40,0) DEFAULT NULL
) ENGINE=CONNECT DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci CONNECTION='DSN=oracle' `TABLE_TYPE`='ODBC' `TABNAME`='t' `SRCDEF`='select * from t'
1 row in set (0.000 sec)

That’s it! When ODBC configuration is established once, it is pretty effortless to migrate data from Oracle to MariaDB.

Read more

Feedback Welcome

If you come across any problems in this blog, with the design, or edge cases that don’t work as expected, please let us know. You are welcome to chat about it on Zulip. As always you can use our JIRA bug/feature request in the MDEV project for any bug/feature request you may encounter.