PXC Scheduler Handler: The Missing Piece for Galera/Percona XtraDB Cluster Puzzle

Percona XtraDB Cluster Scheduler Handler Working on a real case scenario in a five node Percona XtraDB Cluster (PXC), we were forced to use wsrep_sync_wait = 1, because the app does reads-after-write and we send reads to all the nodes. We had the idea to leave some nodes in DESYNC mode to reduce the flow control messages during peak load and expected to have a steadier write throughput keeping the read consistency.

We decided to test Perconas’s new PXC Scheduler Handler which is an application that manages integration between ProxySQL and Galera/PXC (the scope is to maintain the ProxySQL mysql_server table, if a negative scenario occurs, like: failures, service degradation, and maintenance). However, we realized that when a node is in DESYNC mode, it is kicked out of the read hostgroup. That is why we asked Marco Tusa to implement this new feature which will remove the node from the read hostgroup if wsrep_local_recv_queue is higher than max_replication_lag.

Environment

5 PXC nodes
1 ProxySQL server

In db01, we run sysbench to simulate write traffic:

sysbench /usr/share/sysbench/oltp_insert.lua --threads=50 --tables=100 --mysql-password=<pass> --mysql-user=<user> --report-interval=5 --time=600 --tx-rate=0  run

1	sysbench /usr/share/sysbench/oltp_insert.lua --threads=50 --tables=100 --mysql-password=<pass> --mysql-user=<user> --report-interval=5 --time=600 --tx-rate=0 run

In ProxySQL we run sysbench to simulate read-only traffic:

sysbench /usr/share/sysbench/oltp_read_only.lua --tables=100 --mysql-password=<pass> --mysql-user=<user> --mysql-host=127.0.0.1 run --threads=32 --report-interval=5 --time=600 --db-ps-mode=disable

1	sysbench /usr/share/sysbench/oltp_read_only.lua --tables=100 --mysql-password=<pass> --mysql-user=<user> --mysql-host=127.0.0.1 run --threads=32 --report-interval=5 --time=600 --db-ps-mode=disable

DESYNC Test

The goal of this test is to see the differences between wsrep_desync ON/OFF and wsrep_sync_wait = 1.

In the next graph, we are going to see both scenarios in the same graph, on the left of each graph when wsrep_desync is ON and on the right when it is OFF.

DESYNC Test

As you can see there are decreases in the read traffic when DESYNC is OFF. It occurs in the same period of time when the flow control messages are sent:

This is expected and it is not new. The number of queries executed were:

    DESYNC ON  queries:                             30561552 (50934.44 per sec.)
    DESYNC OFF queries:                             28324704 (47195.67 per sec.)

1 2	DESYNC ON queries: 30561552 (50934.44 per sec.) DESYNC OFF queries: 28324704 (47195.67 per sec.)

Basically, you can execute 8% of queries if flow control is not enabled.

Consistency Test

Now, we are going to simulate a scenario when the cluster receives CPU-intensive queries. We are going to execute in db03 (or any other node):

for j in $(seq 1 40) ; do 
  for i in $(seq 1 250); do 
    echo "SELECT pad FROM sbtest.sbtest1 GROUP BY pad LIMIT 10" | mysql --password=<pass> --user=<user> --host=<host> ; 
  done > /dev/null 2>&1 & 
done

for j in $(seq 1 40) ; do

for i in $(seq 1 250); do

echo "SELECT pad FROM sbtest.sbtest1 GROUP BY pad LIMIT 10" | mysql --password=<pass> --user=<user> --host=<host> ;

done > /dev/null 2>&1 &

done

It starts 40 threads that execute 250 times the same group by query which is enough for our testing.

And we are going to add a timestamp column on the sbtest1 to monitor the lag:

ALTER TABLE sbtest.sbtest1 add  `tnow` timestamp(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6),

1	ALTER TABLE sbtest.sbtest1 add `tnow` timestamp(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6),

Now we have this 4 scenario where we are going to monitor with:

select timediff(tnow,current_timestamp(6))  from sbtest1 order by tnow desc limit 1

1	select timediff(tnow,current_timestamp(6)) from sbtest1 order by tnow desc limit 1

And the status variable: wsrep_local_recv_queue.

This will be the header of the tables:

| timediff | query time |  wsrep_local_recv_queue |

1	\| timediff \| query time \| wsrep_local_recv_queue \|

– With wsrep_desync=OFF and wsrep_sync_wait=1

| 00:00:02.553139 | 2.69 sec |  237 |
| 00:00:00.223150 | 0.26 sec |   72 |

1 2	\| 00:00:02.553139 \| 2.69 sec \| 237 \| \| 00:00:00.223150 \| 0.26 sec \| 72 \|

– With wsrep_desync=OFF and wsrep_sync_wait=0

| 00:00:01.711082 | 1.74 sec |  183 |
| 00:00:01.175446 | 1.20 sec |  112 |

1 2	\| 00:00:01.711082 \| 1.74 sec \| 183 \| \| 00:00:01.175446 \| 1.20 sec \| 112 \|

– With wsrep_desync=ON and wsrep_sync_wait=1

| -00:00:00.021460 | 1.30 sec | 13542 |
| -00:00:00.009098 | 1.28 sec | 13184 |

1 2	\| -00:00:00.021460 \| 1.30 sec \| 13542 \| \| -00:00:00.009098 \| 1.28 sec \| 13184 \|

– With wsrep_desync=ON and wsrep_sync_wait=0

| -00:01:02.065854 | 0.46 sec | 635387 |
| -00:01:02.882633 | 0.54 sec | 643742 |

1 2	\| -00:01:02.065854 \| 0.46 sec \| 635387 \| \| -00:01:02.882633 \| 0.54 sec \| 643742 \|

With wsrep_desync=OFF, the behavior is similar, and this means that the node needs to be on sync and then it checks wsrep_sync_wait.

With wsrep_desync=ON and wsrep_sync_wait=1, we can see that the query is delayed because it needs to apply the transaction in the apply queue. It is not the case when wsrep_sync_wait=0, which data is far behind the writer node, and the query is answered immediately.

The two cases that matter are when wsrep_sync_wait=1, and both cases are read-consistent even if they show different timediff values, as the query time is measuring the flow control lag when wsrep_desync=OFF and the apply queue time when wsrep_desync=ON.

Cluster Behavior During Load Increases

It is time to merge both test and simulate when ProxySQL spreads the CPU-intensive queries over the cluster. We are going to execute in ProxySQL the same script:

for j in $(seq 1 40) ; do 
  for i in $(seq 1 250); do 
    echo "SELECT pad FROM sbtest.sbtest1 GROUP BY pad LIMIT 10" | mysql --password=<pass> --user=<user> --host=<host> ; 
  done > /dev/null 2>&1 & 
done

for j in $(seq 1 40) ; do

for i in $(seq 1 250); do

echo "SELECT pad FROM sbtest.sbtest1 GROUP BY pad LIMIT 10" | mysql --password=<pass> --user=<user> --host=<host> ;

done > /dev/null 2>&1 &

done

In the next graph we are going to see how the active connections on different nodes went up and down as the status of the node changed:

MySQL Node Change

And when the script finished it went back to normal.

In ProxySQL, you will see how the status of the servers changes to ONLINE to OFFLINE_SOFT and back to ONLINE because of PXC Scheduler Handler intervention, like this:

proxysql> select hostgroup_id,hostname,status from runtime_mysql_servers order by hostgroup_id ='101' ;
+--------------+---------------+--------------+
| hostgroup_id | hostname      | status       |
+--------------+---------------+--------------+
| 101          | 10.127.70.242 | ONLINE       |
| 101          | 10.127.70.243 | ONLINE       |
| 101          | 10.127.70.244 | OFFLINE_SOFT |
| 101          | 10.127.70.245 | ONLINE       |
| 101          | 10.127.71.3   | ONLINE       |
+--------------+---------------+--------------+
proxysql> select hostgroup_id,hostname,status from runtime_mysql_servers order by hostgroup_id ='101' ;
+--------------+---------------+--------------+
| hostgroup_id | hostname      | status       |
+--------------+---------------+--------------+
| 101          | 10.127.70.242 | ONLINE       |
| 101          | 10.127.70.243 | ONLINE       |
| 101          | 10.127.70.244 | ONLINE       |
| 101          | 10.127.70.245 | OFFLINE_SOFT |
| 101          | 10.127.71.3   | ONLINE       |
+--------------+---------------+--------------+

proxysql> select hostgroup_id,hostname,status from runtime_mysql_servers order by hostgroup_id ='101' ;

+--------------+---------------+--------------+

| hostgroup_id | hostname | status |

+--------------+---------------+--------------+

| 101 | 10.127.70.242 | ONLINE |

| 101 | 10.127.70.243 | ONLINE |

| 101 | 10.127.70.244 | OFFLINE_SOFT |

| 101 | 10.127.70.245 | ONLINE |

| 101 | 10.127.71.3 | ONLINE |

+--------------+---------------+--------------+

proxysql> select hostgroup_id,hostname,status from runtime_mysql_servers order by hostgroup_id ='101' ;

+--------------+---------------+--------------+

| hostgroup_id | hostname | status |

+--------------+---------------+--------------+

| 101 | 10.127.70.242 | ONLINE |

| 101 | 10.127.70.243 | ONLINE |

| 101 | 10.127.70.244 | ONLINE |

| 101 | 10.127.70.245 | OFFLINE_SOFT |

| 101 | 10.127.71.3 | ONLINE |

+--------------+---------------+--------------+

This can be reviewed also in PMM:

percona monitoring and management

This means that PXC Scheduler Handler is helping us to spread the load across DESYNC nodes, improving the response time for the read-only traffic.

Architecture

With traditional replication we had an abstract diagram like this:

traditional replication

We didn’t have any production-ready option to guarantee Read Consistency on the Read Replicas.

With Galera/PXC we don’t need the Replication Manager as it will be replaced with PXC Scheduler Handler:

PXC Scheduler Handler

We have the same amount of nodes with Read Consistency (thanks to wsrep_sync_wait) and a synced secondary writer node. What is missing, but not difficult to add, is a tool that monitors the WriterNodes, as in a failure scenario, we might want to keep not less than two synced nodes.

Conclusion

I think that Marco Tusa did a great job with the PXC Scheduler Handler which allowed us to implement this new Architecture that might help people that need to scale reads, need consistent write throughput, need consistent reads, and don’t want reads to affect the flow of the replication process.

If you’re interested in learning more about Percona XtraDB Cluster and ProxySQL, be sure to check out Percona’s Training Services. We offer an advanced two-day hands-on tutorial for XtraDB Cluster, in addition to our one-day ProxySQL intensive. Contact us today to learn more!

0 Comments

Inline Feedbacks

View all comments

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

PXC Scheduler Handler: The Missing Piece for Galera/Percona XtraDB Cluster Puzzle

Environment

DESYNC Test

Consistency Test

Cluster Behavior During Load Increases

Architecture

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

MySQL 8.4 First Peek

LDAP Authentication in PgBouncer Through PAM

Trying out the PostgreSQL pg_tde Tech Preview Release

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

PXC Scheduler Handler: The Missing Piece for Galera/Percona XtraDB Cluster Puzzle

Environment

DESYNC Test

Consistency Test

Cluster Behavior During Load Increases

Architecture

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

MySQL 8.4 First Peek

LDAP Authentication in PgBouncer Through PAM

Trying out the PostgreSQL pg_tde Tech Preview Release

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation