A beginner’s guide to YugabyteDB

Last modified:

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

In this article, we are going to see what YugabyteDB is, how to install it and manage using PostgreSQL tools, and how you can connect to it using JDBC, JPA, or Hibernate.

I got curious about Yugabyte since Franck Pachot joined them as a Developer Advocate. Having followed Franck for a long time, I decided to investigate this new PostgreSQL-compatible database they are developing since I’ve been learning a lot of stuff about SQL and database systems from Franck.

What is YugabyteDB

YugabyteDB is an open-source distributed SQL database that combines the benefits of using a relational database (e.g., ACID transactions) with the advantages of globally-distributed auto-sharded stores (e.g., NoSQL document databases).

First of all, it’s an open-source database, and you can find it on GitHub. Only the cloud management part is proprietary, but the engine itself is community-driven.

Second, YugabyteDB builds on top of PostgreSQL, so every tool that works with PostgreSQL works with Yugabyte as well. So, not only you’ll be able to use PgAdmin to connect to Yugabyte, but you can use any software framework or library that works with the PostgreSQL drivers. As you will see in this article, it’s extremely easy to make an existing PostgreSQL application work on Yugabyte.

Third, YugabyteDB is versatile when it comes to data and traffic volumes. Because it provides auto-scaling, auto-sharding, and auto-balancing, you won’t have to rearchitect your system the moment it becomes too successful for the initial architecture to cope with.

How to install YugabyteDB

Depending on your application needs, there are multiple ways to install Yugabyte.

However, in this article, I’m going to show you how to run Yugabyte in a Docker container.

The first step is to pull the Docker image:

docker pull yugabytedb/yugabyte:2.15.1.0-b175

2.15.1.0-b175: Pulling from yugabytedb/yugabyte
2d473b07cdd5: Pull complete
5954b7a9c5ea: Pull complete
5b00001786bb: Pull complete
c43e6bd8eb6c: Pull complete
99ad07cc1c7c: Pull complete
b9331fac7e42: Pull complete
a7e3630fe335: Pull complete
05b42b4417c9: Pull complete
d97501a5f6ad: Pull complete
06158813861c: Pull complete
736eaefc97b2: Pull complete
c45ea0648626: Pull complete
2843bee931d8: Pull complete
808b5e86368d: Pull complete

Digest: sha256:b340163bdd55bf6b3653224460eb93f71782b331804d2f9655194e2b135ba72f
Status: Downloaded newer image for yugabytedb/yugabyte:2.15.1.0-b175
docker.io/yugabytedb/yugabyte:2.15.1.0-b175

Afterward, we can create a new container using the following docker run command:

docker run -d --name yugabyte -p7000:7000 -p9000:9000 -p5433:5433 -p9042:9042 yugabytedb/yugabyte:2.15.1.0-b175 bin/yugabyted start --daemon=false --ui=false

If you are running macOS Monterey, you have to replace -p7000:7000 with -p7001:7000.

This is necessary because, by default, AirPlay listens on port 7000. This conflicts with YugabyteDB and causes yugabyted start to fail unless you forward the port as shown. Alternatively, you can disable AirPlay receiving, then start YugabyteDB normally, and then, optionally, re-enable AirPlay receiving.

Notice that the newly created container is called yugabyte, and we can see it installed with the ps -a command:

docker ps -a

CONTAINER ID   IMAGE                                                COMMAND                  CREATED          STATUS                       PORTS                                                                                                                                                                     NAMES
88feaa0a2942   yugabytedb/yugabyte:2.15.1.0-b175                    "/sbin/tini -- bin/y…"   27 seconds ago   Up 24 seconds                0.0.0.0:5433->5433/tcp, 6379/tcp, 7100/tcp, 0.0.0.0:7000->7000/tcp, 0.0.0.0:9000->9000/tcp, 7200/tcp, 9100/tcp, 10100/tcp, 11000/tcp, 0.0.0.0:9042->9042/tcp, 12000/tcp   yugabyte

Having the container in place, the next time we boot our system, we can start the Yugabyte database using the start Docker command:

docker start yugabyte

That’s it!

How to connect to Yugabyte

Once the Yugabyte database server is started, you can connect to it using any PostgreSQL-compatible tool. For instance, I can use the PgAdmin UI tool to connect to both my local PostgreSQL server and the YugabyteDB server running on Docker:

Connecting to YugabyteDB using PgAdmin

From your favorite programming language, you can connect to Yugabyte just like you’d do for PostgreSQL. For instance, if you’re using Java, you can use the PGSimpleDataSource from the PostgreSQL JDBC Driver, as illustrated by the following example:

PGSimpleDataSource dataSource = new PGSimpleDataSource();
dataSource.setURL(
    "jdbc:postgresql://127.0.0.1:5433/high_performance_java_persistence"
);
dataSource.setUser("yugabyte");
dataSource.setPassword("admin");

Awesome, right?

Running the High-Performance repository on Yugabyte

For me, the best way to test a database system that has a JDBC Driver and a Hibernate Dialect is to use the High-Performance Java Persistence GitHub repository since it provides a massive collection of integration tests that can verify tons of JPA, Hibernate, JDBC, and database features.

find . -name '*Test.java' | wc -l
709

With 709 integration test classes available, I have a lot of ways I could test a given relational database, so I’m going to integrate Yugabyte into my High-Performance Java Persistence GitHub repository and test how it works using the existing PostgreSQL-compatible integration tests.

As illustrated by this commit, adding support for Yugabyte was just a matter of creating a new YugabyteDBDataSourceProvider.

I didn’t even have to add the Yugabyte-specific JDBC Driver if I’m using a single Docker database server instance. YugabyteDB provides its own JDBC Driver, which is needed if you want to benefit from auto-balancing or enable other cool features they offer.

Testing time

Assuming we have the following JPA entity:

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    @GeneratedValue
    private Long id;

    private String title;

    @Column(name = "created_on")
    private LocalDateTime createdOn;

    public Long getId() {
        return id;
    }

    public void setId(Long id) {
        this.id = id;
    }

    public String getTitle() {
        return title;
    }

    public Post setTitle(String title) {
        this.title = title;
        return this;
    }

    public LocalDateTime getCreatedOn() {
        return createdOn;
    }

    public Post setCreatedOn(LocalDateTime createdOn) {
        this.createdOn = createdOn;
        return this;
    }
}

When persisting three Post entities:

entityManager.persist(
    new Post()
        .setTitle("High-Performance Java Persistence, Part 1")
        .setCreatedOn(today.minusDays(2).atStartOfDay())
);

entityManager.persist(
    new Post()
        .setTitle("High-Performance Java Persistence, Part 2")
        .setCreatedOn(today.minusDays(1).atStartOfDay())
);

entityManager.persist(
    new Post()
        .setTitle("High-Performance Java Persistence, Part 3")
        .setCreatedOn(today.atStartOfDay())
);

Hibernate executes the following INSERT statements on YugabyteDB:

INSERT INTO post (
    created_on, 
    title, 
    id
) 
VALUES (
    '2022-09-05 00:00:00.0', 
    'High-Performance Java Persistence, Part 1', 
    1
)

INSERT INTO post (
    created_on, 
    title, 
    id
) 
VALUES (
    '2022-09-06 00:00:00.0', 
    'High-Performance Java Persistence, Part 2', 
    2
)

INSERT INTO post (
    created_on, 
    title, 
    id
) 
VALUES (
    '2022-09-07 00:00:00.0', 
    'High-Performance Java Persistence, Part 3', 
    3
)

And querying works just like on any relational database system:

List<Post> posts = entityManager.createNativeQuery("""
    SELECT *
    FROM post
    WHERE
        created_on >= :startTimestamp and 
        created_on < :endTimestamp
    """, Post.class)
.setParameter("startTimestamp", today.minusDays(2))
.setParameter("endTimestamp", today)
.getResultList();

assertEquals(2, posts.size());

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

This was the first time I ever used YugabyteDB, and I’m really impressed because it allows me to reuse lots of tools I’m already familiar with.

The fact that I didn’t have to do anything special to make it work with JPA and Hibernate is great because I can easily migrate an existing Spring Boot project from PostgreSQL or YugabyteDB and benefit from its auto-scaling capabilities.

This research was funded by Yugabyte and conducted in accordance with the blog ethics policy.

While the article was written independently and reflects entirely my opinions and conclusions, the amount of work involved in making this article happen was compensated by Yugabyte.