DEV Community

Yev
Yev

Posted on

Ask dev.to - struggle with flask+sqlalchemy

Hello! Need some help from the bright minds of dev.to πŸ™ Recently I started to implement a backend for my project using flask + sqlalchemy (postgres + pgbouncer) + gunicorn. The problem is, that it performs only ~1 request per second... I started to play with the gunicorn config and set 5 gevent workers, πŸ™ˆ monkey patched psycorp2 (the driver for postgres). Didn't work. Okay, maybe sqlalchemy blocks the main thread and the app performs poorly. So I switched to 5 sync workers. Now at least it should give me 5 RPS... Not this time. After some investigation, I found out that the bottleneck was between my web app and the db. pg bouncer (47 connections) and postgres are hosted on digital ocean, so I believe the guys made the right settings for me. The only thing that is left is sqlalchemy and how it handles sessions (and my hands which wrote the code). Maybe somebody had a similar issue or can give advice where to look further? Thank you in advance, you're awesome πŸ’š

Top comments (10)

Collapse
 
rhymes profile image
rhymes

What happens if you remove pgbouncer from the equation? 1 request per second seems a configuration problem.

You're saying your DB and pg bouncer are on Digital Ocean, but where the rest of the app is?

Collapse
 
yev profile image
Yev

Hi, the rest of the app is also in digital ocean (droplet). I was yesterday playing with the gunicorn config and made a blazing 5 request per second... (with a fail rate ~20%) BTW, the testing software is locust. I could not believe that pgbouncer cannot handle more than that. Without pg bouncer the result is the same, that's super weird. I am not using raw sqlalchemy, just the flask-sqlalchemy, which closes connections and handles all the stuff under the hood.

Collapse
 
rhymes profile image
rhymes • Edited

It's pretty weird. What happens on localhost? Have you tried running both app and the DB locally and load test it?

You need a process of elimination :D

Thread Thread
 
yev profile image
Yev

Yep, tried running on localhost.
Interesting observation - gunicorn 9 sync workers - the query time to db is around ~500ms, RPS is 5.
gunicorn + gevent - query time gets from 1 sec to 100 sec and it has the same RPS result as sync. I think psycorp2 monkey patching is doing a poor job and with gevent it blocks the main thread. sigh.

Thread Thread
 
rhymes profile image
rhymes • Edited

I honestly don't understand how the app is setup, can you describe it better?

You have a flask web app which runs on production with gunicorn but why did you choose to use gevent as well? Have you tried without it?

With psycorp2 you mean psycopg2? Why is it monkey patched? Because of gevent? Which version of PostgreSQL are you using?

What happens if you just start the app without gunicorn and gevent locally and you benchmark it? 5 requests per seconds is pretty low, what is your app doing?

Sorry but it's hard to help without information

Thread Thread
 
yev profile image
Yev • Edited

Thank you rhymes for feedback! Basically, I found out that the bottleneck was the db, the query performance is 5 transactions per second. After query optimization to 10 rps I decided to cache some info in redis and update the cache in background jobs. But another problem came out (I think it's good karma πŸ˜‡). Redis benchmark shows 50k rps, but in my endpoint it makes only 50rps 😫 Soo, I'm currently profiling the sh*t of the code.

The app setup is next ->

  • postgresql 11 with 47 connections with pgbouncer
  • redis
  • flask
  • gunicorn with gevent worker (why gevent because it's good for I/O bound work and for traffic spikes)
Thread Thread
 
rhymes profile image
rhymes • Edited

Sorry if I re-iterate: have you tried without pgbouncer? Because 10 rps is ridicously low. How many RPS do you measure without it?

You shouldn't need redis either, at least not as a patch for the problem.

The easiest way you can get out of this is to isolate each component.

What happens if you remove pgbouncer AND gevent?

Thread Thread
 
yev profile image
Yev • Edited

I measure on localhost without any pgbouncer, using pgbench, so no gevent and pgbouncer.

Thread Thread
 
rhymes profile image
rhymes

Let's recap because you're giving out information in small pieces and I'm having difficulties following what's happening:

  • you have a Flask/Flask-Alchemy/PostgreSQL app
  • this app runs on production with gunicorn and gevent
  • PostgreSQL is on a digital ocean
  • using pgbench from your localhost you're getting 5 RPS

There a few aspects to consider:

  • pgbench tests PostgreSQL, not your app. It also tells you transactions per second, not requests per second.
  • I'm not sure testing a remote PostgreSQL from your local machine is useful at all. Your app if deployed in the same network as PostgreSQL is not going to behave the same way anway. You should probably test it from the server that hosts the app
  • Are you running pgbench correctly? Benchmarking can introduce methodology errors. How many connections are you opening? Can you provide the command line you run the test with?
  • What happens to said PostgreSQL's memory when benchmarked?

Finally, have you tried to actually load test the production app? Or at least a copy of it. With tools like wrk for example.

Collapse
 
jin09 profile image
Gautam Jain

Hey,
Will it be possible for you to share your code base? Delete any business critical logic that you might have written.