this post was submitted on 23 Jul 2023
239 points (99.2% liked)

Lemmy Server Performance

1 readers
1 users here now

Lemmy Server Performance

lemmy_server uses the Diesel ORM that automatically generates SQL statements. There are serious performance problems in June and July 2023 preventing Lemmy from scaling. Topics include caching, PostgreSQL extensions for troubleshooting, Client/Server Code/SQL Data/server operator apps/sever operator API (performance and storage monitoring), etc.

founded 1 year ago
MODERATORS
 

Details here: https://github.com/LemmyNet/lemmy/issues/3165

This will VASTLY decrease the server load of I/O for PostgreSQL, as this mistaken code is doing writes of ~1700 rows (each known Lemmy instance in the database) on every single comment & post creation. This creates record-locking issues given it is writes, which are harsh on the system. Once this is fixed, some site operators will be able to downgrade their hardware! ;)

top 27 comments
sorted by: hot top controversial new old
[–] _Rho_@lemmy.world 37 points 1 year ago (1 children)

This is fascinating

My biggest takeaway from reading through the GitHub comments though is that it seems like no one actually knows where much of the SQL comes from? As in it's possible that the bug in question is just one manifestation of old, handwritten Postres code that may or may not be optimized (Or even logical?).

I don't mean this in a critical way, as things like this are bound to happen in an open-source, federated world. However, I would think a comprehensive audit of the Lemmy Postgres triggers, queries, etc could potentially save us all from some future headaches.

[–] xtremeownage@lemmyonline.com 35 points 1 year ago (1 children)

Holy hell. Post this to one of the programming-related communities. That is interesting.

[–] lemmyvore@feddit.nl 1 points 1 year ago (1 children)

Not that interesting when it turns out it was an ORM artifact.

[–] xtremeownage@lemmyonline.com 1 points 1 year ago

True. I have experienced quite a few of those... But, still interesting nonetheless.

[–] Shadow@lemmy.ca 29 points 1 year ago* (last edited 1 year ago) (1 children)

It's not on every comment, it's mostly triggered on deletions and edits. The problem is actually infintesimally worse and 1700 rows are updated if you delete 3 comments. If you delete more it's exponential and just straight up fails and locks your database.

I'll probably put a patch in there later tonight and then see about a PR unless someone else does.

[–] RoundSparrow@lemmy.ml 28 points 1 year ago* (last edited 1 year ago) (1 children)

It’s not on every comment,

My testing with latest code is that it is indeed on every single comment INSERT, every new comment. I have the ability to view my live data while using Lemmy: https://lemmyadmin.bulletintree.com/query/raw_site_aggregates?output=table

Every one of the 1486 rows on that table gets +1 on comment when I post a new comment on my instance.

it’s mostly triggered on deletions and edits

That is not correct. Edits do not change the count of comments column on site_aggregates - because the number isn't changing. Deletes (of a comment or post) in Lemmy are also not SQL DELETE statements, they are just a delete data column in the table. That DELETE PostgreSQL trigger only gets run when a end-user cancels their Lemmy account in their profile.

[–] Shadow@lemmy.ca 11 points 1 year ago

Ah Gotcha. That's true, but the cascading issue that causes thousands of inserts happens on a delete.

That table update you're looking at is blazing fast due to the relatively low number of sites, until you run it thousands of times on a delete.

[–] PochoHipster@lemmy.ml 24 points 1 year ago (2 children)
[–] Mechaguana@programming.dev 21 points 1 year ago

Man that is some bug, no wonder lemmy had such a rough start performance wise during the reddit migration!

[–] BeigeAgenda@lemmy.ca 15 points 1 year ago (1 children)

Get some DBA's on the job and Lemmy will be blazing fast.

[–] RoundSparrow@lemmy.ml 14 points 1 year ago (2 children)

We have had DBA's, the problem is the Rust code uses ORM and an auto JSON framework that makes tracing the code time-consuming to learn.

[–] BeigeAgenda@lemmy.ca 3 points 1 year ago

Okay so you may need to refactor here and there to get more performance.

[–] AnarchoYeasty@beehaw.org 3 points 1 year ago (1 children)

Honestly, ORMs are a waste of time. Why not use sqlx and just hand write the SQL to avoid issues like this.

[–] RoundSparrow@lemmy.ml 1 points 1 year ago

In this one case, it was hand-written SQL inside a PostgreSQL FUNCTION that the ORM knows nothing about. But there is a approach in the entire application to have live-data from PostgreSQL for every little thing.

[–] eleitl@lemmy.ml 10 points 1 year ago

Wonderful. Time to start running my own instance.

[–] sebinspace@lemmy.world 8 points 1 year ago

Big victories :D

load more comments
view more: next ›