this post was submitted on 14 Jun 2023

7 points (100.0% liked)

Meta

625 readers

59 users here now

Discussion about the aussie.zone instance itself

founded 1 year ago

MODERATORS

admin@aussie.zone

lodion@aussie.zone

Hourly 502 errors (aussie.zone)

submitted 1 year ago* (last edited 1 year ago) by lodion@aussie.zone to c/meta@aussie.zone

20 comments fedilink hide all child comments

Due to this bug that results in the Hot listing not updating, I've scheduled lemmy to automatically restart every hour on the hour.

When this occurs you may receive 502 errors briefly.

top 20 comments

sorted by: hot top controversial new old

[–] lodion@aussie.zone 2 points 1 year ago

I've disabled the restart, as it risks losing federated messages.

[–] Fashtas@aussie.zone 2 points 1 year ago (1 children)

I did see mention on that Git that restarting the server will break federation, in that outbound changes are held in ram

[–] lodion@aussie.zone 3 points 1 year ago (1 children)

Hmm good call out. I hadn't thought about it, but would have assumed there was a DB table with pending outbound sync tasks. Kind of make sense for this to not be the case.

I think I'll disable the restart cronjob when I can. Better to have a frozen "Hot" timeline than federation issues.

[–] Fashtas@aussie.zone 2 points 1 year ago (1 children)

Damned if you do, damned if you don't - I don't know enough about Lemmy if the ram federation thing is true but no one on Git has disagreed with it yet

(Actually checking this issue #2142 it seems as if it is the outgoing federation activity sync fails that remain in ram for retry)

[–] lodion@aussie.zone 3 points 1 year ago

Yeah I've stopped the restart cronjob.

[–] lodion@aussie.zone 1 points 1 year ago

And again

[–] lodion@aussie.zone 1 points 1 year ago

And again just now.

[–] lodion@aussie.zone 1 points 1 year ago

Looks like "hot" has been stuck for a while again... won't be home to kick it for a while.

Don't worry... Lemmy hasn't suddenly gone quiet, you can view by new still :)

[–] lodion@aussie.zone 1 points 1 year ago

Restarted just before this comment.

[–] lodion@aussie.zone 1 points 1 year ago

again

[–] lodion@aussie.zone 0 points 1 year ago (1 children)

Ok, this is now scheduled to occur hourly, on the hour. Will cause a few seconds of 502s while lemmy is restarted.

[–] hikarulsi@aussie.zone 1 points 1 year ago (1 children)

what is the instance running on at the moment? perhaps it might be helpful to have some load balancing fronting the instance

[–] lodion@aussie.zone 2 points 1 year ago* (last edited 1 year ago) (1 children)

Details on the server stuff is over here.
This issue isn't load related, simply a bug. Load balancing wouldn't help in this case.

[–] hikarulsi@aussie.zone 2 points 1 year ago (1 children)

thank you for the information

let us know the error/warning message, let's see if anyone happen to have the knowledge. Remember to redact some key info

[–] lodion@aussie.zone 2 points 1 year ago (1 children)

If you get the error, it will literally be "502 error" on a plain white page. Chances are unless you've just clicked a link or hit reply/submit, you'll never see it.

[–] hikarulsi@aussie.zone 2 points 1 year ago (1 children)

The likely reason would be server overload, haven't look into the lemmy stack. Perhaps monitor cpu, ram and cache usage (if any) to see if there is any correlations to the 502s

[–] lodion@aussie.zone 2 points 1 year ago (1 children)

No... 502 is due to me restarting the lemmy process, due to the known bug I linked.

[–] hikarulsi@aussie.zone 2 points 1 year ago

understood, i guess it is up to the dev or pull request to fix it for now

[–] cool_pebble@aussie.zone 0 points 1 year ago (1 children)

Would it be worth chucking in an hourly restart cron job/systemd timer for the time being? Not an ideal solution for sure, but maybe that's better than having to jump into the server every few hours (and users experiencing a seemingly stale feed).

[–] lodion@aussie.zone 1 points 1 year ago

Yeah, on my todo list this weekend. Devs have stated there will be no further 0.17.x releases, though the fix is in 0.18... with no ETA currently.