this post was submitted on 26 Aug 2024
29 points (100.0% liked)

Meta (slrpnk.net)

584 readers
1 users here now

Here we can discuss anything about this Lemmy instance/server itself.

Our XMPP support chat: Movim or XMPP client.

Please also refer to our Wiki

founded 2 years ago
MODERATORS
 

We seem to have run into some issues with one of the NVMe drives the main Lemmy database runs on. It's still being investigated, but we might have to take down the site for some hours later today to transfer the database to another SSD raid temporarily.

Let's see...

Edit: Did some preparation work today to minimize the impact/downtime, but I will do the main work that will require a hopefully short down-time tomorrow. But the NVMe drive in question is probably toast, quite litterally as it seems to have been an overheating issue. The new one will have to have better cooling I guess.

Edit2: decided to wait for some spare-parts to arrive first.

Edit3: got the first spare-part today and the other one should be arriving by mail yesterday... so probably tomorrow (edit: has arrived). Currently a bit busy with other things, but latest on the weekend the new parts will go into the server.

top 4 comments
sorted by: hot top controversial new old
[–] Nemo 8 points 3 weeks ago

Thanks for the heads-up!

[–] KryptonNerd 3 points 3 weeks ago

Good luck!!! Thanks for all the hard work

[–] poVoq 2 points 1 week ago

Looks like the hardware replacement went mostly as expected, just the additional cooler for the still working SSD in the raid pool didn't fit due to some mainboard plastic part in the way, so it will have to continue with somewhat elevated temperatures. But I removed an old network adapter that was producing a lot of heat and improved the overall airflow in the case, so it should be ok. The new NVMe ssd came with a small cooler and so far the temperature looks much better.

[–] poVoq 1 points 1 week ago

So just as a heads up. The replacement parts have arrived and I will probably find some time on the weekend to install them.

This will mean a down-time of approximately 30 minutes or so if everything goes well.