this post was submitted on 08 Jun 2023
175 points (99.4% liked)
Lemmy
12568 readers
1 users here now
Everything about Lemmy; bugs, gripes, praises, and advocacy.
For discussion about the lemmy.ml instance, go to !meta@lemmy.ml.
founded 4 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I was thinking this the other day. Without having read the spec, it seems like mirroring should be fairly straightforward - but then once an instance has gone down, how do the users find which mirror is promoted to the new main? Or should the mirrors be treated like backups, and just used to populate a new community on whatever instance is chosen (and then mirror from the new source)?
I'd like to see a live replication kind of thing. So if you're on !games@lemmy.ml it can merge with !games@behaw.meh and they super federate and advertise that this group exists, replicated, on four or five lemmy servers and the client tracks that every X hours and knows what the failovers are.
Solves some of the fragmentation issues and the backup/archive issues at the same time. Might even help with load balancing a bit if we have some kind of routing algo on the endpoints.
That sounds really smart. Let communities decide which instances they federate with. The mod team owns the community, not the instance admins.
Good question. Not sure what the best procedure might be here. Could be as simple promoting them in order of initial mirror deployment dates and the others become mirrors for the newly activated instance.
Triggering the activation could be a part of an instance decommissioning procedure where the operator selects the mirror to become the successor. Maybe there could be some basic system specs and network performance reporting so they could choose the optimal instance. Users would receive a message that their account is being moved to another instance and domain.
In the event of an unexpected outage, there could be a deadman switch style timeout where the fastest mirror activates automatically after the original instance is out for long enough, but also a process for the operator of the downed instance to delay the takeover by signaling, "I'm working on it." In the event of automatic takeover, since users wouldn't be able to receive messages, there would have to be some sort of global lemmy notice system so users of the downed instance know where to go, like a sticky post on the front page or maybe just a separate "notices" page.
Definitely need some kind of replication/mirroring to occur between instances for DR, I was looking at how other decentralized systems for inspiration. Something like RAID where it's tolerant of one or two drive failures could be translated to Lemmy. When you set up a new instance it allows you to opt-in to a peer network where you host backup content from several other instances and your instance is backed up to several other (non-overlapping) instances.
When you say "go down" do you mean what happens if an instance shuts down its servers for good? I think the answer to that is not a technical one. If a sever is owned by an organization (not-for-profit) and it pays it's cloud provider bills, it'll stay up forever.
If you mean what happens if there's a technical issue and the server data is lost, that's a different and solved issue. Create database backups. Easy peasy.