crashdoom

joined 1 year ago
MODERATOR OF
[–] crashdoom@pawb.social 1 points 15 hours ago

Looking like I had an off-by-one error, it's actually fixed in the beta for 0.19.6; Will look at upgrading to that to fix the issue, just reviewing the changes to make sure there's nothing else major there.

 

We've successfully upgraded Lemmy from v0.19.3 to v0.19.5 and have a bunch of changes to share!

Bug Fixes

  • Previously users were reporting that the web client kept logging them out. This was suspected to be due to a misconfiguration of the cookie storing the session information and should be fixed.

  • Some communities were slow to load. Federation changes in v0.19.4 and v0.19.5 should rectify some federation issues. Additionally, we increased the number of database connections for Lemmy and will gradually increase this if we identify any further slowdowns.

(Longer term, we're planning to upgrade the database hardware which should improve the performance for all services)

Local Only Communities

When creating or editing your communities, you'll now see a "Visibility" option that allows you to choose between "Public" and "Local Only".

The "Public" setting will continue to operate as normal and federate your community to all other instances we federate with.

"Local Only" will disable federation for your community and restrict the community to being viewed by local logged in users only; Logged out users may not be able to see your community until they log in.

Image Proxying

Once we've established that the database is no longer under substantial load and the performance issues previously experienced are fixed, we'll be enabling an image proxying mode which will allow us to locally cache images from remote instances rather than serving them directly.

This will allow us to improve the performance of loading images without applying additional load to other instances, and prevent remote bad actors from trying to scrape IP addresses.

Other changes

There are a handful of other miscellaneous changes that you can view in the full change logs:

[–] crashdoom@pawb.social 2 points 4 days ago

👋 This isn't just you! We're aware of the issue and believe that the latest version of Lemmy should fix it; It seems to be an issue with the cookies being set with Strict instead of Lax causing it to not be sent by your browser.

Full announcement for maintenance to perform the upgrade: https://pawb.social/post/14286683

 

We’re aware of an issue affecting our Lemmy service (pawb.social) that seems to be causing users to be randomly logged out. We believe we’ve identified the issue thanks to a bug fix in a newer version of the UI code.

We’re needing to verify there are no other breaking changes for the upgrade process, and expect to perform maintenance this Sunday, Sept 22nd at Noon US Mountain Time (https://everytimezone.com/s/7abd4955). We’ll provide an update tomorrow to confirm the maintenance window.

Thank you to everyone who brought this to our attention and we apologize for the inconvenience.

[–] crashdoom@pawb.social 2 points 7 months ago

Looks really, really well done! :D

 

tl;dr summary furry.engineer and pawb.fun will be down for several hours this evening (5 PM Mountain Time onward) as we migrate data from the cloud to local storage. We'll post updates via our announcements channel at https://t.me/pawbsocial.


In order to reduce costs and expand our storage pool, we'll be migrating data from our existing Cloudflare R2 buckets to local replicated network storage, and from Proxmox-based LXC containers to Kubernetes pods.

Currently, according to Mastodon, we're using about 1 TB of media storage, but according to Cloudflare, we're using near 6 TB. This appears to be due to Cloudflare R2's implementation of the underlying S3 protocol that Mastodon uses for cloud-based media storage, which is preventing Mastodon from properly cleaning up no longer used files.

As part of the move, we'll be creating / using new Docker-based images for Glitch-SOC (the fork of Mastodon we use) and hooking that up to a dedicated set of database nodes and replicated storage through Longhorn. This should allow us to seamlessly move the instances from one Kubernetes node to another for performing routine hardware and system maintenance without taking the instances offline.

We're planning to roll out the changes in several stages:

  1. Taking furry.engineer and pawb.fun down for maintenance to prevent additional media being created.

  2. Initiating a transfer from R2 to the new local replicated network storage for locally generated user content first, then remote media. (This will happen in parallel to the other stages, so some media may be unavailable until the transfer fully completes).

  3. Exporting and re-importing the databases from their LXC containers to the new dedicated database servers.

  4. Creating and deploying the new Kubernetes pods, and bringing one of the two instances back online, pointing at the new database and storage.

  5. Monitoring for any media-related issues, and bringing the second instance back online.

We'll be beginning the maintenance window at 5 PM Mountain Time (4 PM Pacific Time) and have no ETA at this time. We'll provide updates through our existing Telegram announcements channel at https://t.me/pawbsocial.

During this maintenance window, furry.engineer and pawb.fun will be unavailable until the maintenance concluded. Our Lemmy instance at pawb.social will remain online, though you may experience longer than normal load times due to high network traffic.


Finally and most importantly, I want to thank those who have been donating through our Ko-Fi page as this has allowed us to build up a small war chest to make this transfer possible through both new hardware and the inevitable data export fees we'll face bringing content down from Cloudflare R2.

Going forward, we're looking into providing additional fediverse services (such as Pixelfed) and extending our data retention length to allow us to maintain more content for longer, but none of this would be possible if it weren't for your generous donations.

1
Lemmy v0.19.3 (pawb.social)
submitted 8 months ago* (last edited 8 months ago) by crashdoom@pawb.social to c/pawbsocial_announcements@pawb.social
 

We've updated to Lemmy v0.19.3!

For a full change log, see the updates below:

Major changes

Improved Post Ranking

There is a new scaled sort which takes into account the number of active users in a community, and boosts posts from less-active communities to the top. Additionally there is a new controversial sort which brings posts and comments to the top that have similar amounts of upvotes and downvotes. Lemmy’s sorts are detailed here.

Instance Blocks for Users

Users can now block instances. Similar to community blocks, it means that any posts from communities which are hosted on that instance are hidden. However the block doesn’t affect users from the blocked instance, their posts and comments can still be seen normally in other communities.

Two-Factor Auth Rework

Previously 2FA was enabled in a single step which made it easy to lock yourself out. This is now fixed by using a two-step process, where the secret is generated first, and then 2FA is enabled by entering a valid 2FA token. It also fixes the problem where 2FA can be disabled without passing any 2FA token. As part of this change, 2FA is disabled for all users. This allows users who are locked out to get into their account again.

New Federation Queue

Outgoing federation actions are processed through a new persistent queue. This means that actions don’t get lost if Lemmy is restarted. It is also much more performant, with separate senders for each target instance. This avoids problems when instances are unreachable. Additionally it supports horizontal scaling across different servers. The endpoint /api/v3/federated_instances contains details about federation state of each remote instance

Remote Follow

Another new feature is support for remote follow. When browsing another instance where you don’t have an account, you can click the subscribe button and enter the domain of your home instance in the popup dialog. It will automatically redirect you to your home instance where it fetches the community and presents a subscribe button. Here is a video showing how it works.

Moderation

Reports are now resolved automatically when the associated post/comment is marked as deleted. This reduces the amount of work for moderators. There is a new log for image uploads which stores uploader. For now it is used to delete all user uploads when an account is purged. Later the list can be used for other purposes and made available through the API.

[–] crashdoom@pawb.social 8 points 9 months ago (2 children)

Apologies for the confusion and delay! We've gone ahead and updated to 0.18.5!

[–] crashdoom@pawb.social 1 points 10 months ago* (last edited 10 months ago)

Side note: Approved the local https://pawb.social/u/CommunityBoost account just now.

 

Currently, we’re running the Ubiquiti Dream Machine directly as the modem via PPPoE, but there appears to be an intermittent issue with the software implementation that results in periodic downtimes of a few minutes while it reconnects.

We’re looking at switching this back out with the ISP provided router in pass through mode to negate the PPPoE connectivity drop.

We don’t expect this to take longer than 1 hour to switch over and test for reliability before bringing the services back up.

We’ll be performing this maintenance around 11 AM US Mountain Time, and will provide updates via the Telegram channel at https://t.me/pawbsocial.

 

One of the data storage systems (CEPH) encountered a critical failure when Proxmox lost connection to several of its nodes, ultimately resulting in the CEPH configuration being cleared by the Proxmox cluster consensus mechanism. No data, except ElasticSearch, was stored on CEPH.

When the connection was lost to the other nodes, a split-brain occurred (when nodes disagree on which changes are authoritative and which should be dropped). As we tried to recluster all of the nodes, a resolution occured that resulted in the ceph.conf file being wiped and the data on CEPH being unrecoverable.

Thankfully, we’ve suffered no significant data loss, with the exception of having to rebuild the Mastodon ElasticSearch indexes from 6 AM this morning to present.

I’d like to profusely apologize for the inconvenience, but we felt it necessary at the time to offline all services as part of our disaster recovery plan to ensure no damage occurred to the containers while we investigated.

1
submitted 11 months ago* (last edited 11 months ago) by crashdoom@pawb.social to c/pawbsocial_announcements@pawb.social
 

We've migrated all of the Pawb.Social (Lemmy) media from within the container to the new local NAS.

We'll be performing the same updates for pawb.fun and furry.engineer over the course of the next week, minimizing downtime where possible.

We will need to take the services offline once the initial transfer is completed to perform a final check for any new data, and transition to the local NAS, but we'll post an announcement prior to the final maintenance for each instance.

These changes are to allow us longer and cheaper retention of remote and local media across all of our services, and to allow us to begin backing up media in case of a catastrophic failure.

 

With the upgrade to Mastodon 4.2.0, the way the search system works is changing to allow more broad searches across posts made across the fediverse.

Cutting to the chase: Full-text search across opted-in posts will be possible in the next few days.


In more depth, the system will allow you to search across your own posts and posts you've interacted with from others irrespective of their opt-in status for full-text search, but will additionally allow you to search ANY posts made by users that opt-in to allow their posts to be indexed.

To that effect, a new setting was added to the Public Profile > Privacy and Reach page called "Include public posts in search results" which is disabled by default. Turning this on will allow any user across the Fediverse to search and find your posts using full-text search.

Shortcuts:

To allow our users to decide if this is a feature they want to use on their posts going forward, we won't be enabling ElasticSearch until Friday Sept. 22.


Additionally, new keywords will be added to the search system:

  • from:me
  • before:2022-11-01, after:2022-11-01, during:2022-11-01
  • language:fr
  • has:poll
  • in:library for searching only in posts you have written or interacted with

These will be enabled on Friday along with the rest of the changes to the search system.


If you've got any questions, please let us know!

[–] crashdoom@pawb.social 2 points 1 year ago (1 children)

I've been thinking about this for a while, especially seeing several of the admin and maintainer communities for Mastodon / Lemmy taking place on Matrix. But, at this time I don't think we're in a position to be able to spin one up due to existing work we need to finish before taking on new responsibilities.

So, not an outright no, but in the future, yes. If this is something folks do want us to provide, please do add your feedback here and updoot. :3

 

EDIT Aug 2, 1:53 PM US Mountain Time: Maintenance has now concluded and all services are back online!


We're preparing the server rack for an additional Proxmox node to be added to support HA failover and have identified the need to offline and move some equipment around to complete this work.

As such, we'll have a temporary downtime for up to 1 hour from 1 PM US Mountain Time, today August 2nd, or 1 hour from this post. Though, we expect the work to update the node, move it, and bring it back online to take no longer than 30 minutes.

We're hoping these changes will help us maintain a more stable service going forward, spread out the Sidekiq processes (the ones that receive / send data across the Fediverse), and allow us to perform maintenance as necessary without having to offline the service in the future.

We apologize for the short notice as we had hoped to perform this maintenance without any downtime.


Maintenance window: 1 PM - 2 PM US Mountain Time, August 2nd, 2023

ETA: 30 minutes to 1 hour

During the maintenance window, the following services will be unavailable:

  • pawb.fun
  • pawb.social
  • furry.engineer

We’ll have updates throughout the maintenance window available at https://t.me/pawbsocial.

 

Cleared The Shifting Oubliettes of Lyhe Ghiah with a group of four, hitting every single exclamation mark (though two were secret sweeper trolls)!

Love the fact the devs are willing to troll you. FFXIV is awesome XD

 

Around 11:30 PM, one of the core GCFI breakers tripped resulting in the UPS array running until around just after midnight before dropping entirely.

We had not set up monitoring on the APC UPS as we hadn’t anticipated that as a failure point due to the reliability of the Colorado electrical grid, and didn’t anticipate the breaker itself flipping after months of continued use.

All services should be online again and working to catch up on any dropped content overnight, and we apologize again for the inconvenience.

We’ll be trying to identify the root cause throughout the day, working on the electrical connections, and setting up alerting through the APC UPS to detect and persistent A/C loss.

 

EDIT Maintenance Complete (7/6 8:41 AM): We’ve finished troubleshooting with the ISP modem (and stumbled onto fixing the issue somehow), so we’ve brought pawb.social back online.

The Mastodon upgrade has completed and we’re now running Mastodon+Glitch 4.1.3!

…What’s Glitch? Glitch-soc is a fork of Mastodon that enables a ton of cool new features, accessibility options, and more!

You’ll see new things like:

  • An option under the three-dots when posting on the web UI to: Enable threading, or keep your post entirely local to your instance (We’ll be using this going forward with an announcement bot on both pawb.fun and furry.engineer to avoid spamming the timeline)

  • Formatted toots! (You can now use markdown language in your toots!)

  • More skins and flavors of the web-UI, including a high contrast version

Let us know if you have any issues!


Following our previous post about ISP issues, we'll be performing tests on our ISP equipment tomorrow morning (July 6th) from 7 AM US Mountain Time for several hours.

Additionally, we've been made aware of a critical security update for Mastodon that will be released around the same time which we will be applying during the maintenance window.

We're unsure the contents of the security update at this time, but have been told there is no reason to believe it has been exploited in the wild.

During the maintenance window, the following services will be unavailable:

  • pawb.fun
  • pawb.social
  • furry.engineer

We'll have updates throughout the maintenance window available at https://t.me/pawbsocial.

We hope to complete the maintenance as soon as possible and apologize for any inconvenience this causes.

[–] crashdoom@pawb.social 1 points 1 year ago

Personally, I pronounce it “POB” because W’s are hard :p

Though, I totally didn’t realize when I chose the domain that Pawb also meant “everybody” in Welch as @match@pawb.social mentioned, but I think it makes it all the better :3

[–] crashdoom@pawb.social 6 points 1 year ago (2 children)

We're using the Ansible playbook deployment, and I ended up giving the pictrs service a restart through docker (docker restart <id> and you can get the id by using docker ps).

It didn't seem to be out of space or even offline, it just locked up and stopped responding to both new uploads and existing image requests.

view more: next ›