this post was submitted on 25 Aug 2023
38 points (100.0% liked)
cybersecurity
3249 readers
4 users here now
An umbrella community for all things cybersecurity / infosec. News, research, questions, are all welcome!
Community Rules
- Be kind
- Limit promotional activities
- Non-cybersecurity posts should be redirected to other communities within infosec.pub.
Enjoy!
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I believe Lemmy instances disallow crawling by default, so SEO is probably not why. Would be nice to find Lemmy results in Google if they can sort out the canonical URL problem. Reddit was a great resource for random questions, and if people move here it should still be easy to find.
Nope, it's allowed.
The default robots.txt disallows access to a few paths but not /post or /comment.
There are lots of crawler bots hitting my instance (ByteSpider being the most aggressive). I just have a list of User Agent regexes I use to block them via Nginx. Some, like Semrush, have IP ranges I can block completely at the firewall (in addition to the UA filters)
What makes you say that? robot.txt just disallows things like /create_community and there's no robots, googlebot, etc meta tags in the source that I can see, and no nofollow apart from on a few things like feeds.
Also, I'm sure I've seen Lemmy appearing in search results already.