this post was submitted on 25 Aug 2023
38 points (100.0% liked)
cybersecurity
3249 readers
4 users here now
An umbrella community for all things cybersecurity / infosec. News, research, questions, are all welcome!
Community Rules
- Be kind
- Limit promotional activities
- Non-cybersecurity posts should be redirected to other communities within infosec.pub.
Enjoy!
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Nope, it's allowed.
The default robots.txt disallows access to a few paths but not /post or /comment.
There are lots of crawler bots hitting my instance (ByteSpider being the most aggressive). I just have a list of User Agent regexes I use to block them via Nginx. Some, like Semrush, have IP ranges I can block completely at the firewall (in addition to the UA filters)