this post was submitted on 15 Oct 2023
91 points (96.0% liked)
Open Source
31243 readers
273 users here now
All about open source! Feel free to ask questions, and share news, and interesting stuff!
Useful Links
- Open Source Initiative
- Free Software Foundation
- Electronic Frontier Foundation
- Software Freedom Conservancy
- It's FOSS
- Android FOSS Apps Megathread
Rules
- Posts must be relevant to the open source ideology
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
- !libre_culture@lemmy.ml
- !libre_software@lemmy.ml
- !libre_hardware@lemmy.ml
- !linux@lemmy.ml
- !technology@lemmy.ml
Community icon from opensource.org, but we are not affiliated with them.
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Proxigram, from Instances - Proxigram - Codeberg.org:
If you get an error, try F5-ing a few times, it usually works at some point.
They also have RSS Feeds for accounts.
The RSS feature is amazing, i wanted to do something like that with RSS Bridge, but it looks like both Instagram and Facebook are doing their best to block exactly these kind of things, so it works half of the times and it needs to be fixed quite often, i think now it doesn't work very well either... Also it is very complicated to be set up if you don't know a bit of PHP. Of course i'm willing to learn but all this blocking that projects like this (see Barinsta or Bibliogram) get is really discouraging. I think Meta content is probably one of the worst to scrape.
Regarding Proxygram: for now it works, i'm using a public istance to grab some RSS feeds, if it proves to be reliable i will be happy to host my own istance as well, if possible :) It's sometimes slow to grab data (i guess because sessions get easily blocked/limited, getting error 500) but not really a problem as i just want to see new events every couple of days, one issue tho is that the RSS doesn't show all the posts (only showing the last three of them), which can be annoying as you may lose something if you don't see it and save it.
EDIT: It actually does get other posts as well, just reaaally slowly, meaning that if you follow really large accounts in a week or so you can find your feed full of older posts marked as unread.
Anyway thanks to whoever is making the hard job of building/owning an instagram scraper, I really know it can be tough.
I'm really late to the party, but I wanted to clarify what you experienced with the RSS feed, on why you could get old posts as new. The thing is proxigram grab the information from different sources (providers), not instagram.com directly (at least at the moment of writing this). So, let's say you are subscribing to @instagram feed, and the instance admin has set the maximum number of posts per feed, which is 12, proxigram makes a request to each of the twelve post to get the full information, so you are able to see the full content of the post directly from your reader.
The twelve request are being made at the same time and with each request a random provider is selected to scrape the necessary data, if an error occurs while fetching a post instead of pausing the process of generating the XML file, proxigram just ignore that post an returns what it was able to get (let's say 11 out of 12). Hours later, when you refresh the feed, the post that returned an error before, it didn't this time, but since your reader didn't have the id of that post at first, it treats it as a new post.
I could instead of this return an item saying that there was an error getting the post, and a link to the post. But since errors are pretty common, I think it would harm the experience of using RSS, but I don´t know. If people really want to add this, I could do it. What do you think?