this_name

joined 1 year ago
[–] this_name@fedia.io 1 points 1 year ago

Yep, cliget was the answer. Honestly an extension I wish I knew about sooner.

[–] this_name@fedia.io 1 points 1 year ago

That's good for most websites, but it only takes the URL and sometimes websites are tricksy when you're trying to use CLI tools and figured I needed all the appropriate headers. I got an extension cliget which seems to have worked well enough. In the end I think this would have worked, but takeout gives you limited attempts to download a file before it cuts you off entirely and wanted to be sure.

[–] this_name@fedia.io 1 points 1 year ago

That's a neat trick, good call! I ended up using an extension called cliget which seems functional so far.

0
submitted 1 year ago* (last edited 1 year ago) by this_name@fedia.io to c/firefox@fedia.io
 

I've got a couple hundred GB to download with Google Takeout, so selected the 50GB file sizes but unfortunately the browser crashes at ~46GB. It actually crashes the whole machine (MacOS) with activity monitor showing firefox "using" 46GB of memory.

Is there some weird niche problem I'm running into here? I'd expect firefox to just be streaming the download into its .part file, so keeping the 46GB in memory is odd.

Is there some way to mimic the firefox download with all cookies as a wget/curl? Dev tools let you copy anything in the network console as a curl request, but since this goes straight to the download I don't think the console sees it.

Honestly any ideas on how to move forward would be appreciated.

Edit:

I ended up using an extension called cliget that does all the "copy as wget" work for me. I added a -c to the wget so I could use the partially downloaded 40GB file and went from there. I think copying the download link would have worked because it's from some random domain and probably uses a jwt-like auth protocol, but it's unclear whether it would deny a wget without correct user-agent or other headers. YMMV