Browser extension and local backend that automatically archives YouTube videos (github.com)

socalgal2 16 hours ago

Cool but ... this also sounds like hording behavior. The number of things I've saved over the years only to throw them away years later and realize that saving them in the first place was a waste of time.

In the 90s my friend's mom would video tape AMC movies. She had 300+ tapes. Maybe she had a few rare ones but now all those movies are available on demand either legally or illegally and in much better quality. Another friend kept all of his 1980s computer magazines (Byte, etc...) and moved these extremely heavy boxes through 30+ years of moves. I doubt he ever opened a single magazine since the moment he saved them. Then they all appeared on The Archive and he finally got rid of them.

To be clear, I have a few youtube videos saved on my local storage. I'm just thinking that saving every video I watch reminds me of the things I've personally over-saved.

Actually that reminds me. I met up with the magazine saving friend recently which is when I verified that he finally got rid of his stash. It made me think about things I'm still saving that if I reflect on I know I will never actually look at. For example I have box of about eight 3.5 inch floppy disks from my Amiga days. The odds that I'm going to get an Amiga or download an Amiga emu and get a drive to read those are close enough to zero that I should throw them away. Similarly I have a book of CD-ROMs of backed up data from the 90s. There's a close to 0% chance that I'm never going to bother look at their contents.

toomuchtodo 13 hours ago

PSA: if you have a collection or other artifacts for ingest by IA, I’ll cover reasonable shipping costs to get them there. Above a certain size, they’ll handle logistics of packing and shipping for ingest.

https://help.archive.org/help/how-do-i-make-a-physical-donat...

Tools to make this easy exist if you already have digital versions.

https://github.com/jjjake/internetarchive

And don’t forget to send a few dollars if and when you can.

https://archive.org/donate

(no affiliation, I just like the public good)

mananaysiempre 15 hours ago

> Another friend kept all of his 1980s computer magazines (Byte, etc...) and moved these extremely heavy boxes through 30+ years of moves.

I don’t think IA has all early issues of the Microsoft Systems Journal (later MSDN Magazine), among others. So this can be useful. (Also, what kind of person do you think put the magazines up on IA in the first place?..)

asdefghyk 15 hours ago

Lots magazines never made it to the archives and have been lost.

danieldk 13 hours ago

I am the exact opposite and sell or throw away pretty much everything that I don't use. I find that doing so not only clutters the house less, but also gives you less to worry about.

My general rule is - if I didn't use it for a year, I don't need it. There are obviously some exceptions like a fire extinguisher (which I hope to never use) and digitized photos, which only go through a careful selection.

I think the thing I kept the longest was a Libranet Linux 3.0 CD set because I worked for Libra Computer Systems for a while and this was the release that I helped building. A few years ago I threw it away, I think after I saw someone uploaded it to archive.org. When I'm 60 and want to install it again for good old time's sake I can.

tl;de: if you don't use something for a year, you probably don't need it.

zie 9 hours ago

> fire extinguisher (which I hope to never use)

These expire, so make sure you check yours is still good!

Otherwise I agree with and do basically the same thing. I also make exceptions for most tools and emotional connection items.

stirfish 9 hours ago

I'll will be buried with my box of miscellaneous cables.

pessimizer 12 hours ago

I don't get around to using plenty of things for the first time a year after I've purchased them. That policy in my life would be a nightmare of constantly rebuying stuff, or failing to rebuy stuff that is now gone forever.

Almost everything that has become indispensable in my life took years to integrate into my life to any significant degree.

"Need" is a weasel word. You don't need anything.

latexr 2 hours ago

> "Need" is a weasel word. You don't need anything.

That feels uncharitable to your parent commenter. By that logic we’d never use “need”, and your use of “significant” would also be a weasel word (I don’t believe it is, merely making a point).

While the words used are important, we should strive to understand the idea being transmitted and steelman the argument. “Need” is relational. Some things you need to survive (even if technically you don’t need to live, and here we’re getting too philosophical), other things you need to feel comfortable, or you may need a chair to reach a high place. In this case your parent commenter is clearly referring to a subjective level of need which differs for each individual and trying to make you reflect on the balance between the things you keep and how their existence in your day-to-day affects your life.

redserk 12 hours ago

I’d own a lot less stuff if there were more opportunities to rent infrequently used items.

As it stands, I have a workshop and electronics bench with many tools that will go unused for years but are critical when I need them and too expensive to buy and throw away.

imtringued 3 hours ago

There is also the problem that people who sell their used stuff on classifieds think of their article to be worth as much of the original purchase price as possible, but don't think about the fact that people would rather buy new stuff.

In reality the mindset they should be having is that the buyer and the seller are doing a group buy separated in time, which means that both paying half the price of new is fair and should be expected.

latexr 2 hours ago

> There is also the problem that people who sell their used stuff on classifieds think of their article to be worth as much of the original purchase price as possible

That is quite the frustrating state of affairs when you try to reduce your consumerism and buy second-hand. It’s not that uncommon to see people selling used items at the same price or even more expensive than new with warranty.

seb1204 9 hours ago

I would like to hear more about this. So you buy a, say new air fryer, a new monitor, a new mobile phone or a new shirt and it takes you 12 months to first use it? Or was this more like I buy 3 SD cards in bulk buy might not need 3 right now? Do you live in a remote area where online shopping delivery is not available? Is it just a habit? Honestly curious.

janandonly 4 hours ago

I know > 1 people who buy a new phone, leave it in the box for > 3 months and then finally have the time to move over their data and start using the new device

latexr 2 hours ago

I agree with you on a personal level. I also collected a bunch of different things but eventually threw them all away and am now extremely selective about what I keep. I even avoid hoarding digital stuff.

However, in your examples, the fact those things eventually became available in other forms is not necessarily a counterpoint to your acquaintances having kept them. The specific counterpoint being Marion Stokes.

https://en.wikipedia.org/wiki/Marion_Stokes

SilverElfin 14 hours ago

I would like to be able to search old videos I’ve seen sometimes. Like to find that one recipe I saw or to pull out that one fact I thought I heard. Or sometimes just to listen to a song that later got made private or deleted outright. When YouTube deletes a video it doesn’t even leave the title in your playlist so it can be frustrating to try and find the same thing again.

frou_dh 1 hour ago

> When YouTube deletes a video it doesn’t even leave the title in your playlist

I think this was an insidious decision of YouTube. It's like they want to pretend that videos deleted for ToS violations never existed, because by default the playlist does not even display the "holes" that it now has, either.

nemomarx 15 hours ago

Getting them on a public shared archive is probably a good outcome though. There was that lady who taped hundreds of hours of daytime TV and archiving that has some interesting historical uses?

But a personal copy I'm not sure has much point yeah.

toomuchtodo 8 hours ago

> Marion Marguerite Stokes (née Butler; November 25, 1929 – December 14, 2012) was an American access television producer, businesswoman, investor, civil rights demonstrator, activist, librarian, and archivist, especially known for hoarding and archiving hundreds of thousands of hours of television news footage spanning 35 years [70,000 VHS tapes], from 1977 until her death in 2012, at which time she had been operating nine properties and three storage units. According to the Los Angeles Review of Books review of the 2019 documentary film Recorder, Stokes's massive project of recording the 24-hour news cycle "makes a compelling case for the significance of guerrilla archiving."

https://en.wikipedia.org/wiki/Marion_Stokes

https://archive.org/details/marionstokesvideo

https://recorderfilm.com/

jh00ker 15 hours ago

> I have a book of CD-ROMs of backed up data

>There's a close to 0% chance that I'm never going to bother look at their contents.

More likely scenario, your children, grandchildren or other family members go through your shit after you pass away and discover stuff about you that perhaps you never wanted to share.

This is something I think about a lot because I don't have a "digital legacy plan."

socalgal2 14 hours ago

> More likely scenario, your children, grandchildren or other family members go through your shit after you pass away

I think that's not really likely. I'm pretty sure if you poll you'll find that few children care about their parent's "stuff". You can find plenty of people who've lost parents who found that they didn't have any interest in going through their parents stuff and then from that realized their children would be the same to them.

Most children aren't going to dig through anything more than a physical photo album, and when they do, the only pictures that are relevant to them are those with people they know. The rest only have meaning to the dead parent. They aren't going to dig through hard drives or CDs unless they are searching for financial documents so they can finish up their parent's financial affairs.

> discover stuff about you that perhaps you never wanted to share

I do worry about that. I just tell myself I'll be dead so it doesn't really matter.

Larrikin 9 hours ago

Where are you making this conclusion from?

Nobody in my family was waiting for one of my parents to die and it actually happened rather suddenly although he was retirement age. There was a very rapid effort to ensure we discovered as many passwords as possible, bought a family NAS, and backed up his entire computer starting with the Lightroom video and pictures. We later went through all of the family photos and folders he hadn't put in there.

To this day it's constantly running with an off site back up to my NAS. There are some photos of cousins we didn't really know, but he owned the best digital cameras of every era since their invention so it's a huge documentation of life. It would have been a family tragedy to lose that.

StopDisinfo910 5 hours ago

That’s family records. People care about that. I will hazard you probably didn’t peruse their whole collection of books and magasines however and I think it’s normal.

People care about the things which remind them of their loved ones: prized possessions, objects with strong memory attached too or things they used to love as kids, this kind of things.

The rest is well stuff.

globalise83 13 hours ago

On your deathbed, you say: "My only regret is forgetting where I saved my Bitcoin keys".

npteljes 14 hours ago

Store your archive encrypted, and then later you can decide if you share the password or not :)

graemep 13 hours ago

Physical media had a much higher cost in terms of both the cost of the media and the space it uses so you can horde a lot more.

Maybe something a bit more selective than this though!

lukebechtel 13 hours ago

Yes!

Hoarding is bad when it's costly, due to space, time, or money.

Digital media hoarding is thus not bad at all!

socalgal2 13 hours ago

You have to define "cost". I have a "server" with 3 external drives connected. One is "media" and 2 are for backups. I have a drawer with 11 external HD drives which I haven't used in years that used to be my "media" and backup drives. Each of those represent money (buying the drive) and time (copying stuff from old 1TB drives to 2TB drives to 4TB drives) etc....

So there is a cost to digital media hording.

I wanted to save the videos I'd captured from my car's cameras but there's ~250gb every 3-4 months or so which is a more money needed. Plus, if I wanted them actually available to access I'd need a way to plug in more drives live into my server so more $$$$ and I'd need to back them up for when the drives fail so more $$$$.

So yea, there is a cost to digital media hording.

graemep 25 minutes ago

There is a cost, but it is a much lower cost. There are degrees of hording too - saving all the video from your cars cameras permanently is pretty extreme. It would also be a lot more expensive and difficult if you were saving it to videotape.

mlyle 6 hours ago

The storage is cheap. The cost is keeping it online and the opportunity cost of gathering, maintaining, and preserving collections.

bombcar 15 hours ago

Digital hoarding takes nearly no practical space.

And there’s a number of YouTube videos o wish I could still access.

fcpguru 16 hours ago

oh that's not why I want them local. I want to open them in final cut pro and edit them and use parts in other videos. I delete the data folder at the end of each day.

IncreasePosts 2 hours ago

It's only hoarding if you don't fill the collection. It also depends on what percent of your storage you devote to it. If I could store every video I ever watched locally and only pay 1% of my storage cost, why not?

pessimizer 12 hours ago

> Then they all appeared on The Archive and he finally got rid of them.

Sometimes you're the person who is uploading them to public archives. Because everybody else threw them all away, and you saved them until the technology made archiving practical enough.

I've been replacing all of my physical media for years, but the reason I can do that now is because other people scan/rip and archive/share the stuff. You also have unique stuff that you may not even know is unique. When you find something in your house that you can't find online, scan it and you're paying everybody back for all of the scanning they did for you.

With the CD-ROMs, you should just glide through them one by one and check if you can find the stuff online. If you can, throw them in the trash. If you can't, copy their contents to a folder, and throw them in the trash. Go through the folder over the next hour or next 20 years (however long it takes to get around to it) and take the things you can't find online that you think somebody might want, and get those things to that somebody (uploading to archive.org is always a good place to start.)

edit: I know for a fact that for a lot of people, uploading somewhere on the internet is their standard pre-deletion ritual.

ThrowawayTestr 15 hours ago

Hard drives are cheap and compact. The real issue is archiving with no organization or indexing.

attila-lendvai 12 hours ago

hoarding, or maybe just anti-censorship measure.

hkon 15 hours ago

With enough space available hoarding is just thinking ahead.

6510 2 hours ago

> saving them in the first place was a waste of time

I think of it like this:

Automatically save everything and spend time deleting the things I don't want to keep. vs Manually saving everything.

"Don't want to keep" depends on disk space and cluttering up the list as it grows. Disk space is not really a thing. I use to have a friend who spend many hours every week cleaning up his 512 GB drive. He was quite obviously deleting things he wanted to keep but "had to" make choices. I just have enough drives to keep 1 to 3 copies of everything. (The single drive will also fail inevitably)

The clutter still seems to happen even if I make the effort to get rid of things. Organizing it a bit, say at least by date is inevitable.

Therefore there is nothing to be gained by wasting time saving things. It is more time efficient to waste time by deleting only enormous folders that you clearly don't need to keep around.

Hoarding is only an appropriate term if you don't have the space for it. If you have an empty airplane hangar a few boxes of foo isn't hoarding.

lez 1 hour ago

It started a line of thoughts in me. What if the backend keeps the videos around for a longer period of time, and: * regularly checks youtube, and whenever one archived video gets deleted from youtube, it advertises the video ID on a specific set of Nostr relays * have a different browser extension for yt viewers that activates when the user hits a deleted video. * the backend can stream the video for continuous Bitcoin lightning payment until the stream is kept alive.

So you can make some money on free disk space.

erinnh 16 hours ago

Ive been using Tubearchivist with the extension for this.

https://github.com/tubearchivist/browser-extension

I really like the WebUI of Tubearchivist itself.

fcpguru 16 hours ago

the main feature I want is to just browse youtube like normal in firefox like I always do. And completely forget starchive is running. Then later in the day I'm pleasntly suprised that any video I want to clip is already downloaded and ready. I never know which one I'll want to download and I don't want to have to click any button.

nemomarx 15 hours ago

What are you clipping them for?

fcpguru 15 hours ago

Usually thoughout the day I'll be watching many different videos and then one will stick with me. Someone will have made a really good point at like time code 3 mins and 17 seconds or something. If I have to right then and there pause the video and start a download it takes me out of the moment. I like it so much better to just at the end of the day go back and find good moments and place them in a their own videos. Examples:

https://www.youtube.com/watch?v=ksHaSnEs4WM

https://www.youtube.com/watch?v=KRfsAufKrzk

https://www.youtube.com/watch?v=6EoH-Qy_xw8

mikae1 16 hours ago

> Videos are saved to the ./data/ directory and converted to MOV format using ffmpeg with hardware acceleration

Transcoded (ouch) or just remuxed to a mov container? Have to investigate.

pixelpoet 6 hours ago

Yeah I was onboard until the re-encoding part: yt-dlp maintains the exact bits, why on earth would someone want to waste encoding time just to trash the quality?

On top of that... seriously, of all the formats one could choose, MOV?! Might as well choose DivX or RealVideo.

fcpguru 1 hour ago

it's because on a mac I want to be able to open them in quicktime and final cut pro. The mp4 format youtube uses isn't something quicktime can open.

But I gave you a cli param of --format

Default is mov but you can pass in mkv

fcpguru 16 hours ago

the video has to be re-encoded because apple quicktime doesn't like the youtube video format. But the audio can just be copied. My mac's fan never spins with the hardware acceleration so it runs in the background and I just forget about it.

latexr 1 hour ago

> the video has to be re-encoded because apple quicktime doesn't like the youtube video format.

That’s not true at all. QuickTime is far from the best video player, but it’s also not entirely worthless. It can play “modern” popular formats like H264 MP4, which is exactly what YouTube recommends.

https://support.google.com/youtube/answer/1722171?hl=en

fcpguru 1 hour ago

Show me a screen recording of this working. Run yt-dlp and get an mp4 and open in quicktime. I get error every time single.

ahoef 15 hours ago

I detest QuickTime more than any other piece of software

1718627440 14 hours ago

Why does Apple take the effort to maintain and ship different encoding libraries? I would've expected to both the Safari engine and Quicktime to simply depend on libappleavsmth.dylib?

fcpguru 13 hours ago

wow i went down an AI rabbit whole learning the answer to this: https://chatgpt.com/share/688e818d-67ac-8010-913d-618f3534f1...

1718627440 17 minutes ago

First of all how do you know it's true?

That being said, that makes zero sense. Just linking to a library, doesn't precluded using a protocol over a socket to talk to a graphic/audio server. Access control like remote code isolation (webAPIs), CORS and DRM also don't change anything about decoding and mixing video streams.

Teever 5 hours ago

yeah but this presupposes that the optimal usage pattern here is to use quicktime instead of VLC or something like Jellyfin.

Which seems a little short sighted to me. VLC or Jellyfin are obviously superior because they're accessible across multiple platforms.

latexr 1 hour ago

Seems likely the reason is they want to easily load videos into mobile Apple devices like an iPhone or iPad. While alternative video players exist for those platforms, management may not be as convenient.

fcpguru 1 hour ago

you can pass in --format mkv

the default is mov

frou_dh 11 hours ago

For YouTube videos I feel are worth archiving, I just add them to playlists on my channel, then periodically download my entire channel using a single yt-dlp command (it can keep track of what's already been downloaded).

Szpadel 15 hours ago

I creates something similar in concept but with different goal. I wanted to be able to watch videos with sponsor block on iPad ideally using Plex.

I found self hosted solution like this but I was very dissatisfied with how that worked

on other hand I wanted to check out loco.rs framework, so I decided to implement my own solution.

basically you are able to add channels/playlists on many many platforms that yt-dlp supports, you can select what should be cut out using sponsor block and you choice how many days you want it (videos older that that are automatically deleted)

if you are interested, you can check it out: https://github.com/Szpadel/LocalTube

jz10 11 hours ago

I gave Claude access to supadata YT transcription and obsidian MCP to convert them to "permanent note" format and it's helped tone down my YT addiction a lot

fcpguru 11 hours ago

oh wow they have free plan? https://supadata.ai/pricing nice!

ProofHouse 12 hours ago

I don’t really get the purpose of this broadly, because doesn’t YouTube keep videos online unless the creator took them down which is probably not the case 95% of the time? That said for a niche or a high likelihood of a video being removed, or if you really want to be 100% certain it makes sense, but would I be accurate in that statement or am I missing something?

fcpguru 12 hours ago

I'm not trying to save them forever. I just want them local so I can take clips from them for other videos. I use them as source input to final cut pro.

add-sub-mul-div 12 hours ago

My version of this downloads the files to my Plex filesystem so I can watch them on my TV without going through a Youtube app. Also the sponsorblock segments are cut out of my local version after download.

I go even further and schedule TV "channels" that rotate through the local videos using ErsatzTV.

imtringued 3 hours ago

If you ever created a long hundred+ video playlist on YouTube you'd realize that videos go missing very frequently and you'll never know which ones are missing.

globular-toast 3 hours ago

Would you let someone else have the power to remove books from your library? Oh wait, you probably are

What I'll say is you're either the kind of person who gets fucked once then does something about it, or you just keep getting screwed and complain pathetically when it keeps happening.

myself248 13 hours ago

Oh, this is huge and important. The number of things I watch that're just gone when I go back to look again!

Youtube is an archive like a grocery store is a food archive. [1]

If it was worth watching in the first place, it's worth saving. Reducing the friction of doing so is going to help a lot of people.

(1: I'm getting this quote wrong, what's the actual and attribution??)

john01dav 7 hours ago

There's a specific removed video that I want back (the whole channel seems to have been bought by someone who wanted pre edition subscribers or something, then everything on it was nuked and replaced with content that doesn't interest me). I tried emailing the channel's new email address to ask for it, and got no response. Do you know of any practical way to try to get it back?

fcpguru 12 hours ago

ha, I'm not sure who said Youtube is a video archive like a grocery store is a food archive but that's excellent.

hkt 1 hour ago

I've been toying with writing something like this for ages and now I don't have to. Brilliant work, thank you!

computegabe 17 hours ago

Interesting. I was looking into creating an extension that manually manipulates and intercepts the vnd.yt-ump [1] requests, then use webcodecs to process everything in the browser.

[1]: https://github.com/gsuberland/UMP_Format/blob/main/UMP_Forma...

fcpguru 17 hours ago

oh wow, yeah https://github.com/yt-dlp/yt-dlp sounds like the easier path.

ivanjermakov 15 hours ago

I'm achieving this with a single yt-dlp script reading url from a clipboard.

fcpguru 12 hours ago

oh but there's still the thought of having to press copy. My favorite thing about this is I just forget I even have it running and browse youtube like normal. Then later anything I've watched that day is already downloaded.

ivanjermakov 10 hours ago

Sounds like a waste of network and computer resources to me - copying url means the video is worth the effort.

syntaxing 14 hours ago

Whoa! I asked about something like this 2 years ago but never got to making anything [1]! Super exciting something like this exists!

[1] https://news.ycombinator.com/item?id=37885584

amelius 14 hours ago

It would be nice if the extension wrote them to some shared repository. That way, the videos could be preserved for humanity without Google having a say in it.

Added benefit: every video would have to be archived only once.

Alive-in-2025 14 hours ago

But then companies could sue to wipe out the centralized repo. So to be safe, you'd copy things to the central repo and also have a local copy. ;-)

Next, you try to centralize all the private copies so only one person has to keep theirs. Solution is end copyright for things over x years in age. Instead in the us we keep pushing back the date.

amelius 13 hours ago

Depends where the central server is. Nobody is wiping annas archive, for example.

WithinReason 16 hours ago

Now add DHT so clients can download videos from each other as a torrent and you solved global video distribution.

rwmj 15 hours ago

That's basically PeerTube?

WithinReason 15 hours ago

PeerTube doesn't have all of youtube's videos on it

rwmj 3 hours ago

Isn't that PeerTube that's going to be DMCA'd out of existence by Google?

WithinReason 2 hours ago

Why? How is it different from yt-dlp from a DMCA standpoint? You're downloading a video in both cases, with this you're not using YouTube’s server bandwidth so it doesn't cost them money.

rwmj 2 hours ago

Because users would also be uploading, as PeerTube works somewhat like Bittorrent.

WithinReason 1 hour ago

I never mentioned PeerTube

untech 13 hours ago

See also ArchiveBox, which supports YT saving as well, but can save other content too

https://github.com/ArchiveBox/ArchiveBox

globular-toast 12 hours ago

I've had this idea myself so cool to see it implemented.

What I'd really like is a kind of universal web caching backend. So everything I access goes through a cache and I have the option of viewing from cache if something goes offline or changes. I could also mark things as "favourite" so they don't ever expire from the cache. Does such a thing exist?

fcpguru 12 hours ago

trying to just grab from the actual browser cache is very hard for video. If you look at the complexity of yt-dlp you'll see why that's so much easier than trying to grab various formats from cache.

busymom0 13 hours ago

My archiving app called HEAP can be configured using a simple apple script and yt-dlp to do this too. And since it's a native macOS app instead of a browser extension, it works via all browsers:

https://apps.apple.com/ca/app/heap-website-full-page-image/i...

fcpguru 18 hours ago

~/os/starchive (main)[56daf7] $ ls -lh data

total 3207312

-rw-r--r-- 1 aa staff 525M Aug 2 09:11 2PMzaym-StM.mov

-rw-r--r-- 1 aa staff 362M Aug 2 09:10 CHbawkGc_os.mov

-rw-r--r-- 1 aa staff 658M Aug 2 09:11 lqR7VV8ftys.mov

~/os/starchive (main)[56daf7] $ ./starachive

Server starting on port 3009...

JSON received: map[videoId:CHbawkGc_os]

Added video CHbawkGc_os to queue. Queue length: 1

Processing video CHbawkGc_os. Remaining in queue: 0