Show HN: I wrote a BitTorrent Client from scratch (github.com)

I picked up programming in late 2023 and been enjoying it now. Wanted to challenge myself and set a stretch goal, so set out to build a bittorrent client.

jorkingit 1 day ago

Great work! Just an FYI, you might want to limit the dynamic allocation size in the bencode decoder: since it's untrusted input (either from torrent or announce), a malicious input could DoS the client by requesting extremely large allocations during string parsing. A good upper bound could be the remaining length of the input, as a well formed torrent can't contain a string longer than the rest of the file.

piyushgupta53 1 day ago

thanks for pointing this out. I've added this in my to-dos.

indrora 15 hours ago

You might look into (if you only care about reading it) writing the bencode decoder using Kaitai Struct [0] to avoid some of the common pitfalls.

[0]: kaitai.io

vkaku 1 day ago

Excellent, looks clean and simple.

Suggestion: Add a simple usage one liner in the README on how to actually download a .torrent file with it.

  ./go-torrent My-Linux-Distro-Wink-ISO.torrent
Suggestion: Bonus points if you add torrent.ParseFromUrl

Everyone should do this for their own spiritual journey.

piyushgupta53 1 day ago

thanks for the suggestion, appreciate it.

__jonas 1 day ago

Neat! There is this challenge on codecrafters that guides you through the process a little, provides tests and such, I've played around with it a bit during a free month they had, was fun:

https://app.codecrafters.io/courses/bittorrent/overview

ashirviskas 1 day ago

As a non go developer, may I ask why you're using older go version 1.21? Is there a reason to stay with older releases?

EDIT: It seems like it was deprecated 10 months ago

pidgeon_lover 1 day ago

Windows 7 support is one reason to stick to older GoLang releases. A project in Go 1.21.4 or earlier will work on every Windows release and any computer made since 2009, whereas a version bump to v1.21.5 means it will only work on more recent computers and Win10 and 11 for no benefit.

https://github.com/golang/go/issues/64622

wongarsu 23 hours ago

I think this is a reasonable take. Yes, people shouldn't be running Windows 7 as their daily driver. But if you can support it at basically no effort and without sacrifices that is the right thing to do. Supporting more platforms is a good thing, even if that platform is an old Windows version instead of an Amiga

CactusRocket 5 hours ago

> and without sacrifices

There are always trade-offs/sacrifices.

The Go team isn't making new versions just for fun. Each version since 1.21 has had improvements. Especially the fix/change to for loop variables in 1.22 is very nice to have, and helps preventing to write bugs.

If there's a reasonable expectation that users will use outdated platforms, it makes sense to support them. If there is no such expectation at all, why would one forego the improvements to the language and tooling.

There's always a price to pay.

dd_xplore 5 hours ago

In this case the 'improvements' aren't required for this particular project.

koito17 11 hours ago

The README is likely AI-generated. The actual go.mod file lists 1.23.1 as the Go version[1], which implies a requirement of Go 1.23.1 or higher[2].

[1] https://github.com/piyushgupta53/go-torrent-client/blob/6130...

[2] https://go.dev/doc/modules/gomod-ref#go-notes

agiron123 1 day ago

Very cool!

This brings me back to college. We did this as our final project for our networking class at Georgia Tech.

I've long lost the code for this, but the lessons learned have lived on :)

Projects like these are great ways to learn new languages too!

vivzkestrel 9 hours ago

what did you refer for building this? did you read the protocol or did you go through other implementations? curious because it always makes me wonder how people build stuff like this from scratch

CactusRocket 5 hours ago

Interestingly (or maybe not) I started writing a Bittorrent client last week, also in Go. I don't really like AI/LLMs for coding, and my goal is to really learn from it. Not to just deliver some piece of code that an AI wrote...

So the first step is to figure out if there is an actual spec for the protocol. Well, in the case of Bittorrent there is a spec, but it's kinda... brief. Basically, this is the spec: https://www.bittorrent.org/beps/bep_0003.html

There are some additional extensions here: https://www.bittorrent.org/beps/bep_0000.html

It's very important to break the work up in tiny steps that each give some kind of a result. E.g. I started with parsing .torrent files, for which you have to implement bencoding.

That was interesting, because I downloaded the Arch Linux .torrent and found out it was basically incorrect: it has a url-list key which isn't mentioned in the spec. Then you have to go down a rabbit hole of some research trying to find out what the heck is up with that (turned out to be https://www.bittorrent.org/beps/bep_0019.html). Finally, was able to successfully parse the Debian Linux .torrent file.

Then came implementing the tracker Announce HTTP request and the peer protocol. The peer protocol was tricky, and that's when an experimentation mindset comes in handy. I stripped the announce URL from the Debian torrent and loaded it up in KTorrent, that way it wasn't able to find any peers. I used the "Add peer" option to add my client so I could see what messages it was sending me, and alongside I would implement sending those to KTorrent messages too. A lot of trial and error and debugging.

I have to confess that from time to time I had to ask ChatGPT about some small protocol details because they just didn't show up in the spec. There's also a lot of details that clients have implemented over time but which aren't really documented well. Mostly algorithms for which pieces to download, which peers to choose, how to "choke"/"unchoke", etc, is not well-defined. But web searches help out a lot.

Also an honorable mention to https://wiki.theory.org/Main_Page which has some nice information.

I'm now at a point where I can fully download and upload a torrent with my local KTorrent, so the next steps are to actually get peers from the tracker, and some algorithm to download pieces and store them in a file.

Hope this gives some insights. Let me know if you have specific questions about the process.

NooneAtAll3 1 day ago

Do you support magnet links?

Edit: ah, planned feature

piyushgupta53 1 day ago

not yet. I'll be adding soon.

TheEdonian 1 day ago

How hard would it be to add a GUI to this? I don't think I've seen a lot of GO Gui implementations in the past

thegeekpirate 16 hours ago

There's a bunch https://github.com/go-graphics/go-gui-projects

My personal favourite is an ImGui wrapper https://github.com/AllenDang/giu

The most featureful is probably unison, although I'm uncertain if anyone uses it outside of their own project (https://gurpscharactersheet.com), meaning documentation will be sparse https://github.com/richardwilkes/unison

Gio uses a different way of thinking about GUIs, used by Tailscale and gotraceui https://gioui.org

Wails is great if you're familiar with development on the web https://wails.io

The GTK4 bindings also look nice https://github.com/diamondburned/gotk4

Cogent Core also looks neat, but I didn't have the time to play with it before I switched over to using the Odin programming language instead of Go https://www.cogentcore.org/core

I personally had nothing but issues with Fyne (especially in regard to performance, across multiple computers and operating systems), but it's probably the most popular option https://fyne.io

CactusRocket 5 hours ago

I would like to echo performance (and other issues) with Fyne. Every time I try it again, I'm actually kind of baffled that it's so popular.

Last week I came across these Qt bindings: https://github.com/mappu/miqt

I have no personal experience with it yet, but am excited to try it out.

throwaway894345 1 day ago

This is cool! I’ve been thinking about something like this as well. How hard was it, and do you have a sense for how “complete” it is? Does it handle DHT and Magnet and all the crazy NAT traversal stuff?

I’m guessing the main obstacle for me has always been that I’m not sure what the complete list of features is to have a client that will just work for the majority of torrents in the wild. It seems like there are dozens of protocols associated with torrenting and I don’t even know what the full list is much less what each does.

CactusRocket 5 hours ago

How hard it is depends a lot on your experience, how comfortable you are with the programming language, and if you have an experimentation mindset. To offer a perspective, I also decided to build a Bittorrent client in Go last week, and I'm about 80% of the way of the posted project in a week. But then again I have a lot of experience with Go, extensive knowledge about protocols and networking, and experimentation has become a kind of second nature for me.

If you're interested, the main Bittorrent spec is actually quite small, you can easily read and understand it in an hour or so: https://www.bittorrent.org/beps/bep_0003.html

There's a lot of stuff "around" Bittorrent, like protocol extensions. But in my experience so far, many clients have different levels of support. E.g. they would try communicating using Protocol Encryption (obfuscation, actually) and if it doesn't work fall back to the regular protocol (if configured in the settings).

If you just want to download a Linux distribution (e.g. Debian) using the .torrent file on their website, it's relatively straight forward. It has just one file, there's a tracker, plenty of peers using the regular protocol, and you could get away not listening for connections when you're behind a NAT.

Naturally when you get into the grey area of content, then you'd need to parse magnet links, use DHT to discover peers, and perhaps you'd want to implement UPnP to try and automatically map a port on your modem. But that is also an iterative process, you can just build out one by one. The more features, the more peers you'll be able to discover and have successful exchanges with.

piyushgupta53 1 day ago

it was challenging for sure. Took me almost a month to get acquainted with the protocol, how bencoding works etc, build a mental model and then eventually writing code.

Magnet and DHT are yet to be added.

Charon77 1 day ago

In my experience, magnet is pretty straightforward. Dht is quite the rabbit hole, and it might be difficult finding clients that support dht (not everyone does)

NoMoreNicksLeft 22 hours ago

Did you do v2 and mutable torrents? Please, for the love of all that is good and wholesome, someone do mutable torrents.

OccamsMirror 1 day ago

Could this be used as a library?

CactusRocket 5 hours ago

I am not the one who created the post, but currently it doesn't seem to be usable as a library because most of the code is in a directory called "internal", which isn't accessible when using the repository as a dependency in another project.

tomhow 1 day ago

Stub for offtopicness

blibble 1 day ago

[flagged]

spuz 1 day ago

Yes, very strange. There's no problem with using AI to build your first app and leaving the generated comments in the code is fine. But the number of comments on this thread that begin "This is so cool" is very suspicious.

throwaway894345 23 hours ago

It doesn't seem wild to me that people would post "This is cool" on a ShowHN post. I did [here](https://news.ycombinator.com/item?id=44265915), and as far as I'm aware, I'm not an AI.

Would you like more information about how to identify AI comments? (kidding)

WhyIsItAlwaysHN 1 day ago

Or like a go beginner, which is fine

OtherShrezzing 1 day ago

Scanning around their other repositories the persons been programming for a few years now. There are ‘.cursor/rules’ directories in some recent repos.

I think it’s a reasonable hypothesis that “I wrote a BitTorrent client from scratch” may be “I produced a BitTorrent client from cursor”.

diagraphic 1 day ago

"convert length string into an integer" is a machine generated comment?

I've been writing code for 15+ years, this made me laugh my ass off. Comments are great, I don't read comments but I write them for others, especially for open source code. Atoi may be something you and I and a whole bunch of others know but people who don't it's a fine comment. Relax! :)

imiric 1 day ago

That comment is a strong sign that this was AI-generated. LLMs have the tendency to leave superfluous comments even when the code is self-explanatory. In this case, strconv is a well-known stdlib package, and anyone reading this in their IDE would get the documentation for what it does. In fact, all of the comments in this function and in most of the file are redundant, and I would point this out in a code review.

But, of course, this was vibe coded, so it's unlikely a human actually reviewed it.

rvnx 1 day ago

In the tests it more obvious:

You can see here for example: https://github.com/piyushgupta53/go-torrent-client/commit/61...

and some strings coming from crawled resources like: lengthi12345e4 but slightly different tokens (like 25 instead of 35 etc).

Gemini Pro 2.5 even gave me the prompt:

> If you asked me, "Generate Go unit tests for a Bencode decoder function called Decode that takes an io.Reader and returns an interface{} and an error. Cover strings, integers, lists, and dictionaries, including common error cases and nested structures" the output I would strive to produce would look very much like the code you've shown.

> It's a good example of well-written Go tests, and that's the kind of pattern I've learned to recognize and replicate.

and a lot actually matches when you ask from a fresh conversation.

So most likely Cursor + Gemini 2.5 Pro, but I cannot blame, I spend 100% of my time with Claude, and I take ownership of the code.

alexpadula 1 day ago

"TODO: We'll develop the actual functionality as we develop each component" lool

It's hard to say honestly. I don't call any project AI as it's just too hard to tell. I write lots of comments in my code too so it's hard to call anything AI without a person stating they used it.

Claude is decent for sure, but I always say with AI, learn the math before jumping to a calculator.

diagraphic 1 day ago

Clean code! Very nice :)

ivanjermakov 1 day ago

No seeding, no DHT, no magnet links, no uTP, no extensions. At this stage it is BitTorrent downloader, not a client.

Using P2P networks in download-only mode, so called leeching or free-riding, is discouraged.

hwmrocker 1 day ago

[dead]

Moosdijk 1 day ago

what's up with the amount of new accounts praising this project?

ivanjermakov 1 day ago

Seems like someone (OP or not) is testing how good they can use HN for free advertisement.

throwaway894345 23 hours ago

I only see two green usernames. Have others been deleted already?

diagraphic 1 day ago

odd indeed

blibble 1 day ago

[flagged]

alexpadula 1 day ago

I always wondered how the heck do people get away with that. HN mods lacking allowing those sort of projects to the top and legit bot likes and comments. Craziness. Put's all the projects and posts worthy of eyes to the dead bottom.

fragmede 1 day ago

some people don't, but there's survivorship bias at play here. whenever you suspect foul play, email the mods at hn@ycombinator.com, they're quite responsive

alexpadula 1 day ago

Thank you for the info! Much appreciated fragmede :)

pvg 1 day ago

You're probably getting downvoted because there are local conventions against astroturfing/shillage/botting accusations described in https://news.ycombinator.com/newsguidelines.html

If you think there's something wrong with the post email the mods at hn@ycombinator.com

Omarbev 1 day ago

[flagged]

man_is_obsolete 1 day ago

[flagged]

man_is_obsolete 1 day ago

[flagged]

thekevan 1 day ago

And just what is the purpose of AI generated replies? Especially with the obvious user name.

einpoklum 1 day ago

So, you're saying... man is not obsolete then? 8-|

Moosdijk 1 day ago

and the account being 1 hour old (at this time)

startyz 1 day ago

sounds very cool. Good luck!

b0a04gl 1 day ago

pretty solid attempt. but no mention of crash recovery, encrypted peer handshakes, or even basic uTP support. no idea how it behaves with NAT either. no memory guardrails during parsing, feels risky in real swarm. not production safe without those. would love to see it modular too, like usable as lib not just cli. tracking roadmap would've helped too.

chrisrickard 1 day ago

[flagged]