
BitTorrent
Last updated: January 21, 2009.
You've heard of a transportation nightmare called gridlock—where a city's
streets become so packed with cars and trucks that no-one can go
anywhere and the entire place grinds to a halt. But have you heard
how the Internet can lock up the same way? The Net is based on what's
called a distributed architecture with no single, central point of
control, but lots of similar systems working in parallel; that's why
it works incredibly efficiently most of the time. Even so,
failures
of important cables (like the ones that link countries and
continents under the sea), systematic attacks by criminal hackers, or
sudden surges in demand all have the power to bring the Net to its
knees. As more and more people go online, the chances of
Internet
gridlock grow steadily greater. What's the solution? One answer
is for people to make better use of the Net's distributed
architecture using a superbly clever way of sharing files known as
BitTorrent (or, to give it its full name, the BitTorrent protocol).
Let's take a closer look at how it works!
Photo: "Traffic" congestion could drive the Net toward gridlock.
Photo by Warren Gretz courtesy of US Department of Energy/National Renewable Energy
Laboratory.
Client and server

Photo: KTorrent: A BitTorrent "client" for the Linux operating system.
If you've read our article about how the Internet works, you'll know
that it uses two kinds of computers linked together:
- Servers
are the big powerful machines that hold web pages, downloadable
MP3
music files, videos, and all the rest.
- Clients are the small
machines we have in our homes and offices that download data from
servers.
Browsing a website involves a lengthy conversation between a
client and a server: a program called a web browser, running on your
computer, sends repeated requests for bits of the websites
(individual web pages and the text, photo, and multimedia content
they contain) to a server, which does its best to oblige.
This system works well most of the time, but if you think about it a little, you can
immediately see there's a problem. Suppose you have a server that's
hosting some really popular file: the latest MP3 music track from
world-beating band The Uber Popular Sharks. Let's say the Sharks
release their track one Monday morning after a lengthy marketing
campaign telling the world that's exactly what they'll be doing.
The server hosting the Sharks' track is going to be besieged with traffic
from all over the world at exactly the same time. Even if it doesn't
grind to a halt, it's going to run incredibly slowly so it could take each
person ages to download the track. Worse, all that music is going to
congest parts of the Internet linked to the Sharks' server. Suppose
the server is based on a small island like Fiji. Chances are, the
whole of Fiji's Internet service will be severely degraded just
because lots of people are downloading the Sharks track from a server
nearby! The whole exercise is also going to cost the Sharks an absolute fortune in
website hosting fees: the more data people download from their
server, the more bandwidth the Sharks will use and the more they'll have
to pay to their ISP (Internet Service Provider)—which is pretty crazy if they're a small band without much
money.
You can see how ridiculous the whole thing gets if you consider what happens if a
large number of Sharks fans all live near one another on the opposite
side of the world in, say, Seattle. Vast amounts of Internet data is
going to be steaming over the Internet between Fiji and Seattle, but
because everyone is downloading the same track, it's going to be
pretty much the same data making that same stupidly long journey over and
over again. Sounds crazy? Wouldn't it be much more sensible if one
person in Seattle downloaded the Sharks track and then shared it with
all the other Sharks fans who live nearby? Roughly speaking, that's
the idea behind BitTorrent.
Peer-to-peer
Since there's no single, central computer controlling the Internet and (in
theory) every computer that's online is connected indirectly to every
other one, it should be possible for any two computers to share
information by communicating directly—and it is! This is called
peer-to-peer (P2P) communication and it's used by some of the
more popular instant messaging (IM) chat programs (as well as
controversial file-sharing programs, which earned themselves a bad
name when people started using them to share copyright music tracks
illegally).

Photo: Transmission: Another BitTorrent "client" for the Linux operating system.
BitTorrent is a protocol (a set of rules that different computer systems agree
to use) based on P2P that can be used to share large files very
efficiently. Suppose the Sharks decide they want to use BitTorrent.
They take their music track and make it available on their computer
as a file called a torrent. The computer that hosts the original file, in its
entirety, is called a seed and it splits the file up into lots
of pieces.
Anyone who wants the file uses a program called a
BitTorrent client to request it from a seed. The client is
sent one of the pieces and gets all the remaining pieces, over a
period of time, from other people's computers through P2P
communication. At any given moment, each computer is downloading some
parts of the file from some of these peers and uploading other parts
of the file to other peers. All the computers cooperating in this way
at any time are called a swarm. The more popular a file
is, the more computers there are in the swarm and the quicker the
process is all round.
Share and share alike is the ethos behind
BitTorrent so, when people have finished downloading a file,
they are encouraged to stay online for a while so they can continue
uploading the file to others in the swarm—an activity known as
seeding. Quitting from a swarm the minute your download is
complete, without seeding, is a selfish activity that's earned itself
the nickname leeching! If everyone leeched, BitTorrent
wouldn't work at all.
Although BitTorrent is a decentralized P2P process very different from old,
client-server-type downloading, there has to be some sort of order
and control. Someone has to keep track of which computers have which
bits of the file. This works in different ways
with different BitTorrent clients. Some rely on centralized computers
called trackers which, as their name suggests, keep track of
where all the pieces of the file can be located at any moment. There is also
a more decentralized version of BitTorrent where the
clients manage the tracking process among themselves (sometimes called trackerless torrents
or distributed torrents).
Downloading versus bittorrenting
Here's a quick summary of the difference between downloading a file through a traditional client-server
approach and using a P2P BitTorrent:
Downloading with client-server
In a client-server setup (for example, when people download the same file from the same website), each computer (client) downloads a complete copy of the file (shown by the four colored blocks) from the server. Since the server has to provide one copy of the file to each client, this can work out hugely expensive for the person making the file available. If everyone downloads at the same time, the server has to divide its effort between the different clients so the speed of downloading goes down in direction proportion to the number of clients. That's why, even with a fast Internet connection, you will still sometimes experience very slow downloads.
Peer-to-peer BitTorrent
With a P2P BitTorrent:
- The originating server (the seed) makes available one copy of the file, which is then split up into chunks. Different chunks are sent out to the various computers (BitTorrent clients) trying to get hold of a copy of the file.
- Each client uploads their part of the file to other clients while simultaneously downloading bits of the file they don't have from other clients. All the clients work together as a swarm to share the file. The file-sharing process doesn't happen in the systematic, sequential way we show here (purely for simplicity): clients upload and download simultaneously and the file actually builds up in a more random way. There are often hundreds of clients involved in each swarm.
- Eventually, every client receives a complete copy of the file. However, in this example, as in real life, one client (lower left) finishes downloading before the others. If the owner of that machine switches off as soon as they're done, the other clients will never receive complete copies of the file!
Note two crucially important differences from client-server downloading. First, the originating server potentially has to make available only one copy of the file—so the whole process is much cheaper for them. Second, the more clients there are, the more effective the whole process will be for everyone involved (unlike in client-server, where more clients slow the process down for everyone involved).
The pros and cons of BitTorrent
If you have a fast Internet connection, why should you care about BitTorrent? You
have to understand that a fast connection doesn't guarantee you a
fast download: if the server you're trying to download from is
congested, you're going to download just as slowly as everyone else.
You can only really "get" the idea of BitTorrent if you think
about the Internet as a collaborative experience with lots of people
trying to do lots of similar things all at the same time. If we all
want to be able to download movies, we can't all do it at the same
time from the same server—just as we can't all drive down the same
highway at exactly the same moment. We have to learn to cooperate and
share the limited resources we've got instead of selfishly trying to push
everyone else out of the way.

BitTorrent is of most immediate benefit to people who want to make large files
available, rather than those who want to download them, because it
greatly reduces the cost of publishing big files online. That means the content
providers can afford to make a bigger range of files available—and that's where the
community as a whole benefits.
BitTorrent also helps the Internet
as a whole by spreading traffic more evenly and helping to reduce
congestion. P2P is generally a far more efficient system than
client-server: it's like people driving down all the streets and
roads they can think of instead of all trying to use just one central
highway.
The disadvantage of BitTorrent is that it also provides a simple way for
people to share pirated movies, music tracks, and other copyright
content. But it's important to be clear: there's nothing illegal
about using BitTorrent clients unless you download
copyright-protected works with them without paying the owners for the
privilege. It's not necessarily the tool that's at fault if
people start to use it in antisocial ways. Much the same
argument applies to cars and guns: both are perfectly legal, valuable
tools unless you happen to put them together and use them in an armed
robbery. If you stick to legal downloads, you'll be doing nothing wrong; in fact,
you'll be doing everything right by helping to use the Internet as a
whole to operate more efficiently!
Photo: ClearBits™ (formerly LegalTorrents) is a website that helps to promote legal file-sharing using BitTorrent clients.