Roll back time a half-century or so and the
smallest computer in the world was a gargantuan machine that filled a
room. When transistors and
integrated circuits were developed,
computers could pack the same power into microchips as big as your
fingernail. So what if you build a room-sized computer today and fill
it full of those same chips? What you get is a supercomputer—a
computer that's millions of times faster than a desktop PC and
capable of crunching the world's most complex scientific problems.
What makes supercomputers different from the machine you're using
right now? Let's take a closer look!
Photo: This is
Frontier, a scientific supercomputer based at Oak Ridge National Laboratory. It's currently the world's
most powerful machine, with some 8,730,112 processor cores, and it set a new performance record of
1.1 exaflops (1.1 million million million flops) in 2022.
Picture courtesy of Oak Ridge National Laboratory, US Department of Energy, published on Flickr in 2022
under a Creative Commons (CC BY 2.0) Licence.
Before we make a start on that question, it helps
if we understand what a computer is:
it's a general-purpose machine that
takes in information (data) by a process called input, stores and
processes it, and then generates some kind of output (result). A
supercomputer is not simply a fast or very large computer: it works
in an entirely different way, typically using parallel processing
instead of the serial processing that an ordinary computer uses.
Instead of doing one thing at a time, it does many things at once.
Serial and parallel processing
What's the difference between serial and parallel? An ordinary computer does
one thing at a time, so it does things in a distinct series of
operations; that's called serial processing. It's a bit like a
person sitting at a grocery store checkout, picking up items from the
conveyor belt, running them through the scanner, and then passing
them on for you to pack in your bags. It doesn't matter how fast you
load things onto the belt or how fast you pack them: the speed at
which you check out your shopping is entirely determined by how fast
the operator can scan and process the items, which is always one at a
time. (Since computers first appeared, most have worked by simple, serial processing,
inspired by a basic theoretical design called a Turing machine,
originally conceived by Alan Turing.)
A typical modern supercomputer works much more
quickly by splitting problems into pieces and working on many
pieces at once, which is called parallel processing.
It's like arriving at the checkout with a giant cart full of items, but
then splitting your items up between several different friends. Each
friend can go through a separate checkout with a few of the items and
pay separately. Once you've all paid, you can get together again,
load up the cart, and leave. The more items there are and the more
friends you have, the faster it gets to do things by parallel
processing—at least, in theory. Parallel processing is more like what happens in our brains.
Artwork: Serial and parallel processing: Top: In serial processing, a problem is tackled one step at a time by a single processor. It doesn't matter how fast different parts of the computer are (such as the input/output or memory), the job still gets done at the speed of the central processor in the middle.
Bottom: In parallel processing, problems are broken up into components, each of which is handled by a separate processor. Since the processors are working in parallel, the problem is usually tackled more quickly even if the processors work at the same speed as the one in a serial system.
Why do supercomputers use parallel processing?
Most of us do quite trivial, everyday things with
our computers that don't tax them in any way: looking at web pages,
sending emails, and writing documents use very little of the
processing power in a typical PC. But if you try to do something more
complex, like changing the colors on a very large digital photograph,
you'll know that your computer does, occasionally, have to work hard
to do things: it can take a minute or so to do really complex
operations on very large digital photos. If you play computer games, you'll be
aware that you need a computer with a fast processor chip and quite a
lot of "working memory" (RAM), or things really slow down. Add a
faster processor or double the memory and your computer will speed up
dramatically—but there's still a limit to how fast it will go: one
processor can generally only do one thing at a time.
Now suppose you're a scientist charged with
forecasting the weather, testing a new cancer drug, or modeling how
the climate might be in 2050. Problems like that push even the
world's best computers to the limit. Just like you can upgrade a
desktop PC with a better processor and more memory, so you can do the
same with a world-class computer. But there's still a limit to how
fast a processor will work and there's only so much difference more
memory will make. The best way to make a difference is to use
parallel processing: add more processors, split your problem into
chunks, and get each processor working on a separate chunk of your
problem in parallel.
Chart: Who has the most supercomputers? About three quarters of the world's 500 most powerful machines
can be found in just five countries: China (32.4%), the USA (25.4%), Germany (6.8%), Japan (6.2%), and France (4.8%).
For each country, the series of bars show its total number of supercomputers in 2017, 2018, 2020,
2021, and 2023. The block on the right of each series, with a bold number in red above, shows the current figure for November 2022.
Although China has the most machines by far, the aggregate performance of the US machines is significantly higher (representing almost half the world's supercomputer total performance, according to TOP500's analysis). Drawn in March 2023 using the latest data from TOP500, November 2022.
Massively parallel computers
Once computer scientists had figured out the basic
idea of parallel processing, it made sense to add more and more
processors: why have a computer with two or three processors when you
can have one with hundreds or even thousands? Since the 1990s,
supercomputers have routinely used many thousands of processors in what's
known as massively parallel processing; at the time I'm
updating this, in March 2023, the supercomputer with more processors
than any other in the world, the Sunway TaihuLight, has around 40,960 processing modules,
each with 260 processor cores, which means 10,649,600 processor cores in total!
(It's currently the world's seventh most powerful machine.)
Unfortunately, parallel processing comes with a
built-in drawback. Let's go back to the supermarket analogy. If you
and your friends decide to split up your shopping to go through
multiple checkouts at once, the time you save by doing this is
obviously reduced by the time it takes you to go your separate ways,
figure out who's going to buy what, and come together again at the end. We can guess, intuitively, that
the more processors there are in a supercomputer, the harder it will probably be to
break up problems and reassemble them to make maximum efficient use of parallel processing. Moreover,
there will need to be some sort of centralized management system or coordinator to split the problems, allocate and control the workload between all the different processors, and reassemble the results, which will also carry an overhead.
With a simple problem like paying for a cart of shopping, that's not really an issue. But imagine
if your cart contains a billion items and you have 65,000 friends helping you with the checkout.
If you have a problem (like forecasting the world's weather for next week) that seems to split neatly into separate sub-problems
(making forecasts for each separate country), that's one thing. Computer scientists refer to complex problems like this, which can be split up easily into independent pieces, as embarrassingly parallel computations (EPC)—because
they are trivially easy to divide.
But most problems don't cleave neatly that way. The weather in one country depends to a great extent on the weather in
other places, so making a forecast for one country will need to take account of forecasts elsewhere. Often, the parallel processors
in a supercomputer will need to communicate with one another as they solve their own bits of the problems. Or one processor might have to wait for results from another before it can do a particular job. A typical problem worked on by a massively parallel computer
will thus fall somewhere between the two extremes of a completely serial problem (where every single step has to be done in an exact sequence) and an embarrassingly parallel one; while some parts can be solved in parallel, other parts will need to be solved in a serial way. A law of computing (known as Amdahl's law, for computer pioneer Gene Amdahl), explains how the part of the problem that remains serial effectively determines the maximum improvement in speed you can get from using a parallel system.
You can make a supercomputer by filling a giant
box with processors and getting them to cooperate on tackling a
complex problem through massively parallel processing. Alternatively,
you could just buy a load of off-the-shelf PCs, put them in the same
room, and interconnect them using a very fast local area
network (LAN) so they work in a broadly similar way. That kind of
supercomputer is called a cluster.
Google does its web searches for users with clusters of
off-the-shelf computers dotted in data centers around the world.
Photo: Supercomputer cluster:
NASA's Pleiades ICE Supercomputer is a cluster of 241,108 cores made from 158 racks of Silicon Graphics (SGI) workstations.
That means it can easily be extendedto make a more powerful machine:
it's now about 15 times more powerful than when it was first built over a decade ago.
As of March 2023, it's the world's 97th most powerful machine
(compared to 2021, when it was 70th, and 2020, when it stood at number 40).
Picture by Dominic Hart courtesy of
NASA Ames Research Center.
A grid is a supercomputer similar to a
cluster (in that it's made up of separate computers), but the
computers are in different places and connected through the Internet
(or other computer networks). This is an example of distributed
computing, which means that the power of a computer is spread across multiple locations
instead of being located in one, single place (that's sometimes called centralized computing).
Grid super computing comes in two main flavors. In
one kind, we might have, say, a dozen powerful mainframe computers in
universities linked together by a network to form a supercomputer
grid. Not all the computers will be actively working in the grid all
the time, but generally we know which computers make up the network.
The CERN Worldwide LHC Computing Grid, assembled to process data from the LHC (Large Hadron Collider) particle accelerator, is an example of this kind of system. It consists of two tiers of computer systems, with 11 major (tier-1) computer centers linked directly
to the CERN laboratory by private networks, which are themselves linked to 160 smaller (tier-2) computer centers around the world
(mostly in universities and other research centers), using a combination of the Internet and private networks.
The other kind of grid is much more ad-hoc and
informal and involves far more individual computers—typically
ordinary home computers. Have you ever taken part in an online
computing project such as
If so, you've allowed your computer to be used as part of an informal,
ad-hoc supercomputer grid. This kind of approach is called
opportunistic supercomputing, because it takes advantage of whatever
computers just happen to be available at the time. Grids like this,
which are linked using the Internet, are best for solving
embarrassingly parallel problems that easily break up into
completely independent chunks.
If you routinely use a laptop (and sit it on your
lap, rather than on a desk), you'll have noticed how hot it gets.
That's because almost all the electical energy that feeds in through the power cable
is ultimately converted to heat energy.
And it's why most computers need a cooling system of some kind, from a simple fan whirring away inside the case (in a home PC) to giant
air-conditioning units (in large mainframes).
Overheating (or cooling, if you prefer) is a major
issue for supercomputers. The early Cray supercomputers had elaborate
cooling systems—and the famous Cray-2 even had its own separate cooling tower,
which pumped a kind of cooling "blood" (Fluorinert™) around the
cases to stop them overheating.
Photo: A Cray-2 supercomputer (left), photographed at NASA in 1989,
with its own personal Fluorinert cooling tower (right). State of the art in the mid-1980s, this particular machine could perform a half-billion calculations per second. Picture courtesy of NASA Langley Research Center
and Internet Archive.
Modern supercomputers tend to be either air-cooled
(with fans) or liquid cooled (with a coolant circulated in a similar
way to refrigeration). Either way, cooling systems
translate into very high energy use and very expensive
bills; they're also very bad environmentally. Some supercomputers
deliberately trade off a little performance to reduce their energy
consumption and cooling needs and achieve lower environmental impact.
What software do supercomputers run?
You might be surprised to discover that most
supercomputers run fairly ordinary operating systems much like the
ones running on your own PC, although that's less surprising when
we remember that a lot of modern supercomputers are actually clusters of off-the-shelf computers
or workstations. The most common supercomputer operating system used to
be Unix, but it's now been superseded by Linux (an open-source,
Unix-like operating system originally developed by Linus Torvalds and
thousands of volunteers). Since supercomputers generally work on
scientific problems, their application programs are sometimes written in traditional scientific programming languages
such as Fortran, as well as popular, more modern languages such as
C and C++.
What do supercomputers actually do?
Photo: Supercomputers can help us crack the most complex scientific problems, including modeling Earth's climate. Picture courtesy of NASA on the Commons.
As we saw at the start of this article, one
essential feature of a computer is that it's a general-purpose
machine you can use in all kinds of different ways: you can send
emails on a computer, play games, edit photos, or do any number of
other things simply by running a different program. If you're using
a high-end cellphone, such as an Android phone or an iPhone
or an iPod Touch, what you have is a powerful little pocket computer that can run programs by loading different "apps"
(applications), which are simply computer programs by another name. Supercomputers are slightly different.
Typically, supercomputers have been used for
complex, mathematically intensive scientific problems, including
simulating nuclear missile tests, forecasting the weather, simulating
the climate, and testing the strength of encryption (computer
security codes). In theory, a general-purpose supercomputer can be
used for absolutely anything.
While some supercomputers are general-purpose
machines that can be used for a wide variety of different scientific
problems, some are engineered to do very specific jobs. Two of the
most famous supercomputers of recent times were engineered this way.
IBM's Deep Blue machine from 1997 was built specifically to play
chess (against Russian grand master Gary Kasparov), while its later
Watson machine (named for IBM's founder, Thomas Watson, and his son) was engineered to play the game Jeopardy. Specially designed machines like
this can be optimized for particular problems; so, for example, Deep
Blue would have been designed to search through huge databases of
potential chess moves and evaluate which move was best in a
particular situation, while Watson was optimized to analyze tricky
general-knowledge questions phrased in (natural)
How powerful are supercomputers?
Look through the specifications of ordinary
computers and you'll find their performance is usually quoted in
MIPS (million instructions per second),
which is how many fundamental programming commands (read, write, store, and so on) the processor can manage. It's easy to
compare two PCs by comparing the number of MIPS they can handle (or even their processor speed, which is typically rated in gigahertz or
Supercomputers are rated a different way. Since
they're employed in scientific calculations, they're measured
according to how many floating point operations per second (FLOPS)
they can do, which is a more meaningful measurement based on what they're actually trying to do
(unlike MIPS, which is a measurement of how they are trying to do it). Since supercomputers were first developed, their
performance has been measured in successively greater numbers of FLOPS, as the table below illustrates:
100 = 102
1 000 = 103
1 000 000 = 106
1 000 000 000 = 109
1 000 000 000 000 = 1012
1 000 000 000 000 000 = 1015
1 000 000 000 000 000 000 = 1018
The example machines listed in the table are described in more detail in the chronology, below.
Who invented supercomputers? A supercomputer timeline
Study the history of computers and you'll notice something straight away:
no single individual can lay claim to inventing these amazing machines. Arguably, that's much less true of supercomputers,
which are widely acknowledged to owe a huge debt to the work of a single man, Seymour Cray (1925–1996).
Here's a whistle stop tour of supercomputing, BC and AC—before and after Cray!
Photo: A distinctive Cray X-MP supercomputer (the C-shaped base unit at the bottom, with handy seats!) and two SSD expansion towers dating from c.1989. Picture courtesy of NASA Ames Research Center and Internet Archive.
NASA Image Exchange (NIX).
1946: John Mauchly and J. Presper Eckert construct ENIAC (Electronic Numerical Integrator And Computer) at the University of Pennsylvania. The first general-purpose, electronic computer, it's about 25m (80 feet) long and weighs 30 tons and, since it's deployed on military-scientific problems, is arguably the very first scientific supercomputer.
1953: IBM develops its first general-purpose mainframe computer, the IBM 701 (also known as the Defense Calculator), and sells about 20 of the machines to a variety of government and military agencies. The 701 is arguably the first off-the-shelf supercomputer. IBM engineer Gene Amdahl later redesigns the machine to make the
IBM 704, a machine capable of 5 KFLOPS (5000 FLOPS).
1956: IBM develops the Stretch supercomputer for Los Alamos National Laboratory. It remains the world's fastest computer until 1964.
1957: Seymour Cray co-founds Control Data Corporation (CDC) and pioneers fast, transistorized, high-performance computers, including the CDC 1604 (announced 1958) and 6600 (released 1964), which seriously challenge IBM's dominance of mainframe computing.
1972: Cray leaves Control Data and founds Cray Research to develop high-end computers—the first true supercomputers. One of his key ideas is to reduce the length of the connections between components inside his machines to help make them faster. This is partly why early Cray computers are C-shaped, although the unusual circular design (and bright blue or red cabinets) also helps to distinguish them from competitors.
1976: First Cray-1 supercomputer is installed at Los Alamos National Laboratory. It manages a speed of about 160 MFLOPS.
1979: Cray develops an ever faster model, the eight-processor, 1.9 GFLOP Cray-2. Where wire connections in the Cray-1 were a maximum of 120cm (~4 ft) long, in the Cray-2 they are a mere
41cm (16 inches).
1994: Thinking Machines files for bankruptcy protection.
1995: Cray Computer runs into financial difficulties and files for bankruptcy protection. Tragically, Seymour Cray dies on October 5, 1996, after sustaining injuries in a car accident.
1996: Cray Research (Cray's original company) is purchased by Silicon Graphics.
1997: ASCI Red, a supercomputer made from Pentium processors by Intel and Sandia National Laboratories, becomes the world's first teraflop (TFLOP) supercomputer.
1997: IBM's Deep Blue supercomputer beats Gary Kasparov at chess.
2008: The Jaguar supercomputer built by Cray Research and Oak Ridge National Laboratory becomes the world's first petaflop (PFLOP) scientific supercomputer. Briefly the world's fastest computer, it is soon superseded by machines from Japan and China.
2011–2013: Jaguar is extensively (and expensively) upgraded, renamed Titan, and briefly becomes the world's fastest supercomputer before losing the top slot to the Chinese machine
(Titan is finally decommissioned in 2019.)
2014: Mont-Blanc, a European consortium, announces plans to build an exaflop (1018 FLOP) supercomputer from energy efficient smartphone and tablet processors.
2018: In June 2018, Oak Ridge's new Summit 200-petaflop supercomputer recaptures the number one spot in the
TOP500 ranking of the world's 500 fastest supercomputers for the United States.
It's eight times more powerful than its predecessor, Titan.
Photo: Summit, based at Oak Ridge National Laboratory. At the time of writing in March 2023, it's the world's fifth most powerful machine (and the second fastest in the United States) with some 2,414,592 processor cores. Picture courtesy of Oak Ridge National Laboratory, US Department of Energy, published on Flickr in 2018 under a Creative Commons Licence.
2020: Japan's Fugaku becomes the world's top supercomputer, with a blistering performance of 415.5 petaflops
(almost three times better than Summit, the previous record holder). Fugaku
is installed at RIKEN Center for Computational Science (R-CCS) in Kobe, Japan.
2022: Oak Ridge reaches a major milestone with Frontier, the first supercomputer to achieve 1.1
exaflops (1.1 quintillion—or 1.1 million million million—calculations per second).
Find out more
On this website
You might like these other articles on our site covering similar topics:
Supercomputers: A great online exhibition of supercomputers, from the earliest days to today, from The Computer History Museum.
Top 500 Supercomputer Sites: The famous supercomputer "beauty" contest! This site has been keeping track of the world's most powerful computers since 1993. You'll find a regularly updated list of which computers are currently the fastest (including detailed specifications) and lots of interesting background articles.
Computer Architecture: From Microprocessors to Supercomputers by Behrooz Parhami. Oxford University Press, 2010. A large, detailed textbook (~600 pages) covering architecture in detail, in theory and practice. (Behrooz Parhami is Professor of Electrical and Computer Engineering at the University of California, Santa Barbara.)
by Editors of Scientific American. Hachette, 2002. An excellent, deliberately accessible 176-page introduction. Good for older teens and adults who want a non-technical overview.
The Supercomputers by Charles A. Jortberg. Abdo & Daughters, 1997. A 48-page introduction to powerful computers for ages 9–12.
Jaguar, What Are You Working on Today? by Clay Dillow. Popular Science, 31 October 2011. How the Cray/ORNL Jaguar supercomputer is being upgraded to become the world's fastest supercomputer once again.
Seymour Cray: Supercomputer legend: A great little tour of the Cray-1, Cray-2, and Cray-3 supercomputers, and their cooling systems, by one of the (excellent) tour guides at the Computer History Museum. (3.5 minutes.)
Please do NOT copy our articles onto blogs and other websites
Articles from this website are registered at the US Copyright Office. Copying or otherwise using registered works without permission, removing this or other copyright notices, and/or infringing related rights could make you liable to severe civil or criminal penalties.