
Flash memory
by Chris Woodford. Last updated: November 14, 2022.
Imagine if your memory worked only
while you were awake. Every
morning when you got up, your mind would be completely blank! You'd
have to relearn everything you ever knew before you could do anything.
It sounds like a nightmare, but it's exactly the problem computers
have. Ordinary computer chips "forget" everything (lose their entire
contents) when the power is switched off. Large personal computers get
around this by having powerful magnetic memories called
hard drives, which can remember things
whether the power is on or off. But smaller, more portable devices, such as digital cameras and MP3 players,
need smaller and more portable memories. They use special chips called
flash memories to store information permanently.
Flash memories are clever—but rather complex too. How exactly do they
work?
Photo: Turn a digital camera's flash memory
card over and you can see the electrical contacts that let the camera
connect to the memory chip inside the protective plastic case.
How computers store information
Computers are electronic
machines that process information in
digital format. Instead of understanding words and numbers, as people
do, they change those words and numbers into strings of zeros and ones
called binary (sometimes referred to as "binary code").
Inside a computer, a single letter "A" is stored as eight binary numbers: 01000001. In fact, all the basic characters on
your keyboard (the letters A–Z in upper and lower case, the numbers
0–9, and the symbols) can be represented with different combinations of
just eight binary numbers. A question mark (?) is stored as 00111111, a
number 7 as 00110111, and a left bracket ([) as 01011011. Virtually all
computers know how to represent information with this "code," because
it's an agreed, worldwide standard. It's called ASCII
(American Standard Code for Information Interchange).
Computers can represent information with patterns of zeros and ones,
but how exactly is the information stored inside their memory chips? It
helps to think of a slightly different example. Suppose you're standing
some distance away, I want to send a message to you, and I have only eight flags with
which to do it. I can set the flags up in a line and then send each
letter of the message to you by raising and lowering a different
pattern of flags. If we both understand the ASCII code, sending
information is easy. If I raise a flag, you can assume I mean a number
1, and if I leave a flag down, you can assume I mean a number 0. So if
I show you this pattern:

You can figure out that I am sending you the binary number 00110111,
equivalent to the decimal number 55, and so signaling the character "7" in ASCII.
What does this have to do with memory? It shows that you can store,
or represent, a character like "7" with something like a flag that can
be in two places, either up or down. A computer memory is effectively a
giant box of billions and billions of flags, each of which can be
either up or down. They're not really flags, though—they are
microscopic switches called transistors
that can be either on or off. It takes eight switches to store a character
like A, 7, or [. It takes one transistor to store each binary digit (which is
called a bit). In most computers, eight of these bits are collectively
called a byte. So when you hear people say a
computer has so many megabytes of memory, it means it can store roughly
that many million characters of information (mega means million; giga means thousand million or billion).
What is flash memory?

Photo: A typical USB memory stick—and the flash memory chip you'll find inside if you take it apart (the large black rectangle on the right).
Ordinary transistors are electronic switches turned on or off by
electricity—and that's both their
strength and their weakness. It's a
strength, because it means a computer can store information simply by
passing patterns of electricity through its memory circuits. But it's a
weakness too, because as soon as the power
is turned off, all the transistors revert to their original states—and
the computer loses all the information it has stored. It's like a giant
attack of electronic amnesia!
Memory that "forgets" when the power goes off is called Random Access Memory (RAM). There is another kind
of memory called Read-Only Memory (ROM) that
doesn't suffer from this problem. ROM chips are pre-stored with
information when they are manufactured, so they don't "forget" what
they know when the power is switched on and off. However, the
information they store is there permanently: they can never be
rewritten again. In practice, a computer uses a mixture of different
kinds of memory for different purposes. The things it needs to remember
all the time—like what to do when you first switch it on—are stored on
ROM chips. When you're working on your computer and it needs temporary
memory for processing things, it uses RAM chips; it doesn't matter that
this information is lost later. Information you want
a computer to remember indefinitely is stored on its hard drive. It takes longer
to read and write information from a hard drive than from memory chips,
so hard drives are not generally used as temporary memory. In gadgets
like digital cameras and small MP3 players,
flash memory is used instead of a hard drive. It has certain things in
common with both RAM and ROM. Like ROM, it remembers information when
the power is off; like RAM, it can be erased and rewritten over
and over again.

Photo: Apple iPods, past and present. The white one on the left is an old-style classic iPod with a 20GB hard drive memory. The newer black model on the right has a 32GB flash memory, which makes it lighter, thinner, more robust (less likely to die if you drop it), and less power hungry.
How flash memory works—the simple explanation

Photo: A typical secure digital (SD) card from a digital camera. Inside, there's a flash memory chip. How does it work? Read on!
Flash works using an entirely different kind of transistor that
stays switched on (or switched off) even when the power is turned off.
A normal transistor has three connections (wires that control it)
called the source, drain, and
gate. Think of a transistor as a pipe through which
electricity can flow as though it's water. One end of the pipe (where the water flows in) is called the
source—think of that as a tap or faucet. The other end of the pipe is
called the drain—where the water drains out and flows away. In between
the source and drain, blocking the pipe, there's a gate. When the gate
is closed, the pipe is shut off, no
electricity can flow and the transistor is off. In this state, the
transistor stores a
zero. When the gate is opened, electricity flows, the transistor is on,
and it stores a one. But when the power is turned off, the transistor
switches off too. When you switch the power back on, the transistor is
still off, and since you can't know whether it was on or off before the
power was removed, you can see why we say it "forgets" any information
it stores.
A flash transistor is different because it has a second ("floating") gate above the first one. When the
gate opens, some electricity leaks up the first gate and stays there,
in between the first gate and the second one.
Even if the power is turned off, the electricity is still there between
the two gates. Now if you try to pass a current through, the stored
electricity stops it from flowing so, in this state, the transistor stores a zero.
If you clear the stored electricity, current can flow through
once again; in this state, the transistor stores a one.
That's how a flash transistor stores its information whether
the power is on or off..
How flash memory works—a more complex explanation
That's a very glossed over, highly simplified explanation of
something that's extremely complex. If you want more detail, it helps
if you read our article about transistors
first, especially the bit at the bottom about MOSFETs—and then read on.
The transistors in flash memory are like MOSFETs only they have two
gates on top instead of one. This is what a flash transistor looks like
inside. You can see it's an n-p-n sandwich with two gates on top, one
called a control gate and one called a floating gate. The two gates are
separated by oxide layers through which current cannot normally pass:

How do we use this to store data? Both the source and the
drain regions are rich in electrons (because they're made of n-type
silicon), but electrons cannot flow from source to drain because of the
electron deficient, p-type material between them. If we apply a
positive voltage to the transistor's two contacts, called the bitline
and the wordline, electrons get pulled in a rush from source to drain. A
few also manage to wriggle through the oxide layer by a process called
tunneling and get stuck on the floating gate:

The electrons will stay on the floating gate indefinitely,
even when the positive voltages are removed and whether there is power
supplied to the circuit or not. If we disconnect the
positive voltages from the bitline and wordline and try to pass a current through the transistor,
from source to drain, none will flow: the electrons on the floating gate will stop it.
So, in this state, we say the transistor is storing a zero.
The electrons on the floating gate can be flushed out by
putting a negative voltage on the wordline. This repels the electrons
back the way they came, clearing the floating gate and allowing
a current to flow through the transistor once again.
In this state, we say the transistor is storing a one.
Not an easy process to understand, but that's how flash memory works its magic!
How long does flash memory last?
Flash memory eventually wears out because its floating gates take longer to work after
they've been used a certain number of times. It's very widely quoted that flash memory degrades after it's been written and rewritten about "10,000 times," but that's misleading. According to a 1990s flash patent by Steven Wells of Intel, "although switching begins to take longer after approximately ten thousand switching operations, approximately one hundred thousand switching operations are required before the extended switching time has any affect on system operation."
Whether it's 10,000 or 100,000, it's usually fine for a USB stick or the SD memory card in
a digital camera you use once a week, but less satisfactory for the main storage in a computer, cellphone, or other gadget that's in daily use for years on end. One practical way around the limit is for the operating system to ensure that different bits of flash memory are used each time information is erased and stored (technically, this is called wear-leveling), so no bit is erased too often. In practice, modern computers might simply ignore and "tiptoe round" the bad parts of a flash memory chip, just like they can ignore bad sectors on a hard drive, so the real practical lifetime limit of flash drives is much higher: somewhere between 10,000 and 1 million cycles. Cutting-edge flash devices
have been demonstrated that survive for 100 million cycles or more.
Who invented flash memory?
Flash was originally developed by Toshiba electrical engineer
Fujio Masuoka, who filed
US Patent 4,531,203
on the idea with colleague Hisakazu Iizuka back in 1981. Originally known as simultaneously erasable EEPROM (Electrically Erasable Programmable Read-Only Memory), it earned the nickname "flash" because it could be instantly erased and reprogrammed—as fast as a camera flash. At that time, state-of-the-art erasable memory chips (ordinary EPROMS) took 20 minutes or so to wipe for reuse with a beam of ultraviolet light, which meant they needed expensive, light-transparent packaging. Cheaper, electrically erasable EPROMS did exist, but used a bulkier and less efficient design that required two transistors to store each bit of information. Flash memory solved these problems.

Photo: Above: Erasable memory before flash: EPROM chips had little round windows in the
top through which you could erase their contents using a lengthy blast of UV light.
If you're interested, this one's a 32KB (kilobyte) AMD AM27C256 dating from 1986,
so it has about 1000 times less storage capacity than even the small 32MB (megabyte) SD card in the top photo.
Below: A closeup of the UV-transparent window and the chip wired inside the package.

Toshiba released the first flash chips in 1987, but most of us didn't come across the technology for another decade or so, after SD memory cards first appeared in 1999 (jointly supported by Toshiba, Matsushita, and SanDisk). SD cards allowed digital cameras to record hundreds of photos and made them far more useful than older film cameras, which were limited to taking about 24–36 pictures at a time. Toshiba launched the first digital music player using an SD card the following year. It took Apple a few more years to catch up and fully embrace flash technology in its own digital music player, the iPod. Early "classic" iPods all used hard drives, but the release of the tiny iPod Shuffle in 2005 marked the beginning of a gradual switchover, and all modern iPods and iPhones now use flash memory instead.
What's the future for flash memory?
Flash has rapidly overtaken magnetic storage over the last decade or so; in everything from
supercomputers and laptops to
smartphones and iPods, hard drives have increasingly given way to fast, compact SSDs (solid-state drives) based on flash chips. That trend has been driven by—and helped to drive—another one: the shift from desktop computers and landline phones to mobile devices (smartphones and tablets) and cellphones, which need ultra-compact, high-density, extremely reliable memories that can withstand the stresses and strains of being thrown around in our backpacks and briefcases. These trends are now favoring 3D flash ("stacked") technology, developed in the early 2000s and formally launched by Samsung in 2013, in which dozens of different layers of memory cells can be grown on the same silicon wafer to increase storage capacity (just like the multiple floors of a high-rise office block let us pack more offices into the same area of land). Instead of using floating gates (as described above), 3D flash uses an alternative (though sometimes less reliable) technique called charge-trap, which allows us to engineer much higher capacity memories in the same amount of space, well into the terabit (Tbit) scale (1 trillion bits = 1,000,000,000,000 bits).