by Dr. Jaydeep T. Vagh

The integrated circuit was not an immediate commercial success. By 1960 the computer had gone from a laboratory device to big business with thousands in operation worldwide and more than half a billion dollars in sales in 1960 alone. 2 International Business Machines (IBM ® ) had become the leading computer manufacturer and had just begun shipping its first all-transistorized computer. These machines still bore little resemblance to the computers of today. Costing millions these “mainframe” computers filled rooms and required teams of operators to man them. Integrated circuits would reduce the cost of assembling these computers but not nearly enough to offset their high prices compared to discrete transistors. Without a large market the volume production that would bring integrated circuit costs down couldn’t happen. Then, in 1961, President
Kennedy challenged the United States to put a man on the moon before the end of the decade. To do this would require extremely compact and light computers, and cost was not a limitation. For the next 3 years, the newly created space agency, NASA, and the U.S. Defense Department purchased every integrated circuit made and demand soared.

The key to making integrated circuits cost effective enough for the general market place was incorporating more transistors into each chip. The size of early MOSFETs was limited by the problem of making the gate cross exactly between the source and drain. Adding dopants to form the source and drain regions requires very high temperatures that would melt a metal gate wire. This forced the metal gates to be formed after the source and drain, and ensuring the gates were properly aligned was a difficult problem. In 1967, Fedrico Faggin at Fairchild Semiconductor experimented with making the gate wires out of silicon. Because the silicon was deposited on top of an oxide layer, it was not a single crystal

but a jumble of many small crystals called polycrystalline silicon,
polysilicon, or just poly. By forming polysilicon gates before adding dopants,
the gate itself would determine where the dopants would enter the
silicon crystal. The result was a self-aligned MOSFET. The resistance
of polysilicon is much higher than a metal conductor, but with heavy
doping it is low enough to be useful. MOSFETs are still made with poly
gates today.

The computers of the 1960s stored their data and instructions in
“core” memory. These memories were constructed of grids of wires with
metal donuts threaded onto each intersection point. By applying current
to one vertical and one horizontal wire a specific donut or “core” could
be magnetized in one direction or the other to store a single bit of infor-
mation. Core memory was reliable but difficult to assemble and oper-
ated slowly compared to the transistors performing computations. A
memory made out of transistors was possible but would require thou-
sands of transistors to provide enough storage to be useful. Assembling
this by hand wasn’t practical, but the transistors and connections needed
would be a simple pattern repeated many times, making semiconductor
memory a perfect market for the early integrated circuit business.

In 1968, Bob Noyce and Gordon Moore left Fairchild Semiconductor
to start their own company focused on building products from inte-
grated circuits. They named their company Intel ® (from INTegrated
ELectronics). In 1969, Intel began shipping the first commercial inte-
grated circuit using MOSFETs, a 256-bit memory chip called the 1101.
The 1101 memory chip did not sell well, but Intel was able to rapidly
shrink the size of the new silicon gate MOSFETs and add more tran-
sistors to their designs. One year later Intel offered the 1103 with 1024
bits of memory, and this rapidly became a standard component in the
computers of the day.

Although focused on memory chips, Intel received a contract to design
a set of chips for a desktop calculator to be built by the Japanese com-
pany Busicom. At that time, calculators were either mechanical or used
hard-wired logic circuits to do the required calculations. Ted Hoff was
asked to design the chips for the calculator and came to the conclusion
that creating a general purpose processing chip that would read instruc-
tions from a memory chip could reduce the number of logic chips
required. Stan Mazor detailed how the chips would work together and
after much convincing Busicom agreed to accept Intel’s design. There
would be four chips altogether: one chip controlling input and output
functions, a memory chip to hold data, another to hold instructions,
and a central processing unit that would eventually become the world’s
first microprocessor.

The computer processors that powered the mainframe computers of the
day were assembled from thousands of discrete transistors and logic chips.

This was the first serious proposal to put all the logic of a computer
processor onto a single chip. However, Hoff had no experience with
MOSFETs and did not know how to make his design a reality. The
memory chips Intel was making at the time were logically very simple
with the same basic memory cell circuit repeated over and over. Hoff ’s
design would require much more complicated logic and circuit design
than any integrated circuit yet attempted. For months no progress was
made as Intel struggled to find someone who could implement Hoff’s idea.

In April 1970, Intel hired Faggin, the inventor of the silicon gate
MOSFET, away from Fairchild. On Faggin’s second day at Intel, Masatoshi
Shima, the engineering representative from Busicom, arrived from Japan
to review the design. Faggin had nothing to show him but the same plans
Shima had already reviewed half a year earlier. Shima was furious, and
Faggin finished his second day at a new job already 6 months behind
schedule. Faggin began working at a furious pace with Shima helping
to validate the design, and amazingly by February 1971 they had all four
chips working. The chips processed data 4 bits at a time and so were
named the 4000 series. The fourth chip of the series was the first micro-
processor, the Intel 4004

The 4004 contained 2300 transistors and ran at a clock speed of 740 kHz,
executing on average about 60,000 instructions per second. 3 This gave it
the same processing power as early computers that had filled entire
rooms, but on a chip that was only 24 mm 2 . It was an incredible engi-
neering achievement, but at the time it was not at all clear that it had a
commercial future. The 4004 might match the performance of the fastest
computer in the world in the late 1940s, but the mainframe computers
of 1971 were hundreds of times faster. Intel began shipping the 4000
series to Busicom in March 1971, but the calculator market had become
intensely competitive and Busicom was unenthusiastic about the high cost
of the 4000 series. To make matters worse, Intel’s contract with Busicom
specified Intel could not sell the chips to anyone else. Hoff, Faggin, and
Mazor pleaded with Intel’s management to secure the right to sell to
other customers. Bob Noyce offered Busicom a reduced price for the 4000
series if they would change the contract, and desperate to cut costs in order
to stay in business Busicom agreed. By the end of 1971, Intel was mar-
keting the 4004 as a general purpose microprocessor. Busicom ultimately
sold about 100,000 of the series 4000 calculators before going out of busi-
ness in 1974. Intel would go on to become the leading manufacturer in
what was for 2003—a $27 billion a year market for microprocessors. The
incredible improvements in microprocessor performance and growth of the

White ceramic Intel C4004 microprocessor with grey traces
Produced	From late 1971 to 1981
Common manufacturer(s)	Intel
Max. CPU clock rate	740 kHz
Min. feature size	10 µm
Instruction set	4-bit BCD oriented
Transistors	2,300^[1]
Data width	4 Bit
Address width	12 (multiplexed)
Socket(s)	DIP16
Successor	Intel 4040
Application	Busicom calculator, arithmetic manipulation
Package(s)	16 pin DIP

semiconductor industry since 1971 have been made possible by steady
year after year improvements in the manufacturing of transistors.

Moore’s Law

Since the creation of the first integrated circuit, the primary driving force
for the entire semiconductor industry has been process scaling. Process
scaling is shrinking the physical size of the transistors and the wires
interconnecting them, allowing more devices to be placed on each chip,
which allows more complex functions to be implemented. In 1975,
Gordon Moore observed that shrinking transistor dimensions were
allowing the number of transistors on a die to double roughly every 18

months. 4 This trend has come to be known as Moore’s law. For micro-
processors, the trend has been closer to a doubling every 2 years, but
amazingly this exponential increase has continued now for 30 years
and seems likely to continue through the foreseeable future (Fig. 1-7).
The 4004 used transistors with a feature size of 10 microns (μm).
This means that the distance from the source of the transistor to the
drain was approximately 10 μm. A human hair is around 100 μm across.
In 2003, transistors were being mass produced with a feature size of only
0.13 μm. Smaller transistors not only allow for more logic gates, but also
allow the individual logic gates to switch more quickly. This has provided
for even greater improvements in performance by allowing faster clock
rates. Perhaps even more importantly, shrinking the size of a computer
chip reduces its manufacturing cost. The cost is determined by the cost
to process a wafer, and the smaller the chip, the more that are made from
each wafer. The importance of transistor scaling to the semiconductor
industry is almost impossible to overstate. Making transistors smaller
allows for chips that provide more performance, and therefore sell for
more money, to be made at a lower cost. This is the fundamental driving
force of the semiconductor industry.

Transistor scaling

The reason smaller transistors switch faster is that although they draw
less current, they also have less capacitance. Less charge has to be
moved to switch their gates on and off. The delay of switching a gate
(T DELAY ) is determined by the capacitance of the gate (C GATE ), the total
voltage swing (V dd ), and the drain to source current (I DS ) drawn by the
transistor causing the gate to switch.

Higher capacitance or higher voltage requires more charge to be
drawn out of the gate to switch the transistor, and therefore more cur-
rent to switch in the same amount of time. The capacitance of the gate
increases linearly with the width and length (L) of the gate and
decreases linearly with the thickness of the gate oxide (T OX ).

The current drawn by a MOSFET increases with the device width (W ),
since there is a wider path for charges to flow, and decreases with the
device length (L), since the charges have farther to travel from source
to drain. Reducing the gate oxide thickness (T OX ) increases current,
since pushing the gate physically closer to the silicon channel allows its
electric field to better penetrate the semiconductor and draw more
charges into the channel (Fig. 1-8).

To draw any current at all, the gate voltage must be greater than a
certain minimum voltage called the threshold voltage (V T ). This volt-
age is determined by both the gate oxide thickness and the concentra-
tion of dopant atoms added to the channel. Current from the drain to
source increases quadratically after the threshold voltage is crossed. The
current of MOSFETs is discussed in more detail.

Putting together these equations for delay and curren

Putting together these equations for delay and current we find:

Decreasing device lengths, increasing voltage, or decreasing threshold
voltage reduces the delay of a MOSFET. Of these methods decreasing the
device length is the most effective, and this is what the semiconductor
industry has focused on the most. There are different ways to measure
channel length, and so when comparing one process to another, it is
important to be clear on which measurement is being compared. Channel
length is measured by three different values as shown in Fig. 1-9.
The drawn gate length (L DRAWN ) is the width of the gate wire as drawn
on the mask used to create the transistors. This is how wide the wire will
be when it begins processing. The etching process reduces the width of the
actual wire to less than what was drawn on the mask. The manufacturing
of MOSFETs is discussed in detail in Chap. 9. The width of the gate wire

at the end of processing is the actual gate length (L GATE ). Also, the source
and drain regions within the silicon typically reach some distance under-
neath the gate. This makes the effective separation between source and
drain in the silicon less than the final gate length. This distance is called
the effective channel length (L EFF ). It is this effective distance that is the
most important to transistor performance, but because it is under the
gate and inside the silicon, it can not be measured directly. L EFF is only
estimated by electrical measurements. Therefore, L GATE is the value most
commonly used to compare difference processes.
Gate oxide thickness is also measured in more than one way as shown
in Fig. 1-10. The actual distance from the bottom of the gate to the top of
the silicon is the physical gate oxide thickness (T OX-P ). For older processes
this was the only relevant measurement, but as the oxide thickness has
been reduced, the thickness of the layer of charge on both sides of the oxide
has become significant. The electrical oxide thickness (T OX-E ) includes the
distance to the center of the sheets of charge above and below the gate oxide.
It is this thickness that determines how much current a transistor will pro-
duce and hence its performance. One of the limits to future scaling is that
increasingly large reductions in the physical oxide thickness are required
to get the same effective reduction in the electrical oxide thickness.
While scaling channel length alone is the most effective way to reduce
delays, the increase in leakage current prevents it from being practical.
As the source and drain become physically closer together, they become
more difficult to electrically isolate from one another. In deep submicron
MOSFETs there may be significant current flow from the drain to the
source even when the gate voltage is below the threshold voltage. This
is called subthreshold leakage. It means that even transistors that
should be off still conduct a small amount of current like a leaky faucet.
This current may be hundreds or thousands of times smaller than the
current when the transistor is on, but for a die with millions of tran-
sistors this leakage current can rapidly become a problem. The most
common solution for this is reducing the oxide thickness.
Moving the gate terminal physically closer to the channel gives the
gate more control and limits subthreshold leakage. However, this

reduces the long-term reliability of the transistors. Any material will con-
duct electricity if a sufficient electrical field is applied. In the case of insu-
lators this is called dielectric breakdown and physically melts the
material. At extremely high electric fields the electrons, which bind
the molecules of the material together, are torn free and suddenly large
amounts of current begin to flow. The gate oxides of working MOSFETs
accumulate defects over time that gradually lower the field at which the
transistor will fail. These defects can also reduce the switching speed
5
of the transistors. These phenomena are particularly worrisome to
semiconductor manufacturers because they can cause a new product to
begin failing after it has already been shipping for months or years.
The accumulation of defects in the gate oxide is in part due to “hot”
electron effects. Normally the electrons in the channel do not have enough
energy to enter the gate oxide. Its band gap is far too large for any sig-
nificant number of electrons to have enough energy to surmount at
normal operating temperatures. Electrons in the channel drift from
source to drain due to the lateral electric field in the channel. Their aver-
age drift velocity is determined by how strong the electric field is and how
often the electrons collide with the atoms of the semiconductor crystal.
Typically the drift velocity is only a tiny fraction of the random thermal
velocity of the electrons, but at very high lateral fields some electrons may
get accelerated to velocities much higher than they would usually have
at the operating temperature. It is as if these electrons are at a much
higher temperature than the rest, and they may have enough energy to
enter the gate oxide. They may travel through and create a current at
the gate, or they may become trapped in the oxide creating a defect. If a
series of defects happens to line up on a path from the gate to the chan-
nel, gate oxide breakdown occurs. Thus the reliability of the transistors
is a limit to how much their dimensions can be scaled. In addition, as gate
oxides are scaled below 5 nm, gate tunneling current becomes significant.
One implication of quantum mechanics is that the position of an elec-
tron is not precisely defined. This means that with a sufficiently thin
oxide layer, electrons will occasionally appear on the opposite side of the
insulator. If there is an electric field, the electron will then be pulled
away and unable to get back. The current this phenomenon creates
through the insulator is called a tunneling current. It does not damage the
layer as occurs with hot electrons because the electron does not travel
through the oxide in the classical sense, but this does cause unwanted
leakage current through the gate of any ON device. The typical solution
for both dielectric breakdown and gate tunneling current is to reduce
the supply voltage.

Scaling the supply voltage by the same amount as the channel length
and oxide thickness keeps all the electrical fields in the device constant.
This concept is called constant field scaling and was proposed by Robert
Dennard in 1974. 6 Constant field scaling is an easy way to address prob-
lems such as subthreshold leakage and dielectric breakdown, but a
higher supply voltage provides for better performance. As a result, the
industry has scaled voltages as slowly as possible, allowing fields in
the channel and the oxide to increase significantly with each device
generation. This has required many process adjustments to tolerate the
higher fields. The concentration of dopants in the source, drain, and
channel is precisely controlled to create a three-dimensional profile that
minimizes subthreshold leakage and hot electron effects. Still, even the
very gradual scaling of supply voltages increases delay and hurts per-
formance. This penalty increases dramatically when the supply voltage
becomes less than about three times the threshold voltage.
It is possible to design integrated circuits that operate with supply
voltages less than the threshold voltages of the devices. These designs
operate using only subthreshold leakage currents and as a result are
incredibly power efficient. However, because the currents being used are
orders of magnitude smaller than full ON currents, the delays involved
are orders of magnitude larger. This is a good trade-off for a chip to go
into a digital watch but not acceptable for a desktop computer. To main-
tain reasonable performance a processor must use a supply voltage sev-
eral times larger than the threshold voltage. To gain performance at
lower supply voltages the channel doping can be reduced to lower the
threshold voltage.
Lowering the threshold voltage immediately provides for more on
current but increases subthreshold current much more rapidly. The
rate at which subthreshold currents increase with reduced threshold
voltage is called the subthreshold slope and a typical value is 100
mV/decade. This means a 100-mV drop in threshold will increase sub-
threshold leakage by a factor of 10. The need to maintain several orders
of magnitude difference between the on and off current of a device there-
fore limits how much the threshold voltage can be reduced. Because the
increase in subthreshold current was the first problem encountered
when scaling the channel length, we have come full circle to the origi-
nal problem. In the end there is no easy solution and process engineers
are continuing to look for new materials and structures that will allow
them to reduce delay while controlling leakage currents and reliability
(Fig. 1-11).

Interconnect scaling

Fitting more transistors onto a die requires not only shrinking the tran-
sistors but also shrinking the wires that interconnect them. To connect
millions of transistors modern microprocessors may use seven or more
separate layers of wires. These interconnects contribute to the delay of
the overall circuit. They add capacitive load to the transistor outputs,
and their resistance means that voltages take time to travel their length.
The capacitance of a wire is the sum of its capacitance to wires on either
side and to wires above and below (see Fig. 1-12).
Fringing fields make the wire capacitance a complex function, but for
cases where the wire width (W INT ) is equal to the wire spacing (W SP ) and

thickness (T INT ) is equal to the vertical spacing of wires (T ILD ), capaci-
tance per length (C L ) is approximated by the following equation.

Wire capacitance is kept to a minimum by using small wires and wide
spaces, but this reduces the total number of wires that can fit in a given
area and leads to high wire resistance. The delay for a voltage signal to
travel a length of wire (L WIRE ) is the product of the resistance of the wire
and the capacitance of the wire, the RC delay. The wire resistance per
length (R L ) is determined by the width and thickness of the wire as well
as the resistivity (r) of the material

Engineers have tried three basic methods of scaling interconnects in
order to balance the need for low capacitance and low resistance. These
8 are ideal scaling, quasi-ideal scaling, and constant-R scaling. For a wire
whose length is being scaled by a value S less than 1, each scheme scales
the other dimensions of the wire in different ways, as shown in Table 1-1.

Ideal scaling reduces all the vertical and horizontal dimensions by the
same amount. This keeps the capacitance per length constant but
greatly increases the resistance per length. In the end the reduction in
wire capacitance is offset by the increase in wire resistance, and the wire
delay remains constant. Scaling interconnects this way would mean
that as transistors grew faster, processor frequency would quickly
become limited by the interconnect delay.
To make interconnect delay scale with the transistor delay, constant-R
scaling can be used. By scaling the vertical and horizontal dimensions
of the wire less than its length, the total resistance of the wire is kept
constant. Because the capacitance is reduced at the same rate as ideal
scaling, the overall RC delay scales with the wire length. The downside
of constant-R scaling is that if S is also scaling the device dimensions,
then the area required for wires is not decreasing as quickly as the
device area. The size of a chip will be rapidly determined not by the
number of transistors but by the number of wires.
To allow for maximum scaling of die area while mitigating the increase
in wire resistance, most manufactures use quasi-ideal scaling. In this
scheme horizontal dimensions are scaled with wire length, but vertical
dimensions are scaled more slowly. The capacitance per length increases
only slightly and the increase in resistance is not as much as ideal scal-
ing. Overall the RC delay will decrease although not as much constant-R
scaling. The biggest disadvantage of quasi-ideal scaling is that it
increases the aspect ratio of the wires, the ratio of thickness to width.
This scaling has rapidly led to wires in modern processors that are twice
as tall as they are wide, but manufacturing wires with ever-greater
aspect ratios is difficult. To help in continuing to reduce interconnect
delays, manufactures have turned to new materials.
In 2000, some semiconductor manufacturers switched from using
aluminum wires, which had been used since the very first integrated cir-
cuits, to copper wires. The resistivity of copper is less than aluminum
providing lower resistance wires. Copper had not been used previously
because it diffuses very easily through silicon and silicon dioxide. Copper
atoms from the wires could quickly spread throughout a chip acting as
defects in the silicon and ruining the transistor behavior. To prevent this,
manufacturers coat all sides of the copper wires with materials that act
as diffusion barriers. This reduces the cross section of the wire that is
actually copper but prevents contamination.
Wire capacitances have been reduced through the use of low-K
dielectrics. Not only the dimensions of the wires determine wire capac-
itance but also by the permittivity or K value of the insulator sur-
rounding the wires. The best capacitance would be achieved if there were
simply air or vacuum between wires giving a K equal to 1, but of course
this would provide no physical support. Silicon dioxide is traditionally

used, but this has a K value of 4. New materials are being tried to
reduce K to 3 or even 2, but these materials tend to be very soft and
porous. When heated by high electrical currents the metal wires tend
to flex and stretch and soft dielectrics do little to prevent this. Future
interlevel dielectrics must provide reduced capacitance without sacri-
ficing reliability.
One of the common sources of interconnect failures is called electro-
migration. In wires with very high current densities, atoms tend to be
pushed along the length of the wire in the direction of the flow of elec-
trons, like rocks being pushed along a fast moving stream. This phe-
nomenon happens more quickly at narrow spots in the wire where the
current density is highest. This leads these spots to become more and
more narrow, accelerating the process. Eventually a break in the wire is
created. Rigid interlevel dielectrics slow this process by preventing the
wires from growing in size elsewhere, but the circuit design must make
sure not to exceed the current carrying capacity of any one wire.
Despite using new conductor materials and new insulator materials,
improvements in the delay of interconnects have continued to trail
behind improvements in transistor delay. One of the ways in which
microprocessors designs try to compensate for this is by adding more
wiring layers. The lowest levels are produced with the smallest dimen-
sions. This allows for a very large number of interconnections. The high-
est levels are produced with large widths, spaces, and thickness. This
allows them to have much less delay at the cost of allowing fewer wires
in the same area.
The different wiring layers connect transistors on a chip the way
roads connect houses in a city. The only interconnect layer that actually
connects to a transistor is the first layer deposited, usually called the
metal 1 or M1 layer. These are the suburban streets of a city. Because
they are narrow, traveling on them is slow, but typically they are very
short. To travel longer distances, wider high speed levels must be used.
The top layer wires would be the freeways of the chip. They are used to
travel long distances quickly, but they must connect through all the
lower slower levels before reaching a specific destination.
There is no real limit to the number of wiring levels that can be added,
but each level adds to the cost of processing the wafer. In the end the
design of the microprocessor itself will have to continue to evolve to
allow for the greater importance of interconnect delays.

SOME Extra

Moore’s law

Moore’s law is the observation that the number of transistors in a dense integrated circuit doubles about every two years. The observation is named after Gordon Moore, the co-founder of Fairchild Semiconductor and CEO of Intel, whose 1965 paper described a doubling every year in the number of components per integrated circuit, and projected this rate of growth would continue for at least another decade. In 1975,looking forward to the next decade,[5] he revised the forecast to doubling every two years. The period is often quoted as 18 months because of a prediction by Intel executive David House (being a combination of the effect of more transistors and the transistors being faster).

Moore’s 2nd law

As the cost of computer power to the consumer falls, the cost for producers to fulfill Moore’s law follows an opposite trend: R&D, manufacturing, and test costs have increased steadily with each new generation of chips. Rising manufacturing costs are an important consideration for the sustaining of Moore’s law. This had led to the formulation of Moore’s second law, also called Rock’s law, which is that the capital cost of a semiconductor fab also increases exponentially over time

Major enabling factors

Numerous innovations by scientists and engineers have sustained Moore’s law since the beginning of the integrated circuit (IC) era. Some of the key innovations are listed below, as examples of breakthroughs that have advanced integrated circuit technology by more than seven orders of magnitude in less than five decades:

The foremost contribution, which is the raison d’être for Moore’s law, is the invention of the integrated circuit, credited contemporaneously to Jack Kilby at Texas Instrumentsand Robert Noyce at Fairchild Semiconductor.
The invention of the complementary metal-oxide-semiconductor (CMOS) process by Frank Wanlass in 1963,and a number of advances in CMOS technology by many workers in the semiconductor field since the work of Wanlass, have enabled the extremely dense and high-performance ICs that the industry makes today.
The invention of dynamic random-access memory (DRAM) technology by Robert Dennard at IBM in 1967made it possible to fabricate single-transistor memory cells, and the invention of flash memory by Fujio Masuoka at Toshiba in the 1980s^[ led to low-cost, high-capacity memory in diverse electronic products.
The invention of chemically-amplified photoresist by Hiroshi Ito, C. Grant Willson and J. M. J. Fréchet at IBM c. 1980 that was 5-10 times more sensitive to ultraviolet light. IBM introduced chemically amplified photoresist for DRAM production in the mid-1980s.
The invention of deep UV excimer laser photolithography by Kanti Jain at IBM c.1980 has enabled the smallest features in ICs to shrink from 800 nanometers in 1990 to as low as 10 nanometers in 2016. Prior to this, excimer lasers had been mainly used as research devices since their development in the 1970s. From a broader scientific perspective, the invention of excimer laser lithography has been highlighted as one of the major milestones in the 50-year history of the laser.
The interconnect innovations of the late 1990s, including chemical-mechanical polishing or chemical mechanical planarization (CMP), trench isolation, and copper interconnects—although not directly a factor in creating smaller transistors—have enabled improved wafer yield, additional layers of metal wires, closer spacing of devices, and lower electrical resistance.

Computer industry technology road maps predicted in 2001 that Moore’s law would continue for several generations of semiconductor chips. Depending on the doubling time used in the calculations, this could mean up to a hundredfold increase in transistor count per chip within a decade. The semiconductor industry technology roadmap used a three-year doubling time for microprocessors, leading to a tenfold increase in a decade.Intel was reported in 2005 as stating that the downsizing of silicon chips with good economics could continue during the following decade,and in 2008 as predicting the trend through 2029.

Recent trends

One of the key challenges of engineering future nanoscale transistors is the design of gates. As device dimension shrinks, controlling the current flow in the thin channel becomes more difficult. Compared to FinFETs, which have gate dielectric on three sides of the channel, gate-all-around structure has even better gate control.

In 2010, researchers at the Tyndall National Institute in Cork, Ireland announced a junctionless transistor. A control gate wrapped around a silicon nanowire can control the passage of electrons without the use of junctions or doping. They claim these may be produced at 10-nanometer scale using existing fabrication techniques.
In 2011, researchers at the University of Pittsburgh announced the development of a single-electron transistor, 1.5 nanometers in diameter, made out of oxide based materials. Three “wires” converge on a central “island” that can house one or two electrons. Electrons tunnel from one wire to another through the island. Conditions on the third wire result in distinct conductive properties including the ability of the transistor to act as a solid state memory.Nanowire transistors could spur the creation of microscopic computers.
In 2012, a research team at the University of New South Wales announced the development of the first working transistor consisting of a single atom placed precisely in a silicon crystal (not just picked from a large sample of random transistors)^, Moore’s law predicted this milestone to be reached for ICs in the lab by 2020.
In 2015, IBM demonstrated 7 nm node chips with silicon-germanium transistors produced using EUVL. The company believes this transistor density would be four times that of current 14 nm chips.

Revolutionary technology advances may help sustain Moore’s law through improved performance with or without reduced feature size.

In 2008, researchers at HP Labs announced a working memristor, a fourth basic passive circuit element whose existence only had been theorized previously. The memristor’s unique properties permit the creation of smaller and better-performing electronic devices.
In 2014, bioengineers at Stanford University developed a circuit modeled on the human brain. Sixteen “Neurocore” chips simulate one million neurons and billions of synaptic connections, claimed to be 9,000 times faster as well as more energy efficient than a typical PC.
In 2015, Intel and Micron announced 3D XPoint, a non-volatile memory claimed to be significantly faster with similar density compared to NAND. Production scheduled to begin in 2016 was delayed until the second half of 2017.

While physical limits to transistor scaling such as source-to-drain leakage, limited gate metals, and limited options for channel material have been reached, new avenues for continued scaling are open. The most promising of these approaches rely on using the spin state of electron spintronics, tunnel junctions, and advanced confinement of channel materials via nano-wire geometry. A comprehensive list of available device choices shows that a wide range of device options is open for continuing Moore’s law into the next few decades. Spin-based logic and memory options are being developed actively in industrial labs, as well as academic labs.

Alternative materials research

The vast majority of current transistors on ICs are composed principally of doped silicon and its alloys. As silicon is fabricated into single nanometer transistors, short-channel effects adversely change desired material properties of silicon as a functional transistor. Below are several non-silicon substitutes in the fabrication of small nanometer transistors.

One proposed material is indium gallium arsenide, or InGaAs. Compared to their silicon and germanium counterparts, InGaAs transistors are more promising for future high-speed, low-power logic applications. Because of intrinsic characteristics of III-V compound semiconductors, quantum well and tunnel effect transistors based on InGaAs have been proposed as alternatives to more traditional MOSFET designs.

In 2009, Intel announced the development of 80-nanometer InGaAs quantum well transistors. Quantum well devices contain a material sandwiched between two layers of material with a wider band gap. Despite being double the size of leading pure silicon transistors at the time, the company reported that they performed equally as well while consuming less power.
In 2011, researchers at Intel demonstrated 3-D tri-gate InGaAs transistors with improved leakage characteristics compared to traditional planar designs. The company claims that their design achieved the best electrostatics of any III-V compound semiconductor transistor.At the 2015 International Solid-State Circuits Conference, Intel mentioned the use of III-V compounds based on such an architecture for their 7 nanometer node.
In 2011, researchers at the University of Texas at Austin developed an InGaAs tunneling field-effect transistors capable of higher operating currents than previous designs. The first III-V TFET designs were demonstrated in 2009 by a joint team from Cornell University and Pennsylvania State University.
In 2012, a team in MIT’s Microsystems Technology Laboratories developed a 22 nm transistor based on InGaAs which, at the time, was the smallest non-silicon transistor ever built. The team used techniques currently used in silicon device fabrication and aims for better electrical performance and a reduction to 10-nanometer scale.

Research is also showing how biological micro-cells are capable of impressive computational power while being energy efficient.

Various forms of graphene are being studied for graphene electronics, eg. Graphene nanoribbon transistors have shown great promise since its appearance in publications in 2008. (Bulk graphene has a band gap of zero and thus cannot be used in transistors because of its constant conductivity, an inability to turn off. The zigzag edges of the nanoribbons introduce localized energy states in the conduction and valence bands and thus a bandgap that enables switching when fabricated as a transistor. As an example, a typical GNR of width of 10 nm has a desirable bandgap energy of 0.4eV. More research will need to be performed, however, on sub 50 nm graphene layers, as its resistivity value increases and thus electron mobility decreases.

Other formulations and similar observations

Several measures of digital technology are improving at exponential rates related to Moore’s law, including the size, cost, density, and speed of components. Moore wrote only about the density of components, “a component being a transistor, resistor, diode or capacitor”, at minimum cost.

Transistors per integrated circuit

The most popular formulation is of the doubling of the number of transistors on integrated circuits every two years. At the end of the 1970s, Moore’s law became known as the limit for the number of transistors on the most complex chips. The graph at the top shows this trend holds true today.

As of 2017, the commercially available processor possessing the highest number of transistors is the 48 core Centriq with over 18 billion transistors.

Density at minimum cost per transistor

This is the formulation given in Moore’s 1965 paper. It is not just about the density of transistors that can be achieved, but about the density of transistors at which the cost per transistor is the lowest.As more transistors are put on a chip, the cost to make each transistor decreases, but the chance that the chip will not work due to a defect increases. In 1965, Moore examined the density of transistors at which cost is minimized, and observed that, as transistors were made smaller through advances in photolithography, this number would increase at “a rate of roughly a factor of two per year”

Dennard scaling

This suggests that power requirements are proportional to area (both voltage and current being proportional to length) for transistors. Combined with Moore’s law, performance per watt would grow at roughly the same rate as transistor density, doubling every 1–2 years. According to Dennard scaling transistor dimensions are scaled by 30% (0.7x) every technology generation, thus reducing their area by 50%. This reduces the delay by 30% (0.7x) and therefore increases operating frequency by about 40% (1.4x). Finally, to keep electric field constant, voltage is reduced by 30%, reducing energy by 65% and power (at 1.4x frequency) by 50%. Therefore, in every technology generation transistor density doubles, circuit becomes 40% faster, while power consumption (with twice the number of transistors) stays the same.

The exponential processor transistor growth predicted by Moore does not always translate into exponentially greater practical CPU performance. Since around 2005–2007, Dennard scaling appears to have broken down, so even though Moore’s law continued for several years after that, it has not yielded dividends in improved performance. The primary reason cited for the breakdown is that at small sizes, current leakage poses greater challenges, and also causes the chip to heat up, which creates a threat of thermal runaway and therefore, further increases energy costs.

The breakdown of Dennard scaling prompted a switch among some chip manufacturers to a greater focus on multicore processors, but the gains offered by switching to more cores are lower than the gains that would be achieved had Dennard scaling continued. In another departure from Dennard scaling, Intel microprocessors adopted a non-planar tri-gate FinFET at 22 nm in 2012 that is faster and consumes less power than a conventional planar transistor.

Quality adjusted price of IT equipment

The price of information technology (IT), computers and peripheral equipment, adjusted for quality and inflation, declined 16% per year on average over the five decades from 1959 to 2009. The pace accelerated, however, to 23% per year in 1995–1999 triggered by faster IT innovation, and later, slowed to 2% per year in 2010–2013.

The rate of quality-adjusted microprocessor price improvement likewise varies, and is not linear on a log scale. Microprocessor price improvement accelerated during the late 1990s, reaching 60% per year (halving every nine months) versus the typical 30% improvement rate (halving every two years) during the years earlier and later. Laptop microprocessors in particular improved 25–35% per year in 2004–2010, and slowed to 15–25% per year in 2010–2013.

The number of transistors per chip cannot explain quality-adjusted microprocessor prices fully. Moore’s 1995 paper does not limit Moore’s law to strict linearity or to transistor count, “The definition of ‘Moore’s Law’ has come to refer to almost anything related to the semiconductor industry that when plotted on semi-log paper approximates a straight line. I hesitate to review its origins and by doing so restrict its definition.”

Hard disk drive areal density

A similar observation (sometimes called Kryder’s law) was made in 2005 for hard disk drive areal density. Several decades of rapid progress in areal density advancement slowed significantly around 2010, because of noise related to smaller grain size of the disk media, thermal stability, and writability using available magnetic fields.

Fiber-optic capacity

The number of bits per second that can be sent down an optical fiber increases exponentially, faster than Moore’s law. Keck’s law, in honor of Donald Keck.

Network capacity

According to Gerry/Gerald Butters,the former head of Lucent’s Optical Networking Group at Bell Labs, there is another version, called Butters’ Law of Photonics, a formulation that deliberately parallels Moore’s law. Butters’ law says that the amount of data coming out of an optical fiber is doubling every nine months. Thus, the cost of transmitting a bit over an optical network decreases by half every nine months. The availability of wavelength-division multiplexing (sometimes called WDM) increased the capacity that could be placed on a single fiber by as much as a factor of 100. Optical networking and dense wavelength-division multiplexing (DWDM) is rapidly bringing down the cost of networking, and further progress seems assured. As a result, the wholesale price of data traffic collapsed in the dot-com bubble. Nielsen’s Law says that the bandwidth available to users increases by 50% annually.

Pixels per dollar

Similarly, Barry Hendy of Kodak Australia has plotted pixels per dollar as a basic measure of value for a digital camera, demonstrating the historical linearity (on a log scale) of this market and the opportunity to predict the future trend of digital camera price, LCD and LED screens, and resolution.

The great Moore’s law compensator (TGMLC), also known as Wirth’s law – generally is referred to as software bloat and is the principle that successive generations of computer software increase in size and complexity, thereby offsetting the performance gains predicted by Moore’s law. In a 2008 article in InfoWorld, Randall C. Kennedy,formerly of Intel, introduces this term using successive versions of Microsoft Office between the year 2000 and 2007 as his premise. Despite the gains in computational performance during this time period according to Moore’s law, Office 2007 performed the same task at half the speed on a prototypical year 2007 computer as compared to Office 2000 on a year 2000 computer.

Library expansion

was calculated in 1945 by Fremont Rider to double in capacity every 16 years, if sufficient space were made available.He advocated replacing bulky, decaying printed works with miniaturized microform analog photographs, which could be duplicated on-demand for library patrons or other institutions. He did not foresee the digital technology that would follow decades later to replace analog microform with digital imaging, storage, and transmission media. Automated, potentially lossless digital technologies allowed vast increases in the rapidity of information growth in an era that now sometimes is called the Information Age.

Carlson curve

is a term coined by The Economist to describe the biotechnological equivalent of Moore’s law, and is named after author Rob Carlson. Carlson accurately predicted that the doubling time of DNA sequencing technologies (measured by cost and performance) would be at least as fast as Moore’s law. Carlson Curves illustrate the rapid (in some cases hyperexponential) decreases in cost, and increases in performance, of a variety of technologies, including DNA sequencing, DNA synthesis, and a range of physical and computational tools used in protein expression and in determining protein structures.

Eroom’s law

is a pharmaceutical drug development observation which was deliberately written as Moore’s Law spelled backwards in order to contrast it with the exponential advancements of other forms of technology (such as transistors) over time. It states that the cost of developing a new drug roughly doubles every nine years.

Experience curve effects says that each doubling of the cumulative production of virtually any product or service is accompanied by an approximate constant percentage reduction in the unit cost. The acknowledged first documented qualitative description of this dates from 1885. A power curve was used to describe this phenomenon in a 1936 discussion of the cost of airplanes.

Operation

MOS capacitors and band diagrams

The MOS capacitor structure is the heart of the MOSFET. Consider a MOS capacitor where the silicon base is of p-type. If a positive voltage is applied at the gate, holes which are at the surface of the p-type substrate will be repelled by the electric field generated by the voltage applied. At first, the holes will simply be repelled and what will remain on the surface will be immobile (negative) atoms of the acceptor type, which creates a depletion region on the surface. Remember that a hole is created by an acceptor atom, e.g. Boron, which has one less electron than Silicon. One might ask how can holes be repelled if they are actually non-entities? The answer is that what really happens is not that a hole is repelled, but electrons are attracted by the positive field, and fill these holes, creating a depletion region where no charge carriers exist because the electron is now fixed onto the atom and immobile.

As the voltage at the gate increases, there will be a point at which the surface above the depletion region will be converted from p-type into n-type, as electrons from the bulk area will start to get attracted by the larger electric field. This is known as inversion. The threshold voltage at which this conversion happens is one of the most important parameters in a MOSFET.

In the case of a p-type bulk, inversion happens when the intrinsic energy level at the surface becomes smaller than the Fermi level at the surface. One can see this from a band diagram. Remember that the Fermi level defines the type of semiconductor in discussion. If the Fermi level is equal to the Intrinsic level, the semiconductor is of intrinsic, or pure type. If the Fermi level lies closer to the conduction band (valence band) then the semiconductor type will be of n-type (p-type). Therefore, when the gate voltage is increased in a positive sense (for the given example), this will “bend” the intrinsic energy level band so that it will curve downwards towards the valence band. If the Fermi level lies closer to the valence band (for p-type), there will be a point when the Intrinsic level will start to cross the Fermi level and when the voltage reaches the threshold voltage, the intrinsic level does cross the Fermi level, and that is what is known as inversion. At that point, the surface of the semiconductor is inverted from p-type into n-type. Remember that as said above, if the Fermi level lies above the Intrinsic level, the semiconductor is of n-type, therefore at Inversion, when the Intrinsic level reaches and crosses the Fermi level (which lies closer to the valence band), the semiconductor type changes at the surface as dictated by the relative positions of the Fermi and Intrinsic energy levels.

Structure and channel formation

A MOSFET is based on the modulation of charge concentration by a MOS capacitance between a body electrode and a gate electrode located above the body and insulated from all other device regions by a gate dielectric layer. If dielectrics other than an oxide are employed, the device may be referred to as a metal-insulator-semiconductor FET (MISFET). Compared to the MOS capacitor, the MOSFET includes two additional terminals (source and drain), each connected to individual highly doped regions that are separated by the body region. These regions can be either p or n type, but they must both be of the same type, and of opposite type to the body region. The source and drain (unlike the body) are highly doped as signified by a “+” sign after the type of doping.

If the MOSFET is an n-channel or nMOS FET, then the source and drain are n+ regions and the body is a p region. If the MOSFET is a p-channel or pMOS FET, then the source and drain are p+ regions and the body is a n region. The source is so named because it is the source of the charge carriers (electrons for n-channel, holes for p-channel) that flow through the channel; similarly, the drain is where the charge carriers leave the channel.

The occupancy of the energy bands in a semiconductor is set by the position of the Fermi level relative to the semiconductor energy-band edges.

With sufficient gate voltage, the valence band edge is driven far from the Fermi level, and holes from the body are driven away from the gate.

At larger gate bias still, near the semiconductor surface the conduction band edge is brought close to the Fermi level, populating the surface with electrons in an inversion layer or n-channel at the interface between the p region and the oxide. This conducting channel extends between the source and the drain, and current is conducted through it when a voltage is applied between the two electrodes. Increasing the voltage on the gate leads to a higher electron density in the inversion layer and therefore increases the current flow between the source and drain. For gate voltages below the threshold value, the channel is lightly populated, and only a very small subthreshold leakage current can flow between the source and the drain.

When a negative gate-source voltage (positive source-gate) is applied, it creates a p-channel at the surface of the n region, analogous to the n-channel case, but with opposite polarities of charges and voltages. When a voltage less negative than the threshold value (a negative voltage for the p-channel) is applied between gate and source, the channel disappears and only a very small subthreshold current can flow between the source and the drain. The device may comprise a silicon on insulator device in which a buried oxide is formed below a thin semiconductor layer. If the channel region between the gate dielectric and the buried oxide region is very thin, the channel is referred to as an ultrathin channel region with the source and drain regions formed on either side in or above the thin semiconductor layer. Other semiconductor materials may be employed. When the source and drain regions are formed above the channel in whole or in part, they are referred to as raised source/drain regions.

Parameter	nMOSFET	pMOSFET
Source/drain type	n-type	p-type
Channel type (MOS capacitor)	n-type	p-type
Gate type	Polysilicon	n+	p+
Metal	φ_m ~ Si conduction band	φ_m ~ Si valence band
Well type	p-type	n-type
Threshold voltage, V_th	Positive (enhancement)Negative (depletion)	Negative (enhancement)Positive (depletion)
Band-bending	Downwards	Upwards
Inversion layer carriers	Electrons	Holes
Substrate type	p-type	n-type

Modes of operation

The operation of a MOSFET can be separated into three different modes, depending on the voltages at the terminals. In the following discussion, a simplified algebraic model is used.^[14] Modern MOSFET characteristics are more complex than the algebraic model presented here.^[15]

For an enhancement-mode, n-channel MOSFET, the three operational modes are: Cutoff, subthreshold, and weak-inversion mode

According to the basic threshold model, the transistor is turned off, and there is no conduction between drain and source. A more accurate model considers the effect of thermal energy on the Fermi–Dirac distribution of electron energies which allow some of the more energetic electrons at the source to enter the channel and flow to the drain. This results in a subthreshold current that is an exponential function of gate-source voltage. While the current between drain and source should ideally be zero when the transistor is being used as a turned-off switch, there is a weak-inversion current, sometimes called subthreshold leakage.

In weak inversion where the source is tied to bulk, the current varies exponentially with

{\displaystyle V_{\text{GS}}} — as given approximately by

$I_{\text{D0}}$

= current at

{\displaystyle V_{\text{GS}}=V_{\text{th}}}

, the thermal voltage

and the slope factor n is given by:

{\displaystyle n=1+{\frac {C_{\text{dep}}}{C_{\text{ox}}}},\,}

= capacitance of the depletion layer

= capacitance of the oxide layer

This equation is generally used, but is only an adequate approximation for the source tied to the bulk. For the source not tied to the bulk, the subthreshold equation for drain current in saturation is

{\displaystyle I_{\text{D}}\approx I_{\text{D0}}e^{\frac {\kappa \left(V_{\text{G}}-V_{\text{th}}\right)-V_{\text{S}}}{V_{\text{T}}}},}

is the channel divider that is given by

{\displaystyle \kappa ={\frac {C_{\text{ox}}}{C_{\text{ox}}+C_{\text{D}}}},}

= capacitance of the depletion layer

= capacitance of the oxide layer

In a long-channel device, there is no drain voltage dependence of the current once

{\displaystyle V_{\text{DS}}\gg V_{\text{T}}}

but as channel length is reduced drain-induced barrier lowering introduces drain voltage dependence that depends in a complex way upon the device geometry (for example, the channel doping, the junction doping and so on). Frequently, threshold voltage

V_th for this mode is defined as the gate voltage at which a selected value of current

ID0 occurs, for example, ID0 = 1 μA, which may not be the same Vth-value used in the equations for the following modes.

Some micropower analog circuits are designed to take advantage of subthreshold conduction. By working in the weak-inversion region, the MOSFETs in these circuits deliver the highest possible transconductance-to-current ratio, namely

{\displaystyle g_{m}/I_{\text{D}}=1/\left(nV_{\text{T}}\right)}

, almost that of a bipolar transistor.

The subthreshold I–V curve depends exponentially upon threshold voltage, introducing a strong dependence on any manufacturing variation that affects threshold voltage; for example: variations in oxide thickness, junction depth, or body doping that change the degree of drain-induced barrier lowering. The resulting sensitivity to fabricational variations complicates optimization for leakage and performance.

Triode mode or linear region (also known as the ohmic mode

When V_GS > V_th and V_DS < V_GS − V_th:

The transistor is turned on, and a channel has been created which allows current between the drain and the source. The MOSFET operates like a resistor, controlled by the gate voltage relative to both the source and drain voltages. The current from drain to source is modeled as:

{\displaystyle I_{\text{D}}=\mu _{n}C_{\text{ox}}{\frac {W}{L}}\left(\left(V_{\text{GS}}-V_{\rm {th}}\right)V_{\text{DS}}-{\frac {{V_{\text{DS}}}^{2}}{2}}\right)}

where

= is the charge-carrier effective mobility,

= is the gate width,

=is the gate length and

= is the gate oxide capacitance per unit area

The transition from the exponential subthreshold region to the triode region is not as sharp as the equations suggest.

Saturation or active mode

When V_GS > V_th and V_DS ≥ (V_GS – V_th):

The switch is turned on, and a channel has been created, which allows current between the drain and source. Since the drain voltage is higher than the source voltage, the electrons spread out, and conduction is not through a narrow channel but through a broader, two- or three-dimensional current distribution extending away from the interface and deeper in the substrate. The onset of this region is also known as pinch-off to indicate the lack of channel region near the drain. Although the channel does not extend the full length of the device, the electric field between the drain and the channel is very high, and conduction continues. The drain current is now weakly dependent upon drain voltage and controlled primarily by the gate-source voltage, and modeled approximately as

{\displaystyle I_{\text{D}}={\frac {\mu _{n}C_{\text{ox}}}{2}}{\frac {W}{L}}\left[V_{\text{GS}}-V_{\text{th}}\right]^{2}\left[1+\lambda (V_{\text{DS}}-V_{\text{DSsat}})\right].}

The additional factor involving λ, the channel-length modulation parameter, models current dependence on drain voltage due to the Early effect, or channel length modulation. According to this equation, a key design parameter, the MOSFET transconductance is:

{\displaystyle g_{m}={\frac {\partial I_{D}}{\partial V_{\text{GS}}}}={\frac {2I_{\text{D}}}{V_{\text{GS}}-V_{\text{th}}}}={\frac {2I_{\text{D}}}{V_{\text{ov}}}},}

where the combination V_ov = V_GS − V_th is called the overdrive voltage,and where V_DSsat = V_GS − V_th accounts for a small discontinuity in which would otherwise appear at the transition between the triode and saturation regions.

Another key design parameter is the MOSFET output resistance r_out given by:

{\displaystyle r_{\text{out}}={\frac {1}{\lambda I_{\text{D}}}}}

r_out is the inverse of g_DS where

{\displaystyle g_{\text{DS}}={\frac {\partial I_{\text{DS}}}{\partial V_{\text{DS}}}}}

I_D is the expression in saturation region.

If λ is taken as zero, an infinite output resistance of the device results that leads to unrealistic circuit predictions, particularly in analog circuits.

As the channel length becomes very short, these equations become quite inaccurate. New physical effects arise. For example, carrier transport in the active mode may become limited by velocity saturation. When velocity saturation dominates, the saturation drain current is more nearly linear than quadratic in VGS. At even shorter lengths, carriers transport with near zero scattering, known as quasi-ballistic transport. In the ballistic regime, the carriers travel at an injection velocity that may exceed the saturation velocity and approaches the Fermi velocity at high inversion charge density. In addition, drain-induced barrier lowering increases off-state (cutoff) current and requires an increase in threshold voltage to compensate, which in turn reduces the saturation current.

Body effect

The occupancy of the energy bands in a semiconductor is set by the position of the Fermi level relative to the semiconductor energy-band edges. Application of a source-to-substrate reverse bias of the source-body pn-junction introduces a split between the Fermi levels for electrons and holes, moving the Fermi level for the channel further from the band edge, lowering the occupancy of the channel. The effect is to increase the gate voltage necessary to establish the channel, as seen in the figure. This change in channel strength by application of reverse bias is called the ‘body effect’.

Simply put, using an nMOS example, the gate-to-body bias V_GB positions the conduction-band energy levels, while the source-to-body bias V_SB positions the electron Fermi level near the interface, deciding occupancy of these levels near the interface, and hence the strength of the inversion layer or channel.

The body effect upon the channel can be described using a modification of the threshold voltage, approximated by the following equation:

{\displaystyle V_{\text{TB}}=V_{T0}+\gamma \left({\sqrt {V_{\text{SB}}+2\varphi _{B}}}-{\sqrt {2\varphi _{B}}}\right),}

where VTB is the threshold voltage with substrate bias present, and VT0 is the zero-VSB value of threshold voltage, γ {\displaystyle \gamma } \gamma is the body effect parameter, and 2φB is the approximate potential drop between surface and bulk across the depletion layer when VSB = 0 and gate bias is sufficient to ensure that a channel is present.[31] As this equation shows, a reverse bias VSB > 0 causes an increase in threshold voltage VTB and therefore demands a larger gate voltage before the channel populates.

The body can be operated as a second gate, and is sometimes referred to as the “back gate”; the body effect is sometimes called the “back-gate effect”.

Circuit symbols

A variety of symbols are used for the MOSFET. The basic design is generally a line for the channel with the source and drain leaving it at right angles and then bending back at right angles into the same direction as the channel. Sometimes three line segments are used for enhancement mode and a solid line for depletion mode (see depletion and enhancement modes). Another line is drawn parallel to the channel for the gate.

The bulk or body connection, if shown, is shown connected to the back of the channel with an arrow indicating pMOS or nMOS. Arrows always point from P to N, so an NMOS (N-channel in P-well or P-substrate) has the arrow pointing in (from the bulk to the channel). If the bulk is connected to the source (as is generally the case with discrete devices) it is sometimes angled to meet up with the source leaving the transistor. If the bulk is not shown (as is often the case in IC design as they are generally common bulk) an inversion symbol is sometimes used to indicate PMOS, alternatively an arrow on the source may be used in the same way as for bipolar transistors (out for nMOS, in for pMOS).

Comparison of enhancement-mode and depletion-mode MOSFET symbols, along with JFET symbols. The orientation of the symbols, (most significantly the position of source relative to drain) is such that more positive voltages appear higher on the page than less positive voltages, implying current flowing “down” the page:

In schematics where G, S, D are not labeled, the detailed features of the symbol indicate which terminal is source and which is drain. For enhancement-mode and depletion-mode MOSFET symbols (in columns two and five), the source terminal is the one connected to the triangle. Additionally, in this diagram, the gate is shown as an “L” shape, whose input leg is closer to S than D, also indicating which is which. However, these symbols are often drawn with a “T” shaped gate (as elsewhere on this page), so it is the triangle which must be relied upon to indicate the source terminal.

For the symbols in which the bulk, or body, terminal is shown, it is here shown internally connected to the source (i.e., the black triangles in the diagrams in columns 2 and 5). This is a typical configuration, but by no means the only important configuration. In general, the MOSFET is a four-terminal device, and in integrated circuits many of the MOSFETs share a body connection, not necessarily connected to the source terminals of all the transistors.

Digital integrated circuits such as microprocessors and memory devices contain thousands to millions of integrated MOSFET transistors on each device, providing the basic switching functions required to implement logic gates and data storage. Discrete devices are widely used in applications such as switch mode power supplies, variable-frequency drives and other power electronics applications where each device may be switching thousands of watts. Radio-frequency amplifiers up to the UHF spectrum use MOSFET transistors as analog signal and power amplifiers. Radio systems also use MOSFETs as oscillators, or mixers to convert frequencies. MOSFET devices are also applied in audio-frequency power amplifiers for public address systems, sound reinforcement and home and automobile sound systems

MOS integrated circuits

Following the development of clean rooms to reduce contamination to levels never before thought necessary, and of photolithography and the planar process to allow circuits to be made in very few steps, the Si–SiO₂ system possessed the technical attractions of low cost of production (on a per circuit basis) and ease of integration. Largely because of these two factors, the MOSFET has become the most widely used type of transistor in integrated circuits.

General Microelectronics introduced the first commercial MOS integrated circuit in 1964.

Additionally, the method of coupling two complementary MOSFETS (P-channel and N-channel) into one high/low switch, known as CMOS, means that digital circuits dissipate very little power except when actually switched.

The earliest microprocessors starting in 1970 were all MOS microprocessors; i.e., fabricated entirely from PMOS logic or fabricated entirely from NMOS logic. In the 1970s, MOS microprocessors were often contrasted with CMOS microprocessors and bipolar bit-slice processors.

CMOS circuits

The MOSFET is used in digital complementary metal–oxide–semiconductor (CMOS) logic,which uses p- and n-channel MOSFETs as building blocks. Overheating is a major concern in integrated circuits since ever more transistors are packed into ever smaller chips. CMOS logic reduces power consumption because no current flows (ideally), and thus no power is consumed, except when the inputs to logic gates are being switched. CMOS accomplishes this current reduction by complementing every nMOSFET with a pMOSFET and connecting both gates and both drains together. A high voltage on the gates will cause the nMOSFET to conduct and the pMOSFET not to conduct and a low voltage on the gates causes the reverse. During the switching time as the voltage goes from one state to another, both MOSFETs will conduct briefly. This arrangement greatly reduces power consumption and heat generation.

Digital

The growth of digital technologies like the microprocessor has provided the motivation to advance MOSFET technology faster than any other type of silicon-based transistor.^[ A big advantage of MOSFETs for digital switching is that the oxide layer between the gate and the channel prevents DC current from flowing through the gate, further reducing power consumption and giving a very large input impedance. The insulating oxide between the gate and channel effectively isolates a MOSFET in one logic stage from earlier and later stages, which allows a single MOSFET output to drive a considerable number of MOSFET inputs. Bipolar transistor-based logic (such as TTL) does not have such a high fanout capacity. This isolation also makes it easier for the designers to ignore to some extent loading effects between logic stages independently. That extent is defined by the operating frequency: as frequencies increase, the input impedance of the MOSFETs decreases.

Analog

The MOSFET’s advantages in digital circuits do not translate into supremacy in all analog circuits. The two types of circuit draw upon different features of transistor behavior. Digital circuits switch, spending most of their time either fully on or fully off. The transition from one to the other is only of concern with regards to speed and charge required. Analog circuits depend on operation in the transition region where small changes to V_gs can modulate the output (drain) current. The JFET and bipolar junction transistor (BJT) are preferred for accurate matching (of adjacent devices in integrated circuits), higher transconductance and certain temperature characteristics which simplify keeping performance predictable as circuit temperature varies.

Nevertheless, MOSFETs are widely used in many types of analog circuits because of their own advantages (zero gate current, high and adjustable output impedance and improved robustness vs. BJTs which can be permanently degraded by even lightly breaking down the emitter-base).The characteristics and performance of many analog circuits can be scaled up or down by changing the sizes (length and width) of the MOSFETs used. By comparison, in bipolar transistors the size of the device does not significantly affect its performance.MOSFETs’ ideal characteristics regarding gate current (zero) and drain-source offset voltage (zero) also make them nearly ideal switch elements, and also make switched capacitoranalog circuits practical. In their linear region, MOSFETs can be used as precision resistors, which can have a much higher controlled resistance than BJTs. In high power circuits, MOSFETs sometimes have the advantage of not suffering from thermal runaway as BJTs do. Also, MOSFETs can be configured to perform as capacitors and gyrator circuits which allow op-amps made from them to appear as inductors, thereby allowing all of the normal analog devices on a chip (except for diodes, which can be made smaller than a MOSFET anyway) to be built entirely out of MOSFETs. This means that complete analog circuits can be made on a silicon chip in a much smaller space and with simpler fabrication techniques. MOSFETS are ideally suited to switch inductive loads because of tolerance to inductive kickback.

Some ICs combine analog and digital MOSFET circuitry on a single mixed-signal integrated circuit, making the needed board space even smaller. This creates a need to isolate the analog circuits from the digital circuits on a chip level, leading to the use of isolation rings and silicon on insulator (SOI). Since MOSFETs require more space to handle a given amount of power than a BJT, fabrication processes can incorporate BJTs and MOSFETs into a single device. Mixed-transistor devices are called bi-FETs (bipolar FETs) if they contain just one BJT-FET and BiCMOS (bipolar-CMOS) if they contain complementary BJT-FETs. Such devices have the advantages of both insulated gates and higher current density.

Analog switches

MOSFET analog switches use the MOSFET to pass analog signals when on, and as a high impedance when off. Signals flow in both directions across a MOSFET switch. In this application, the drain and source of a MOSFET exchange places depending on the relative voltages of the source/drain electrodes. The source is the more negative side for an N-MOS or the more positive side for a P-MOS. All of these switches are limited on what signals they can pass or stop by their gate-source, gate-drain and source–drain voltages; exceeding the voltage, current, or power limits will potentially damage the switch.

Single-type

This analog switch uses a four-terminal simple MOSFET of either P or N type.

In the case of an n-type switch, the body is connected to the most negative supply (usually GND) and the gate is used as the switch control. Whenever the gate voltage exceeds the source voltage by at least a threshold voltage, the MOSFET conducts. The higher the voltage, the more the MOSFET can conduct. An N-MOS switch passes all voltages less than V_gate − V_tn. When the switch is conducting, it typically operates in the linear (or ohmic) mode of operation, since the source and drain voltages will typically be nearly equal.

In the case of a P-MOS, the body is connected to the most positive voltage, and the gate is brought to a lower potential to turn the switch on. The P-MOS switch passes all voltages higher than V_gate − V_tp (threshold voltage V_tp is negative in the case of enhancement-mode P-MOS).

Dual-type (CMOS)

This “complementary” or CMOS type of switch uses one P-MOS and one N-MOS FET to counteract the limitations of the single-type switch. The FETs have their drains and sources connected in parallel, the body of the P-MOS is connected to the high potential (V_DD) and the body of the N-MOS is connected to the low potential (gnd). To turn the switch on, the gate of the P-MOS is driven to the low potential and the gate of the N-MOS is driven to the high potential. For voltages between V_DD − V_tn and gnd − V_tp, both FETs conduct the signal; for voltages less than gnd − V_tp, the N-MOS conducts alone; and for voltages greater than V_DD − V_tn, the P-MOS conducts alone.

The voltage limits for this switch are the gate-source, gate-drain and source-drain voltage limits for both FETs. Also, the P-MOS is typically two to three times wider than the N-MOS, so the switch will be balanced for speed in the two directions.

Tri-state circuitry sometimes incorporates a CMOS MOSFET switch on its output to provide for a low-ohmic, full-range output when on, and a high-ohmic, mid-level signal when off.

Construction

Gate material

The primary criterion for the gate material is that it is a good conductor. Highly doped polycrystalline silicon is an acceptable but certainly not ideal conductor, and also suffers from some more technical deficiencies in its role as the standard gate material. Nevertheless, there are several reasons favoring use of polysilicon:

The threshold voltage (and consequently the drain to source on-current) is modified by the work function difference between the gate material and channel material. Because polysilicon is a semiconductor, its work function can be modulated by adjusting the type and level of doping. Furthermore, because polysilicon has the same bandgap as the underlying silicon channel, it is quite straightforward to tune the work function to achieve low threshold voltages for both NMOS and PMOS devices. By contrast, the work functions of metals are not easily modulated, so tuning the work function to obtain low threshold voltages (LVT) becomes a significant challenge. Additionally, obtaining low-threshold devices on both PMOS and NMOS devices sometimes requires the use of different metals for each device type. While bimetallic integrated circuits (i.e., one type of metal for gate electrodes of NFETS and a second type of metal for gate electrodes of PFETS) are not common, they are known in patent literature and provide some benefit in terms of tuning electrical circuits’ overall electrical performance.
The silicon-SiO₂ interface has been well studied and is known to have relatively few defects. By contrast many metal-insulator interfaces contain significant levels of defects which can lead to Fermi level pinning, charging, or other phenomena that ultimately degrade device performance.
In the MOSFET IC fabrication process, it is preferable to deposit the gate material prior to certain high-temperature steps in order to make better-performing transistors. Such high temperature steps would melt some metals, limiting the types of metal that can be used in a metal-gate-based process.

While polysilicon gates have been the de facto standard for the last twenty years, they do have some disadvantages which have led to their likely future replacement by metal gates. These disadvantages include:

Polysilicon is not a great conductor (approximately 1000 times more resistive than metals) which reduces the signal propagation speed through the material. The resistivity can be lowered by increasing the level of doping, but even highly doped polysilicon is not as conductive as most metals. To improve conductivity further, sometimes a high-temperature metal such as tungsten, titanium, cobalt, and more recently nickel is alloyed with the top layers of the polysilicon. Such a blended material is called silicide. The silicide-polysilicon combination has better electrical properties than polysilicon alone and still does not melt in subsequent processing. Also the threshold voltage is not significantly higher than with polysilicon alone, because the silicide material is not near the channel. The process in which silicide is formed on both the gate electrode and the source and drain regions is sometimes called salicide, self-aligned silicide.
When the transistors are extremely scaled down, it is necessary to make the gate dielectric layer very thin, around 1 nm in state-of-the-art technologies. A phenomenon observed here is the so-called poly depletion, where a depletion layer is formed in the gate polysilicon layer next to the gate dielectric when the transistor is in the inversion. To avoid this problem, a metal gate is desired. A variety of metal gates such as tantalum, tungsten, tantalum nitride, and titanium nitride are used, usually in conjunction with high-κ dielectrics. An alternative is to use fully silicided polysilicon gates, a process known as FUSI.

Present high performance CPUs use metal gate technology, together with high-κ dielectrics, a combination known as high-κ, metal gate (HKMG). The disadvantages of metal gates are overcome by a few techniques:

The threshold voltage is tuned by including a thin “work function metal” layer between the high-κ dielectric and the main metal. This layer is thin enough that the total work function of the gate is influenced by both the main metal and thin metal work functions (either due to alloying during annealing, or simply due to the incomplete screening by the thin metal). The threshold voltage thus can be tuned by the thickness of the thin metal layer.
High-κ dielectrics are now well studied, and their defects are understood.
HKMG processes exist that do not require the metals to experience high temperature anneals; other processes select metals that can survive the annealing step.

Insulator

As devices are made smaller, insulating layers are made thinner, often through steps of thermal oxidation or localised oxidation of silicon (LOCOS). For nano-scaled devices, at some point tunneling of carriers through the insulator from the channel to the gate electrode takes place. To reduce the resulting leakage current, the insulator can be made thinner by choosing a material with a higher dielectric constant. To see how thickness and dielectric constant are related, note that Gauss’s law connects field to charge as

{\displaystyle Q=\kappa \epsilon _{0}E,}

with Q = charge density, κ = dielectric constant, ε₀ = permittivity of empty space and E = electric field. From this law it appears the same charge can be maintained in the channel at a lower field provided κ is increased. The voltage on the gate is given by:

{\displaystyle V_{\text{G}}=V_{\text{ch}}+E\,t_{\text{ins}}=V_{\text{ch}}+{\frac {Qt_{\text{ins}}}{\kappa \epsilon _{0}}},}

with V_G = gate voltage, V_ch = voltage at channel side of insulator, and t_ins = insulator thickness. This equation shows the gate voltage will not increase when the insulator thickness increases, provided κ increases to keep t_ins / κ = constant (see the article on high-κ dielectrics for more detail, and the section in this article on gate-oxide leakage).

The insulator in a MOSFET is a dielectric which can in any event be silicon oxide, formed by LOCOS but many other dielectric materials are employed. The generic term for the dielectric is gate dielectric since the dielectric lies directly below the gate electrode and above the channel of the MOSFET.

WIde-swing_MOSFET_mirror

Junction design

The source-to-body and drain-to-body junctions are the object of much attention because of three major factors: their design affects the current-voltage (I-V) characteristics of the device, lowering output resistance, and also the speed of the device through the loading effect of the junction capacitances, and finally, the component of stand-by power dissipation due to junction leakage.

mofset Structure

The drain induced barrier lowering of the threshold voltage and channel length modulation effects upon I-V curves are reduced by using shallow junction extensions. In addition, halo doping can be used, that is, the addition of very thin heavily doped regions of the same doping type as the body tight against the junction walls to limit the extent of depletion regions.

The capacitive effects are limited by using raised source and drain geometries that make most of the contact area border thick dielectric instead of silicon.

These various features of junction design are shown in the figure.

Scaling

Over the past decades, the MOSFET (as used for digital logic) has continually been scaled down in size; typical MOSFET channel lengths were once several micrometres, but modern integrated circuits are incorporating MOSFETs with channel lengths of tens of nanometers. Robert Dennard’s work on scaling theory was pivotal in recognising that this ongoing reduction was possible. Intel began production of a process featuring a 32 nm feature size (with the channel being even shorter) in late 2009. The semiconductor industry maintains a “roadmap”, the ITRS, which sets the pace for MOSFET development. Historically, the difficulties with decreasing the size of the MOSFET have been associated with the semiconductor device fabrication process, the need to use very low voltages, and with poorer electrical performance necessitating circuit redesign and innovation (small MOSFETs exhibit higher leakage currents and lower output resistance).

Smaller MOSFETs are desirable for several reasons. The main reason to make transistors smaller is to pack more and more devices in a given chip area. This results in a chip with the same functionality in a smaller area, or chips with more functionality in the same area. Since fabrication costs for a semiconductor wafer are relatively fixed, the cost per integrated circuits is mainly related to the number of chips that can be produced per wafer. Hence, smaller ICs allow more chips per wafer, reducing the price per chip. In fact, over the past 30 years the number of transistors per chip has been doubled every 2–3 years once a new technology node is introduced. For example, the number of MOSFETs in a microprocessor fabricated in a 45 nm technology can well be twice as many as in a 65 nm chip. This doubling of transistor density was first observed by Gordon Moore in 1965 and is commonly referred to as Moore’s law.It is also expected that smaller transistors switch faster. For example, one approach to size reduction is a scaling of the MOSFET that requires all device dimensions to reduce proportionally. The main device dimensions are the channel length, channel width, and oxide thickness. When they are scaled down by equal factors, the transistor channel resistance does not change, while gate capacitance is cut by that factor. Hence, the RC delay of the transistor scales with a similar factor. While this has been traditionally the case for the older technologies, for the state-of-the-art MOSFETs reduction of the transistor dimensions does not necessarily translate to higher chip speed because the delay due to interconnections is more significant.

Producing MOSFETs with channel lengths much smaller than a micrometre is a challenge, and the difficulties of semiconductor device fabrication are always a limiting factor in advancing integrated circuit technology. Though processes such as ALD have improved fabrication for small components, the small size of the MOSFET (less than a few tens of nanometers) has created operational problems:

Higher subthreshold conduction

As MOSFET geometries shrink, the voltage that can be applied to the gate must be reduced to maintain reliability. To maintain performance, the threshold voltage of the MOSFET has to be reduced as well. As threshold voltage is reduced, the transistor cannot be switched from complete turn-off to complete turn-on with the limited voltage swing available; the circuit design is a compromise between strong current in the on case and low current in the off case, and the application determines whether to favor one over the other. Subthreshold leakage (including subthreshold conduction, gate-oxide leakage and reverse-biased junction leakage), which was ignored in the past, now can consume upwards of half of the total power consumption of modern high-performance VLSI chips

Increased gate-oxide leakage

The gate oxide, which serves as insulator between the gate and channel, should be made as thin as possible to increase the channel conductivity and performance when the transistor is on and to reduce subthreshold leakage when the transistor is off. However, with current gate oxides with a thickness of around 1.2 nm (which in silicon is ~5 atoms thick) the quantum mechanical phenomenon of electron tunneling occurs between the gate and channel, leading to increased power consumption. Silicon dioxide has traditionally been used as the gate insulator. Silicon dioxide however has a modest dielectric constant. Increasing the dielectric constant of the gate dielectric allows a thicker layer while maintaining a high capacitance (capacitance is proportional to dielectric constant and inversely proportional to dielectric thickness). All else equal, a higher dielectric thickness reduces the quantum tunneling current through the dielectric between the gate and the channel. Insulators that have a larger dielectric constant than silicon dioxide (referred to as high-κ dielectrics), such as group IVb metal silicates e.g. hafnium and zirconium silicates and oxides are being used to reduce the gate leakage from the 45 nanometer technology node onwards. On the other hand, the barrier height of the new gate insulator is an important consideration; the difference in conduction band energy between the semiconductor and the dielectric (and the corresponding difference in valence band energy) also affects leakage current level. For the traditional gate oxide, silicon dioxide, the former barrier is approximately 8 eV. For many alternative dielectrics the value is significantly lower, tending to increase the tunneling current, somewhat negating the advantage of higher dielectric constant. The maximum gate-source voltage is determined by the strength of the electric field able to be sustained by the gate dielectric before significant leakage occurs. As the insulating dielectric is made thinner, the electric field strength within it goes up for a fixed voltage. This necessitates using lower voltages with the thinner dielectric.

Increased junction leakage

To make devices smaller, junction design has become more complex, leading to higher doping levels, shallower junctions, “halo” doping and so forth, all to decrease drain-induced barrier lowering (see the section on junction design). To keep these complex junctions in place, the annealing steps formerly used to remove damage and electrically active defects must be curtailed increasing junction leakage. Heavier doping is also associated with thinner depletion layers and more recombination centers that result in increased leakage current, even without lattice damage.

Drain-induced barrier lowering

(DIBL) and V_T roll off Because of the short-channel effect, channel formation is not entirely done by the gate, but now the drain and source also affect the channel formation. As the channel length decreases, the depletion regions of the source and drain come closer together and make the threshold voltage (V_T) a function of the length of the channel. This is called V_T roll-off. V_T also becomes function of drain to source voltage V_DS. As we increase the V_DS, the depletion regions increase in size, and a considerable amount of charge is depleted by the V_DS. The gate voltage required to form the channel is then lowered, and thus, the V_T decreases with an increase in V_DS. This effect is called drain induced barrier lowering (DIBL).

Lower output resistance

For analog operation, good gain requires a high MOSFET output impedance, which is to say, the MOSFET current should vary only slightly with the applied drain-to-source voltage. As devices are made smaller, the influence of the drain competes more successfully with that of the gate due to the growing proximity of these two electrodes, increasing the sensitivity of the MOSFET current to the drain voltage. To counteract the resulting decrease in output resistance, circuits are made more complex, either by requiring more devices, for example the cascode and cascade amplifiers, or by feedback circuitry using operational amplifiers, for example a circuit like that in the adjacent figure

Lower transconductance

The transconductance of the MOSFET decides its gain and is proportional to hole or electron mobility (depending on device type), at least for low drain voltages. As MOSFET size is reduced, the fields in the channel increase and the dopant impurity levels increase. Both changes reduce the carrier mobility, and hence the transconductance. As channel lengths are reduced without proportional reduction in drain voltage, raising the electric field in the channel, the result is velocity saturation of the carriers, limiting the current and the transconductance.

Interconnect capacitance

Traditionally, switching time was roughly proportional to the gate capacitance of gates. However, with transistors becoming smaller and more transistors being placed on the chip, interconnect capacitance (the capacitance of the metal-layer connections between different parts of the chip) is becoming a large percentage of capacitance. Signals have to travel through the interconnect, which leads to increased delay and lower performance.

Heat production

The ever-increasing density of MOSFETs on an integrated circuit creates problems of substantial localized heat generation that can impair circuit operation. Circuits operate more slowly at high temperatures, and have reduced reliability and shorter lifetimes. Heat sinks and other cooling devices and methods are now required for many integrated circuits including microprocessors. Power MOSFETs are at risk of thermal runaway. As their on-state resistance rises with temperature, if the load is approximately a constant-current load then the power loss rises correspondingly, generating further heat. When the heatsink is not able to keep the temperature low enough, the junction temperature may rise quickly and uncontrollably, resulting in destruction of the device.

Process variations

With MOSFETs becoming smaller, the number of atoms in the silicon that produce many of the transistor’s properties is becoming fewer, with the result that control of dopant numbers and placement is more erratic. During chip manufacturing, random process variations affect all transistor dimensions: length, width, junction depths, oxide thickness etc., and become a greater percentage of overall transistor size as the transistor shrinks. The transistor characteristics become less certain, more statistical. The random nature of manufacture means we do not know which particular example MOSFETs actually will end up in a particular instance of the circuit. This uncertainty forces a less optimal design because the design must work for a great variety of possible component MOSFETs. See process variation, design for manufacturability, reliability engineering, and statistical process control.

Modeling challenges

Modern ICs are computer-simulated with the goal of obtaining working circuits from the very first manufactured lot. As devices are miniaturized, the complexity of the processing makes it difficult to predict exactly what the final devices look like, and modeling of physical processes becomes more challenging as well. In addition, microscopic variations in structure due simply to the probabilistic nature of atomic processes require statistical (not just deterministic) predictions. These factors combine to make adequate simulation and “right the first time” manufacture difficult.