by Dr. Jaydeep T. Vagh

Microprocessor Scaling

Because of the importance of process scaling to processor design, all
microprocessor designs can be broken down into two basic categories:
lead designs and compactions. Lead designs are fundamentally new

designs. They typically add new features that require more transistors
and therefore a larger die size. Compactions change completed designs
to make them work on new fabrication processes. This allows for higher
frequency, lower power, and smaller dies. Figure 1-13 shows to scale die
photos of different Intel lead and compaction designs.
Each new lead design offers increased performance from added functionality but uses a bigger die size than a compaction in the same generation. It is the improvements in frequency and reductions in cost that come from compacting the design onto future process generations that make the new designs profitable. We can use Intel manufacturing processes of the last 10 years to show the typical process scaling from
one generation to the next (Table 1-2). On average the semiconductor industry has begun a new generation of fabrication process every 2 to 3 years. Each generation reduces horizontal dimensions about 30 percent compared to the previous generation. It would be possible to produce new generations more often if a smaller shrink factor was used, but a smaller improvement in performance might not justify the expense of new equipment. A larger shrink factor could provide more performance improvement but would require a longer time between generations. The company attempting the larger shrink factor would be at a disadvantage when competitors had advanced to a new process before them. The process generations have come to be referred to by their “technology node.” In older generations this name indicated the MOSFET

Functioning

The dynamic power (switching power) dissipated per unit of time by a chip is C·V²·A·f, where C is the capacitance being switched per clock cycle, V is voltage, A is the Activity Factor indicating the average number of switching events undergone by the transistors in the chip (as a unit-less quantity) and f is the switching frequency.

Voltage is therefore the main determinant of power usage and heating. The voltage required for stable operation is determined by the frequency at which the circuit is clocked, and can be reduced if the frequency is also reduced. Dynamic power alone does not account for the total power of the chip, however, as there is also static power, which is primarily because of various leakage currents. Due to static power consumption and asymptotic execution time it has been shown that the energy consumption of a piece of software shows convex energy behavior, i.e., there exists an optimal CPU frequency at which energy consumption is minimal.Leakage current has become more and more important as transistor sizes have become smaller and threshold voltage levels lower. A decade ago, dynamic power accounted for approximately two-thirds of the total chip power. The power loss due to leakage currents in contemporary CPUs and SoCs tend to dominate the total power consumption. In the attempt to control the leakage power, high-k metal-gates and power gating have been common methods.

Dynamic voltage scaling is another related power conservation technique that is often used in conjunction with frequency scaling, as the frequency that a chip may run at is related to the operating voltage.

The efficiency of some electrical components, such as voltage regulators, decreases with increasing temperature, so the power usage may increase with temperature. Since increasing power use may increase the temperature, increases in voltage or frequency may increase system power demands even further than the CMOS formula indicates, and vice versa.

Performance Impact

Dynamic frequency scaling reduces the number of instructions a processor can issue in a given amount of time, thus reducing performance. Hence, it is generally used when the workload is not CPU-bound.

Dynamic frequency scaling by itself is rarely worthwhile as a way to conserve switching power. Saving the highest possible amount of power requires dynamic voltage scaling too, because of the V² component and the fact that modern CPUs are strongly optimized for low power idle states. In most constant-voltage cases, it is more efficient to run briefly at peak speed and stay in a deep idle state for longer time (called “race to idle” or computational sprinting), than it is to run at a reduced clock rate for a long time and only stay briefly in a light idle state. However, reducing voltage along with clock rate can change those trade-offs.

A related-but-opposite technique is overclocking, whereby processor performance is increased by ramping the processor’s (dynamic) frequency beyond the manufacturer’s design specifications.

One major difference between the two is that in modern PC systems overclocking is mostly done over the Front Side Bus (mainly because the multiplier is normally locked), but dynamic frequency scaling is done with the multiplier. Moreover, overclocking is often static, while dynamic frequency scaling is always dynamic. Software can often incorporate overclocked frequencies into the frequency scaling algorithm, if the chip degradation risks are allowable.

Implementations

Intel’s CPU throttling technology, SpeedStep, is used in its mobile and desktop CPU lines.

AMD employs two different CPU throttling technologies. AMD’s Cool’n’Quiet technology is used on its desktop and server processor lines. The aim of Cool’n’Quiet is not to save battery life, as it is not used in AMD’s mobile processor line, but instead with the purpose of producing less heat, which in turn allows the system fan to spin down to slower speeds, resulting in cooler and quieter operation, hence the name of the technology. AMD’s PowerNow! CPU throttling technology is used in its mobile processor line, though some supporting CPUs like the AMD K6-2+ can be found in desktops as well.

VIA Technologies processors use a technology named LongHaul (PowerSaver), while Transmeta’s version was called LongRun.

The 36-processor AsAP 1 chip is among the first multi-core processor chips to support completely unconstrained clock operation (requiring only that frequencies are below the maximum allowed) including arbitrary changes in frequency, starts, and stops. The 167-processor AsAP 2 chip is the first multi-core processor chip which enables individual processors to make fully unconstrained changes to their own clock frequencies.

According to the ACPI Specs, the C0 working state of a modern-day CPU can be divided into the so-called “P”-states (performance states) which allow clock rate reduction and “T”-states (throttling states) which will further throttle down a CPU (but not the actual clock rate) by inserting STPCLK (stop clock) signals and thus omitting duty cycles.

AMD PowerTune and AMD ZeroCore Power are dynamic frequency scaling technologies for GPUs.

gate length of the process (L GATE ), but more recently some manufac-
tures have scaled their gate lengths more aggressively than others. This
means that today two different 90-nm processes may not have the same
device or interconnect dimensions, and it may be that neither has any
important dimension that is actually 90-nm. The technology node has
become merely a name describing the order of manufacturing genera-
tions and the typical 30 percent scaling of dimensions. The important
historical trends in microprocessor fabrication demonstrated by Table 1-2
and quasi-ideal interconnect scaling are shown in Table 1-3.
Although it is going from one process generation to the next that
gradually moves the semiconductor industry forward, manufacturers do
not stand still for the 2 years between process generations. Small incre-
mental improvements are constantly being made to the process that
allow for part of the steady improvement in processor frequency. As a
result, a compaction microprocessor design may first ship at about the

TABLE 1-3
Microprocessor Fabrication Historical Trends
1)New generation every 2 years
2)35% reduction in gate length
3)30% reduction in gate oxide thickness
4)15% reduction in voltage
5)30% reduction in interconnect horizontal dimensions
6) 15% reduction in interconnect vertical dimensions
7)Add 1 metal layer every other generation

same frequency as the previous generation, which has been graduall improving since its launch. The motivation for the new compaction is not only the immediate reduction in cost due to a smaller die size, but the potential that it will be able to eventually scale to frequencies beyond what the previous generation could reach. As an example the 180-nm generation Intel Pentium ® 4 began at a maximum frequency of 1.5 GHz and scaled to 2.0 GHz. The 130-nm Pentium 4 started at 2.0 GHz and scaled to 3.4 GHz. The
90-nm Pentium 4 started at 3.2 GHz. Each new technology generation is planned to start when the previous generation can no longer be easily improved.

The future of Moore’s law

In recent years, the exponential increase with time of almost any aspect of the semiconductor industry has been referred to as Moore’s law. Indeed, things like microprocessor frequency, computer performance, the cost of a semiconductor fabrication plant, or the size of a microprocessor design team have all increased exponentially. No exponential trend can continue forever, and this simple fact has led to predictions of the end of Moore’s law for decades. All these predictions have turned out to be wrong. For 30 years, there have always been seemingly insurmountable problems about 10 years in the future. Perhaps one of the most important lessons of Moore’s law is that when billions of dollars in profits are on the line, incredibly difficult problems can
be overcome. Moore’s law is of course not a “law” but merely a trend that has been
true in the past. If it is to remain true in the future, it will be because the industry finds it profitable to continue to solve “insurmountable” problems and force Moore’s law to come true. There have already been a number of new fabrication technologies proposed or put into use that will help continue Moore’s law through 2015.

Multiple threshold voltages. Increasing the threshold voltage dramatically reduces subthreshold leakage. Unfortunately this also reduces the on current of the device and slows switching. By applying different amounts of dopant to the channels of different transistors, devices with different threshold voltages are made on the same die. When speed is required, low V T devices, which are fast but high power, are used. In circuits that do not limit the frequency of the processor, slower, more powerefficient, high V T devices are used to reduce overall leakage power. This technique is already in use in the Intel 90-nm fabrication generation. Ghani et al., “90nm Logic Technology.”

**Silicon on insulator (SOI)**

SOI transistors, as shown in Fig. 1-14, build MOSFETs out of a thin layer of silicon sitting on top of an insulator. This layer of insulation reduces the capacitance of the source and drain regions, improving speed and reducing power. However, creating defectfree crystalline silicon on top of an insulator is difficult. One way to accomplish this is called silicon implanted with oxygen (SIMOX). In this method oxygen atoms are ionized and accelerated at a silicon wafer so that they become embedded beneath the surface. Heating the wafer then causes silicon dioxide to form and damage to the crystal structure of the surface to be repaired. Another way of creating an SOI wafer is to start with two separate wafers. An oxide layer is grown on the surface of one and then this
wafer is implanted with hydrogen ions to weaken the wafer just beneath the oxide layer. The wafer is then turned upside down and bonded to a second wafer. The layer of damage caused by the hydrogen acts as a perforation, allowing most of the top wafer to be cut away. Etching then reduces the thickness of the remaining silicon further, leaving just a thin layer of crystal silicon on top. These are known as bonded etched back
silicon on insulator (BESOI) wafers. SOI is already in use in the Advanced Micro Devices (AMD ® ) 90-nm fabrication generation

Industry need

The implementation of SOI technology is one of several manufacturing strategies employed to allow the continued miniaturization of microelectronic devices, colloquially referred to as “extending Moore’s Law” (or “More Moore”, abbreviated “MM”). Reported benefits of SOI technology relative to conventional silicon (bulk CMOS) processing include:

Lower parasitic capacitance due to isolation from the bulk silicon, which improves power consumption at matched performance
Resistance to latchup due to complete isolation of the n- and p-well structures
Higher performance at equivalent VDD. Can work at low VDD’s^[5]
Reduced temperature dependency due to no doping
Better yield due to high density, better wafer utilization
Reduced antenna issues
No body or well taps are needed
Lower leakage currents due to isolation thus higher power efficiency
Inherently radiation hardened (resistant to soft errors), reducing the need for redundancy

From a manufacturing perspective, SOI substrates are compatible with most conventional fabrication processes. In general, an SOI-based process may be implemented without special equipment or significant retooling of an existing factory. Among challenges unique to SOI are novel metrology requirements to account for the buried oxide layer and concerns about differential stress in the topmost silicon layer. The threshold voltage of the transistor depends on the history of operation and applied voltage to it, thus making modeling harder. The primary barrier to SOI implementation is the drastic increase in substrate cost, which contributes an estimated 10–15% increase to total manufacturing costs.

SOI transistors

An SOI MOSFET is a semiconductor device (MOSFET) in which a semiconductor layer such as silicon or germanium is formed on an insulator layer which may be a buried oxide (BOX) layer formed in a semiconductor substrate.SOI MOSFET devices are adapted for use by the computer industry.The buried oxide layer can be used in SRAM designs.There are two types of SOI devices: PDSOI (partially depleted SOI) and FDSOI (fully depleted SOI) MOSFETs. For an n-type PDSOI MOSFET the sandwiched p-type film between the gate oxide (GOX) and buried oxide (BOX) is large, so the depletion region can’t cover the whole p region. So to some extent PDSOI behaves like bulk MOSFET. Obviously there are some advantages over the bulk MOSFETs. The film is very thin in FDSOI devices so that the depletion region covers the whole film. In FDSOI the front gate (GOX) supports less depletion charges than the bulk so an increase in inversion charges occurs resulting in higher switching speeds. The limitation of the depletion charge by the BOX induces a suppression of the depletion capacitance and therefore a substantial reduction of the subthreshold swing allowing FD SOI MOSFETs to work at lower gate bias resulting in lower power operation. The subthreshold swing can reach the minimum theoretical value for MOSFET at 300K, which is 60mV/decade. This ideal value was first demonstrated using numerical simulation. Other drawbacks in bulk MOSFETs, like threshold voltage roll off, etc. are reduced in FDSOI since the source and drain electric fields can’t interfere due to the BOX. The main problem in PDSOI is the “floating body effect (FBE)” since the film is not connected to any of the supplies

Manufacture of SOI wafers

SiO₂-based SOI wafers can be produced by several methods:

SIMOX – Separation by IMplantation of OXygen – uses an oxygen ion beam implantation process followed by high temperature annealing to create a buried SiO₂ layer.
Wafer bonding – the insulating layer is formed by directly bonding oxidized silicon with a second substrate. The majority of the second substrate is subsequently removed, the remnants forming the topmost Si layer.
- One prominent example of a wafer bonding process is the Smart Cut method developed by the French firm Soitec which uses ion implantation followed by controlled exfoliation to determine the thickness of the uppermost silicon layer.
- NanoCleave is a technology developed by Silicon Genesis Corporation that separates the silicon via stress at the interface of silicon and silicon-germanium alloy.
- ELTRAN is a technology developed by Canon which is based on porous silicon and water cut.
Seed methods- wherein the topmost Si layer is grown directly on the insulator. Seed methods require some sort of template for homoepitaxy, which may be achieved by chemical treatment of the insulator, an appropriately oriented crystalline insulator, or vias through the insulator from the underlying substrate.

An exhaustive review of these various manufacturing processes may be found in reference

Use in the microelectronics industry

IBM began to use SOI in the high-end RS64-IV “Istar” PowerPC-AS microprocessor in 2000. Other examples of microprocessors built on SOI technology include AMD’s 130 nm, 90 nm, 65 nm, 45 nm and 32 nm single, dual, quad, six and eight core processors since 2001. Freescale adopted SOI in their PowerPC 7455 CPU in late 2001, currently Freescale is shipping SOI products in 180 nm, 130 nm, 90 nm and 45 nm lines.The 90 nm PowerPC- and Power ISA-based processors used in the Xbox 360, PlayStation 3, and Wii use SOI technology as well. Competitive offerings from Intel however continue to use conventional bulk CMOS technology for each process node, instead focusing on other venues such as HKMG and tri-gate transistors to improve transistor performance. In January 2005, Intel researchers reported on an experimental single-chip silicon rib waveguide Raman laser built using SOI.

As for the traditional foundries, on July 2006 TSMC claimed no customer wanted SOI, but Chartered Semiconductor devoted a whole fab to SOI.

Use in high-performance radio frequency (RF) applications

In 1990, Peregrine Semiconductor began development of an SOI process technology utilizing a standard 0.5 μm CMOS node and an enhanced sapphire substrate. Its patented silicon on sapphire (SOS) process is widely used in high-performance RF applications. The intrinsic benefits of the insulating sapphire substrate allow for high isolation, high linearity and electro-static discharge (ESD) tolerance. Multiple other companies have also applied SOI technology to successful RF applications in smartphones and cellular radios

Use in photonics

SOI wafers are widely used in silicon photonics. The crystalline silicon layer on insulator can be used to fabricate optical waveguides and other optical devices, either passive or active (e.g. through suitable implantations). The buried insulator enables propagation of infrared light in the silicon layer on the basis of total internal reflection. The top surface of the waveguides can be either left uncovered and exposed to air (e.g. for sensing applications), or covered with a cladding, typically made of silica

Strained silicon

The ability of charge carriers to move through silicon
is improved by placing the crystal lattice under strain. Electrons in the
conduction band are not attached to any particular atom and travel
more easily when the atoms of the crystal are pulled apart to create more
space between them. Depositing silicon nitride on top of the source and
drain regions tends to compress these areas. This pulls the atoms in the
channel farther apart and improves electron mobility. Holes in the
valence band are attached to a particular atom and travel more easily

when the atoms of the crystal are pushed together. Depositing germa-
nium atoms, which are larger than silicon atoms, into the source and
drain tends to expand these areas. This pushes the atoms in the channel
closer together and improves hole mobility. Strained silicon is already
in use in the Intel 90-nm fabrication generation. 15

High-K Gate Dielectric.

Gate oxide layers thinner than 1 nm are only a
few molecules thick and would have very large gate leakage currents.
Replacing the silicon dioxide, which is currently used in gate oxides, with
a higher permittivity material strengthens the electric field reaching the
channel. This allows for thicker gate oxides to provide the same control
of the channel at dramatically lower gate leakage currents.

Need for high-κ materials

Silicon dioxide (SiO₂) has been used as a gate oxide material for decades. As transistors have decreased in size, the thickness of the silicon dioxide gate dielectric has steadily decreased to decrease the gate capacitance and thereby drive current, raising device performance. As the thickness scales below 2 nm, leakage currents due to tunneling increase drastically, leading to high power consumption and reduced device reliability. Replacing the silicon dioxide gate dielectric with a high-κ material allows increased gate capacitance without the associated leakage effects.

First principles

The gate oxide in a MOSFET can be modeled as a parallel plate capacitor. Ignoring quantum mechanical and depletion effects from the Si substrate and gate, the capacitance C of this parallel plate capacitor is given by

Where

A is the capacitor area
κ is the relative dielectric constant of the material (3.9 for silicon dioxide)
ε₀ is the permittivity of free space
t is the thickness of the capacitor oxide insulator

Since leakage limitation constrains further reduction of t, an alternative method to increase gate capacitance is alter κ by replacing silicon dioxide with a high-κ material. In such a scenario, a thicker gate oxide layer might be used which can reduce the leakage current flowing through the structure as well as improving the gate dielectric reliability.

Gate capacitance impact on drive current

The drain current I_D for a MOSFET can be written (using the gradual channel approximation) as

{\displaystyle I_{D,{\text{Sat}}}={\frac {W}{L}}\mu \,C_{\text{inv}}{\frac {(V_{G}-V_{\text{th}})^{2}}{2}}}

Where

W is the width of the transistor channel
L is the channel length
μ is the channel carrier mobility (assumed constant here)
C_inv is the capacitance density associated with the gate dielectric when the underlying channel is in the inverted state
V_G is the voltage applied to the transistor gate
V_th is the threshold voltage

The term VG − Vth is limited in range due to reliability and room temperature operation constraints, since a too large VG would create an undesirable, high electric field across the oxide. Furthermore, Vth cannot easily be reduced below about 200 mV, because leakage currents due to increased oxide leakage (that is, assuming high-κ dielectrics are not available) and subthreshold conduction raise stand-by power consumption to unacceptable levels. (See the industry roadmap,which limits threshold to 200 mV, and Roy et al. ). Thus, according to this simplified list of factors, an increased ID,sat requires a reduction in the channel length or an increase in the gate dielectric capacitance.

Materials and considerations

Replacing the silicon dioxide gate dielectric with another material adds complexity to the manufacturing process. Silicon dioxide can be formed by oxidizing the underlying silicon, ensuring a uniform, conformal oxide and high interface quality. As a consequence, development efforts have focused on finding a material with a requisitely high dielectric constant that can be easily integrated into a manufacturing process. Other key considerations include band alignment to silicon (which may alter leakage current), film morphology, thermal stability, maintenance of a high mobility of charge carriers in the channel and minimization of electrical defects in the film/interface. Materials which have received considerable attention are hafnium silicate, zirconium silicate, hafnium dioxide and zirconium dioxide, typically deposited using atomic layer deposition.

It is expected that defect states in the high-k dielectric can influence its electrical properties. Defect states can be measured for example by using zero-bias thermally stimulated current, zero-temperature-gradient zero-bias thermally stimulated current spectroscopy, or inelastic electron tunneling spectroscopy (IETS).

Improved interconnects.

Improvements in interconnect capacitance are
possible through further reductions in the permittivity of interlevel
dielectrics. However, improvements in resistance are probably not pos-
sible. Quasi-ideal interconnect scaling will rapidly reach aspect ratios
over 2, beyond which fabrication and cross talk noise with neighboring
wires become serious problems. The only element with less resistivity
than copper is silver, but it offers only a 10 percent improvement and
is very susceptible to electromigration. So, it seems unlikely that any
practical replacement for copper will be found, and yet at dimensions
below about 0.2 μm the resistivity of copper wires rapidly increases. 16
The density of free electrons and the average distance a free electron
travels before colliding with an atom determine the resistivity of a bulk
conductor. In wires whose dimensions approach the mean free path
length, the number of collisions is increased by the boundaries of the
wire itself. The poor scaling of interconnect delays may have to be
compensated for by scaling the upper levels of metal more slowly and
adding new metal layers more rapidly to continue to provide enough

TABLE 1-4

Microprocessor Fabrication Projection (2005–2015)
1) New generation every 2–3 years
2)30% reduction in gate length
3)30% increase in gate capacitance through high-K materials
4)15% reduction in voltage
5)30% reduction in interconnect horizontal and vertical dimensions for lower metal layers
5)15% reduction in interconnect horizontal and vertical dimensions for upper metal layers
6) Add 1 metal layer every generation

connections. Improving the scaling of interconnects is currently the
greatest challenge to the continuation of Moore’s law.

Double and Triple gate

Another way to provide the gate more control over the channel is to wrap the gate wire around two or three sides of a raised strip of silicon. In a triple gate device the channel is like a tunnel with the gate forming both sides and the roof (Fig. 1-15). This allows
strong electric fields from the gate to penetrate the silicon and increases on current while reducing leakage currents. These ideas allow at least an educated guess as to what the scaling of devices may look like over the next 10 years (Table 1-4).

Conclusion

Picturing the scaling of devices beyond 2015 becomes difficult. There is
no reason why all the ideas discussed already could not be combined,
creating a triple high-K gate strained silicon-on-insulator MOSFET. If
this does happen, a high priority will have to be finding a better name.
Although these combinations would provide further improvement, at
current scaling rates the gate length of a 2030 transistor would be only
0.5 nm (about two silicon atoms across). It’s not clear what a transistor
at these dimensions would look like or how it would operate. As always,
our predictions for semiconductor technology can only see about 10 years
into the future.
Nanotechnology start-ups have trumpeted the possibility of single mol-
ecule structures, but these high hopes have had no real impact on the semi-
conductor industry of today. While there is the chance that carbon tubules
or other single molecule structures will be used in everyday semiconduc-
tor products someday, it is highly unlikely that a technological leap will sud-
denly make this commonplace. As exciting as it is to think about structures
one-hundredth the size of today’s devices, of more immediate value is how
to make devices two-thirds the size. Moore’s law will continue, but it will
continue through the steady evolution that has brought us so far already

The Evolution of the Microprocessor Part 3

Microprocessor Scaling

Functioning