Previous Up Next

Chapter 2  RISC Machines

2.1  Introduction

As we saw in chapter 1, the general trend from the 1960s to the mid 1980s was for machine instruction sets to become more sophisticated. In the early 1980s, a different concept emerged — the Reduced Instruction Set Computer, or RISC machine.

The basic idea was: when constructing a microprocessor (or any machine), we have limited resources available and therefore, we should devote these resources to the most used functions. The trouble was that very little work had been done on trying to discover what these most used functions actually were (in chapter 3 we will look at the results of some analysis).

Hardware designers had been assuming that the most productive way to use the available resources was to build machines with complex instructions sets that mirrored high level languages in an attempt to make life easier for compiler writers. However, as we have seen, there had already been some doubt expressed about this. It had been pointed out that most compiler generated code used vast numbers of the basic, simple instructions, and very few of the sophisticated ones.

2.2  Early History

The origins of RISC machines go back some way. We tend to think that techniques like pipelining (chapter 4) are new, but they are not. There were machines in the 60s that were heavily pipelined — the `supercomputers' of their day. These were machines like the top-of-the-range IBM 360s, the Control Data 6600 and 7600, and later the Cray-1 and Control Data Cyber 205. The architects of these machines understood that to effectively use pipelining they needed simple instructions with uniform lengths and execution times1. These particular machines were very much special-purpose, so they did not particularly influence `general' machine architectures at the time (though they have since). It is also worth noting that a couple of the ideas that later became important indicators of whether a computer is a RISC or not were seen in the 1956 Pegasus, made by Ferranti, so a lot of the fundamental ideas are pretty old.

The first widely-acknowledged RISC machine was the IBM 801, produced by a project lead by John Cocke that started in 1975. The 801 predates the idea of the microprocessor as a `serious' computer — it was in fact a minicomputer project. Unfortunately, little information came out until much later — IBM is a commercial company (at the time, the largest computer company in the world) and it was a commercial project. It is also claimed that IBM were worried about undermining their own product line, since 801-based machines would alledgedly outperform, and be much cheaper than, many of their existing low-end machines2 (this was pre-PC, remember). Certainly 801-based products did not appear (though there were some that derived from the work, much later).

In the early 80s, a research group at Berkeley lead by Patterson (of `Hennessy and Patterson') started to investigate the design of a machine based on the principle of `only put in what is most necessary'. This was a microprocessor-based project, and would become the RISC I and RISC II processors which we will look at in the next chapter. (There were also a RISC Blue and a RISC Gold.) These machines are the origin of the acronym `RISC' — Reduced Instruction Set Computer. This name actually makes what is a sophisticated set of ideas and principles sound very simple: just take out some instructions, which is potentially misleading.

A third project at Stanford lead by Hennessy (also of `Hennessy & Patterson') started about the same time. This lead to the MIPS processors. The name stood for Microprocessor without Interlocked Pipeline Stages. This referred to the policy of using the compiler rather than hardware (`interlocks') to ensure that Very Bad Things did not happen during pipelined program execution. (We will look at these Very Bad Things in chapter 4.)

2.3  Subsequent Developments

Almost all subsequent RISC machines derive from these three early examples. The most obviously linked is the MIPS series, which derived directly from the Stanford original. RISC II derived examples include the Pyramid mainframe series (the University owned one of these from 1987 to 1994), and more successfully Sun's SPARC series. The IBM 801 was followed by a series of RISC processors starting with the R6000 and leading ultimately to the POWER and PowerPC series. Since then, there has been much merging of concepts from the original designs, and many early ideas have been dropped as a result of technological developments (for example, delayed branches — see section 5.4).

In fact, there is very little conceptual difference between the current MIPS, SPARC, POWER and PowerPC architectures, or the DEC Alpha and HP-PA architectures. To some extent, it could be argued that earlier, `non-RISC' concepts have crept back in…

For example, the earliest RISC processors only contained hardware support for integer data types. Modern processors also include, for instance, floating point data types; the received wisdom, traditionally, is that few programs use floating point, and a key RISC tenet is that you should not include things that will not get used much 3. The argument used to justify the inclusion of floating point hardware was that available resources had grown and it was now possible to justify devoting a proportion to, for example, floating point. This may not give the best overall performance, but failing to do so would mean that floating point operations would be too slow (even though there may not be that many).

This is completely reasonable, but (a) is not inline with the original, quite `hard-line' RISC philosophy, and (b) raises the question: what exactly is a RISC processor? This turns out to be quite hard to answer. Here is a list of possible criteria that have been used in the past.

  1. Instructions are conceptually simple — that is, no baroque things like `evaluate polynomial', or `edit string', both of which were found in the VAX.
  2. Instructions are uniform length — as opposed, to say, the VAX or M68000 which have a wide range of instruction lengths.
  3. Instructions use one, or very few, formats — again, unlike the VAX or M68000.
  4. The instruction set is orthogonal — that is, there are no special rules about what operations are permitted with particular addressing modes (which would complicate the life of a compiler writer).
  5. There is one, or very few, addressing modes.
  6. The architecture is load-and-store — that is, only load and store operations access memory — all operate instructions (e.g. arithmetic) only operate on registers.
  7. The architecture supports two (or perhaps a few more) datatypes — integer and floating point usually.

If a processor possesses a majority of these properties, we can claim it is a RISC processor. For example, a VAX satisfies none of these, and the RISC II architecture satisfies all of them. One of the problems with this list is that it is a bit subjective (`few', `conceptually'). In practice, there are only three in the list that do not have this problem (2, 4, and 6), and almost all RISCs satisfy all of these, with the occasional exception of 2. (The reason 2 is sometimes not met is usually to reduce code size in, say, embedded applications where memory is limited. Instead of requiring all instructions to be, say, 32-bits, some of the simpler ones may be 16 bits.)

2.4  Current RISC (and non-RISC) Processors

If we restrict ourselves to relatively high-profile processors (i.e. those likely to be found in desktop machines, or servers rather than simple control applications — where successful processors include ARM,, the current leading RISC architectures are:

2.4.1  SPARC

SPARC is Sun's architecture, found typically in SparcStations and elsewhere. Like many current architectures a number of different companies have been responsible for producing SPARC implementations at one time or another. (The first time this happened was after the main architect of the IBM 360 – Gene Amdahl – left to form his own company making 360 code-compatible machines. Legal action was necessary to get IBM to sell software to companies that bought the machines. Now it is of course considered completely acceptable to manufacture a processor that implements someone else's architecture.)

2.4.2  Alpha

Alpha was DEC's successor the the VAX series or processors. The VAX was the archetypical CISC (Complex Instruction Set Computer), the antithesis of the RISC concept. In its time (the 1980s essentially) it was very successful — many Universities in the developed world outside the Soviet Block, including this one, owned VAXen (partly because in addition to DEC's own VMS operating system there was a very popular port of Berkeley Unix available). However, technological developments, and the emergence of RISC, made the VAX obsolete. DEC replaced it with the Alpha-series processors, which are generally the fastest you can get at any one time (they are aimed at specialised, high-performance applications).

Alpha was the first 64-bit architecture to be generally available, though there are now others of course (for example, SPARC and MIPS). However, DEC (Digital Equipment Company) never really recovered from the rapid decline of the VAX, and a few years ago were bought out by Compaq who have since merged with Hewlett Packard. At one point Digital were the second largest computer company in the world. Now, Alpha is handled by Compaq/HP's Alpha Systems Division who have announced that development is to cease; sales of existing Alphas ended in 2006.

2.4.3  PowerPC

The PowerPC is a collaboration between Apple, IBM and Motorola, with Motorola & IBM providing the actual processors. The current state of the art are the 750 (G4) series, the 740 (G3) series and the new 970 (G5) series. This architecture is related to, but is not exactly the same as, IBM's POWER architecture. PowerPC is the only RISC architecture that has any real level of penetration in the `consumer' and `non-specialist' desktop market.

Initially, Big Things were expected, with a range of companies supplying machines with a high-level of compatability (the Common Hardware Reference Pattern – CHRP). Since the companies involved were large, they hoped to be able to get a significant part of the Microsoft/Intel market, but failed. For quite some time the only major user of PowerPC-based processors was Apple; they've now stopped, and gaming consoles are the main market (PS3, Xbox360, Wii).

Motorola's position in the consortium is questionable. They alone produced the G4s which failed to keep up in performance terms with Intel and AMD. G5s (used in new Apple desktops) and G3s (used in iBooks) weremade by IBM, with G4s used in PowerBooks and iMacs. We will look at some of the interesting background to this later, in chapter 13.

2.4.4  IA64/Itanium

IA64 is a 64-bit architecture developed by HP and Intel; Itanium (previously Merced) was the first implementation (see chapter 12). This is not a convential RISC architecture, though it does have many RISC concepts. In addition, the IA64 architecture is a collaborative project with Hewlett-Packard, and is based to an extent on HP's HP-PA RISC architecture. It is not a `major' architecture, but it may possibly become one in the future. It is, at least, technologically significant and we will look at in more detail later.

2.4.5  MIPS

MIPS no longer manufactures stand-alone microprocessors. Instead they design and supply cores, or embedded cores. These are essentially microprocessor designs which are supplied to, and embedded in, chips manufactured by others. For example, you may be designing a specialised chip, or chip family, for some application (set-top boxes are a big market for MIPS currently) which requires a microprocessor among other things.

By buying a processor design (or more properly, the rights to use a processor core design) you save lots of work. You do not have to design a processor yourself, which is a specialised and expensive task, and you can buy-in to lots of already-available software support. Such processor designs are often called IP cores, because you you are actually selling intellectual property (IP) rather than anything tangible. This is also a major business area for other companies (e.g. ARM).

Paradoxically, the only high-profile, high-performance architecture that is not RISC is the most successful…

2.4.6  x86

Intel's x86, or IA32 (32-bit Intel Architecture) as used by the Pentium series, and the 80386 and 80486 before that (the earlier processors in the series – 8086/8088, 80186, 80286 – had simpler architectures), and also implemented currently by Advanced Micro Devices (AMD). However, even that (in current Intel implementations) uses RISC concepts internally, as we will see when we look at the P6 in chapter 11.

The primary reason that IA32 has survived (let alone dominated) is simply that IBM chose its predecessor (the 8088) as the basis for the first IBM PC, which as we all know has consequently dominated the market. The 8088 was an 8-bit version of Intel's then top of the range processor, the 16-bit 8086. IBM chose the 8-bit version to save money on the motherboard (narrow buses). Supposedly, IBM initially approached Motorola about using the 68000, which at the time was a more powerful processor. Because so many have been (and are being) sold, Intel benefits from economy of scale: they can afford to spend a lot of resources on translating the actual IA32 instructions into an internal RISC instruction set and still sell processors at competitive prices, simply because they sell so many. If they were only selling the same volume as, say, the PowerPC, then this might not be possible.

However, there are penalties.

First, power dissipation — Intel's Pentium-series processors use several times as much as the PowerPC equivalent (hence the massive heatsinks with fans bolted-on — though current G5s seem to be catching up a bit here).

Second, Pentium-series clock rates need to be significantly higher than their RISC equivalents for equivalent performance (something exacerbated by possibly-marketing-influenced decisions made by Intel: see the P6).

How much depends on the application, but for a computationally expensive task (e.g. using Maude to verify a microprocessor), a 670MHz Pentium III was only about 25% faster than a 375MHz G3, both running Linux and using the same compiler (gcc). (Lots of such `comparisons' are, of course, essentially artificial and rigged for marketing purposes.)

The reasons for this are quite complex, but we will look at some of them later when we start to deal with pipelining and when we specifically study the P6 example. However, here is one example: because of the extra translation from one instruction set to another, the pipelines of Pentium-series processors (except for the original Pentium) are longer than comparable RISC processors. This means that if for any reason the pipeline empties (we will see some reasons later, in chapters 4 and 5), it takes longer to refill it, meaning the penalty is more significant.

2.5  Lightbulb thought experiment

Incidentally, here is a small thought experiment. Estimate the heat dissipation per square cm of the surface of a lightbulb. To do this you need to make some simplifying assumptions to keep the arithmetic reasonably simple and you need to know the formula for the surface area of a sphere. (Assuming a light bulb is a sphere is the first assumption: the others concern (a) the dimensions of the bulb; (b) the wattage of the bulb; and (c) what proportion of energy is dissipated as heat instead of light.) Once you have the answer, consider that a typical Pentium-class processor will be dissipating some tens of Watts, and it has a surface area of a couple of square cm. It should then be obvious why the fans and heatsinks are needed.

Actually, the `simple' is not necessary for pipelining — you could, in principle, have only complex instructions that were all the same length and had the same execution times. However, it is much easier to do without complex instructions than to try and do away with the simple ones.
This kind of concern is rife in commercial settings; see for a nice tale of a company deliberately `sabotaging' their products for commercial reasons.
Actually, given the growth in graphics-based applications, this traditional view of floating point may no longer be true, but the trend to include floating point hardware pre-dates this.

Previous Up Next