Author Archives: Eben Upton

Happy birthday to us!

via Raspberry Pi

The eagle-eyed among you may have noticed that today is 28 February, which is as close as you’re going to get to our sixth birthday, given that we launched on a leap day. For the last three years, we’ve launched products on or around our birthday: Raspberry Pi 2 in 2015; Raspberry Pi 3 in 2016; and Raspberry Pi Zero W in 2017. But today is a snow day here at Pi Towers, so rather than launching something, we’re taking a photo tour of the last six years of Raspberry Pi products before we don our party hats for the Raspberry Jam Big Birthday Weekend this Saturday and Sunday.

Prehistory

Before there was Raspberry Pi, there was the Broadcom BCM2763 ‘micro DB’, designed, as it happens, by our very own Roger Thornton. This was the first thing we demoed as a Raspberry Pi in May 2011, shown here running an ARMv6 build of Ubuntu 9.04.

BCM2763 micro DB

Ubuntu on Raspberry Pi, 2011-style

A few months later, along came the first batch of 50 “alpha boards”, designed for us by Broadcom. I used to have a spreadsheet that told me where in the world each one of these lived. These are the first “real” Raspberry Pis, built around the BCM2835 application processor and LAN9512 USB hub and Ethernet adapter; remarkably, a software image taken from the download page today will still run on them.

Raspberry Pi alpha board, top view

Raspberry Pi alpha board

We shot some great demos with this board, including this video of Quake III:

Raspberry Pi – Quake 3 demo

A little something for the weekend: here’s Eben showing the Raspberry Pi running Quake 3, and chatting a bit about the performance of the board. Thanks to Rob Bishop and Dave Emett for getting the demo running.

Pete spent the second half of 2011 turning the alpha board into a shippable product, and just before Christmas we produced the first 20 “beta boards”, 10 of which were sold at auction, raising over £10000 for the Foundation.

The beginnings of a Bramble

Beta boards on parade

Here’s Dom, demoing both the board and his excellent taste in movie trailers:

Raspberry Pi Beta Board Bring up

See http://www.raspberrypi.org/ for more details, FAQ and forum.

Launch

Rather to Pete’s surprise, I took his beta board design (with a manually-added polygon in the Gerbers taking the place of Paul Grant’s infamous red wire), and ordered 2000 units from Egoman in China. After a few hiccups, units started to arrive in Cambridge, and on 29 February 2012, Raspberry Pi went on sale for the first time via our partners element14 and RS Components.

Pallet of pis

The first 2000 Raspberry Pis

Unboxing continues

The first Raspberry Pi from the first box from the first pallet

We took over 100000 orders on the first day: something of a shock for an organisation that had imagined in its wildest dreams that it might see lifetime sales of 10000 units. Some people who ordered that day had to wait until the summer to finally receive their units.

Evolution

Even as we struggled to catch up with demand, we were working on ways to improve the design. We quickly replaced the USB polyfuses in the top right-hand corner of the board with zero-ohm links to reduce IR drop. If you have a board with polyfuses, it’s a real limited edition; even more so if it also has Hynix memory. Pete’s “rev 2” design made this change permanent, tweaked the GPIO pin-out, and added one much-requested feature: mounting holes.

Revision 1 versus revision 2

If you look carefully, you’ll notice something else about the revision 2 board: it’s made in the UK. 2012 marked the start of our relationship with the Sony UK Technology Centre in Pencoed, South Wales. In the five years since, they’ve built every product we offer, including more than 12 million “big” Raspberry Pis and more than one million Zeros.

Celebrating 500,000 Welsh units, back when that seemed like a lot

Economies of scale, and the decline in the price of SDRAM, allowed us to double the memory capacity of the Model B to 512MB in the autumn of 2012. And as supply of Model B finally caught up with demand, we were able to launch the Model A, delivering on our original promise of a $25 computer.

A UK-built Raspberry Pi Model A

In 2014, James took all the lessons we’d learned from two-and-a-bit years in the market, and designed the Model B+, and its baby brother the Model A+. The Model B+ established the form factor for all our future products, with a 40-pin extended GPIO connector, four USB ports, and four mounting holes.

The Raspberry Pi 1 Model B+ — entering the era of proper product photography with a bang.

New toys

While James was working on the Model B+, Broadcom was busy behind the scenes developing a follow-on to the BCM2835 application processor. BCM2836 samples arrived in Cambridge at 18:00 one evening in April 2014 (chips never arrive at 09:00 — it’s always early evening, usually just before a public holiday), and within a few hours Dom had Raspbian, and the usual set of VideoCore multimedia demos, up and running.

We launched Raspberry Pi 2 at the start of 2015, pairing BCM2836 with 1GB of memory. With a quad-core Arm Cortex-A7 clocked at 900MHz, we’d increased performance sixfold, and memory fourfold, in just three years.

Nobody mention the zenon death flash.

And of course, while James was working on Raspberry Pi 2, Broadcom was developing BCM2837, with a quad-core 64-bit Arm Cortex-A53 clocked at 1.2GHz. Raspberry Pi 3 launched barely a year after Raspberry Pi 2, providing a further doubling of performance and, for the first time, wireless LAN and Bluetooth.

All our recent products are just the same board shot from different angles

Zero to hero

Where the PC industry has historically used Moore’s Law to “fill up” a given price point with more performance each year, the original Raspberry Pi used Moore’s law to deliver early-2000s PC performance at a lower price. But with Raspberry Pi 2 and 3, we’d gone back to filling up our original $35 price point. After the launch of Raspberry Pi 2, we started to wonder whether we could pull the same trick again, taking the original Raspberry Pi platform to a radically lower price point.

The result was Raspberry Pi Zero. Priced at just $5, with a 1GHz BCM2835 and 512MB of RAM, it was cheap enough to bundle on the front of The MagPi, making us the first computer magazine to give away a computer as a cover gift.

Cheap thrills

MagPi issue 40 in all its glory

We followed up with the $10 Raspberry Pi Zero W, launched exactly a year ago. This adds the wireless LAN and Bluetooth functionality from Raspberry Pi 3, using a rather improbable-looking PCB antenna designed by our buddies at Proant in Sweden.

Up to our old tricks again

Other things

Of course, this isn’t all. There has been a veritable blizzard of point releases; RAM changes; Chinese red units; promotional blue units; Brazilian blue-ish units; not to mention two Camera Modules, in two flavours each; a touchscreen; the Sense HAT (now aboard the ISS); three compute modules; and cases for the Raspberry Pi 3 and the Zero (the former just won a Design Effectiveness Award from the DBA). And on top of that, we publish three magazines (The MagPi, Hello World, and HackSpace magazine) and a whole host of Project Books and Essentials Guides.

Chinese Raspberry Pi 1 Model B

RS Components limited-edition blue Raspberry Pi 1 Model B

Brazilian-market Raspberry Pi 3 Model B

Visible-light Camera Module v2

Learning about injection moulding the hard way

250 pages of content each month, every month

Essential reading

Forward the Foundation

Why does all this matter? Because we’re providing everyone, everywhere, with the chance to own a general-purpose programmable computer for the price of a cup of coffee; because we’re giving people access to tools to let them learn new skills, build businesses, and bring their ideas to life; and because when you buy a Raspberry Pi product, every penny of profit goes to support the Raspberry Pi Foundation in its mission to change the face of computing education.

We’ve had an amazing six years, and they’ve been amazing in large part because of the community that’s grown up alongside us. This weekend, more than 150 Raspberry Jams will take place around the world, comprising the Raspberry Jam Big Birthday Weekend.

Raspberry Pi Big Birthday Weekend 2018. GIF with confetti and bopping JAM balloons

If you want to know more about the Raspberry Pi community, go ahead and find your nearest Jam on our interactive map — maybe we’ll see you there.

The post Happy birthday to us! appeared first on Raspberry Pi.

Why Raspberry Pi isn’t vulnerable to Spectre or Meltdown

via Raspberry Pi

Over the last couple of days, there has been a lot of discussion about a pair of security vulnerabilities nicknamed Spectre and Meltdown. These affect all modern Intel processors, and (in the case of Spectre) many AMD processors and ARM cores. Spectre allows an attacker to bypass software checks to read data from arbitrary locations in the current address space; Meltdown allows an attacker to read arbitrary data from the operating system kernel’s address space (which should normally be inaccessible to user programs).

Both vulnerabilities exploit performance features (caching and speculative execution) common to many modern processors to leak data via a so-called side-channel attack. Happily, the Raspberry Pi isn’t susceptible to these vulnerabilities, because of the particular ARM cores that we use.

To help us understand why, here’s a little primer on some concepts in modern processor design. We’ll illustrate these concepts using simple programs in Python syntax like this one:

t = a+b
u = c+d
v = e+f
w = v+g
x = h+i
y = j+k

While the processor in your computer doesn’t execute Python directly, the statements here are simple enough that they roughly correspond to a single machine instruction. We’re going to gloss over some details (notably pipelining and register renaming) which are very important to processor designers, but which aren’t necessary to understand how Spectre and Meltdown work.

For a comprehensive description of processor design, and other aspects of modern computer architecture, you can’t do better than Hennessy and Patterson’s classic Computer Architecture: A Quantitative Approach.

What is a scalar processor?

The simplest sort of modern processor executes one instruction per cycle; we call this a scalar processor. Our example above will execute in six cycles on a scalar processor.

Examples of scalar processors include the Intel 486 and the ARM1176 core used in Raspberry Pi 1 and Raspberry Pi Zero.

What is a superscalar processor?

The obvious way to make a scalar processor (or indeed any processor) run faster is to increase its clock speed. However, we soon reach limits of how fast the logic gates inside the processor can be made to run; processor designers therefore quickly began to look for ways to do several things at once.

An in-order superscalar processor examines the incoming stream of instructions and tries execute more than one at once, in one of several “pipes”, subject to dependencies between the instructions. Dependencies are important: you might think that a two-way superscalar processor could just pair up (or dual-issue) the six instructions in our example like this:

t, u = a+b, c+d
v, w = e+f, v+g
x, y = h+i, j+k

But this doesn’t make sense: we have to compute v before we can compute w, so the third and fourth instructions can’t be executed at the same time. Our two-way superscalar processor won’t be able to find anything to pair with the third instruction, so our example will execute in four cycles:

t, u = a+b, c+d
v    = e+f                   # second pipe does nothing here
w, x = v+g, h+i
y    = j+k

Examples of superscalar processors include the Intel Pentium, and the ARM Cortex-A7 and Cortex-A53 cores used in Raspberry Pi 2 and Raspberry Pi 3 respectively. Raspberry Pi 3 has only a 33% higher clock speed than Raspberry Pi 2, but has roughly double the performance: the extra performance is partly a result of Cortex-A53’s ability to dual-issue a broader range of instructions than Cortex-A7.

What is an out-of-order processor?

Going back to our example, we can see that, although we have a dependency between v and w, we have other independent instructions later in the program that we could potentially have used to fill the empty pipe during the second cycle. An out-of-order superscalar processor has the ability to shuffle the order of incoming instructions (again subject to dependencies) in order to keep its pipelines busy.

An out-of-order processor might effectively swap the definitions of w and x in our example like this:

t = a+b
u = c+d
v = e+f
x = h+i
w = v+g
y = j+k

allowing it to execute in three cycles:

t, u = a+b, c+d
v, x = e+f, h+i
w, y = v+g, j+k

Examples of out-of-order processors include the Intel Pentium 2 (and most subsequent Intel and AMD x86 processors), and many recent ARM cores, including Cortex-A9, -A15, -A17, and -A57.

What is speculation?

Reordering sequential instructions is a powerful way to recover more instruction-level parallelism, but as processors become wider (able to triple- or quadruple-issue instructions) it becomes harder to keep all those pipes busy. Modern processors have therefore grown the ability to speculate. Speculative execution lets us issue instructions which might turn out not to be required (because they are branched over): this keeps a pipe busy, and if it turns out that the instruction isn’t executed, we can just throw the result away.

To demonstrate the benefits of speculation, let’s look at another example:

t = a+b
u = t+c
v = u+d
if v:
   w = e+f
   x = w+g
   y = x+h

Now we have dependencies from t to u to v, and from w to x to y, so a two-way out-of-order processor without speculation won’t ever be able to fill its second pipe. It spends three cycles computing t, u, and v, after which it knows whether the body of the if statement will execute, in which case it then spends three cycles computing w, x, and y. Assuming the if (a branch instruction) takes one cycle, our example takes either four cycles (if v turns out to be zero) or seven cycles (if v is non-zero).

Speculation effectively shuffles the program like this:

t = a+b
u = t+c
v = u+d
w_ = e+f
x_ = w_+g
y_ = x_+h
if v:
   w, x, y = w_, x_, y_

so we now have additional instruction level parallelism to keep our pipes busy:

t, w_ = a+b, e+f
u, x_ = t+c, w_+g
v, y_ = u+d, x_+h
if v:
   w, x, y = w_, x_, y_

Cycle counting becomes less well defined in speculative out-of-order processors, but the branch and conditional update of w, x, and y are (approximately) free, so our example executes in (approximately) three cycles.

What is a cache?

In the good old days*, the speed of processors was well matched with the speed of memory access. My BBC Micro, with its 2MHz 6502, could execute an instruction roughly every 2µs (microseconds), and had a memory cycle time of 0.25µs. Over the ensuing 35 years, processors have become very much faster, but memory only modestly so: a single Cortex-A53 in a Raspberry Pi 3 can execute an instruction roughly every 0.5ns (nanoseconds), but can take up to 100ns to access main memory.

At first glance, this sounds like a disaster: every time we access memory, we’ll end up waiting for 100ns to get the result back. In this case, this example:

a = mem[0]
b = mem[1]

would take 200ns.

In practice, programs tend to access memory in relatively predictable ways, exhibiting both temporal locality (if I access a location, I’m likely to access it again soon) and spatial locality (if I access a location, I’m likely to access a nearby location soon). Caching takes advantage of these properties to reduce the average cost of access to memory.

A cache is a small on-chip memory, close to the processor, which stores copies of the contents of recently used locations (and their neighbours), so that they are quickly available on subsequent accesses. With caching, the example above will execute in a little over 100ns:

a = mem[0]    # 100ns delay, copies mem[0:15] into cache
b = mem[1]    # mem[1] is in the cache

From the point of view of Spectre and Meltdown, the important point is that if you can time how long a memory access takes, you can determine whether the address you accessed was in the cache (short time) or not (long time).

What is a side channel?

From Wikipedia:

“… a side-channel attack is any attack based on information gained from the physical implementation of a cryptosystem, rather than brute force or theoretical weaknesses in the algorithms (compare cryptanalysis). For example, timing information, power consumption, electromagnetic leaks or even sound can provide an extra source of information, which can be exploited to break the system.”

Spectre and Meltdown are side-channel attacks which deduce the contents of a memory location which should not normally be accessible by using timing to observe whether another location is present in the cache.

Putting it all together

Now let’s look at how speculation and caching combine to permit the Meltdown attack. Consider the following example, which is a user program that sometimes reads from an illegal (kernel) address:

t = a+b
u = t+c
v = u+d
if v:
   w = kern_mem[address]   # if we get here crash
   x = w&0x100
   y = user_mem[x]

Now our out-of-order two-way superscalar processor shuffles the program like this:

t, w_ = a+b, kern_mem[address]
u, x_ = t+c, w_&0x100
v, y_ = u+d, user_mem[x_]

if v:
   # crash
   w, x, y = w_, x_, y_      # we never get here

Even though the processor always speculatively reads from the kernel address, it must defer the resulting fault until it knows that v was non-zero. On the face of it, this feels safe because either:

  • v is zero, so the result of the illegal read isn’t committed to w
  • v is non-zero, so the program crashes before the read is committed to w

However, suppose we flush our cache before executing the code, and arrange a, b, c, and d so that v is zero. Now, the speculative load in the third cycle:

v, y_ = u+d, user_mem[x_]

will read from either address 0x000 or address 0x100 depending on the eighth bit of the result of the illegal read. Because v is zero, the results of the speculative instructions will be discarded, and execution will continue. If we time a subsequent access to one of those addresses, we can determine which address is in the cache. Congratulations: you’ve just read a single bit from the kernel’s address space!

The real Meltdown exploit is more complex than this, but the principle is the same. Spectre uses a similar approach to subvert software array bounds checks.

Conclusion

Modern processors go to great lengths to preserve the abstraction that they are in-order scalar machines that access memory directly, while in fact using a host of techniques including caching, instruction reordering, and speculation to deliver much higher performance than a simple processor could hope to achieve. Meltdown and Spectre are examples of what happens when we reason about security in the context of that abstraction, and then encounter minor discrepancies between the abstraction and reality.

The lack of speculation in the ARM1176, Cortex-A7, and Cortex-A53 cores used in Raspberry Pi render us immune to attacks of the sort.

* days may not be that old, or that good

The post Why Raspberry Pi isn’t vulnerable to Spectre or Meltdown appeared first on Raspberry Pi.