IBM boosts System z processing speed

At this week’s Hot Chips Conference Brian Curran, IBM Distinguished Engineer discussed their recently announced, new faster processing chip for System z mainframe environments that runs at 5.2Ghz.  (FYI, the first 31 minutes of the YouTube video link above are from Brian’s session and the first 10 minutes provides a good overview of the chip.)

Brian discussed System z environments which mainly run large mission critical applications such as OLTP, which use large instruction and data caches.  Also System Z is now being used for Linux consolidation with 1000s of Linux machines running on a mainframe.

The numbers

The new z196 processing core provides up to a 40% improvement executing mainframe applications.  Also, the new processor chip was measured at 50 Billion instructions per second (Bips).

In addition, the z196 achieved a remarkable 40% code thread constant improvement and another 20-30% throughput performance improvement was attainable through re-compilation.  Moreover, they have shown a sustained system execution throughput (multi-thread/multi-application) of 400 Bips.  All this was done without increasing energy consumption over current generation System z processing chips.

Cache everywhere and lots of it

The z196 chip is a 45nm 1.4B transistor, quad core processor with two onboard, special purpose co-processors for cryptographic and compression acceleration. The z196 processing chip has 64KB L1 private I-cache (instruction) and 128KB private D-cache (data), with a 1.5MB private L2 cache. The two L1 & L2 SRAM caches are replicated for each of the four cores.  There is an onboard shared 24MB eDRAM L3 cache as well. With a full 5.2Ghz clock speed across all cores in the z196 quad-core processor group.

Each z196 processing core supports out-of-order instruction execution with a 40 instruction window size.   Further, all data is protected with ECC and hardened with parity and/or duplication for processing steps.

Six of these z196 processing chips combine together to form a processor node on a multi-chip module (MCM).  There is an industry first additional 192MB eDRAM L4 cache shared across the six processing chips on a MCM.  Each System z MCM can interface with up to 750GB of main memory.

In a System z processing frame there can be up to four MCMs, which then provides a total of 96 processing cores.  With the four MCMs, System z can address ~3TB of main memory.  Each MCM is fully interconnected with all other MCMs in a processing frame via a pair of redundant fabric interfaces.

System z is a CISC architecture which with the Z196 has passed the 1000 instruction count barrier (1079 instructions).  Whew, glad I am not coding in Assembler anymore.

IBM formerly announced the chip a month ago and it will be in shipping System z product later this year.

There was some mention by WSJ blogs of Power systems 7+ going up to 5.5Ghz   but I couldn’t locate a more definitive source for that news.

Comments?

Image: Z10 by Roberto Berlim