At this week’s Hot Chips Conference Brian Curran, IBM Distinguished Engineer discussed their recently announced, new faster processing chip for System z mainframe environments that runs at 5.2Ghz. (FYI, the first 31 minutes of the YouTube video link above are from Brian’s session and the first 10 minutes provides a good overview of the chip.)
Brian discussed System z environments which mainly run large mission critical applications such as OLTP, which use large instruction and data caches. Also System Z is now being used for Linux consolidation with 1000s of Linux machines running on a mainframe.
The new z196 processing core provides up to a 40% improvement executing mainframe applications. Also, the new processor chip was measured at 50 Billion instructions per second (Bips).
In addition, the z196 achieved a remarkable 40% code thread constant improvement and another 20-30% throughput performance improvement was attainable through re-compilation. Moreover, they have shown a sustained system execution throughput (multi-thread/multi-application) of 400 Bips. All this was done without increasing energy consumption over current generation System z processing chips.
Cache everywhere and lots of it
The z196 chip is a 45nm 1.4B transistor, quad core processor with two onboard, special purpose co-processors for cryptographic and compression acceleration. The z196 processing chip has 64KB L1 private I-cache (instruction) and 128KB private D-cache (data), with a 1.5MB private L2 cache. The two L1 & L2 SRAM caches are replicated for each of the four cores. There is an onboard shared 24MB eDRAM L3 cache as well. With a full 5.2Ghz clock speed across all cores in the z196 quad-core processor group.
Each z196 processing core supports out-of-order instruction execution with a 40 instruction window size. Further, all data is protected with ECC and hardened with parity and/or duplication for processing steps.
Six of these z196 processing chips combine together to form a processor node on a multi-chip module (MCM). There is an industry first additional 192MB eDRAM L4 cache shared across the six processing chips on a MCM. Each System z MCM can interface with up to 750GB of main memory.
In a System z processing frame there can be up to four MCMs, which then provides a total of 96 processing cores. With the four MCMs, System z can address ~3TB of main memory. Each MCM is fully interconnected with all other MCMs in a processing frame via a pair of redundant fabric interfaces.
System z is a CISC architecture which with the Z196 has passed the 1000 instruction count barrier (1079 instructions). Whew, glad I am not coding in Assembler anymore.
IBM formerly announced the chip a month ago and it will be in shipping System z product later this year.
There was some mention by WSJ blogs of Power systems 7+ going up to 5.5Ghz but I couldn’t locate a more definitive source for that news.
Image: Z10 by Roberto Berlim