## GR740: Rad-Hard Quad-Core LEON4FT System-on-Chip

Magnus Hjorth (SPEAKER), Martin Åberg, Nils-Johan Wessman, Jan Andersson

Cobham Gaisler, Kungsgatan 12, SE-411 91, Göteborg, Sweden Tel: +46 31 775 86 50 {magnus.hjorth,martin.aberg,nisse,jan.andersson}@gaisler.com

Remy Chevallier<sup>(1),</sup> Russel Forsyth<sup>(2)</sup> <sup>(1)</sup>STMicroelectronics, 12, rue Jules Horowitz, 38019 Grenoble Cedex,, France <sup>(2)</sup>STMicroelectronics, 33 Pinkhill, EH12 7BF, Edinburgh, U.K Tel: +33 4 76 58 75 94 {remy.chevallier,russell.forsyth}@st.com

Roland Weigand, Luca Fossati

European Space Agency, Keplerlaan 1 – PO Box 299, 2220AG Noordwjik ZH, The Netherlands, Tel: +31 71 565 65 65 {roland.weigand,luca.fossati}@esa.int

## ABSTRACT

The GR740 microprocessor device is a SPARC V8(E) based multi-core architecture that provides a significant performance increase compared to earlier generations of European space processors. The GR740 is currently in development at Cobham Gaisler, Sweden, and STMicroelectronics, France, in activities to develop the Next Generation Microprocessor (NGMP) initiated and funded by the European Space Agency (ESA).

The presentation and paper will describe the architecture and characteristics of the GR740 device.

This abstract describes an on-going development where the devices are in the final stages before manufacturing. The presentation and final paper will contain further details on the manufactured device and will describe progress of the development.

## BACKGROUND

The LEON project was started by the European Space Agency in late 1997 to study and develop a high-performance processor to be used in European space projects. The objectives for the project were to provide an open, portable and non-proprietary processor design, capable to meet future requirements for performance, software compatibility and low system cost. Another objective was to be able to manufacture in a Single Event Upset (SEU) sensitive semiconductor process. To maintain correct operation in the presence of SEUs, extensive error detection and error handling functions were needed. The goals have been to detect and tolerate one error in any register without software intervention, and to suppress effects from Single Event Transient (SET) errors in combinational logic.

The LEON IP-core family includes the first LEON1 VHSIC Hardware Description Language (VHDL) design that was used in the LEONExpress test chip developed in 0.35 µm technology to prove the fault toler-

ance concept. The second LEON2 VHDL design was used in the processor device AT697 from Atmel (F) and various system-on-chip devices. These two LEON IPcore implementations were developed by ESA. Gaisler Research, now Cobham Gaisler, developed the third (LEON3) and fourth (LEON4) designs that are used in a number of avionics systems and also in the commercial sector.

Following the development of the TSC695 (ERC32) and AT697 processor components in 0.5 and 0.18  $\mu$ m technology respectively, ESA has initiated the NGMP activity targeting a European Deep Sub-Micron (DSM) technology in order to meet increasing requirements on performance and to ensure the supply of European space processors. Cobham Gaisler, at the time Aeroflex Gaisler, was selected to develop the NGMP system that is centred around the new LEON4FT processor.

After a preliminary study in 2006, implementing a quadcore LEON3 in an FPGA [1], the NGMP development started in 2009. It experienced delays however, waiting for the availability of a suitable European DSM technology. Meanwhile, a functional prototype [2] was developed on commercial technology (eASIC Nextreme2) for early SW development and user evaluation. Development of the rad-hard NGMP resumed during the second quarter of 2014 when technology became available. The technology used is the C65SPACE platform from STMicroelectronics [3]. The GR740 device is implemented using this platform and constitutes the NGMP Engineering Model.

## **ARCHITECTURAL OVERVIEW**

Figure 1 shows an overview of the GR740 architecture. The system consists of five Advanced Highperformance Buses (AHB); one 128-bit Processor bus, one 128-bit Memory bus, two 32-bit I/O buses and one 32-bit Debug bus. The Processor bus connects four LEON4FT cores with private FPU and L1 caches,

Programme and Abstracts Book of the DASIA 2015 Conference Barcelona, Spain 19-21 May 2015 connected to a shared Level-2 (L2) cache. The Memory bus is located between the L2 cache and the memory controller for the main external memory interface, PC100 SDRAM, and also connects a memory scrubber. As an alternative to a large on-chip memory, part of the L2 cache can be turned into on-chip memory by cacheway locking.

The two separate I/O buses connect all the peripheral cores. All memory-mapped interfaces have been placed on one bus (Slave I/O bus), and all interfaces with DMA capability have been placed on the other bus (Master I/O bus). The exception is that slave interfaces that connect via the AMBA APB bus are directly connected to the processor AHB bus via AHB/APB bridges.

The Master I/O bus connects to the Processor bus via an AHB bridge that provides access restriction and address translation (IOMMU) functionality. Alternatively, the IOMMU bridge can also connect directly to the memory bus, bypassing the L2 cache and reducing the load on the processor bus. The two I/O buses include all peripheral units such as timers, interrupt controller, UARTs, general purpose I/O port, PCI master/target, Ethernet MACs, MIL-STD-1553B, Serial Peripheral Interface bus and SpaceWire interfaces. All I/O master units in the system contain dedicated DMA engines and are controlled by descriptors located in main memory that are set up by the processors. Reception of, for instance, Ethernet and SpaceWire packets will not increase CPU load. The cores will buffer incoming packets and write them to main memory without processor intervention.

The fifth bus, a dedicated 32-bit Debug bus, connects a debug support unit (DSU), PCI and AHB trace buffers and several debug communication links. The Debug bus allows for non-intrusive debugging through the DSU and direct access to the complete system, as the Debug bus is not placed behind an AHB bridge with access restriction functionality.

The list below summarizes the specification for the GR740 system:

- 128-bit Processor AHB bus
  - 4x LEON4FT
    - 16 + 16 KiB write-through cache with LRU replacement. Can be configured by software to use random and direct-mapped replacement.
    - SPARC Reference MMU. Physical snooping.
    - 32-bit MUL/DIV.
    - GRFPU floating-point unit
  - 1x2 MiB Shared L2 write-back cache with memory access protection (fence registers), cache-way locking and partitioning.
  - 1x 32-bit to 128-bit unidirectional AHB to AHB bridge (from Debug bus to Processor bus)
  - 1x 128-bit to 32-bit unidirectional AHB to AHB bridge (from Processor bus to Slave I/O bus)
  - 1x 32-bit to 128-bit unidirectional AHB to AHB bridge with IOMMU (from Master I/O bus to Processor bus)
- 128-bit Memory AHB bus
  - 1x 64-bit data SDRAM PC100 memory interface with Reed-Solomon ECC (with 16 or 32 check bits)
  - 1x Memory scrubber
- 32-bit Master I/O AHB bus
  - SpaceWire router with eight external ports and four AMBA ports, with RMAP @ 300 Mbit/s
  - 2x 10/100/1000 Mbit Ethernet interface with MII/GMII PHY interface
  - 1x 32-bit PCI target interface @ 33 MHz
  - MIL-STD-1553B interface
- 32-bit Slave I/O AHB bus



Figure 1: GR740 Block diagram

- 1x 32-bit PCI master interface @ 33 MHz with DMA controller mapped to the Master I/O bus
- 1x 8/16-bit PROM/IO controller with BCH ECC
- 2x 32-bit AHB to APB bridge connecting:
  - 5x General purpose timer unit
  - 2x General purpose I/O port
  - 2x 8-bit UART interface
  - 1x Multiprocessor interrupt controller
  - 2x AHB status register
  - 1x Clock gating control unit
  - 1x LEON4 statistical unit (perf. counters)
  - 1x SPI master/slave controller
  - PLL and pad control units
  - 1x Temperature sensor
  - CCSDS Time Distribution Protocol controller
- 32-bit Debug AHB Bus
  - 1x Debug support unit
  - 1x JTAG debug link
  - 1x SpaceWire RMAP target
  - 1x AHB trace buffer, tracing Master I/O bus
  - 1x PCI trace buffer:

# DEVIATIONS FROM NGMP BASELINE SPECIFICATION

The target technology platform and available suitable package technology does not allow implementation of all features described in the NGMP specification. The GR740 design also differs from the NGMP specification due to user feedback and lessons learned from the LEON4-N2X (NGMP functional prototype) development.

The subsections below describe aspects where the functional prototype will be different from what was specified as the NGMP baseline design.

## **High-Speed Serial Links**

The GR740 does not contain the four High-Speed Serial Links (HSSL) present in the NGMP baseline design.

## **Main Memory Interfaces**

The baseline NGMP design has DDR2 SDRAM and (SDR) SDRAM on shared pins. The GR740 has a PC100 SDRAM interface. The change has no functional impact visible to application software.

## **Floating-point units**

The NGMP baseline specification and the LEON4-N2X device has floating-point units shared between pairs of processors. The GR740 device has one floating-point unit dedicated for each processor and also supports separate clock gating of each floating-point unit.

## Level-2 cache hit-under-miss and SPLIT

The Level-2 cache in GR740 can serve accesses that result in cache hits while the cache is performing missprocessing. This is done by issuing AMBA SPLIT responses to cache misses so that additional accesses can be accepted by the cache.

## **Pin-multiplexing**

The implemented device is pin constrained and interfaces have been put on shared pins. The constraints placed by pin-multiplexing is that PCI can only be used when the SDRAM is in half-width mode (32 data bits plus check bits). Several interfaces are also pinmultiplexed with the PROM/IO interface.

## **PCI** arbiter

The GR740 device does not include a PCI arbiter.

## **Time synchronization**

The GR740 device has functionality in all internal timer units to latch and set the current time based on interrupts, MIL-STD-1553B synchronization commands and SpaceWire time codes. The GR740 also includes a CCSDS TDP controller for keeping Spacecraft time. This functionality is in addition to what was specified in the NGMP specification.

## SpaceWire router

The SpaceWire router in the GR740 has support for SpaceWire-D and SpaceWire-PnP. Support for these protocols were not included in the NGMP specification.

## **USB** debug link

The GR740 does not include a USB debug link.

## Clocking, reset and maximum frequency

The maximum operating frequency for the GR740 AMBA system is 250 MHz. The device has separate clock inputs for system, SpaceWire and MIL-STD-1553B clocks.

In order to avoid problems with reset sequencing, the GR740 has one single reset input that is sequenced internally to provide reset signals to the different clock domains within the device.

## **Dynamic PLL and Pad Control**

Simple interface cores have been developed in order to allow software to configure clock frequencies and pad characteristics. Similar functionality was also included in the NGMP functional prototype (LEON4-N2X device).

## ARCHITECTURAL FEATURE: IMPROVED SUPPORT FOR AMP/SMP SYSTEMS

The GR740 device has improved support for resource partitioning. The architecture has been designed to support both SMP, AMP and mixtures (example: 3 processors running Linux or VxWorks SMP and one processor running RTEMS). This is accomplished using several new hardware features:

- The Level-2 cache can be set to 1 way/processor mode. This partitioning means that one software instance cannot evict data belonging to another software instance and reduces the amount of interference between software instances.
- The Level-2 cache has fence registers that can be used to protect backup software
- Performance counters exist to count accesses to shared resources. These counters can then be used to verify that the software instances are within their allowed budgets for shared resource accesses.
- Interrupts can be masked and routed separately for each CPU.
- The I/O peripherals' register interfaces are located at separate 4k address boundaries, to allow (via processor MMU) restricting user-level software from accessing the "wrong" peripheral.
- The IOMMU allows placing DMA peripherals into groups and offers modes with protection and address translation.

#### ARCHITECTURAL FEATURE: IMPROVED SUPPORT FOR PROFILING AND DEBUGGING

The GR740 provides high-speed debug interfaces via the 1000/100/10 Mbit Ethernet interfaces. The dedicated Debug bus allows non-intrusive debugging since the DSU, trace buffers and performance counters can be accessed without causing traffic on the Processor AHB bus.

The GR740 also supports filtering for both the AHB and instruction trace buffers, as well as hardware data watchpoints and data area monitoring.

The LEON4 statistics unit provides performance counters, with support for filtering, for a large number of events, including:

- I/D cache/TLB miss/hold
- Data write buffer hold
- Branch prediction miss

- Total/Integer/Floating-point instruction count
- Total execution count
- Level-2 cache accesses, misses, hits
- AHB bus statistics for Processor AHB bus and Master I/O AHB bus

The interrupt controller in the GR740 supports interrupt time stamping with time stamps interrupt line assertion and processor interrupt acknowledge.

#### ARCHITECTURAL FEATURE: IMPROVED SUPPORT FOR PROM-LESS APPLICATIONS

The GR740 provides easy access for systems that want to avoid having a boot-PROM connected to the device and prefer to upload software remotely. The example flow for booting via SpaceWire becomes:

- 1. Connect via RMAP
- 2. Configure main memory controller
- 3. Use hardware memory scrubber to initialize memory
- 4. Enable Level-2 cache
- 5. Upload software
- 6. Assign processor start address(es) via interrupt controller
- 7. Start processor(s) via interrupt controller.

The SpaceWire router, with eight external ports, can be controlled via bootstrap signals to be fully functional without processor intervention. This also allows the device to act as software/processor-free bridge between SpaceWire and PCI/SPI/MIL-STD-1553B etc, via RMAP.

The IOMMU can be used to restrict the memory addresses accessible via SpaceWire RMAP and allow safe operation of software while RMAP is enabled.

## TARGET TECHNOLOGY

The target technology is 65nm CMOS C65SPACE from STMicroelectronics. The device is manufactured by STMicroelectronics.

## PACKAGE

The device has a hermetically sealed CLGA625 package.

## **DEVELOPMENT BOARD**

The baseline specification for the development board is to support all implemented functions and interfaces of the design. The foreseen form factor is Compact PCI (cPCI) in a 6U format occupying one or two slots.

#### **GR740 VALIDATION**

A validation plan for the GR740 devices has been established as part of the initial NGMP development. The validation plan covers verification of all IP cores and four multi-processor operating systems (eCos, Linux, RTEMS and VxWorks). The validation effort will repeat what has been done previously by means of prototyping, but at full operating speed and over temperature.

#### SOFTWARE SUPPORT

The GR740 architecture is already supported by all operating systems and toolchains provided by Cobham Gaisler. The GR740 architecture has, through NGMP FPGA prototypes and the LEON4-N2X device, already been evaluated by the European space industry and by other technology vendors.

The ESA NGMP development programme comprises also several SW development activities, covering evaluation, benchmarking and predictability analysis, as well as hypervisor and operating system development (e.g. XtratuM, RTEMS-SMP).

#### **INSTRUCTION SET SIMULATOR**

An instruction set simulator for the NGMP architecture has been developed along with the NGMP design. This simulator, currently based on Cobham Gaisler's GRSIM simulator, will be extended and adapted to match the GR740 device.

## COMPARISON WITH GR712RC, UT699, UT699E and UT700 DEVICES

The LEON4 in the GR740 improves the cycles-perinstruction (CPI) performance over existing LEON3FT devices by using wider internal data paths and a wider interface to the on-chip bus. The LEON4 processor provide 1.7 Dhrystone MIPS(DMIPS)/MHz while the LEON3FT implementations provide up to 1.4 DMIPS/MHz.

The improved CPI performance is amplified by the higher maximum operating frequency of the GR740 device. Also, the quad processor system provides an additional performance improvement. The theoretical speed-up, excluding super-scaling, is up to four. In reality the speed-up is less due to the shared bus and software synchronization requirements.

#### CONCLUSION

The GR740 is a SPARC V8(E) based multi-core architecture that provides a significant performance increase compared to earlier generations of European space processors, with high-speed interfaces such as SpaceWire and Gigabit Ethernet on-chip. The platform has improved support for profiling and debugging and will have a rich set of software immediately available due to backward compatibility with existing SPARC V8 software and LEON3 board support packages. GR740 includes also specific support for AMP configurations and Time-Space Partitioning.

The GR740 constitutes the engineering model of the ESA NGMP, which is part of the ESA roadmap for standard microprocessor components. It is developed under ESA contract, and it will be commercialised under fair and equal conditions to all users in the ESA member states. The GR740 is also fully developed with manpower located in Europe, and it only relies on European IP sources. It will therefore not be affected by US export regulations.

The NGMP specification and other related documents are posted at the following link:

http://microelectronics.esa.int/ngmp/ngmp.htm

## REFERENCES

[1] Summary Report: Development of LEON3-FT-MP, (GINA), E. Catovic, May 2006

http://microelectronics.esa.int/finalreport/SummaryRepo rt-18533-COO3-2006-05-15.pdf

[2] Executive Summary: Manufacture and Validation of LEON4FT Multiprocessor Prototype Device, May 2013 <u>http://microelectronics.esa.int/finalreport/NGFP-EXEC-0011-ilr1.pdf</u>

[3] P. Roche, G. Gasiot, S. Uznanski, J-M. Daveau, J. Torras-Flaquer, S. Clerc, and R. Harboe-Sørensen, "A Commercial 65 nm CMOS Technology for Space Applications: Heavy Ion, Proton and Gamma Test Results and Modeling", IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 57, NO. 4, AUGUST 2010