

### **High Level Synthesis techniques**

Laurent Hili ESA-ESTEC 19/09/2011

#### What are the challenges ?



- Everything tends to become more complex (Moore's Law)
- Miniaturisation thanks to CMOS technologies (65, 45, 28 and 22nm) offers possibilities to design chips exhibiting a complexity beyond a billion of transistors
- Systems on board(s) tend to move to systems on chip (SoC)
- Possibility to integrate various technologies on the same chip (SW, HW digital, HW analog, MEMs, sensors)

### What are the challenges ?



- Necessity to have higher abstraction languages to face the new challenges raised by tighter HW/SW integration and trade offs
- Necessity to gain in productivity in order to handle the ever growing complexity while using the same number of designers (or even less)
- Necessity to put in place advanced CAD techniques to enable the productivity gains
  - ✓ Languages: C, C++, SystemC, System Verilog
  - Modeling: Transactions Level Modeling (TLM) & Transaction Based Verification (TBV)
  - ✓ CAD tools: High level synthesis / Virtual Platform

#### **Productivity through abstraction**





European Space Agency



- Higher productivity
- ✓ New IP development ~ 2 to 3 times faster than RTL (VHDL / Verilog)
- Possibility to describe an algorithm in a very concise way compared to HDL languages (~ 10 times fewer lines of code to maintain)
- The designer can better focus on the functionality rather than implementation details
- ✓ HLS can generate many RTL derivatives from same C code in a time effective manner (architecture exploration)
- ✓ HLS can easily be merged in a HW/SW co-design flow
  - ✓ Integration with virtual platform (SystemC TLM)
  - ✓ Integration with HDL flow (simulators, logic synthesis)

(ST Microelectronics source, 9th annual ESL Symposium, June 2011)

High Level Synthesis | Laurent Hili | ESA-ESTEC | 19/09/2011 | Microelectronics Section (TEC-EDM) | Slide 5

#### **Common features in HLS tools**



Architecture exploration  $\rightarrow$  ability to generate RTL variants

- Loops optimisation: parallelisation and/or pipelining
- Arrays optimisation: registers, RAMs, ROMs
- Interfaces optimisation: wire, enable, handshake, bus, NoC (Network on Chip interface)
- ✓ Gantt chart analysis → ability to display concurrency and/or dependencies between resources / tasks
- ✓ Verification flow analysis → ability to run regression tests, Transaction Based Verifications (TBV) in order to check the RTL code automatically generated against original C++ code

#### **Common features in HLS tools**



- ✓ Cross probing analysis → ability to identify resources in the Gantt chart or RTL and map them to the original C++
- Interfacing with conventional ASIC / FPGA back end flows (RTL simulators and logic synthesis)
- ✓ Architecture exploration  $\rightarrow$  target technology aware
  - ✓ Resources aware: operators, memories, interfaces
  - Timing aware
  - Area aware
  - Power aware (feature available as an option sometimes)

### **HLS solutions on the market**



- CatapultC originally developed by Mentor Graphics has been spun off to Calypto the 26/8/2011. This tool has been one of the first used for ASICs tape out. CatapultC is in full production in Thales where 3 ASICs out of 4 are now produced using this methodology
- ✓ Synphony C compiler  $\rightarrow$  Synopsys
- ✓ C to Silicon compiler  $\rightarrow$  Cadence
- ✓ Cynthesizer → Forte Design Systems

### **ESA High Level Synthesis flow**





#### **CatapultC and Simulink integration**





High Level Synthesis | Laurent Hili | ESA-ESTEC | 19/09/2011 | Microelectronics Section (TEC-EDM) | Slide 10

### FIR filter example (algorithm)



FIR filter direct implementation



$$Y(n) = \sum_{i=0}^{N} H(i) \times X(n-i)$$

Let's assume a 8 TAPs FIR filter N=7

$$Y(n) = \sum_{i=0}^{7} H(i) \times X(n-i)$$

 $Y(n) = H(0).X(n) + H(1).X(n-1) + H(2).X(n-2) + H(3).X(n-3) + \dots + H(7).X(n-7)$ 

Let's assume n=7

$$Y(7) = H(0).X(7) + H(1).X(6) + H(2).X(5) + H(3).X(4) + \dots + H(7).X(0)$$

High Level Synthesis | Laurent Hili | ESA-ESTEC | 19/09/2011 | Microelectronics Section (TEC-EDM) | Slide 11

European Space Agency

ESA UNCLASSIFIED – For Official Use

### FIR filter example (source code)





High Level Synthesis | Laurent Hili | ESA-ESTEC | 19/09/2011 | Microelectronics Section (TEC-EDM) | Slide 12

### FIR filter example (test bench)



|                                                          | <u>fir-tb.cpp</u>                                                                                                      |                                                                                               |
|----------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------|
|                                                          | #include <iostream></iostream>                                                                                         |                                                                                               |
|                                                          | <pre>#include "mc_scverify.h" #include "fir.h"</pre>                                                                   |                                                                                               |
|                                                          | // some functions for generating random test vectors                                                                   |                                                                                               |
|                                                          |                                                                                                                        |                                                                                               |
|                                                          | //<br>// Start of MAIN                                                                                                 | Unique test bench is used for regression<br>tests RTL vs C++ (Transaction Based Verification) |
|                                                          | CCS_MAIN(int argc, char *argv) {                                                                                       |                                                                                               |
| only ones differing from<br>original C++ test bench      | ac_fixed<18, 2, true, AC_TRN, AC_WRAP > input;<br>data_t coeffs[8];                                                    |                                                                                               |
| Those instructions are<br>used by CatapultC<br>synthesis | <pre>// Initialize local variables to zero init_input(input); init_cooffe(cooffe);</pre>                               |                                                                                               |
|                                                          | init_output(output);                                                                                                   |                                                                                               |
|                                                          | <pre>// Main test iterations start here for (int iteration = 0; iteration &lt; 100; ++iteration) {</pre>               |                                                                                               |
|                                                          | <pre>// Set test values for this iteration throw_dice_for_input(input); throw_dice_for_coeffs(coeffs);</pre>           |                                                                                               |
|                                                          | <ul> <li>// Call original function and capture data</li> <li>CCS_DESIGN(fir_filter)(input, coeffs, output);</li> </ul> |                                                                                               |
|                                                          | <pre>} // Return success CCS_RETURN(0);</pre>                                                                          |                                                                                               |
|                                                          | }                                                                                                                      |                                                                                               |

High Level Synthesis | Laurent Hili | ESA-ESTEC | 19/09/2011 | Microelectronics Section (TEC-EDM) | Slide 13

European Space Agency

## FIR filter example (architecture constraints)



| Catapult University Version 2010a.104 (Production Release) Constraint Ed                                                         | itor                                                                   |                                                                  |                                                                          |                         |
|----------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------|------------------------------------------------------------------|--------------------------------------------------------------------------|-------------------------|
| ne wew toos window men                                                                                                           | 3 The Current Solution                                                 | 27 F1 🕅 🋐 👔                                                      |                                                                          |                         |
| Task Bar A X Project Files                                                                                                       | ≜ ▼ ×                                                                  | Start Page Start Page                                            | Constraint E                                                             | ≜ <del>-</del> ×        |
| Synthesis Tasks                                                                                                                  |                                                                        |                                                                  |                                                                          |                         |
| 🗗 Add Input Files                                                                                                                |                                                                        | 🙀 fir filter                                                     | LUUp: man                                                                |                         |
| Setup Design                                                                                                                     |                                                                        |                                                                  | Iteration Count:                                                         | <b>.</b>                |
| 🔅 Architecture Constrai                                                                                                          |                                                                        | - 🧶 clk                                                          | Unroll                                                                   |                         |
| Schedule                                                                                                                         |                                                                        | 🖨 🚞 Interface                                                    | Partial:                                                                 | 2                       |
| Generate RTL                                                                                                                     |                                                                        | Imput:rsc (1x18)      model                                      | Next P                                                                   |                         |
| B 🔁 SCVerify                                                                                                                     |                                                                        | output:rsc (1x144)                                               | Pipeline                                                                 |                         |
| 🕀 🦢 Schematics                                                                                                                   |                                                                        | 🖻 🕥 core                                                         | Initiation Interval:                                                     | 1                       |
| Verification                                                                                                                     |                                                                        | 🖹 🣴 Arrays                                                       | Cenerate distributed pineline                                            |                         |
|                                                                                                                                  |                                                                        | 😑 🜓 regs:rsc (8x18)                                              | Enderstate and marca pipeline                                            |                         |
| Gycle VHDL output 'cycle.vhdl' v                                                                                                 | s Untimed C++                                                          |                                                                  | Decoupling stages:                                                       |                         |
| RTL VHDL output 'rtl.vhd' vs Unt                                                                                                 | .imed C++                                                              |                                                                  |                                                                          |                         |
| Mapped VHDL output 'rtl.vhdl' vs                                                                                                 | Untimed C++                                                            | MAC                                                              | Loop can be Merged                                                       |                         |
| Gate VHDL output 'gate.vhdi vs                                                                                                   | Untimed C++                                                            |                                                                  |                                                                          |                         |
|                                                                                                                                  |                                                                        |                                                                  |                                                                          |                         |
| 🕂 📴 Synthesis                                                                                                                    |                                                                        |                                                                  |                                                                          |                         |
| 🗄 🚞 Precision                                                                                                                    |                                                                        |                                                                  |                                                                          |                         |
|                                                                                                                                  |                                                                        |                                                                  |                                                                          |                         |
| Details - Loop                                                                                                                   | <u>≜ ▼ ×</u>                                                           |                                                                  |                                                                          |                         |
| Path: //fir_filter/core/main                                                                                                     |                                                                        |                                                                  |                                                                          |                         |
| File: C:\PROGRA~1\MENTOR~1\CATAPU~1.104\Mgc_home\pkgs\CC5_TO~1\flows\mat                                                         | :lab\src\fir\fir.cpp(4)                                                |                                                                  |                                                                          |                         |
|                                                                                                                                  |                                                                        |                                                                  |                                                                          |                         |
|                                                                                                                                  |                                                                        |                                                                  |                                                                          |                         |
|                                                                                                                                  |                                                                        |                                                                  |                                                                          |                         |
|                                                                                                                                  |                                                                        |                                                                  |                                                                          |                         |
|                                                                                                                                  |                                                                        |                                                                  | Settings                                                                 | Apply Cancel            |
| Transcript                                                                                                                       |                                                                        |                                                                  |                                                                          |                         |
| 0 0 Errors * 1 0 Warnings * 1 0 Infos * # 0 Comments * 5 0 Comm                                                                  | ands • Get Location                                                    |                                                                  |                                                                          | = • ^                   |
| # Nessage                                                                                                                        |                                                                        |                                                                  |                                                                          | D File(line) Id         |
| # Reading component library '\$MGC_HOME\pkgs\siflibs\ps:                                                                         | r2009a_up2\mgc_Xilinx-VIRTEX-6-1L                                      | L_beh_psr.lib' [mgc_Xilinx-VIR                                   | TEX-6-1L_beh_psr]                                                        | LIB-49                  |
| # Reading component library '\$MGC_HOME\pkgs\siflibs\ps;                                                                         | r2009a_up2\ram_Xilinx-VIRTEX-6-1L                                      | L_RAMDB.lib' [ram_Xilinx-VIRTE                                   | XX-6-1L_RAMDB]                                                           | LIB-49                  |
| <pre># Keading component library '\$MGC_HOME\pkgs\siflibs\ps;<br/># Reading component library '\$MGC_HOME\pkgs\siflibs\ns;</pre> | 52009a_up2\ram_Xilinx-VIRTEX-6-1L<br>r2009a_up2\ram_Xilinx-VIRTEX-6-11 | L_FIFE.IID' [ram_Xilinx-VIRTEX<br>L RAMSB.lib' [ram_Xilinx-VIRTE |                                                                          | LIB-49<br>LIB-49        |
| # Reading component library '\$MGC_HOME\pkgs\siflibs\ps                                                                          | r2009a_up2\rom_Xilinx-VIRTEX-6-11                                      | L.lib' [rom_Xilinx-VIRTEX-6-1L                                   |                                                                          | LIB-49                  |
| # Reading component library '\$MGC_HOME\pkgs\siflibs\ps                                                                          | r2009a_up2\rom_Xilinx-VIRTEX-6-1L                                      | L_SYNC_regin.lib' [rom_Xilinx-                                   | VIRTEX-6-1L_SYNC_regin]                                                  | LIB-49                  |
| # Reading component library '\$MGC_HOME\pkgs\siflibs\ps:                                                                         | 22009a_up2\rom_Xilinx-VIRTEX-6-1L                                      | L_SINC_regout.lib' [rom_Xilinx                                   | -VIRTEX-6-1L_SYNC_regout]                                                | LIB-49                  |
| fir_filter{2}>                                                                                                                   |                                                                        |                                                                  |                                                                          |                         |
| Ready Project Dir: Catapult Working Dir: Settings\Laurent Hill\Desktop\FIR Filter example                                        |                                                                        |                                                                  |                                                                          |                         |
| 🛃 Start 🛛 🏠 ESL day September 2011 🛛 🏠 fir_filter.v3 🛛 🔯 fir                                                                     | 😡 Laurent Hili - Inbox - IBM                                           | . 🛛 🐻 Microsoft PowerPoint - [ 🗍 🔁 hls-repo                      | rt-2011.pdf - Ad 🛛 🔤 Catapult University Versi 🗍 🎯 Catapult University V | 2 🕵 🗟 🤉 🖻 🖉 🛞 💟 💷 餐 📎 🖉 |

ESA UNCLASSIFIED - For Official Use

# FIR filter example (architecture exploration)



| Report: General          |                | - 🔆 🗙 🛄 🛄    |                   |                 |       |            |
|--------------------------|----------------|--------------|-------------------|-----------------|-------|------------|
| Solution /               | Latency Cycles | Latency Time | Throughput Cycles | Throughput Time | Slack | Total Area |
| fir_filter.v3 (extract)  | 8              | 400.00       | 10                | 500.00          | 43.03 | 858.44     |
| fir_filter.v5 (extract)  | 9              | 450.00       | 10                | 500.00          | 44.89 | 678.12     |
| fir_filter.v11 (extract) | 8              | 400.00       | 8                 | 400.00          | 44.51 | 593.86     |
| fir_filter.v12 (extract) | 1              | 50.00        | 1                 | 50.00           | 42.85 | 2966.85    |

Clock period constrained to 50ns (20 MHz)

|                |           | resources  |                     |                                                 |  |
|----------------|-----------|------------|---------------------|-------------------------------------------------|--|
| solution       | MAIN loop | SHIFT loop | MAC loop            | Default implementation                          |  |
| Fir_filter.v3  | Rolled    | Rolled     | Rolled 🔺            | timings constraints allow loops are merged      |  |
| Fir_filter.v5  | Rolled    | Unrolled   | Rolled              |                                                 |  |
| Fir_filter.v11 | Pipelined | Unrolled   | Pipelined<br>II=1 ◀ | II=1<br>Initial Interval = 1                    |  |
| Fir_filter.v12 | Pipelined | Unrolled   | Unrolled            | i data red in the pipeline<br>Every clock cycle |  |

High Level Synthesis | Laurent Hili | ESA-ESTEC | 19/09/2011 | Microelectronics Section (TEC-EDM) | Slide 15

## FIR filter example (architecture exploration)





Solutions

High Level Synthesis | Laurent Hili | ESA-ESTEC | 19/09/2011 | Microelectronics Section (TEC-EDM) | Slide 16

## FIR filter example (architecture exploration)







High Level Synthesis | Laurent Hili | ESA-ESTEC | 19/09/2011 | Microelectronics Section (TEC-EDM) | Slide 17

# FIR filter example (Gantt chart analysis)





## FIR filter example (RTL code & target technology netlist generation)





High Level Synthesis | Laurent Hili | ESA-ESTEC | 19/09/2011 | Microelectronics Section (TEC-EDM) | Slide 19

## FIR filter example (RTL code & target technology netlist generation)





#### Fir\_filter.V12

Main loop: pipelined Shift loop: unrolled MAC loop: unrolled

#### Best timing solution

High Level Synthesis | Laurent Hili | ESA-ESTEC | 19/09/2011 | Microelectronics Section (TEC-EDM) | Slide 20

## FIR filter example (implementation after synthesis & place-route)







High Level Synthesis | Laurent Hili | ESA-ESTEC | 19/09/2011 | Microelectronics Section (TEC-EDM) | Slide 21

# FIR filter example (simulink validation)





ESA UNCLASSIFIED - For Official Use

### Integration of HLS flow with virtual platform





High Level Synthesis | Laurent Hili | ESA-ESTEC | 19/09/2011 | Microelectronics Section (TEC-EDM) | Slide 23

#### **Virtual Platform deployment view**





High Level Synthesis | Laurent Hili | ESA-ESTEC | 19/09/2011 | Microelectronics Section (TEC-EDM) | Slide 24



### Thanks for your attention

Special thanks to my colleague Jelle Poupaert and Stephane Labert (Mentor Graphics France) for their support

Any question ?