Print this page

A Compact Vector Processor for FPGA Applications

Aaron Severance and Guy Lemieux, University of British Columbia

VENICE (Vector Extensions to NIOS Implemented Compactly and Elegantly) is a soft vector processor (SVP) that accelerates computationally intensive applications. Exclusively for FPGAs, SVPs target the gap between writing custom hardware in VHDL/Verilog (high performance/low productivity) and writing software for a soft processor (low performance/high productivity). In contrast to previous SVPs, VENICE is a very small (compact) core, highly efficient with short vectors, and optimized for 1 to 4 parallel lanes (32-bit ALUs) in width. This makes VENICE a good building block for a multiprocessor-based system to exploit thread- and task-level parallelism on top of vectors.

VENICE achieves speedups up to 20x using just 1 vector lane (V1) by supporting subword arithmetic, eliminating inner loop overhead, and eliminating load-store instructions in favor of DMA transfers concurrent with computation. Vector arithmetic uses memory-to-memory operations on a scratchpad, with addresses specified by C pointers. For higher performance with short vectors, VENICE can optionally computer on 2D vectors. A V1-VENICE offers 2.5x better area-delay product than V1-VEGAS (our previous work, FPGA2011) and 4.7x better than Altera's fastest soft processor, the Nios II/f, with its best area-delay product at V2 and V4 (5.3x and 5.2x better than Nios II/f, respectively).