Skip to content

Instantly share code, notes, and snippets.

Last active December 9, 2023 08:48
Show Gist options
  • Save ad-astra-et-ultra/d5ac9599746301345fdb402823634565 to your computer and use it in GitHub Desktop.
Save ad-astra-et-ultra/d5ac9599746301345fdb402823634565 to your computer and use it in GitHub Desktop.
Google Summer of Code 2023 Final Work Product Submission

gsoc blog cover

Google Summer of Code 2023 - CHIPS Alliance

Final Report : DSP Hard Block Integration in F4PGA

Table of Contents

About Me

I am an undergraduate student pursuing a B.Tech degree in Electronics and Communication Engineering from the National Institute of Technology, Durgapur. I am currently in the 4th year (final year) of my undergraduation and expect to graduate in 2024. I have been working with CHIPS Alliance throughout this summer. It has truly been a profound journey of learning and growth. Each day brought new challenges, all of which have contributed to my personal and professional development in immeasurable ways.

About CHIPS Alliance

The CHIPS Alliance is a collaborative organization dedicated to the advancement of open-source hardware solutions. It specializes in the development and hosting of high-quality intellectual property (IP) cores, interconnect IP encompassing both physical and logical protocols, and open-source software tools tailored for design and verification purposes. CHIPS Alliance's mission is to foster a collaborative and accessible environment that significantly reduces the expenses associated with IP and hardware development.

Project Overview

F4PGA is an open-source FPGA toolchain designed as a free alternative to proprietary computer-aided design tools like Xilinx’s Vivado. It is a workgroup under the CHIPS Alliance. Previously, mapping designs to DSP blocks and generating DSP block bitstreams were not implemented for the Xilinx 7-series FPGA devices within the toolchain. This project aimed to integrate the support for DSP48E1 hard block in F4PGA. This will enable designs using DSPs to be synthesized, placed, and routed correctly. We need to diagnose and implement required changes throughout the F4PGA toolchain, allowing for DSP design bitstreams to be successfully generated with open-source tools.

Project Deliverables

  1. Support for DSP48E1 within the F4PGA architecture definitions.
    • Status: Completed
  2. Testing flow for designs using Xilinx 7-series DSP hard blocks which include
    • Verilog-to-Bitstream using the F4PGA toolchain.
      • Status: Completed
    • Fasm2bels to re-generate the original netlist from the bitstream output of F4PGA.
      • Status: Completed
    • Proof-test through Vivado to verify the correctness of the netlist.
      • Status: Completed

Achieved Goals

  1. Dsp48e1 support in the F4PGA-arch-defs

  • Previously, there was no support for DSP within the F4PGA toolchain. What this essentially meant was that the architecture files and routing graphs of all the parts do not contain information about the DSP blocks. These are used for placement and routing of the implemented design on the FPGA fabric. Hence, the toolchain behaves as if there were no DSP resources available on the target board. I needed to model the DSP48E1 site using its official documentation and regenerate the routing graph and the architecture file for the targeted board.
  • I referred to the VTR docs and Xilinx documentation for this purpose. In order to facilitate the integration of a specific primitive into VTR, two files are required within the architecture definitions: dsp48e1.model.xml and dsp48e1.pb_type.xml. Techmaps for Yosys can then be prepared correctly using these primitives.
  • Model XML (xxx.model.xml) contains general information about the module’s input and output ports and their relations. Physical Block XML (xxx.pb_type.xml) describes the actual layout of the primitive, with information about the timings, internal connections, etc.
  • To allow VPR to elaborate Yosys output netlist, we need to perform a technology mapping pass to transform the Yosys-generated primitives to the VPR-compatible ones. To do so I modified cells_map.v and cells_sim.v in < f4pga-arch-defs/xilinx/xc7/techmap/>. They are necessary to translate the gate level netlist that is output from Yosys onto a set of VPR-readable cells that can be packed, placed, and routed onto the FPGA. I Added a DSP48E1 module to <cells_map.v> that maps to DSP48E_VPR defined in the <cell_sim.v>. “cell_map.v” defines how specific cells need to be re-mapped and “cell_sim.v” defines the VPR-specific cells that will be present in the .eblif output.
  • I included sequential and internal path combinational delays in the DSP primitive. Some fake timing values had to be included as they were not present in the database. Each mode(or a combination of features) of DSP48E1 has different internal timings, and each of them needs a separate VPR cell. Right now only one mode is enabled in the primitive. Other modes will have the same structure but different blif_models to map to. Bels.json must have a different set of timing info for each VPR cell. Since the total combinations of all the features will be quite large, only a few of those with significant differences in timings will be considered.
  • I also added DRC checks for DSP48E1 in the cells_map.v.
  • I have completed the above tasks in the following PR:
  1. Dsp48e1 support in the fasm2bels

  • Fasm2bels serves as a utility to facilitate the integration of a FASM file into Vivado. This functionality is achieved through the generation of a file that outlines the BEL connections (in the form of tech-mapped Verilog) and the creation of TCL commands for Vivado, which subsequently secure the BEL placements. Assuming the absence of any issues, the process is expected to result in Vivado producing a bitstream that matches the one generated by the F4PGA using the original FASM file.
  • Earlier it did not have support for decoding DSP48E1 features. I needed to add a python script that is capable of processing DSP48E1 related features in the FASM.
  • I Added to process DSP tiles while running fasm2bels on a target. Each DSP tile has 2 DSP48E1 sites. Features related to DSP_0 and DSP_1 sites are segregated which are then processed separately. For each DSP site, a bel(named DSP48E1) is instantiated with parameters that are set according to the set features in FASM. Sinks and sources are added to the bel by referring the user guide for DSP48E1.
  • Cascaded inputs and outputs are added as unconnected ports since these are non-configurable. USE_MULT, SEL_PATTERN, USE_PATTERNDETECT are not found in the Segbits database. They are always set as their default value. For parameters like MASK, PATTERN, IS_ALUMODE_INVERTED, etc. , binary value is decoded from the given multibit features in the FASM.
  • A cleanup function is added to be executed after completing the routing to remove redundant DSP sites. For each DSP site, we check if any of its outputs are in use. If not, the site is removed. Some designs may use only cascaded outputs of a particular DSP site (e.g. an intermediate DSP site in a filter). To check if these outputs are in use, we need to see if the adjacent sites are using the corresponding cascaded inputs which in turn is determined by the features of that site. Depending on the site's Y-coordinate, it may be cascaded with the DSP site within the same tile or the tile adjacent to it.
  • I have completed the above tasks in the following PR:
  1. Testing flow for designs using Xilinx 7-series DSP hard blocks

  • I needed to add tests in < f4pga-arch-defs/xilinx/xc7/tests/> to verify the correctness of newly added support for DSP block. This can be done by cross checking the bitstream generated by the F4PGA with the one generated by Vivado. The following diagram illustrates this: image
  • First F4PGA is used to generate a bitstream using the given source design. The FASM produced in one of the intermediate steps in F4PGA is then used as an input to the fasm2bels. It produces a tech-mapped verilog file along with placement constraints. These are then used to generate a bitstream using Vivado. For this test to succeed, both the bitstreams must be identical.
  • I have completed the above tasks in the following PR:

Future Work

  1. Improve Project X-Ray so that it can use new Vivado versions. (prjxray only supports Vivado 2017.2 currently)
  2. Add support for new FPGA architectures.
  3. Enable behavioural modelling of DSP elements. Currently only structural instantiation is supported for DSP elements.

Important Links

  1. Pull Requests:
  1. Blogs:


  • The successful completion of this project has enabled designs using DSPs to be synthesized, placed, and routed correctly. This will lead to a wider adoption of F4PGA further catalyzing growth of open source hardware development toolchains.
  • Working as a Google Summer of Code contributor at CHIPS Alliance has been an incredibly transformative experience that has significantly contributed to my personal and professional development. I am indebted to my mentor Maciej Kurc for his unwavering support and guidance throughout the project duration.
  • The knowledge, experiences, and relationships I gained during this period have not only equipped me with valuable skills but have also instilled in me a deep appreciation for the open-source community and its potential to shape the future of technology. I am excited to continue my journey of learning and growth in the world of open-source hardware.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment