Workshops & Tutorials – International Symposium on Field-Programmable Gate Arrays

Workshop and Tutorial Schedule

All times shown in Pacific Standard Time (UTC−8)

To sign up for workshops and tutorials, please use the registration link on the conference registration page (https://wp.isfpga.org/registration/). Some events may require additional registration, see each event for details.

Lunch will not be provided, but we will have a lunch break from 12:15PM to 1:45PM

Type	Date	Time	Name	Location
Tutorial	March 1	09:00AM - 12:15PM	T1: Python-Based Accelerator Design and Programming with Allo	TBD
Tutorial	March 1	09:00AM - 12:15PM	T2: Mastering AI Optimization with Altera's FPGA AI Suite	TBD
Tutorial	March 1	09:00AM - 12:15PM	T3: Democratizing High-Level Synthesis Research: From Rapid Simulation Tools to ML-Ready Datasets	TBD
Tutorial	March 1	01:45PM - 05:00PM	T4: Smart Embedded Vision AI on Microchip FPGAs using VectorBlox™ Accelerator SDK	TBD
Tutorial	March 1	01:45PM - 05:00PM	T5: Weightless Neural Networks: An Efficient Edge Inference Architecture	TBD
Tutorial	March 1	01:45PM - 05:00PM	T6: CEDR: A Holistic Software and Hardware Design Environment for Hardware Agnostic Application Development and Deployment on FPGA-Integrated Heterogeneous Systems	TBD
Workshop	March 1	09:00AM - 12:15PM	W1: Workshop on Domain-Specialized FPGAs	TBD
Workshop	March 1	01:45PM - 05:00PM	W2: FPGAs for -omics: The road ahead	TBD

Workshop and Tutorial Details

T1: Python-Based Accelerator Design and Programming with Allo

March 1, 2025 | 09:00AM – 12:15PM

Organizers: Hongzheng Chen (Cornell), Niansong Zhang (Cornell), Shaojie Xiang (Cornell), Zhiru Zhang (Cornell)

Special-purpose hardware accelerators are increasingly pivotal for sustaining performance improvements in emerging machine learning and scientific applications, especially as the benefits of technology scaling continue to diminish. However, designers currently lack effective tools and methodologies to construct complex, high-performance accelerator architectures in a productive manner. Existing high-level synthesis (HLS) tools often require intrusive source-level changes to attain satisfactory quality of results. Despite the introduction of several new accelerator design languages (ADLs) aiming to enhance or replace HLS, they prove less effective for realistic hierarchical designs with multiple kernels, even if the design hierarchy is flattened.

In this tutorial, we will present Allo, a Python-based composable programming model for efficient accelerator design. Allo decouples hardware customizations—such as compute, memory, communication, and data types—from algorithm specifications, encapsulating them into a set of customization primitives. By preserving the hierarchical structure of input programs, Allo integrates customizations across functions in a bottom-up, type-safe manner, enabling holistic optimizations that span function boundaries. Specifically, we will:
1. Demonstrate the concept of decoupling customizations from algorithms.
2. Utilize Allo to optimize both single-kernel and multi-kernel designs, with examples including PolyBench, convolutional neural networks (CNNs), and large language models (LLMs).
3. Discuss our latest advancements in spatial and temporal composition, and showcase the mapping flow to AMD Versal FPGAs and AMD Ryzen AIE hardware.

This tutorial is based on our publications in PLDI’24, FPGA’25, ’24, ’22, ’19, DAC’22, and FCCM’24. The Allo framework is open-source and available at https://github.com/cornell-zhang/allo.

T2: Mastering AI Optimization with Altera’s FPGA AI Suite

March 1, 2025 | 09:00AM – 12:15PM

Organizers: Cinthya Rosales (Altera), Rama Venkata (Altera)

This workshop will focus on utilizing FPGA AI suite to deploy and optimize AI models on FPGAs. Participants will learn how to optimize for various elements such as power, throughput, and latency, which are crucial for real-time applications requiring efficient inference at the edge. The workshop will cover:

Introduction to FPGA AI suite
- How to install the software utilizing Docker
- The different components of our toolset
Hands-on Case study
- Using a pre-trained model, we will analyze the different architectures
  provided in the toolset and see how they affect the different elements of
  optimization.

T3: Democratizing High-Level Synthesis Research: From Rapid Simulation Tools to ML-Ready Datasets

March 1, 2025 | 09:00AM – 12:15PM

Organizers: Cong “Callie” Hao (Georgia Tech), Rishov Sarkar (Georgia Tech), Stefan Abi-Karam (Georgia Tech)

High-Level Synthesis (HLS) research continues to gain traction with the development of new HLS paradigms, including machine learning (ML) / deep learning-driven design optimization and the advent of LLMs for HLS designs. In this tutorial, we present two open-source tools to facilitate this new wave of HLS research.

Fast, Accurate Simulation. Simulating HLS designs is a frustratingly slow process, taking hours to days for complex designs and posing a significant challenge for incorporating simulation data into HLS research. We first introduce LightningSim (FCCM 2023 Best Paper Runner-Up), which provides extremely accurate performance simulation but can be orders of magnitude faster than C/RTL co-simulation. We will demonstrate how LightningSim’s push-button command-line tool works with existing Vitis HLS projects to obtain simulation data effortlessly—no additional configuration required.

HLS Datasets for Machine Learning and Beyond. Aside from simulation data, having high-quality data from all steps of the HLS design flow is critical for quickly validating novel research ideas. However, comprehensive, usable datasets of HLS design code and data are scarce. To address this, we introduce HLSFactory (MLCAD 2024 Best Paper), a Python-based data infrastructure that empowers users to quickly build their own HLS datasets and contribute data back to the community. We will show the versatility of our framework through several case studies, including post-implementation QoR prediction, HLS tool version regression benchmarking, executing Xilinx and Intel flows with parameterized designs, and contributing new user designs.

We invite any researchers involved with HLS to join us, from individuals using HLS to develop FPGA designs to those working on HLS tools and methodologies.

T4: Smart Embedded Vision AI on Microchip FPGAs using VectorBlox™ Accelerator SDK

March 1, 2025 | 01:45PM – 05:00PM

Organizers: Navdhish Gupta (Microchip), Michie Shiroma (Microchip)

In this workshop we present Microchip’s VectorBlox™ solution that accelerates Machine learning inference on a FPGA resulting in higher performance and using lower power. We show how run Microchip’s VectorBlox™ SDK tutorials that compile and simulate models for tasks like classification, object detection, and show how to target the compiled objects to Microchip FPGAs. The VectorBlox Accelerator Software Development Kit (SDK) offers the most power-efficient FPGA-based Convolutional Neural Network (CNN)-based Artificial Intelligence/Machine Learning (AI/ML) inference with PolarFire® and PolarFire® SoC (System on Chip) FPGAs.

Specifically, after this Workshop and Tutorial attendees will be able to:

Describe the fundamentals of the VectorBlox Accelerator SDK and VectorBlox solution and why it might be used.
Download and configure the SDK and go through the steps to use the VectorBlox™ SDK.
Run tutorials that import models, optimize, compile, and then simulate the generated runtime objects that are targeted to PolarFire SoC FPGAs.
Run networks generated in the SDK on a PolarFire SoC Video Kit.

T5: Weightless Neural Networks: An Efficient Edge Inference Architecture

March 1, 2025 | 01:45PM – 05:00PM

Organizers: Lizy K. John (UT Austin), Priscila M. V. Lima (UFRJ, Brazil), Felipe M. G. Franca (UFRJ, Brazil), and Google

Mainstream artificial neural network models, such as Deep Neural Networks (DNNs) are computation-heavy and energy-hungry. Weightless Neural Networks (WNNs) are natively built with RAM-based neurons and represent an entirely distinct type of neural network computing compared to DNNs. WNNs are extremely low-latency, low-energy, and suitable for efficient, accurate, edge inference. The WNN approach derives an implicit inspiration from the decoding process observed in the dendritic trees of biological neurons, making neurons based on Random Access Memories (RAMs) and/or Lookup Tables (LUTs) ready-to-deploy neuromorphic digital circuits. Since FPGAs are abundant in LUTs, LUT based WNNs are a natural fit for implementing edge inference in FPGAs.

WNNs has been demonstrated to be an energetically efficient AI model, both in software, as well as in hardware. For instance, the most recent DWN – Differential Weightless Neural Network – model demonstrates up to 135× reduction in energy costs in FPGA implementations compared to other multiplication-free approaches, such as binary neural networks (BNNs) and DiffLogicNet, up to 9% higher accuracy in deployments on constrained devices, and culminate in up to 42.8× reduction in circuit area for ultra-low-cost chip implementations. This tutorial will help participants understand how WNNs work, why WNNs were underdogs for such a long time, and be introduced to the most recent members of the WNN family, such as BTHOWeN , LogicWiSARD, COIN, ULEEN and DWN, and contrast to BNNs and LogicNets.

T6: CEDR: A Holistic Software and Hardware Design Environment for FPGA-Integrated Heterogeneous Systems

March 1, 2025 | 01:45PM – 05:00PM

Website: https://ua-rcl.github.io/presentations/fpga25/index.html

Organizers: Serhan Gener (University of Arizona), Sahil Hassan (University of Arizona), Ali Akoglu (University of Arizona)

Building on tutorials conducted during ESWEEK’23 and ISFPGA’24, this tutorial caters to audiences with diverse backgrounds and varying levels of expertise through a series of hands-on activities tailored to the needs of three distinct user types: naïve application developers, system designers, and resource management heuristic developers. Throughout these activities, the common thread is lifting the barriers to research and enabling productive application development and deployment on FPGA-integrated heterogeneous systems.

As computing platforms grow more heterogeneous, system designers continue to explore design methodologies that leverage increased levels of heterogeneity to maximize performance within specified constraints. In line with this objective, we have developed CEDR, —an open-source, unified compilation, and runtime framework designed for FPGA-integrated heterogeneous systems, as part of the DARPA DSSoC program. CEDR empowers users to seamlessly develop, compile, and deploy applications on off-the-shelf heterogeneous computing platforms, while offering portability across diverse Linux-based systems to minimize migration efforts.

In this tutorial, participants will explore CEDR’s capabilities with hands-on exercises without having to possess specialized hardware expertise throughout the process, focusing on functional verification, performance analysis, and design space exploration through FPGA-based heterogeneous SoC emulations. They will gain practical experience on integrating accelerators, deploying applications, and performing profiling techniques. Additionally, the tutorial will introduce CEDR 2.0, which includes an 86x reduction in runtime overhead and new features that improve user productivity and performance. The tutorial aims to reduce the knowledge barrier for participants, enabling them to optimize applications for FPGA-based heterogeneous systems and explore the latest advancements in CEDR.

W1: Workshop on Domain-Specialized FPGAs

March 1, 2025 | 09:00AM – 12:15PM

Organizers: Abhishek Jain (AMD), Aman Arora (Arizona State University), Cong “Callie” Hao (Georgia Tech), Andrew Schmidt (AMD)

FPGAs are versatile devices used in various fields like digital signal processing, wireless communications, electronic warfare, and machine learning (ML). To meet diverse application needs, FPGA architectures have evolved to include larger blocks such as DSP slices, Block RAMs, and hardened controllers like PCIe and DDR. FPGA vendors offer multiple families of FPGAs, traditionally categorized by size, features, and building block resources. However, with the end of Moore’s law, specialization has become crucial for high performance, leading to the development of FPGAs tailored for specific domains. Examples include Intel Stratix 10 NX for ML, AMD/Xilinx Versal for ML & signal processing, SmartNICs for networking, RFSoC FPGAs for signal processing, and HBM-enabled FPGAs for high-performance computing.

Designing new FPGA architectures for emerging applications is challenging and requires analyzing workloads representative of targeted domains. Recognizing the importance of domain-specialized FPGAs and of understanding domain-specialized FPGA workloads, the first International Workshop on Domain-specialized FPGAs will be held at ISFPGA 2025. The workshop aims to share domain knowledge to assist the architecture, design and evaluation of future domain-specialized FPGAs. It will feature presentations from experts in various domains highlighting how FPGA architecture is used, what feature of FPGA becomes the limiting factor in performance and presenting a wish-list of features for future domain-specialized FPGAs. This will be followed by a panel discussion. A poster session will also be held, focusing on domain-specialized workloads, architectures, and CAD tools/algorithms for FPGAs.

W2: FPGAs for -omics: The road ahead

March 1, 2025 | 01:45PM – 05:00PM

Organizers: Madhura Purnaprajna (AMD and PES University), Hari Sadasivan (AMD and University of Washington)

Omics (genomics, proteomics, metabolomics etc.) has emerged as one of the largest sources of data on the planet, generating an unprecedented volume of information that poses significant computational and real-time processing challenges. Reconfigurable computing architectures, including FPGAs, offer potential solutions for addressing emergent computational patterns encountered in -omics fields, owing to their adaptability and customizability. Unlike GPUs, FPGAs are yet to evolve as general-purpose accelerators. Are FPGAs currently positioned to offer viable alternatives to existing omics solutions? Is there a unique opportunity for FPGAs, where cost, power and performance are aligned within the -omics space?

The objective of this workshop is to bring together experts in the field of FPGA architectures and computational genomics to jointly explore and democratize the landscape of performance acceleration with a purpose of improving application domains including but not limited to healthcare, biotechnology, agriculture, and environmental health. We will look at open questions like: What are the critical omics workloads that will benefit from acceleration? Is there a market? How should FPGAs evolve towards handling these workloads? Is the ecosystem ready?

The workshop will provide a platform for a deliberation on the computational demands and software benchmarks in the omics space to discuss gaps in existing FPGA architecture, the software stack, and potential mechanisms to bridge that gap.