Created by Joseph Joy
Accelerator's control and status register interface. It allows a processor to manage the hardware.
0x10: zeta parameter0x18: epsilon parameter0x20: delta parameter0x00 (Control Register): A specific bit in this register can act as a start signal for your FSM.0x00 (Status Register): Read a bit to see if the core is busy or done.The coordination between the FSM, AXI interfaces, and the algorithm cores is the key to achieving the high-throughput, pipelined performance described in the paper. Think of it as a perfectly synchronized factory assembly line.
Here is how they work together to pipeline the processing of LiDAR frames.
zeta, epsilon, delta) into registers.start command that kicks off the entire process.TLAST signal to tell the FSM, "This is the last piece of the current job (frame)."IDLE, CLEARING_GRID, PROCESSING) and tells the two algorithm cores when to work based on the signals from the AXI interfaces.Cell Grid Core (Workstation 1): Takes raw materials (points) from the input belt and does the first processing step (populating the BRAMs).Ground Segmentation Core (Workstation 2): Takes the semi-finished product from Workstation 1 (via the BRAMs) and completes the final assembly (classification).Let's trace the flow of two consecutive frames, Frame N and Frame N+1, to see the pipeline in action.
IDLE. The processor has already configured the parameters via AXI-Lite.PROCESSING state, and enables the Cell Grid Core (Workstation 1).Cell Grid Core processes each point of Frame N, writing to the 'A' ports of the Grid BRAM and Points BRAM.Ground Segmentation Core (Workstation 2) is idle. It has nothing to do yet.TLAST signal is asserted for one clock cycle.TLAST signal. This is the trigger for the entire pipeline. It immediately does two things in the very next clock cycle:
start signal to the Ground Segmentation Core (Workstation 2), telling it, "The data for Frame N is ready in the BRAMs. Begin your work."Cell Grid Core (Workstation 1) enabled. Why? Because the points for the next frame, Frame N+1, are arriving on the AXI-Stream input right behind Frame N.This is the state where the system achieves maximum throughput. For the entire duration that Frame N+1 is being processed:
Cell Grid Core is busy processing points from Frame N+1. It reads from the AXI-Stream Slave and writes to the 'A' ports of the BRAMs.Ground Segmentation Core is simultaneously busy processing the data from Frame N. It reads from the 'B' ports of the BRAMs and sends the results to the AXI-Stream Master.This parallel operation is only possible because the Block RAMs are dual-port. One core can write to one side of the memory while the other core reads from the other side without conflict.The FSM's job is simply to manage these start/stop signals at the frame boundaries. The AXI interfaces ensure the data keeps flowing smoothly, and the algorithm cores just focus on their dedicated tasks. This elegant coordination allows the accelerator to start processing a new frame long before it has finished with the previous one, which is the very definition of a pipelined architecture.
start, done, reset) for all other modules.Grid BRAM at the beginning of each new frame processing.Size Determined by the maximum number of points in a frame .
Cell Grid Core. It writes the is_valid, point_z and cell_index for every point in the frame. The memory address corresponds to the point's index in the frame.Ground Segmentation Core.max_points_per_frame (e.g., ~70k for VLP-16) x (Z_coord_width + cell_index_width).TVALID, TREADY, TDATA, TLAST). TLAST is used to signal the end of a frame.Responsible for the first step—creating and populating the grid.
Processes each point from the input LiDAR frame to determine which grid cell it belongs to.
Functionality:
cell_index_calculator  also determines if the point is inside the grid boundaries. This generates the is_valid flag.{is_valid=1, point_z, cell_index} to the Points BRAM. It also proceeds with the read-modify-write operation on the Grid BRAM.{is_valid=0, DONT_CARE, DONT_CARE} to the Points BRAM. It does not perform any operation on the Grid BRAM. The point is effectively ignored by the grid creation logic, as intended.(is_valid, point_z, cell_index) tuple to the Points BRAM.Grid BRAM to update the Z_min and Z_max values for the corresponding cell_index.Top-Level Controller detects the start of a new frame from the AXI-Stream Slave Interface and asserts a signal to clear the Grid BRAM, setting all Z_min values to +infinity and Z_max to -infinity.Cell Grid Core one by one.i, the Cell Grid Core performs two actions simultaneously:
cell_index. It then reads the existing (Z_min, Z_max) from the Grid BRAM at that cell_index(Address), compares them with the current point's Z coordinate, and writes back the updated values if necessary.is_valid flag, Z coordinate (point_z) and its calculated cell_index into the Points BRAM at addressTLAST signal. The Top-Level Controller now knows that the Grid BRAM holds the complete elevation map for Frame 'N' and the Points BRAM holds all the necessary point data.Performs the second step—classifying points as ground or non-ground. It uses the pre-calculated grid information (Z_min, Z_max) to make its decision for each point.
After the Cell Grid Core has processed the entire frame, the Ground Segmentation Core starts reading the point data that was stored in the Points BRAM sequentially.
is_valid p.z (the point's Z value) and p.cell_index.is_valid flag first.
is_valid == 1: It proceeds with the normal FESTA classification logic (fetches Z_min/Z_max from Grid BRAM, Â using the cell index , runs the comparisons, etc.) and outputs the resulting 0 or 1 flag.is_valid == 0: It bypasses all the classification logic. It immediately outputs a default classification flag. The most logical and safe default for a point outside the known area is ALFA_LABEL_NO_GROUND (0).Top-Level Controller signals the Ground Segmentation Core to begin.
Ground Segmentation Core starts reading from the Points BRAM sequentially, from address 0 to the last point. For each entry it reads (is_valid,point_z, cell_index) ,cell_index to perform a read from the Grid BRAM to fetch the final, correct Z_min and Z_max values for that cell.is_ground flag is sent to the AXI-Stream Master Interface.AXI-Stream Master Interface streams the classification results out of the FPGA.Top-Level Controller marks the processing ofGround Segmentation Core.The fundamental principle is that the output stream of flags is perfectly synchronized with the original input point cloud.
The AXI stream of 0s and 1s is essentially a tag or a label for each point. The downstream system's job is to correlate these tags with the original point data, which it has kept a copy of.
Here is the step-by-step process that happens outside the FESTA accelerator, typically on the host processor (like the ARM core in a Zynq SoC).
Before the point cloud frame is sent to the FPGA, the host processor makes sure it has a complete copy of it stored in main memory (DDR RAM). Let's call this OriginalPointCloud_FrameN.
The processor configures a DMA (Direct Memory Access) controller to perform two tasks:
OriginalPointCloud_FrameN from the DDR RAM and stream it to the FESTA accelerator's AXI-Stream Slave input.FlagBuffer_FrameN.Once the DMA transfer is complete, the processor has two arrays in memory:
OriginalPointCloud_FrameN: Contains the full (X, Y, Z, intensity, etc.) data for every point.FlagBuffer_FrameN: Contains the corresponding ground/non-ground flag for every point.Now, the software can perform the actual segmentation by simply iterating through both arrays simultaneously.
Size determined by the Grid Dimensions (eg 512x216 cells)
Z_min, Z_max) for the entire grid.Cell Grid Core for the read-modify-write updates.Ground Segmentation Core to fetch the Z_min and Z_max values for classification.grid_width x grid_length (e.g., 512x256) x (Z_min_width + Z_max_width).