Code Structure
Overview
The code structure of RAD-Sim is summarized as follows:
rad-sim/
|- sim/
| |- dram/
| | |- DRAMsim3/
| | |- mem_controller.{cpp/hpp}
| |- noc/
| | |- booksim/
| | |- aximm_interface.hpp
| | |- axis_interface.hpp
| | |- aximm_master_adapter.{cpp/hpp}
| | |- aximm_slave_adapter.{cpp/hpp}
| | |- axis_master_adapter.{cpp/hpp}
| | |- axis_slave_adapter.{cpp/hpp}
| | |- noc_utils.{cpp/hpp}
| | |- sc_flit.{cpp/hpp}
| | |- radsim_noc.{cpp/hpp}
| |- design_context.{cpp/hpp}
| |- design_system.hpp
| |- design_top.hpp
| |- radsim_cluster.{cpp/hpp}
| |- radsim_config.{cpp/hpp}
| |- radsim_defines.hpp
| |- radsim_inter_rad.{cpp/hpp}
| |- radsim_module.{cpp/hpp}
| |- radsim_telemetry.{cpp/hpp}
| |- radsim_utils.{cpp/hpp}
| |- main.cpp
|- example-designs/
| |- mydesign/
| | |- modules/
| | |- config.yml
| | |- mydesign_driver.{cpp/hpp}
| | |- mydesign_system.{cpp/hpp}
| | |- mydesign_top.{cpp/hpp}
| | |- mydesign.clks
| | |- mydesign.place
|- test/
It consists of three main directories:
sim: contains all the RAD-Sim simulation infrastructureexample-designs: contains all application designs simulated in RAD-Simtest: contains all the testing scripts
Simulator Infrastructure (sim)
This directory includes all the RAD-Sim simulation infrastructure and utilities:
The
nocdirectory contains everything related to the NoC modeling:Booksim 2.0 NoC simulator source code.
Definitions of the AXI memory mapped (AXI-MM) and streaming (AXI-S) interfaces (
{aximm/axis}_interface.hpp).SystemC implementation of the AXI-MM and AXI-S NoC adapters (
{aximm/axis}_{master/slave}_adapter.{cpp/hpp}). These adapters present AXI-MM/AXI-S interfaces for the Booksim NoC to the SystemC application modules. They also perform functionalities such as width adaptation and clock-domain crossing between the application modules and the NoC.Bit-level definitions of the NoC flit and packet (
sc_flit.{cpp/hpp}).NoC-related utility functions (
noc_utils.{cpp/hpp}).SystemC wrapper for the NoC (Booksim NoC + Adapters) to be instantiated in application designs (
radsim_noc.{cpp/hpp}).
The
dramdirectory contains everything related to external memory modeling in RAD-Sim:DRAMsim3 memory simulator source code.
SystemC wrapper for DRAMsim that presents an AXI-MM interface and implements functionality book-keeping to be instantiated in application designs (
mem_controller.{cpp/hpp}).
The
RADSimDesignContextclass indesign_context.{cpp/hpp}stores all the details of a RAD-Sim design such as NoCs and modules of the design, their clocks, module NoC placement, and connections between modules and NoC adapters. For each device in the RAD-Sim simulation, there is a variable of this class type (radsim_design) that stores these information to be used from any part of the simulator.The
RADSimClusterclass inradsim_cluster.{cpp/hpp}stores details for the cluster of RADs for the RADSim simulation. This is the top-level of the hierarchy for simulation. Single-RAD simulation is implemented as a cluster of one RAD.The
RADSimDesignSystemclass indesign_system.hppis a generalized parent class used per design. The RADSimDesignSystem wraps around the device-under-test (DUT) and testbench. Each design in theexample-designsdirectory has its own system class that should inherit from this class. This class hassc_moduleas its virtual parent class.The
RADSimDesignTopclass indesign_top.hppis a parent class for the DUT (top) class used within any design. It contains the creation of a portal module which is used to interface with the inter-RAD network. This class hassc_moduleas its virtual parent class.The
RADSimConfigclass inradsim_config.{cpp/hpp}stores all the RAD-Sim configuration parameters.RAD-Sim constant definitions are in
radsim_defines.hpp. This header file is automatically generated by the RAD-Sim configuration script (config.py).The
RADSimInterRad classinradsim_inter_rad.{cpp/hpp}implements a latency- and bandwidth-constrained network for communication between RADs.The
RADSimModuleclass inradsim_module.{cpp/hpp}implements an abstract class from which all RAD-Sim application modules are derived. This class stores information about each module in the design such as its name, its clock, pointers to its AXI-MM/AXI-S ports and their data widths. Each module in the application design must implement the pure virtual funtionRegisterModuleInfo()with adds the module AXI-MM and AXI-S master/slave ports to theRADSimDesignContextclass.Logging and trace recording functions and classes are in
radsim_telemetry.{cpp/hpp}.The
NoCTransactionTraceandNoCTransactionTelemetryare used for collecting NoC statistics.The
SimLogclass is for logging simulator messages.The
SimTraceRecordingclass is for recording timestamps at any time during the simulation and dumping them as simulation traces at the end of the simulation.
Utility functions and struct definitions are in
radsim_utils.{cpp/hpp}.The
main.cppfile declares all the global variables, instantiates the system to be simulated, and starts the SystemC simulation.
Application Designs (example-designs)
This directory includes the user application designs that will be simulated in RAD-Sim. Each application design has its
own sub-directory (<design_name>/) which must contain the following files/directories.
Modules Directory (modules/)
This directory includes the SystemC definitions of all the modules in the design. All of these modules have to be derived
from the RADSimModule abstract class. If a module is to be attached to the NoC, it must have AXI-MM and/or AXI-S
ports which are defined in the sim/{aximm|axi_s}_interface.hpp files.
Design Top-level (<design_name>_top.{cpp/hpp})
These files define a RADSimDesignTop class which in turn defines a SystemC module (sc_module) that instantiates all the modules in the design and connects any
non-NoC signals between the modules in its constructor using conventional SystemC syntax. At the end of its constructor,
it must include the following lines of code to build the design context, create the system NoCs, and automatically
connect the ports of NoC-attached modules to the NoC based on the NoC placement file:
// mydesign_top Constructor
mydesign_top::mydesign_top(const sc_module_name &name, RADSimDesignContext* radsim_design) : RADSimDesignTop(radsim_design) {
this->radsim_design = radsim_design; //to use within design
// Module Instantiations and Connections Start Here
// ...
// Module Instantiations and Connections End Here
radsim_design->BuildDesignContext("mydesign.place", "mydesign.clks");
radsim_design->CreateSystemNoCs(rst);
radsim_design->ConnectModulesToNoC();
}
The design top-level SystemC module will typically have input/output ports (sc_in/sc_out) which will be used to
communicate with the design testbench/driver.
Design Testbench (<design_name>_driver.{cpp/hpp})
These files define a SystemC module (sc_module) that acts as the testbench/driver of the design top-level module.
It has two SystemC threads (SC_CTHREAD): a source thread that sends inputs to the design top-level input ports
and a sink thread that listens on the design top-level output ports to receive outputs. A common scenario is that
this driver module performs the following steps:
Parse test inputs and golden outputs from files.
Use the
sourcethread to send inputs to design top-level when ready.Use
sinkthread to listen for outputs from the design top-level when available.Compare received outputs to golden outputs to verify functionality.
Raise per-RAD done flag when all testbench outputs are received. When all testbenches (for all RADs in the simulation raise their done flags, simulation stops.
Design System (<design_name>_system.{cpp/hpp})
This inherits from the RADSimDesignSystem class and is a simple SystemC module (sc_module) that instantiates and connects the design top-level and simulation
driver modules. This is the single module that will be instantiated inside the sc_main() function in the
main.cpp file.
Clock Settings File (<design_name>.clks)
This file defines the operating clock frequency of the module’s NoC adapters and the module itself for each of the modules instantiated in the design. Each line of this file should have a module name followed by two integers (all space-separated) as shown in the example below.
module_a 0 1
module_b 0 0
The two integers in each line represent the indecies to the NoC adapters and design clock period values listed in the
design’s config.yml file. For example, if the config.yml file, had the following values, it means that the NoC
adapters of both modules are operating at 1.25 ns clock period (800 MHz), while module_a has a clock period of
2.5 ns (400 MHz) and module_b has a clock period of 5.0 ns (200 MHz).
For designs containing multiple RADs, RAD-Sim adds a portal module to the design, which allows for communication between RADs. The clock configuration for the portal module should be added to the clock configuration file.
noc_adapters:
clk_period: [1.25 2.5]
design:
name: 'mydesign'
noc_placement: ['mydesign.place']
clk_periods: [5.0 2.5]
Note
RAD-Sim design modules so far do not support more than one clock and all their adapters are restricted to use the same clock as well (i.e. a single module cannot connect to multiple NoC adapters running at different clock speeds).
NoC Placement File (<design_name>.place)
This file defines the placement of the design modules relative to the NoC. In other words, which NoC router each design module port connects to. An example NoC placement file is shown below. Each line has a port name followed by the NoC ID it is connected to (in case multiple NoCs exist in the system), the node ID it is attached to, and the type of the interface as AXI-MM or AXI-S (all space-separated) as shown in the example below.
module_a 0 0 axis
module_b.port_a 0 3 aximm
module_b.port_b 0 7 aximm
For a mesh NoC, Booksim assumes a row-major ordering of the NoC router IDs with the top-left router has ID \(0\) and the bottom-right router has ID \(N^2-1\) for an \(N \times N\) mesh. Only for modules with all AXI-S interfaces, it is possible to only write the module name and this will result in all its ports to be connected to the same NoC router with arbitration logic between them.
For designs containing multiple RADs, RAD-Sim adds a portal module to the design, which allows for communication between RADs. The NoC configuration for the portal module should be added to the configuration file. AXI-S is the correct interface type. Verify that the design configuration yaml file has a large enough NoC size to include the portal module. Any unused NoC ID can be selected.
CMakeLists File (CMakeLists.txt)
This is a conventional CMakeLists file that lists all your modules, top, driver, and system header and source files
for CMake to compile correctly when you build RAD-Sim for the application design. For a new application design, it is
recommended that you copy the CMakeLists.txt file from one of the provided example design directories and edit the
hdrfiles and srcfiles variables to include all your design .hpp and .cpp files.
RAD-Sim Configuration File (config.yml)
This YAML file configures all the RAD-Sim parameters for the simulation of the application design under 4 main tags:
noc, noc_adapters, config <configname>, and cluster. The noc and noc_adapters parameters are shared across all RADs.
There may be multiple config <configname> sections, each describing a RAD configuration that can be applied to a single or multiple devices in the cluster.
The cluster tag describes the cluster of RADs, including the number of RADs and their configurations.
This file should be located in the same directory as the config.py script. For a new design, you should copy
the config.yml file from one of the provided example design directories and make modifications for your use case.
Note that the parameters within a config <configname> subsection can be applied to a single RAD or shared among multiple RADs.
An example configuration file is shown below, followed by an explanation for each configuration parameter.
noc:
type: ['2d']
num_nocs: 1
clk_period: [1.0]
payload_width: [166]
topology: ['mesh']
dim_x: [4]
dim_y: [4]
routing_func: ['dim_order']
vcs: [5]
vc_buffer_size: [8]
output_buffer_size: [8]
num_packet_types: [5]
router_uarch: ['iq']
vc_allocator: ['islip']
sw_allocator: ['islip']
credit_delay: [1]
routing_delay: [1]
vc_alloc_delay: [1]
sw_alloc_delay: [1]
noc_adapters:
clk_period: [1.25]
fifo_size: [16]
obuff_size: [2]
in_arbiter: ['fixed_rr']
out_arbiter: ['priority_rr']
vc_mapping: ['direct']
config rad1:
dram:
num_controllers: 4
clk_periods: [3.32, 3.32, 2.0, 2.0]
queue_sizes: [64, 64, 64, 64]
config_files: ['DDR4_8Gb_x16_2400', 'DDR4_8Gb_x16_2400', 'HBM2_8Gb_x128', 'HBM2_8Gb_x128']
design:
name: 'dlrm'
noc_placement: ['dlrm.place']
clk_periods: [5.0, 2.0, 3.32, 1.5]
config anotherconfig:
dram:
num_controllers: 4
clk_periods: [3.32, 3.32, 2.0, 2.0]
queue_sizes: [64, 64, 64, 64]
config_files: ['DDR4_8Gb_x16_2400', 'DDR4_8Gb_x16_2400', 'HBM2_8Gb_x128', 'HBM2_8Gb_x128']
design:
name: 'dlrm'
noc_placement: ['dlrm.place']
clk_periods: [5.0, 2.0, 3.32, 1.5]
cluster:
sim_driver_period: 5.0
telemetry_log_verbosity: 2
telemetry_traces: ['Embedding LU', 'Mem0', 'Mem1', 'Mem2', 'Mem3', 'Feature Inter.', 'MVM first', 'MVM last']
num_rads: 2
cluster_configs: ['rad1', 'anotherconfig'] #use config 'rad1' for the first RAD and config 'anotherconfig' for the second RAD under simulation
cluster_topology: 'all-to-all' #this parameter is not currently used
inter_rad_latency: 2100 #in nanoseconds
inter_rad_bw: 102.4 #in bits per nanosecond
inter_rad_fifo_num_slots: 1000
NoC Configuration Parameters
and
NoC Adapters Configuration Parameters
Configuration Parameters
Config subsection: DRAM Configuration Parameters
is the number of DRAM controllers
are the clock periods per DRAM
are the names of the DRAMSim3 configuration file for each DRAM. For a complete list of configuration options, check the rad-flow/rad-sim/sim/dram/DRAMsim3/configs/ directory.
are the filenames of the files specifying the memory configuration per DRAM
Config subsection: Design Configuration Parameters
of the design being run in this configuration
is the NoC placement file to use
is a list of all clock periods used in this design
Cluster Configuration Parameters
is the max clock period in nanoseconds for the entire simulation. Simulation cycle counts are reported based upon this.
specifies how much detail to use for the telemetry logging
specifies which simulation traces to use for telemetry
is the number of RADs being simulated
is a list of which configuration to use per-RAD. These names must match those in the config <configname> tagged sections.
is not currently used but is meant to specify the connection of RADs within the cluster. Currently only all-to-all is supported wherein each RAD can send to and receive data from any other RAD over the inter-RAD network directly.
is the latency in nanoseconds for data transfer between RADs over the inter-RAD network
is the bandwidth in bits per nanosecond for data transfer between RADs over the inter-RAD network
is the number of FIFO slots available for the buffering within the inter-RAD network