TCE provides easy ``hooks'' to attach cycle-count accurate TTA simulation models to SystemC system level simulations. The hooks allow instantiating TTA cores running a fixed program as SystemC modules. In order to simulate I/O from TTA cores, or just to model the functionality of an function unit (FU) in more detail, the fast (but not necessarily bit accurate) default pipeline simulation models of the FUs can be overriden with as accurate SystemC models as the system level simulation requires.
The TCE SystemC integration layer is implemented in ``tce_systemc.hh'' which should be included in your simulation code to get access to the TTA simulation hooks. In addition, to link the simulation binary successfully, the TCE libraries should be linked in via ``-ltce'' or a similar switch.
For a full example of a system level simulation with multiple TTA cores, see Appendix B.
New TTA cores are added to the simulation by instantiating objects of class TTACore. The constructor takes three parameters: the name for the TTA core SystemC module, the file name for the architecture description, and the program to be loaded to the TTA.
For example:
... #include <tce_systemc.hh> ... int sc_main(int argc, char* argv[]) { ... TTACore tta("tta_core_name", "processor.adf", "program.tpef"); tta.clock(clk.signal()); tta.global_lock(glock); ... }
As you can see, the TTA core model presents only two external ports that must be connected in your simulation model: the clock and the global lock. The global lock freezes the whole core when it's up and can be used in case of dynamic latencies, for example.
The default TTA simulation model is a model that produces cycle-count accuracy but does not simulate the details not visible to the programmer. It's an architecture simulator which is optimized for simulation speed. However, in case the core is to be connected to other hardware blocks in the system, the input and output behavior must be modeled accurately.
The TCE SystemC API provides a way to override functional unit simulation models with more detailed SystemC modules. This is done by describing one or more ``operation simulation models'' with the macro TCE_SC_OPERATION_SIMULATOR and defining more accurate simulation behavior for operation pipelines.
In the following, a simulation model to replace the default load-store simulation model is described. This model simulates memory mapped I/O by redirecting accesses outside the data memory address space of the internal TTA memory to I/O registers.
TCE_SC_OPERATION_SIMULATOR(LSUModel) { sc_in<int> reg_value_in; sc_out<int> reg_value_out; sc_out<bool> reg_value_update; TCE_SC_OPERATION_SIMULATOR_CTOR(LSUModel) {} TCE_SC_SIMULATE_CYCLE_START { reg_value_update = 0; } TCE_SC_SIMULATE_STAGE { unsigned address = TCE_SC_UINT(1); // overwrite only the stage 0 simulation behavior of loads and // stores to out of data memory addresses if (address <= LAST_DMEM_ADDR || TCE_SC_OPSTAGE > 0) { return false; } // do not check for the address, assume all out of data memory // addresses update the shared register value if (TCE_SC_OPERATION.writesMemory()) { int value = TCE_SC_INT(2); reg_value_out.write(value); reg_value_update.write(1); } else { // a load, the operand 2 is the data output int value = reg_value_in.read(); TCE_SC_OUTPUT(2) = value; } return true; } };
In the above example, TCE_SC_SIMULATE_CYCLE_START is used to describe behavior that is produced once per each simulated TTA cycle, before any of the operation stages are simulated. In this case, the update signal of the I/O register is initialized to 0 to avoid garbage to be written to the register in case write operation is not triggered at that cycle.
TCE_SC_SIMULATE_STAGE is used to define the parts of the operation pipeline to override. The code overrides the default LSU operation stage 0 in case the address is greater than LAST_DMEM_ADDR which in this simulation stores the last address in the TTA's local data memory. Otherwise, it falls back to the default simulation model by returning false from the function. The default simulation behavior accesses the TTA local memory simulated with the TTACore simulation model like in a standalone TTA simulation. Returning true signals that the simulation behavior was overridden and the default behavior should not be produced by TTACore.
The actual simulation behavior code checks whether the executed operation is a memory write. In that case it stores the written value to the shared register and enables its update signal. In case it's a read (we know it's a read in case it's not a write as this is a load-store unit with only memory accessing operations), it reads the shared register value and places it in the output queue of the functional unit.
In more detail: TCE_SC_OUTPUT(2) = value instructs the simulator to write the given value to the functional unit port bound to the operand 2 of the executed operation (in this case operand 2 of a load operation is the first result operand. This follows the convention of OSAL operation behavior models (see Section 4.3.5 for further details). Similarly, TCE_SC_UINT(2) and TCE_SC_INT(1) are used to read the value written to the operand 2 and 1 of the operation as unsigned and signed integers, respectively. In case of the basic load/store operations, operand 1 is the memory address and in case of stores, operand 2 is the value to write.
Finally, the simulation model is instantiated and the original LSU simulation model of TTACore is replaced with the newly defined one:
... LSUModel lsu("LSU"); tta.setOperationSimulator("LSU", lsu); ...
Pekka Jääskeläinen 2018-03-12