Subsections

4 Automatic Design Space Explorer (explore)

Automatic Design Space Explorer automates the process of searching for target processor configurations with favourable cost/performance characteristics for a given set of applications by evaluating hundreds or even thousands of processor configurations.

Input: ADF (a starting point architecture), TPEF, HDB

Output: ExpResDB (Section 2.2.8)

1 Explorer Application format

Applications are given as directories that contain the application specific files to the Explorer. Below is a description of all the possible files inside the application directory.

file name Description

program.bc The program byte code (produced using ``tcecc --emit-llvm'').

description.txt The application description.

simulate.ttasim TTASIM simulation script piped to the TTASIM to produce ttasim.out file. If no such file is given the simulation is started with "until 0" command.

correct_simulation_output Correct standard output of the simulation used in verifying. If you use this verification method, you need to add verification printouts to your code or the simulation script and produce this file from a successful run.

max_runtime The applications maximum runtime (as an ascii number in nanoseconds).

setup.sh Simulation setup script, if something needs to be done before simulation.

verify.sh Simulation verify script for additional verifying of the simulation, returns 0 if OK. If missing only correct_simulation_output is used in verifying.

file name	Description
program.bc	The program byte code (produced using ```tcecc --emit-llvm`'').
description.txt	The application description.
simulate.ttasim	TTASIM simulation script piped to the TTASIM to produce ttasim.out file. If no such file is given the simulation is started with "until 0" command.
correct_simulation_output	Correct standard output of the simulation used in verifying. If you use this verification method, you need to add verification printouts to your code or the simulation script and produce this file from a successful run.
max_runtime	The applications maximum runtime (as an ascii number in nanoseconds).
setup.sh	Simulation setup script, if something needs to be done before simulation.
verify.sh	Simulation verify script for additional verifying of the simulation, returns 0 if OK. If missing only correct_simulation_output is used in verifying.

Below is an example of the file structure of HelloWorld application. As The maximum runtime file is missing application is expected not to have a maximum runtime requirement.

HelloWorld/program.bc
HelloWorld/correct_simulation_output
HelloWorld/description.txt

2 Command Line Options

The exploration result database <output_dsdb> is required always. The database can be queried applications can be added into and removed from the database and the explored configurations in the database can be written as files for further examination.

Please refer to explroe -h for a full listing of possible options.

Depending on the exploration plugin, the exploring results machine configurations in to the exploration result database dsdb. The best results from the previous exploration run are given at the end of the exploration:

explore -e RemoveUnconnectedComponents -a data/FFTTest --hdb=data/initial.hdb data/test.dsdb

Best result configurations:
 1

Exploration plugins may also estimate the costs of configurations with the available applications. If there are estimation results for the configuratios those can be queried with option --conf_summary by giving the ordering of the results.

The Explorer plugins explained in chapters below can be listed with a command:

explore -g

And their parameters with a command:

explore -p <plugin name>

These commands can help if, for some reason, this documentation is not up-to-date.

3 Explorer Plugin: ConnectionSweeper

ConnectionSweeper reduces the interconnection network gradually until a given cycle count worsening threshold is reached. The algorithm tries to reduce register file connections first as they tend to be more expensive.

Example:

explore -v -e ConnectionSweeper -u cc_worsening_threshold=10 -s 1 database.dsdb

This reduces the connections in the IC network starting from the configuration number 1 until the cycle count drops over 10%. This algorithm might take quite a while to finish, thus the verbose switch is recommended to print the progress and possibly to pick up interesting configurations during the execution of the algorithm.

The explore tool has a pareto set finder which can output the interesting configurations from the DSDB after the IC exploration. The pareto set can be printed with:

explore --pareto_set C database.dsdb

This prints out the pareto efficient configurations using the number of connections in the architecture and the execution cycle count as the quality measures. For visualizing the pareto set you can use the pareto_vis script (installed with TCE) to plot the configurations:

explore --pareto_set C database.dsdb | pareto_vis

4 Explorer Plugin: SimpleICOptimizer

SimpleICOptimizer is an explorer plugin that optimizes the interconnection network of the given configuration by removing the connections that are not used in the parallel program.

This is so useful functionality especially when generating ASIPs for FPGAs that there's a shortcut script for invoking the plugin.

Usage:

 minimize-ic unoptimized.adf program.tpef target-ic-optimized.adf

However, if you want more customized execution, you should read on.

Parameters that can be passed to the SimpleICOptimizer are:

Param Name Default Value Description

tpef no default value name of the scheduled program file

add_only false Boolean value. If set true the connections of the given configuration won't be emptied, only new ones may be added

evaluate true Boolean value. True evaluates the result config.

Param Name	Default Value	Description
tpef	no default value	name of the scheduled program file
add_only	false	Boolean value. If set true the connections of the given configuration won't be emptied, only new ones may be added
evaluate	true	Boolean value. True evaluates the result config.

If you pass a scheduled tpef to the plugin, it tries to optimize the configuration for running the given program. If multiple tpefs are given, the first one will be used and others discarded. Plugin tries to schedule sequential program(s) from the application path(s) defined in the dsdb and use them in optimization if tpef is not given.

Using the plugin requires user to define the configuration he wishes optimize. This is done by giving -s <configuration_ID> option to the explorer.

Let there be 2 configurations in database.dsdb and application directory path app/. You can optimize the first configuration with:

explore -e SimpleICOptimizer -s 1 database.dsdb

If the optimization was succesfull, explorer should output:

Best result configuration:
 3

Add_only option can be used for example if you have an application which isn't included in application paths defined in database.dsdb but you still want to run it with the same processor configuration. First export the optimized configuration (which is id 3 in this case):

explore -w 3 database.dsdb

Next schedule the program:

schedule -t 3.adf -o app_dir2/app2.scheduled.tpef app_dir2/app2.seq

And then run explorer:

explore -e SimpleICOptimizer -s 3 -u add_only=true -u tpef=app_dir2/app2.scheduled.tpef database.dsdb

The plugin now uses the optimized configuration created earlier and adds connections needed to run the other program. If the plugin finds a new configuration it will be added to the database, otherwise the existing configuration was already optimal. Because the plugin won't remove existing connections the new machine configuration is able to run both programs.

5 Explorer Plugin: RemoveUnconnectedComponents

Explorer plugin that removes unconnected ports from units or creates connections to these ports if they are FUs, but removes FUs that have no connections. Also removes unconnected buses. If all ports from a unit are removed, also the unit is removed.

You can pass a parameter to the plugin:

Param Name Default Value Description

allow_remove false Allows the removal of unconnected ports and FUs

Param Name	Default Value	Description
allow_remove	false	Allows the removal of unconnected ports and FUs

When using the plugin you must define the configuration you wish the plugin to remove unconnected components. This is done by passing -s <configuration_ID> to explorer.

If you do not allow removal the plugin will connect unconnected ports to some sockets. It can be done with:
explore -e RemoveUnconnectedComponents -s 3 database.dsdb
or
explore -e RemoveUnconnectedComponents -s 3 -u allow_remove=false database.dsdb
if you wish to emphasise you do not want to remove components. This will reconnect the unconnected ports from the configuration 3 in database.dsdb.

And if you want to remove the unconnected components:
explore -e RemoveUnconnectedComponents -s 3 -u allow_remove=true database.dsdb

6 Explorer Plugin: GrowMachine

GrowMachine is an Explorer plugin that adds resources to the machine until cycle count doesn't go down anymore.

Parameters that can be passed to the GrowMachine are:

Param Name Default Value Description

superiority 2 Percentage value of how much faster schedules are wanted until cycle count optimization is stopped

Param Name	Default Value	Description
superiority	2	Percentage value of how much faster schedules are wanted until cycle count optimization is stopped

Using the plugin requires user to define the configuration he wishes optimize. This is done by giving -s <configuration_ID> option to the explorer.

Example of usage:

explore -e GrowMachine -s 1 database.dsdb

7 Explorer Plugin: ImmediateGenerator

ImmediateGenerator is an Explorer plugin that creates or modifies machine instruction templates. Typical usage is to split an instruction template slot among buses.

Parameters that can be passed to the ImmediateGenerator are:

Param Name Default Value Description

print false Print information about machines instruction templates.

remove_it_name no default value Remove instruction template with a given name

add_it_name no default value Add empty instruction template with a given name.

modify_it_name no default value Modify instruction template with a given name.

width 32 Instruction template supported width.

width_part 8 Minimum size of width per slot.

split false Split immediate among slots.

dst_imm_unit no default value Destination immediate unit.

Param Name	Default Value	Description
print	false	Print information about machines instruction templates.
remove_it_name	no default value	Remove instruction template with a given name
add_it_name	no default value	Add empty instruction template with a given name.
modify_it_name	no default value	Modify instruction template with a given name.
width	32	Instruction template supported width.
width_part	8	Minimum size of width per slot.
split	false	Split immediate among slots.
dst_imm_unit	no default value	Destination immediate unit.

Example of adding a new 32 width immediate template named newTemplate that is splitted among busses:

explore -e ImmediateGenerator -s 1 -u add_it_name="newTemplate" -u width=32 -u split=true database.dsdb

8 Explorer Plugin: ImplementationSelector

ImplementationSelector is an Explorer plugin that selects implementations for units in a given configuration ADF. It creates a new configuration with a IDF.

Parameters that can be passed to the ImmediateGenerator are:

Param Name Default Value Description

ic_dec DefaultICDecoder Name of the ic decoder plugin.

ic_hdb asic_130nm_1.5V.hdb name of the HDB where the implementations are

selected.

adf no default value An ADF for the implementations are selected if no

database is used.

Param Name	Default Value	Description
ic_dec	DefaultICDecoder	Name of the ic decoder plugin.
ic_hdb	asic_130nm_1.5V.hdb	name of the HDB where the implementations are
selected.
adf	no default value	An ADF for the implementations are selected if no
database is used.

Example of creating implementation for configuration ID 1, in the database:

explore -e ImplementationSelector -s 1 database.dsdb

9 Explorer Plugin: MinimizeMachine

MinimizeMachine is an Explorer plugin that removes resources from a machine until the real time requirements of the applications are not reached anymore.

Parameters that can be passed to the ImmediateGenerator are:

Param Name Default Value Description

min_bus true Minimize buses.

min_fu true Minimize function units.

min_rf true Minimize register files.

frequency no default value Running frequency for the applications.

Param Name	Default Value	Description
min_bus	true	Minimize buses.
min_fu	true	Minimize function units.
min_rf	true	Minimize register files.
frequency	no default value	Running frequency for the applications.

Example of minimizing configuration ID 1, in the database, with a frequency of 50 MHz:

explore -e MinimizeMachine -s 1 -u frequency=50 database.dsdb

10 Explorer Plugin: ADFCombiner

ADFCombiner is an Explorer plugin that helps creating clustered architectures. From two input architectures, one (described as extra) is copied, and other one (described as node) is replicated defined number of times.

The connections are create between register files in neighboring clusters and the extra.

In case the build_idf is set to true, hardware databases to be used for search for component implementations can be specified with -b parameter.

Parameters that can be passed to the ADFCombiner are:

Param Name Default Value Description

node_count 4 Number of times the node is replicated.

node node.adf The architecture of node that will be replicated.

extra extra.adf The architecture which will be added just once.

build_idf false If defined, ADFCombiner will try to create .idf definition file.

vector_lsu false If defined, the VectorLSGenerator plugin will be called to create wide load store unit.

address_spaces data The semicolon separated list of address space names to be used by generated wide load store units (one unit per address space).

Param Name	Default Value	Description
node_count	4	Number of times the node is replicated.
node	node.adf	The architecture of node that will be replicated.
extra	extra.adf	The architecture which will be added just once.
build_idf	false	If defined, ADFCombiner will try to create .idf definition file.
vector_lsu	false	If defined, the VectorLSGenerator plugin will be called to create wide load store unit.
address_spaces	data	The semicolon separated list of address space names to be used by generated wide load store units (one unit per address space).

Example of creating architecture with eight clusters, eight wide load and store, without creating implementation definition file:

explore -e ADFCombiner -u node=myNode.adf -u extra=myExtra.adf -u node_count=8 -u vector_lsu=true -u address_spaces="local;global" test.dsdb

If successful, explorer will print out configuration number of architecture created in test.dsdb. The created architecture can be written to architecture file with:

explore -w 2 test.dsdb

Example of creating architecture with four clusters with implementation definition file:

explore -e ADFCombiner -u node=myNode.adf -u extra=myExtra.adf -u node_count=4 -u build_idf=true -b default.hdb -b stream.hdb test.dsdb

If successful, explorer will print out configuration number of architecture created in test.dsdb. The created architecture and implementation definition file can be written to architecture file with:

explore -w 2 test.dsdb

11 Explorer Plugin: VLIWConnectIC

VLIWConnectIC takes an .ADF architecture as input, and arranges its FUs into a VLIW-like interconnection. This is typically as baseline for running the BusMergeMinimizer and RFPortMergeMinimizer plugins. The output machine has some empty buses whose instruction slots are used to encode long immediates.

Param Name Default Value Description

wipe_register_file yes Replace the original register file(s) with a native VLIW RF.

limm_bus_count 1 Number of empty buses used to encode long immediates.

simm_width 6 Short immediate width.

Param Name	Default Value	Description
wipe_register_file	yes	Replace the original register file(s) with a native VLIW RF.
limm_bus_count	1	Number of empty buses used to encode long immediates.
simm_width	6	Short immediate width.

Example of invoking the plugin:

explore -e VLIWConnectIC -a test.adf -s 1 test.dsdb
explore -w 2 test.dsdb

Pekka Jääskeläinen 2018-03-12