News and updates archive

Dec 11th, 2024: Doctoral thesis on energy-efficient CGRAs utilizes OpenASIP

Barry de Bruin from Technical University of Eindhoven defended his doctoral thesis on energy-efficient coarse-grained reconfigurable arrays (CGRA). In the thesis, titled "Design of Energy‐Efficient CGRA‐based Systems", Barry leveraged OpenASIP's flexible framework and used it's retargetable compiler backend to compile C programs for the designed CGRAs. This allowed more flexibility and ease of programming when compared to other CGRA implementations. As a doctoral student, Barry also visited and worked as a member of the CPC group. The group's leader Pekka Jääskeläinen was a copromotor (co-supervisor) in the thesis. Congratulations Barry!

Barry's dissertation

Nov 27th, 2024: Two publications in NorCAS

The CPC group published two papers in this year's IEEE Nordic Circuits and Systems Conference (NorCAS). Kari, the first author of both of the papers, participated in the conference in Lund and gave a presentation about each topic. The first paper leans on the recent interest in using AI-based methods for processor design space exploration. In this field, methods to evaluate design points quickly are key for fast exploration. The paper, done in collaboration with the Robot Learning team in Aalto University, describes a machine learning based method to estimate cycle counts in application-specific, static multi-issue architecuters. The paper is titled "Cycle Count Estimation of VLIW Processors Using Machine Learning".

Cycle count estimation

The second paper, "Fully Automatic Compiler Retargeting and CV-X-IF Hardware Interface Generation for RISC-V Custom Instructions", concerns CPC's efforts in the TRISTAN project. Since developing, verifying, and possibly certifying processor IPs is time-consuming and expensive, there are ongoing efforts in the RISC-V community to specify and implement standardized coprocessor/accelerator interfaces to existing processors. Once the interface is in place, the processor IP can be instantiated with different coprocessor/accelerator IPs. In this work, we leveraged OpenASIP's hardware generation capabilities to automatically generate CV-X-IF-based coprocessors. Operations of the coprocessor can be defined with OpenASIP's processor designer (ProDE). In this work, we also describe the improvements to the RISC-V support in OpenASIP. Operations from C code can now automatically be mapped to (suitable) custom operations in the coprocessor.

cv-x-if coprocessor

Nov 22nd, 2024: Programmable Instruction Dictionary Compression in Springer DAES

Instruction compression has been used in a variety of ways to mitigate the overheads of programmability in processors. We proposed a programmable instruction dictionary compression with the goal of improving dynamic compression ratio and energy-efficiency, and compared our approach to "traditional" instruction stream components. The article titled "Energy-efficient instruction compression with programmable dictionaries" was published in Springer Design Automation for Embedded Systems (DAES).

Parallel dictionaries

Oct 10th, 2024: FPGA bitstream database paper in IEEE VLSI

Implementing applications efficiently on FPGAs requires knowledge not only on the algorithms used in the application, but also on RTL description and FPGA EDA tools. In order to separate the tasks of the SW designer from those of the HW designer, Topi Leppänen proposes to use pre-generated bitstream databases together with partial FPGA reconfiguration. The SW designer can implement an application by picking from kernels in the database and is not required to have expertise in RTL or FPGA design. Our proposed tool, AFOCL, handles downloading the bitstreams and reconfiguring the FPGA automatically. The article "Bitstream Database-Driven FPGA Programming Flow Based on Standard OpenCL" is published in IEEE Transactions on Very Large Scale Integration (VLSI). The code is released as open-source and is available here.

FPGA bitream database flow

July 2nd, 2024: CPC at RISC-V Summit Europe

A delegation of three CPC members (Pekka, Kari and Joonas) participated in the RISC-V Summit Europe 2024 in Münich. The hero of the pack was Kari who delivered both a poster and an excellent talk about OpenASIP's RISC-V support. Check the slides here.

June 16th, 2024: PoCL-R enables adaptable AI offloading from nanodrone in AISA Y3 Demonstrator

The 3rd year demonstrator of the AISA project was presented last Friday live at Paidia in Tampere, Finland. The demonstrator features adaptable AI compute offloading from a nanodrone to remote servers. The Crazyflie nanodrone offloads an object detection algorithm to a remote server via PoCL-R and adapts to the network quality by adjusting the compression rate of the images sent to the server on the fly. You can watch the demonstrator videos on YouTube.

April 29th, 2024: The Impact of Wireless Channel Impairments on Computer Vision Accuracy in WCNC 2024

When offloading computer vision (CV) computation from a small device, such as a drone, to a remote server, a stream of images needs to be sent over a wireless network channel. Traditional entropy-coded bitstreams, such as JPEG, transmitted via a digital channel are prone to a so-called “digital cliff”: A sudden drop in the reconstructed image quality due to data corruption caused by channel noise and lost packets. To circumvent the digital cliff, Linear Coding and Transmission schemes (LCT) were pioneered by SoftCast in 2010, in which the reconstructed image quality degrades smoothly with increased amount of channel impairments. So far, however, the impact of LCT and channel impairments on CV accuracy has been studied only minimally. Jakub Žádník recently presented a paper “Performance of Linear Coding and Transmission in Low-Latency Computer Vision Offloading” at the WCNC 2024 conference in Dubai (UAE) in which he studies the impact of LCT processing, wireless channel noise and packet losses on the accuracy of semantic segmentation and object detection tasks. The absence of the digital cliff in the task accuracy was confirmed via a thorough evaluation over a wide range of LCT configurations. The findings were further strengthened by a realistic 5G channel simulation and retraining the CV tasks to account for the distortions caused by LCT and noisy channel.

April 12th, 2024: OpenCL pipe specification improvements in IWOCL 2024

OpenCL Pipe is a memory object used for passing data between kernels. It is useful in streaming style applications, where data is forwarded from one task to another. Since the pipe can be implemented in multiple ways, and OpenCL is intended as a programming model for heterogeneous platforms, the performance of the pipe implementations can vary heavily. The PhD thesis work of Topi Leppänen has resulted in insights on how the pipe specification could be improved especially in the context of FPGAs. These findings, along with suggestions for the OpenCL specification, were presented in IWOCL 2024 by Topi. Read the publication here.

April 5th, 2024: Adding fault tolerance to OpenCL

The modern computing landscape includes a variety of platforms. In addition to general-purpose devices, specialized processors are used to increase efficiency in various application domains and use cases. The OpenCL standard presents a unified way to program these heterogeneous devices, and the CPC group's PoCL is a vendor-independent, open-source implementation of the standard. In his MSc thesis "Adding fault tolerance to OpenCL" (2023) Robin Bijl added a mechanism to achieve robust computation with PoCL. This allows fault tolerance and reliable computing even in the context of heterogeneous platforms. Read the thesis here.

December 11th, 2023: Improving IoT device capabilities by offloading OpenCL kernels to edge servers
The Internet of things (IoT) consists of an enormous amount of devices with their size varying from large to extremely tiny. While it may be desirable to have complex functionalities in even the tiniest devices, this is often not feasible simply due to the lack of available resources. However, offloading the computation to a (nearby) server or a larger device enables sharing of the resources and seemingly allows even small devices to perform demanding computations. In his MSc thesis "Offloading Computation with a Minimized OpenCL Runtime from a Nano Drone" (2022) Jyry Uitto created a proof-of-concept implementation of a nano drone that can offload OpenCL kernel execution onto an edge server. Read the thesis here.

November 30th, 2023: Dual-IS article in IEEE TC

Static multi-issue processors exploit instruction level parallelism efficiently thanks to the lack of dynamic hardware that schedules instructions during run time. However, their instruction stream energy consumption is significantly higher than that of their dynamic multi- or single-issue counterparts. Processor designers must choose between the benefits of static multi-issue capabilities and higher code density, but is it too much to ask for both? In our latest article, we introduce an energy-efficient dual-mode (RISC-V single-issue and an exposed datapath VLIW) architecture for leveraging instruction level parallelism statically when available in the program, without suffering from VLIW’s poor code density when there’s a lack of it. The flexibility of the architecture is utilized by a novel compilation method that can generate code for both instruction sets with fine-grained mode switching. Read more in the article.

November 16th, 2023: BrainTTA presentation in IEEE ICCD 2023

Our Dutch colleague Maarten Molendijk from TU Eindhoven presented a co-authored paper "BrainTTA: A 28.6 TOPS/W Compiler Programmable Transport-Triggered NN SoC" in IEEE ICCD 2023. The publication was a result of successful collaboration work between our CPC group and PARSE/TUE where a programmable TTA/SIMD-based accelerator was designed for ultra low power AI inference on low precision use cases. The design was done using the OpenASIP tools with the design work conducted by Molendijk et al. Read more about it in the preprint. The presentation slides are available here.

November 6th, 2023: New publications added
  • Topi Leppänen, Joonas Multanen, Leevi Leppänen, Pekka Jääskeläinen:
    AFOCL: Portable OpenCL Programming of FPGAs via Automated Built-in Kernel Management
    in IEEE Nordic Circuits and Systems Conference (NorCAS 2023) (download).
  • Niklas Rother, Leonard Mätzner, Pekka Jääskeläinen, Topi Leppänen, Jens Karsten Schleusner, Holger Christoph Blume:
    Synthetic Aperture Radar Algorithms on Transport Triggered Architecture Processors using OpenCL
    International Radar Conference 2023
  • Maarten Molendijk, Floran de Putter, Manil Dev Gomony, Pekka Jääskeläinen and Henk Corporaal:
    BrainTTA: A 28.6 TOPS/W Compiler Programmable Transport-Triggered NN SoC
    IEEE International Conference on Computer Design (ICCD 2023)
  • Panagiotis Mousouliotis, Topi Leppanen, Pekka Jaaskelainen, Nikos Petrellis, Panagiotis Christakos, Georgios Keramidas, Christos Antonopoulos, Nikolaos Voros:
    On the OpenCL Support for Streaming Fixed-Function Accelerators on Embedded SoC FPGAs
    The 19th International Symposium on Applied Reconfigurable Computing (ARC 2023)
November 1st, 2023: AFOCL presentation in NorCAS 2023 conference

Our doctoral researcher Topi Leppänen presented the paper "AFOCL: Portable OpenCL Programming of FPGAs via Automated Built-in Kernel Management" in NorCAS 2023. AFOCL allows FPGA device users to avoid vendor lock-in and separates the roles of software and FPGA engineer. Behind the curtain, the OpenCL implementation automatically selects IPs from a precompiled bitstream database and handles FPGA reconfiguration. Details in the paper.

August 24th, 2023: Final demonstrator video for the CPSoSAware EU project available

Check out the video below of the final demonstrator for the CPSoSAware EU project. The work was a collaboration with the University of Peloponnese. The demonstrator features a nanodrone, which offloads processing to edge resources wirelessly using Pocl-R.

August 8th, 2023: Added a publication from 2022 missing from the web page
  • Topi Leppänen, Atro Lotvonen, Pekka Jääskeläinen:
    "Cross-vendor programming abstraction for diverse heterogeneous platforms"
    in Frontiers in Computer Science, Vol. 4, Oct. 2022 (download).
June 15th, 2023: Two new publications added
  • Topi Leppänen, Atro Lotvonen, Panagiotis Mousouliotis, Joonas Multanen, Georgios Keramidas, Pekka Jääskeläinen:
    "Efficient OpenCL system integration of non-blocking FPGA accelerators"
    in Microprocessors and Microsystems (MICPRO), Vol. 97, Mar. 2023 (download).
  • Alex Hirvonen, Topi Leppänen, Kari Hepola, Joonas Multanen, Joost Hoozemans, Pekka Jääskeläinen:
    "AEX: Automated High-Level Synthesis of Compiler Programmable Co-processors"
    in Journal of Signal Processing Systems (JSPS), Feb. 2023​ (download).
October 17th, 2022: A master's thesis and a new publication added
  • Kari Hepola:
    Generation of Customized RISC-V Implementations
    (2022) (link)
  • Kanishkan Vadivel, Barry de Bruin, Pekka Jääskeläinen, Roel Jordans and Henk Corporaal:
    "Prebypass: Software Register File Bypassing for Reduced Interconnection Architecture"
    in Euromicro Conference on Digital Systems Design (DSD 2022) (download).
September 22nd, 2022: New publications added
  • Jakub Žádník, Markku Mäkitalo, Pekka Jääskeläinen,
    "Pruned Lightweight Encoders for Computer Vision"
    IEEE 24th International Workshop on Multimedia Signal Processing (MMSP 2022) download poster
  • Kari Hepola, Joonas Multanen and Pekka Jääskeläinen:
    "Dual-IS: Instruction Set Modality for Efficient Instruction Level Parallelism"
    in 35th GI/ITG International Conference on Architecture of Computing Systems (ARCS 2022) (download).
  • Kari Hepola, Joonas Multanen and Pekka Jääskeläinen:
    "OpenASIP 2.0: Co-Design Toolset for RISC-V Application-Specific Instruction-Set Processors"
    in 33rd IEEE International Conference on Applicationspecific Systems, Architectures and Processors (ASAP 2022) (download).
September 22nd, 2022: New publications added
  • Jakub Žádník, Markku Mäkitalo, Pekka Jääskeläinen,
    "Pruned Lightweight Encoders for Computer Vision",
    IEEE 24th International Workshop on Multimedia Signal Processing (MMSP 2022) download poster
  • Kari Hepola, Joonas Multanen and Pekka Jääskeläinen:
    "Dual-IS: Instruction Set Modality for Efficient Instruction Level Parallelism",
    in 35th GI/ITG International Conference on Architecture of Computing Systems (ARCS 2022) (download).
  • Kari Hepola, Joonas Multanen and Pekka Jääskeläinen:
    "OpenASIP 2.0: Co-Design Toolset for RISC-V Application-Specific Instruction-Set Processors",
    in 33rd IEEE International Conference on Applicationspecific Systems, Architectures and Processors (ASAP 2022) (download).
February 16th, 2022: New publication added
  • Jakub Zadnik, Markku Mäkitalo, Jarno Vanne, Pekka Jääskeläinen:
    "Image and Video Coding Techniques for Ultra-Low-Latency",
    in ACM Computing Surveys (Volume 54, Issue 11s, January 2022) (download).
December 23rd, 2021: New publication, an old master's thesis and a new doctoral dissertation added
  • Topi Leppänen, Panagiotis Mousouliotis, Georgios Keramidas, Joonas Multanen, Pekka Jääskeläinen:
    "Unified OpenCL Integration Methodology for FPGA Designs",
    in NorCAS 2021: IEEE Nordic Circuits and Systems Conference (download).
  • Joonas Multanen:
    Hardware Optimizations for Low-Power Processors (December, 2014) (link)
  • Joonas Multanen:
    Energy-Efficient Instruction Streams for Embedded Processors (November, 2021) (link)
  • November 8th, 2021: Two master's theses added
    • Topi Leppänen:
      Scalability optimizations for multicore soft processors (2021) (link)
    • Jan Solanti:
      Distributed Low Latency Computing With OpenCL: A Scalable Multi-Access Edge Computing Framework (2020) (link)
    November 4th, 2021: New publications added
    • Joonas Multanen, Kari Hepola, Asif Ali Khan, Jeronimo Castrillon, Pekka Jääskeläinen:
      "Energy-Efficient Instruction Delivery in Embedded Systems with Domain Wall Memory",
      in IEEE Transactions on Computers (Volume 71, Issue 9, September 2022) (download).
    • Jan Solanti, Michal Babej, Julius Ikkala, Vinod Kumar Malamal Vadakital, Pekka Jääskeläinen:
      "PoCL-R: A Scalable Low Latency Distributed OpenCL Runtime",
      in SAMOS XXI: Embedded Computer Systems: Architectures, MOdeling, and Simulation (virtual, July 2021) (download).
    • Jakub Zadnik, Markku Mäkitalo, Jussi Iho, Pekka Jääskeläinen:
      "Performance of Texture Compression Algorithms in Low-Latency Computer Vision Tasks",
      in EUVIP 2021: 9th European Workshop on Visual Information Processing (virtual, 23-25 June 2021) (download).
    • Joost Hoozemans, Kati Tervo, Pekka Jääskeläinen, Zaid Al-Ars:
      "Energy Efficient Multistandard Decompressor ASIP",
      in ICCDE 2021: 7th International Conference on Computing and Data Engineering (virtual, January 2021) (download).
    December 10th, 2020: New publications and a blog post

    New publications in the fall:

    • Joonas Multanen, Kari Hepola, Pekka Jääskeläinen:
      "Programmable Dictionary Code Compression for Instruction Stream Energy Efficiency",
      in ICCD 2020: The 38th IEEE International Conference on Computer Design (virtual, October, 2020) (download).
    • Kati Tervo, Samawat Malik, Topi Leppänen, Pekka Jääskeläinen:
      "TTA-SIMD Soft Core Processors",
      in FPL2020: 30th International Conference on Field-Programmable Logic and Applications (virtual, August-September, 2020) (download).

    The instruction stream energy efficiency paper is featured in a recent blog post as well.

    August 19th, 2020: New publication added
    • Joonas Multanen, Heikki Kultala, Kati Tervo, Pekka Jääskeläinen:
      "Energy Efficient Low Latency Multi-issue Cores for Intelligent Always-On IoT Applications",
      in Journal of Signal Processing Systems (2020) (download).
    June 16th, 2020: Three old master's theses added
    For some reason, there were three extremely interesting master's thesis produced from the work of the group missing from the web page, which were now added:
    • Ville Korhonen:
      Portable OpenCL Out-of-Order Execution Framework for Heterogeneous Platforms (December, 2014) (link)
    • Henry Linjamäki:
      Instruction Memory Hierarchy Generation for Customized Processors (2015) (link)
    • Aleksi Tervo:
      Optimizing Transport-Triggered Architectures for Field-Programmable Gate Arrays (2018) (link)
    October 7th, 2019: New publications added
    • Kanishkan Vadivel, Pekka Jääskeläinen, Roel Jordans, Heikki Kultala, Sander Stuijk, Henk Corporaal:
      "Towards Efficient Code Generation for Exposed Datapath Architectures",
      in SCOPES 2019: 22nd International Workshop on Software and Compilers for Embedded Systems (Sankt Goar, Germany, May, 2019) (download).
    • Sven Gesper, Moritz Weißbrich, Stephan Nolting, Tobias Stuckenberg, Holger Blume, Guillermo Payá Vayá, Pekka Jääskeläinen:
      "Evaluation of Different Processor Architecture Organizations for On-Site Electronics in Harsh Environments",
      in SAMOS XIX: Embedded Computer Systems: Architectures, MOdeling, and Simulation (Samos, Greece, July 2019) (download).
    • Joonas Multanen, Pekka Jääskeläinen, Asif Ali Khan, Fazal Hameed, Jeronimo Gastrillon:
      "SHRIMP: Efficient Instruction Delivery with Domain Wall Memory",
      in ACM/IEEE International Symposium on Low Power Electronics and Design (Lausanne, Switzerland, July 2019) (download).
    • Alex Hirvonen, Kati Tervo, Heikki Kultala, Pekka Jääskeläinen:
      "AEx: Automated Customization of Exposed Datapath Soft-Cores",
      in Euromicro Conference on Digital System Design (Kallithea, Greece, August 2019) (download).
    • Jakub Zadnik, Jarmo Takala:
      "Low-power Programmable Processor for Fast Fourier Transform Based on Transport Triggered Architecture",
      in 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (download).
    May 2nd, 2019: New publications added
    LordCore, a FP16 SIMD multicore TTA co-design case study we did in the ALMARVI project is now finally published as an article:
    • Heikki Kultala, Timo Viitanen, Heikki Berg, Pekka Jääskeläinen, Joonas Multanen, Mikko Kokkonen, Kalle Raiskila, Tommi Zetterman, Jarmo Takala:
      "LordCore: Energy-Efficient OpenCL-Programmable Software-Defined Radio Coprocessor",
      in IEEE Transactions on Very Large Scale Integration (VLSI) Systems (Volume 27, Issue 5, May 2019) (download).
    Also added a Polar Decoder case study published by our colleagues in Bordeaux:
    • Mathieu Léonardon, Camille Leroux, Pekka Jääskeläinen, Christophe Jego, Yvon Savaria:
      "Transport Triggered Polar Decoders",
      in 2018 IEEE 10th International Symposium on Turbo Codes & Iterative Information Processing (ISTC) (download).
    Please let us know if you have a TTA/OpenASIP-related publication to add to the list!
    January 10th, 2019: New publications added
    Two conference publications we published during Fall time:
    • Joonas Multanen, Heikki Kultala, Pekka Jääskeläinen, Timo Viitanen, Aleksi Tervo, Jarmo Takala:
      "LoTTA: Energy-Efficient Processor for Always-On Applications",
      in SiPS 2018: IEEE Workshop on Signal Processing Systems (Cape Town, South Africa, October 2018) (download).
    • Joonas Multanen, Heikki Kultala, Pekka Jääskeläinen:
      "Energy-Delay Trade-Offs in Instruction Register File Design",
      in IEEE Nordic Circuits and Systems Conference (Tallinn, Estonia, October 2018) (download).
    • Mathieu Léonardon, Camille Leroux, Pekka Jääskeläinen, Christophe Jego and Yvon Savaria:
      "Transport Triggered Polar Decoders",
      in The 10th International Symposium on Turbo Codes & Iterative Information Processing (Hong Kong, China, December 2018) (download).
    • Pekka Jääskeläinen, Ville Korhonen, Matias Koskela, Jarmo Takala, Karen Egiazarian, Aram Danielyan, Cristóvão Cruz, James Price, Simon Mcintosh-Smith:
      "Exploiting Task Parallelism with OpenCL: A Case Study",
      in Journal of Signal Processing Systems, vol. 91, issue 1, October 2018 (download).
    October 19th, 2018: New publications added
    These workshop and conference publications we published during Spring and Summer time and can be now found in proceedings:
    • Jos IJzerman, Timo Viitanen, Pekka Jääskeläinen, Heikki Kultala, Lasse Lehtonen, Maurice Peemen, Henk Corporaal, Jarmo Takala:
      "AivoTTA: An Energy Efficient Programmable Accelerator for CNN-Based Object Recognition",
      in SAMOS XVIII: Embedded Computer Systems: Architectures, MOdeling, and Simulation (Samos, Greece, July 2018) (download).
    • Pekka Jääskeläinen, John Glossner, Martin Jambor, Aleksi Tervo, Matti Rintala:
      "Offloading C++17 Parallel STL on System Shared Virtual Memory Platforms",
      in 3rd Workshop on Open Source Supercomputing (OpenSuco3 within ISC 2018, June, Frankfurt, Germany) (download).
    • Jääskeläinen, P., Tervo, A., Paya-Vaya, G., Viitanen, T., Behmann, N., Takala, J., & Blume, H. (2018).:
      "Transport-Triggered Soft Cores",
      in 2018 IEEE International Parallel and Distributed Processing Symposium, Workshops (IPDPSW) (download).
    July 23rd, 2018: AivoTTA Wins the Best Paper Award in SAMOS 2018!

    We were honored to receive the Stamatis Vassiliadis best paper award this year in SAMOS for our paper AivoTTA: An Energy Efficient Programmable Accelerator for CNN-Based Object Recognition!

    In the paper we proposed an ASIP design based on a custom wide-SIMD TTA for high-performance low power CNN inference applications with excellent results.

    July 11th, 2018: New publications and a Twitter account added

    A couple of new publications were added:

    • Timo Viitanen, Janne Helkala, Heikki Kultala, Pekka Jääskeläinen, Jarmo Takala, Tommi Zetterman, Heikki Berg:
      "Variable Length Instruction Compression on Transport Triggered Architectures",
      in International Journal of Parallel Programming, 2018 (download).
    • Multanen, J., Viitanen, T., Jääskeläinen, P. & Takala, J.:
      "Instruction Fetch Energy Reduction with Biased SRAMs",
      in Journal of Signal Processing Systems, 2018 (download).
    • Heikki Kultala, Pekka Jääskeläinen, Johannes Ijzerman, Timo Viitanen, Markku Mäkitalo, Jarmo Takala:
      "Exposed Datapath Optimizations for Loop Scheduling",
      in SAMOS XVII: Embedded Computer Systems: Architectures, MOdeling, and Simulation (Samos, Greece, July 2017) (download).
    • Mona Aghababaeetafreshi, Matias Koskela, Dani Korpi, Pekka Jääskeläinen, Mikko Valkama, Jarmo Takala:
      "Software Defined Radio Implementation of Adaptive Nonlinear Digital Self-interference Cancellation for Mobile Inband Full-Duplex Audio",
      in GlobalSIP: 4th IEEE Global Conference on Signal & Information Processing (Washington D.C., USA, December 2016) (download).

    CPC also now has a Twitter account where we plan to announce new publications and other activities.

    May 17th, 2017: New publications<
  • Jukka Teittinen, Markus Hiienkari, Indrė Žliobaitėb, Jaakko Hollmen, Heikki Berg, Juha Heiskala, Timo Viitanen, Jesse Simonsson, Lauri Koskinen:
    "A 5.3 pJ/op approximate TTA VLIW tailored for machine learning",
    in Microelectronics Journal, Volume 61, March 2017 (
    download).
  • Pekka Jääskeläinen, Timo Viitanen, Jarmo Takala, Heikki Berg:
    "HW/SW Co-design Toolset for Customization of Exposed Datapath Processors",
    in Computing Platforms for Software-Defined Radio (book chapter pp 147-164), December 2016 (download).
  • Joonas Multanen, Timo Viitanen, Pekka Jääskeläinen, Jarmo Takala:
    "Xor-Masking: a Low-Overhead Method for Instruction Fetch Energy Reduction with Emerging SRAM Technologies",
    in SiPS 2016: IEEE Workshop on Signal Processing Systems (Dallas, Texas, October 2016) (download).
  • Joonas Multanen, Heikki Kultala, Matias Koskela, Timo Viitanen, Pekka Jääskeläinen, Jarmo Takala, Karen Egiazarian, Aram Danielyan, Cristóvão Cruz:
    "OpenCL Programmable Exposed Datapath High Performance Low-Power Computational Imaging Accelerator",
    in IEEE Nordic Circuits and Systems Conference (Copenhagen, Denmark, November 2016) (download).
  • Heikki Kultala, Timo Viitanen, Pekka Jääskeläinen, Jarmo Takala:
    "Aggressively Bypassing List Scheduler for Transport Triggered architectures",
    in SAMOS XVI: Embedded Computer Systems: Architectures, MOdeling, and Simulation (Samos, Greece, July 2016) (download).
  • N.Behmann, C. Seifert, G. Paya-Vaya, H. Blume, P. Jääskeläinen, J.Multanen, H. Kultala, J. Takala, J. Thiemann, S. van de Par:
    "Customized High Performance Low Power Processor for Binaural Speaker Localization",
    in IEEE Int'l Conference on Electronics, Circuits, & Systems (Monte Carlo, Monaco, December 2016) (download).
  • March 16th, 2017: TCE 1.15 released

    A new version of the toolset is now available for download.

    See the release announcement and the change summary for details. Install instructions.

    November 24th, 2016: TCE 1.14 released

    A new version of the toolset is now available for download.

    See the release announcement and the change summary for details. Install instructions.

    October 20th, 2016: New publications added

    It's been too long time since a new publication page update. The following new publications are now found in proceedings and published journal issues:

    • Heikki Kultala, Timo Viitanen, Pekka Jääskeläinen, Janne Helkala, Jarmo Takala:
      "Improving Code Density with Variable Length Encoding Aware Instruction Scheduling",
      in Journal of Signal Processing Systems, September 2016, vol. 84, issue 3 (download).
    • Tomi Äijö, Pekka Jääskeläinen, Tapio Elomaa, Jarmo Takala:
      "Integer Linear Programming-Based Scheduling for Transport Triggered Architectures",
      in ACM Transactions on Architecture and Code Optimization, January 2016, vol. 12, issue 4 (download).
    • Heikki Kultala, Joonas Multanen, Pekka Jääskeläinen, Timo Viitanen, and Jarmo Takala:
      "Impact of Operand Sharing to the Processor Energy Efficiency",
      in CADS: 18Th CSI International Symposium on Computer Architecture & Digital Systems (Tehran, Iran, October 2015) (download).
    • Ville Korhonen , Pekka Jääskeläinen, Matias Koskela, Jarmo Takala:
      "Rapid Customization of Image Processors Using Halide",
      in GlobalSIP: 3rd IEEE Global Conference on Signal & Information Processing (Orlando, Florida, December 2015) (download).
    • Joonas Multanen, Timo Viitanen, Henry Linjamäki, Heikki Kultala, Pekka Jääskeläinen, Jarmo Takala, Lauri Koskinen, Jesse Simonsson, Heikki Berg, Kalle Raiskila and Tommi Zetterman:
      "Power Optimizations for a Transport Triggered SIMD Processor",
      in SAMOS XV: Embedded Computer Systems: Architectures, MOdeling, and Simulation (Samos, Greece, July 2015) (download).
    • Heikki Kultala, Timo Viitanen, Pekka Jääskeläinen, Jarmo Takala:
      "Aggressively Bypassing List Scheduler for Transport Triggered Architectures",
      in SAMOS XVI: Embedded Computer Systems: Architectures, MOdeling, and Simulation (Samos, Greece, July 2016) (download).
    • N.Behmann, C. Seifert, G. Paya-Vaya, H. Blume, P. Jääskeläinen, J.Multanen, H. Kultala, J. Takala, J. Thiemann, S. van de Par:
      "Customized High Performance Low Power Processor for Binaural Speaker Localization",
      in IEEE Int'l Conference on Electronics, Circuits, & Systems (Monte Carlo, Monaco, December 2016) (download).
    March 3rd, 2016: TCE 1.13 released

    A new version of the toolset is now available for download.

    See the release announcement and the change summary for details. Install instructions.

    November 25th, 2015: moved to git and Github

    TCE development was moved from Bazaar and Launchpad to Git and Github.

    September 4th, 2015: TCE 1.12 released

    A new version of the toolset is now available for download.

    See the release announcement for details and the change summary for details.

    March 2nd, 2015: TCE 1.11 released

    A new version of the toolset is now available for download.

    See the release announcement

    January 12th, 2015: new publications added
    • Yviquel Hervé, Sanchez Alexandre, Jääskeläinen Pekka, Takala Jarmo, Raulet Mickaël, Casseau Emmanuel:
      "Embedded Multi-Core Systems Dedicated to Dynamic Dataflow Programs",
      in Journal of Signal Processing Systems, vol. 80, issue 1, July 2015 (download).
    • Kultala Heikki, Viitanen Timo, Jääskeläinen Pekka, Helkala Janne, Takala Jarmo:
      "Compiler Optimizations for Code Density of Variable Length Instructions",
      in 2014 IEEE Workshop on Signal Processing Systems (SiPS) (download).
    • Viitanen Timo, Kultala Heikki, Jääskeläinen Pekka, Takala Jarmo:
      "Heuristics for Greedy Transport Triggered Architecture Interconnect Exploration",
      in International Conference on Compilers, Architecture and Synthesis for Embedded Systems 2014 (CASES 14) (download).
    • Nyländen Teemu, Boutellier Jani, Nikunen Karri, Hannuksela Jari, Silvén Olli:
      "Low-power Reconfigurable Miniature Sensor Nodes for Condition Monitoring",
      in International Journal of Parallel Programming (download).
    • Hautala Ilkka, Boutellier Jani, Hannuksela Jari, Silvén Olli:
      "Programmable Low-Power Multicore Coprocessor Architecture for HEVC/H.265 In-Loop Filtering",
      in IEEE Transactions on Circuits and Systems for Video Technology (download).
    • Ghazi Amanullah, Boutellier Jani, Abdelaziz Mahmoud, Lu Xiaojia, Anttila Lauri, Cavallaro Joseph R, Bhattacharyya Shuvra S, Valkama Mikko, Juntti Markku:
      "Low Power Implementation of Digital Predistortion Filter on a Heterogeneous Application Specific Multiprocessor",
      in The 39th IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), Florence, Italy (download).
    September 23rd, 2014: new publications and theses added
    • Jääskeläinen Pekka, Kultala Heikki, Viitanen Timo, Takala Jarmo:
      "Code Density and Energy Efficiency of Exposed Datapath Architectures",
      in Journal of Signal Processing Systems July 2014 (download).
    • Helkala Janne, Viitanen Timo, Kultala Heikki, Jääskeläinen Pekka, Takala Jarmo, Zetterman Tommi, Berg Heikki:
      "Variable Length Instruction Compression on Transport Triggered Architectures",
      in International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, SAMOS XIV, Samos Island, Greece, July 14-17, 2014 (download).
    • Rister B., Jääskeläinen P., Silven O., Hannuksela J.:
      "Parallel programming of a symmetric transport-triggered architecture with applications in flexible LDPC encoding",
      in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (download).
    • Yviquel H., Sanchez A., Jääskeläinen P., Takala J.:
      "Efficient software synthesis of dynamic dataflow programs",
      in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (download).
    • Master's thesis of Janne Helkala:
      Variable Length Instruction Compression on Transport Triggered Architectures (June, 2014) (link)
    • Master's thesis of Mikko Järvelä:
      Vector Operation Support for Transport Triggered Architectures (June, 2014) (link)
    September 5th, 2014: TCE 1.10 released

    A new version of the toolset is now available for download.

    See the release announcement for details.

    February 27th, 2014: New research group name

    The research group in Tampere University maintaining TCE was renamed from FlexASP to Customized Parallel Computing (CPC).

    January 27th, 2014: TCE 1.9 released

    A new version of the toolset is now available for download.

    See the release announcement for details.

    October 28th, 2013: new publications from Oulu added

    A bunch of publications from University of Oulu that use TCE added.

    October 17th, 2013: new publications added
    • Tomasz Patyk, David Guevorkian, Teemu Pitkänen, Pekka Jääskeläinen, Jarmo Takala:
      "Low-Power Application-Specific FFT Processor for LTE Applications",
      in SAMOS XIII: Embedded Computer Systems: Architectures, MOdeling, and Simulation (Samos, Greece, July 2013). (doi)
    • Heikki Kultala, Otto Esko, XianJun Jiao, Pekka Jääskeläinen, Vladimír Guzma, Jarmo Takala, Tommi Zetterman, Heikki Berg:
      "Turbo Decoding on Tailored OpenCL Processor",
      in IWCMC 2013: International Wireless Communications & Mobile Computing Conference (Cagliari, Italy, July 2013). (doi)
    • Tomasz Patyk, Perttu Salmela, Teemu Pitkänen, Pekka Jääskeläinen, Jarmo Takala:
      "Design Methodology for Offloading Software Executions to FPGA",
      in Journal of Signal Processing Systems, November 2011, vol. 65, issue 2. (doi)
    June 18th, 2013: TCE 1.8 released

    A new version of the toolset is now available for download.

    See the release announcement for details.

    January 21st, 2013: TCE 1.7 released

    A new version of the toolset is now available for download.

    See the release announcement for details.

    Jan 9th, 2013: Master's thesis added: "Floating-Point Arithmetic in Transport Triggered Architectures"

    A new Master's Thesis about floating-point support in TCE has been added to the publications section.

    Nov 13th, 2012: A doctoral dissertation added: "From Parallel Programs to Customized Parallel Processors"

    The doctoral dissertation of Pekka Jääskeläinen is now available in the publications section. The dissertation is about exploiting parallel programming languages in the parallel processor customization, and expanding the customization aspects to the multicore level.

    June 7th, 2012: TCE 1.6 released

    A new version of the toolset is now available for download.

    This release adds support for LLVM 3.1, experimental Verilog backend for the Processor Generator, support for explicit access to multiple address spaces from C, a simplified C++ interface for accessing the simulation engine, automated generation of clustered-style TTA machines, experimental vector input and a bottom-up instruction scheduler. See the CHANGES file for a more thorough change listing.

    May 12th, 2012: ASILOMAR '11 publication added

    We presented a paper about our operation description format and compiler retargeting:

    • Heikki Kultala, Pekka Jääskeläinen, Jarmo Takala
      "Operation Set Customization in Retargetable Compilers".
      Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), 6-9 Nov 2011, Pacific Grove, California (IEEEexplore)
    December 13th, 2011: TCE 1.5 released

    A new version of the toolset is now available for download.

    This release includes support for LLVM 3.0, experimental OpenCL C Embedded Profile support (in offline compilation/standalone mode), a light weight (debug output) printing library, support for calling custom operations in specific function units, generalizations to the architecture description format to allow using the instruction scheduler for operation triggered architectures (with a proof of concept for the Cell SPU), several code generator improvements and plenty of bug fixes. See the CHANGES file for a more thorough change listing.

    November 4th, 2011: SoC'11 publications added

    We presented two new papers related to TCE in the SoC 2011 conference:

    • Pekka Jääskeläinen, Erno Salminen, Otto Esko, Jarmo Takala,
      "Customizable Datapath Integrated Lock Unit,"
    • Vladimír Guzma, Teemu Pitkänen, Jarmo Takala,
      "Effects of Loop Unrolling and Use of Instruction Buffer on Processor Energy Consumption,"
      in Proc. of International Symposium on System on Chip 2011, Tampere, Finland, October 31-November 2, 2011
    October 19th, 2011: Portable OpenCL (pocl) released

    The work started in early 2009 as an experiment to schedule OpenCL C kernels for standalone application-specific processors has been now generalized and released as a separate open source project called Portable OpenCL (pocl).

    September 07th, 2011: SAMOS XI publications added

    We presented two new papers in the SAMOS XI conference:

    • Pekka O. Jääskeläinen, Erno O. Salminen, Carlos S. de La Lama, Jarmo H. Takala, and Jose Ignacio Martinez,
      "TCEMC: A Co-Design Flow for Application-Specific Multicores".
    • Vladimír Guzma, Teemu Pitkänen, and Jarmo H. Takala,
      "Instruction Buffer with Limited Control Flow and Loop Nest Support".
    August 18th, 2011: Publication added: ASIP Integration and Verification Flow

    A new Master's Thesis about automatic ASIP integration and verification has been added to the publications.

    In addition, a new Bachelor's Thesis, "Siirtoliipaistujen prosessorien käyttäminen FPGA-pohjaisissa järjestelmäpiireissä" (Utilizing Transport-Triggered Processors on FPGA-based System-on-Chip), written in Finnish, has been added to the publications.

    July 7th, 2011: Publication added: OpenCL-based Design Methodology for Application-Specific Processors

    A new journal paper, an extended version of our SAMOS 2010 paper that studied OpenCL in the context of ASIP design, has been published.

    April 11th, 2011: TCE 1.4 released

    A new version of the toolset is now available for download. Check the release announcement and the change log. Good luck with your new TTA designs!

    --The TCE crew

    Feb 28th, 2011: Publication added: Programmable and Scalable Architecture for Graphics Processing Units (extended version)

    A new journal paper, an extended version of our SAMOS 2009 paper that studied TTA for GPU implementation, has been published.

    Jan 19th, 2011: TCE 1.3 virtual machine image uploaded & slides for teaching

    The virtual machine image for easy TCE experimentation has been updated to TCE 1.3. See the bottom of the download page for more info.

    In the documentation section there are now some slide sets that can be useful for teaching or giving TCE tutorials.

    Nov 10th, 2010: TCE 1.3 released

    A new version of the toolset is now available for download. Check the release announcement and the change log. Good luck with your new TTA designs!

    --The TCE crew

    Contact Us

    Send email to to contact us.