An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors

Instruction set customization is an effective way to improve processor performance. Critical portions of applicationdata-flow graphs are collapsed for accelerated execution on specialized hardware. Collapsing dataflow subgraphs will compress the latency along critical paths and reduces the number of...

Full description

Saved in:

Bibliographic Details
Published in:	32nd International Symposium on Computer Architecture (ISCA'05) pp. 272 - 283
Main Authors:	Clark, Nathan, Blome, Jason, Chu, Michael, Mahlke, Scott, Biles, Stuart, Flautner, Krisztian
Format:	Conference Proceeding
Language:	English
Published:	Washington, DC, USA IEEE Computer Society 01.05.2005 IEEE
Series:	ACM Conferences
Subjects:	Acceleration Application software Application specific processors Computer architecture Computer systems organization > Architectures Computer systems organization > Architectures > Parallel architectures > Very long instruction word Computer systems organization > Architectures > Serial architectures > Complex instruction set computing Computer systems organization > Architectures > Serial architectures > Reduced instruction set computing Computer systems organization > Embedded and cyber-physical systems Computer systems organization > Real-time systems Costs Delay General and reference > Cross-computing tools and techniques > Performance Hardware Hardware > Electronic design automation > Modeling and parameter extraction Laboratories Process design Registers
ISBN:	076952270X, 9780769522708
ISSN:	1063-6897
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Instruction set customization is an effective way to improve processor performance. Critical portions of applicationdata-flow graphs are collapsed for accelerated execution on specialized hardware. Collapsing dataflow subgraphs will compress the latency along critical paths and reduces the number of intermediate results stored in the register file. While custom instructions can be effective, the time and cost of designing a new processor for each application is immense. To overcome this roadblock, this paper proposes a flexible architectural framework to transparently integrate custom instructions into a general-purpose processor. Hardware accelerators are added to the processor to execute the collapsed subgraphs. A simple microarchitectural interface is provided to support a plug-and-play model for integrating a wide range of accelerators into a pre-designed and verified processor core. The accelerators are exploited using an approach of static identification and dynamic realization. The compiler is responsible for identifying profitable subgraphs, while the hardware handles discovery, mapping, and execution of compatible subgraphs. This paper presents the design of a plug-and-play transparent accelerator system and evaluates the cost/performance implications of the design.
Bibliography:	SourceType-Conference Papers & Proceedings-1 ObjectType-Conference Paper-1 content type line 25
ISBN:	076952270X 9780769522708
ISSN:	1063-6897
DOI:	10.1109/ISCA.2005.9