1 Cover
2 Title Page SCIENCES Electronics Engineering , Field Director – Francis Balestra Design Methodologies and Architecture , Subject Head – Ahmed Jerraya
3 Copyright First published 2020 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc. Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Ltd 27-37 St George’s Road London SW19 4EU UK www.iste.co.uk John Wiley & Sons, Inc. 111 River Street Hoboken, NJ 07030 USA www.wiley.com © ISTE Ltd 2020 The rights of Liliana Andrade and Frédéric Rousseau to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988. Library of Congress Control Number: 2020940076 British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN 978-1-78945-021-7 ERC code: PE6 Computer Science and Informatics PE6_1 Computer architecture, pervasive computing, ubiquitous computing PE6_10 Web and information systems, database systems, information retrieval and digital libraries, data fusion PE7 Systems and Communication Engineering PE7_2 Electrical engineering: power components and/or systems
4 Foreword
5 Acknowledgments
6 PART 1: Processors
1 Processors for the Internet of Things
1.1. Introduction 1.1. Introduction In recent years, computing paradigms have evolved significantly. One such paradigm is cloud computing, a centralized paradigm that aims to offer computing as a utility. Another complementary paradigm is edge computing, a decentralized paradigm that aims to offer smart compute capabilities at the edge of the network. A related term is the Internet of Things (IoT), which refers to large numbers of interconnected computing devices aimed at offering services to a great variety of applications. In this chapter, we specifically focus on IoT edge devices : smart devices at the edge of the network that interact with the “real world”. These devices acquire data from the environment using sensors. This data is subsequently processed locally on the IoT edge device and/or on computing devices in the network. For each application, a proper trade-off must be made about which functions to perform where based on the requirements for computing, bandwidth, latency, connectivity, security, reliability, etc. The number of IoT edge devices is predicted to grow to tens of billions over the coming years. Some example IoT edge devices are: – smartphones and tablets; – smart doorbells with cameras, performing face detection for triggering an alert, accompanied by an image or video, on the owner’s smartphone; – smart speakers with voice control, employing local speech recognition for a limited vocabulary of voice commands while relaying other speech data into the cloud for more advanced analysis; – smart sensing devices used in agriculture to monitor and control, for example, soil quality, crop yield and livestock, while sporadically communicating data over cellular connections using, for example, NB-IoT protocols for low power consumption. Many IoT edge devices are battery-operated and demand an optimized implementation in order to enable a long battery life. Therefore, we must target low power consumption for functions that need to be performed in software locally on the IoT edge device. This, in turn, requires programmable processors that are optimized for executing these software functions efficiently, which is the topic of this chapter.
1.2. Versatile processors for low-power IoT edge devices 1.3. Machine learning inference 1.4. Conclusion 1.5. References 2 A Qualitative Approach to Many-core Architecture 2.1. Introduction 2.2. Motivations and context 2.3. The MPPA3 many-core processor 2.4. The MPPA3 software environments 2.5. Conclusion 2.6. References 3 The Plural Many-core Architecture – High Performance at Low Power 3.1. Introduction 3.2. Related works 3.3. Plural many-core architecture 3.4. Plural programming model 3.5. Plural hardware scheduler/synchronizer 3.6. Plural networks-on-chip 3.7. Hardware and software accelerators for the Plural architecture 3.8. Plural system software 3.9. Plural software development tools 3.10. Matrix multiplication algorithm on the Plural architecture 3.11. Conclusion 3.12. References 4 ASIP-Based Multi-Processor Systems for an Efficient Implementation of CNNs 4.1. Introduction 4.2. Related works 4.3. ASIP architecture 4.4. Single-core scaling 4.5. MPSoC overview 4.6. NoC parameter exploration 4.7. Summary and conclusion 4.8. References
7 PART 2: Memory 5 Tackling the MPSoC Data Locality Challenge 5.1. Motivation 5.2. MPSoC target platform 5.3. Related work 5.4. Coherence-on-demand: region-based cache coherence 5.5. Near-memory acceleration 5.6. The big picture 5.7. Conclusion 5.8. Acknowledgments 5.9. References 6 mMPU: Building a Memristor-based General-purpose In-memory Computation Architecture 6.1. Introduction 6.2. MAGIC NOR gate 6.3. In-memory algorithms for latency reduction 6.4. Synthesis and in-memory mapping methods 6.5. Designing the memory controller 6.6. Conclusion 6.7. References 7 Removing Load/Store Helpers in Dynamic Binary Translation 7.1. Introduction 7.2. Emulating memory accesses 7.3. Design of our solution 7.4. Implementation 7.5. Evaluation 7.6. Related works 7.7. Conclusion 7.8. References 8 Study and Comparison of Hardware Methods for Distributing Memory Bank Accesses in Many-core Architectures 8.1. Introduction 8.2. Basics on banked memory 8.3. Overview of software approaches 8.4. Hardware approaches 8.5. Modeling and experimenting 8.6. Conclusion 8.7. References
8 PART 3: Interconnect and Interfaces 9 Network-on-Chip (NoC): The Technology that Enabled Multi-processor Systems-on-Chip (MPSoCs) 9.1. History: transition from buses and crossbars to NoCs 9.2. NoC configurability 9.3. System-level services 9.4. Hardware cache coherence 9.5. Future NoC technology developments 9.6. Summary and conclusion 9.7. References 10 Minimum Energy Computing via Supply and Threshold Voltage Scaling 10.1. Introduction 10.2. Standard-cell-based memory for minimum energy computing 10.3. Minimum energy point tracking 10.4. Conclusion 10.5. Acknowledgments 10.6. References 11 Maintaining Communication Consistency During Task Migrations in Heterogeneous Reconfigurable Devices 11.1. Introduction 11.2. Background 11.3. Related works 11.4. Proposed communication methodology in hardware context switching 11.5. Implementation of the communication management on reconfigurable computing architectures 11.6. Experimental results 11.7. Conclusion 11.8. References
9 List of Authors
10 Author Biographies
11 Index
12 End User License Agreement
1 Chapter 1Table 1.1. Input data rates and model complexities for example machine learning ...Table 1.2. Supported kernels in the embARC MLI libraryTable 1.3. Model parameters of the CIFAR-10 CNN graphTable 1.4. Performance data for the CIFAR-10 CNN graph
2 Chapter 2Table 2.1. Cyber-security requirements by application areaTable 2.2. Types of network-on-chip interconnectsTable 2.3. Types of VLIW architectures
3 Chapter 4Table 4.1. Comparison of state-of-the-art CNN acceleratorsTable 4.2. Synthesis results for different configurations of the ASIP
Читать дальше