LibCat » Книги » Приключения » unrecognised » Albert Chun-Chen Liu - Artificial Intelligence Hardware Design

Albert Chun-Chen Liu - Artificial Intelligence Hardware Design

Здесь есть возможность читать онлайн «Albert Chun-Chen Liu - Artificial Intelligence Hardware Design» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
Artificial Intelligence Hardware Design
Автор:
Albert Chun-Chen Liu
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
4 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 80
- 1
- 2
- 3
- 4
- 5

Artificial Intelligence Hardware Design: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Artificial Intelligence Hardware Design»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

ARTIFICIAL INTELLIGENCE HARDWARE DESIGN
Learn foundational and advanced topics in Neural Processing Unit design with real-world examples from leading voices in the field Artificial Intelligence Hardware Design: Challenges and Solutions
Artificial Intelligence Hardware Design

Artificial Intelligence Hardware Design — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Artificial Intelligence Hardware Design», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

4 Chapter 4Figure 4.1 Data streaming TCS model.Figure 4.2 Blaize depth‐first scheduling approach.Figure 4.3 Blaize graph streaming processor architecture.Figure 4.4 Blaize GSP thread scheduling.Figure 4.5 Blaize GSP instruction scheduling.Figure 4.6 Streaming vs. sequential processing comparison.Figure 4.7 Blaize GSP convolution operation.Figure 4.8 Intelligence processing unit architecture [8].Figure 4.9 Intelligence processing unit mixed‐precision multiplication.Figure 4.10 Intelligence processing unit single‐precision multiplication.Figure 4.11 Intelligence processing unit interconnect architecture [9].Figure 4.12 Intelligence processing unit bulk synchronous parallel model.Figure 4.13 Intelligence processing unit bulk synchronous parallel execution...Figure 4.14 Intelligence processing unit bulk synchronous parallel inter‐chi...

5 Chapter 5Figure 5.1 Deep convolutional neural network hardware architecture.Figure 5.2 Convolution computation.Figure 5.3 Filter decomposition with zero padding.Figure 5.4 Filter decomposition approach.Figure 5.5 Data streaming architecture with the data flow.Figure 5.6 DCNN accelerator COL buffer architecture.Figure 5.7 Data streaming architecture with 1×1 convolution mode.Figure 5.8 Max pooling architecture.Figure 5.9 Convolution engine architecture.Figure 5.10 Accumulation (ACCU) buffer architecture.Figure 5.11 Neural network model compression.Figure 5.12 Eyeriss system architecture.Figure 5.13 2D convolution to 1D multiplication mapping.Figure 5.14 2D convolution to 1D multiplication – step #1.Figure 5.15 2D convolution to 1D multiplication – step #2.Figure 5.16 2D convolution to 1D multiplication – step #3.Figure 5.17 2D convolution to 1D multiplication – step #4.Figure 5.18 Output stationary.Figure 5.19 Output stationary index looping.Figure 5.20 Weight stationary.Figure 5.21 Weight stationary index looping.Figure 5.22 Input stationary.Figure 5.23 Input stationary index looping.Figure 5.24 Eyeriss Row Stationary (RS) dataflow.Figure 5.25 Filter reuse.Figure 5.26 Feature map reuse.Figure 5.27 Partial sum reuse.Figure 5.28 Eyeriss run‐length compression.Figure 5.29 Eyeriss processing element architecture.Figure 5.30 Eyeriss global input network.Figure 5.31 Eyeriss processing element mapping (AlexNet CONV1).Figure 5.32 Eyeriss processing element mapping (AlexNet CONV2).Figure 5.33 Eyeriss processing element mapping (AlexNet CONV3).Figure 5.34 Eyeriss processing element mapping (AlexNet CONV4/CONV5).Figure 5.35 Eyeriss processing element operation (AlexNet CONV1).Figure 5.36 Eyeriss processing element operation (AlexNet CONV2).Figure 5.37 Eyeriss processing element (AlexNet CONV3).Figure 5.38 Eyeriss processing element operation (AlexNet CONV4/CONV5).Figure 5.39 Eyeriss architecture comparison.Figure 5.40 Eyeriss v2 system architecture.Figure 5.41 Network‐on‐Chip configurations.Figure 5.42 Mesh network configuration.Figure 5.43 Eyeriss v2 hierarchical mesh network examples.Figure 5.44 Eyeriss v2 input activation hierarchical mesh network.Figure 5.45 Weights hierarchical mesh network.Figure 5.46 Eyeriss v2 partial sum hierarchical mesh network.Figure 5.47 Eyeriss v1 neural network model performance. [6]Figure 5.48 Eyeriss v2 neural network model performance. [6]Figure 5.49 Compressed sparse column format.Figure 5.50 Eyeriss v2 PE architecture.Figure 5.51 Eyeriss v2 row stationary plus dataflow.Figure 5.52 Eyeriss architecture AlexNet throughput speedup [6].Figure 5.53 Eyeriss architecture AlexNet energy efficiency [6].Figure 5.54 Eyeriss architecture MobileNet throughput speedup [6].Figure 5.55 Eyeriss architecture MobileNet energy efficiency [6].

6 Chapter 6Figure 6.1 Neurocube architecture.Figure 6.2 Neurocube organization.Figure 6.3 Neurocube 2D mesh network.Figure 6.4 Memory‐centric neural computing flow.Figure 6.5 Programmable neurosequence generator architecture.Figure 6.6 Neurocube programmable neurosequence generator.Figure 6.7 Tetris system architecture.Figure 6.8 Tetris neural network engine.Figure 6.9 In‐memory accumulation.Figure 6.10 Global buffer bypass.Figure 6.11 NN partitioning scheme comparison.Figure 6.12 Tetris performance and power comparison [7].Figure 6.13 NeuroStream and NeuroCluster architecture.Figure 6.14 NeuroStream coprocessor architecture.Figure 6.15 NeuroStream 4D tiling.Figure 6.16 NeuroStream roofline plot [8].

7 Chapter 7Figure 7.1 DaDianNao system architecture.Figure 7.2 DaDianNao neural functional unit architecture.Figure 7.3 DaDianNao pipeline configuration.Figure 7.4 DaDianNao multi‐node mapping.Figure 7.5 DaDianNao timing performance (Training) [1].Figure 7.6 DaDianNao timing performance (Inference) [1].Figure 7.7 DaDianNao power reduction (Training) [1].Figure 7.8 DaDianNao power reduction (Inference) [1].Figure 7.9 DaDianNao basic operation.Figure 7.10 Cnvlutin basic operation.Figure 7.11 DaDianNao architecture.Figure 7.12 Cnvlutin architecture.Figure 7.13 DaDianNao processing order.Figure 7.14 Cnvlutin processing order.Figure 7.15 Cnvlutin zero free neuron array format.Figure 7.16 Cnvlutin dispatch.Figure 7.17 Cnvlutin timing comparison [4].Figure 7.18 Cnvlutin power comparison [4].Figure 7.19 Cnvlutin 2ineffectual activation skipping.Figure 7.20 Cnvlutin 2ineffectual weight skipping.

8 Chapter 8Figure 8.1 EIE leading nonzero detection network.Figure 8.2 EIE processing element architecture.Figure 8.3 Deep compression weight sharing and quantization.Figure 8.4 Matrix W, vector a and b are interleaved over four processing ele...Figure 8.5 Matrix W layout in compressed sparse column format.Figure 8.6 EIE timing performance comparison [1].Figure 8.7 EIE energy efficient comparison [1].Figure 8.8 Cambricon‐X architecture.Figure 8.9 Cambricon‐X processing element architecture.Figure 8.10 Cambricon‐X sparse compression.Figure 8.11 Cambricon‐X buffer controller architecture.Figure 8.12 Cambricon‐X index module architecture.Figure 8.13 Cambricon‐X direct indexing architecture.Figure 8.14 Cambricon‐X step indexing architecture.Figure 8.15 Cambricon‐X timing performance comparison [4].Figure 8.16 Cambricon‐X energy efficiency comparison [4].Figure 8.17 SCNN convolution.Figure 8.18 SCNN convolution nested loop.Figure 8.19 PT‐IS‐CP‐dense dataflow.Figure 8.20 SCNN architecture.Figure 8.21 SCNN dataflow.Figure 8.22 SCNN weight compression.Figure 8.23 SCNN timing performance comparison [5].Figure 8.24 SCNN energy efficiency comparison [5].Figure 8.25 SeerNet architecture.Figure 8.26 SeerNet Q‐ReLU and Q‐max‐pooling.Figure 8.27 SeerNet quantization.Figure 8.28 SeerNet sparsity‐mask encoding.

9 Chapter 9Figure 9.1 2.5D interposer architecture.Figure 9.2 3D stacked architecture.Figure 9.3 3D‐IC PDN configuration (pyramid shape).Figure 9.4 PDN – Conventional PDN Manthan geometry.Figure 9.5 Novel PDN X topology.Figure 9.6 3D network bridge.Figure 9.7 Neural network layer multiple nodes connection.Figure 9.8 3D network switch.Figure 9.9 3D network bridge segmentation.Figure 9.10 Multiple‐channel bidirectional high‐speed link.Figure 9.11 Power switch configuration.Figure 9.12 3D neural processing power gating approach.Figure 9.13 3D neural processing clock gating approach.

Guide

1 Cover Page

2 Series Page IEEE Press 445 Hoes Lane Piscataway, NJ 08854 IEEE Press Editorial Board Ekram Hossain, Editor in Chief Jón Atli Benediktsson Xiaoou Li Jeffrey Reed Anjan Bose Lian Yong Diomidis Spinellis David Alan Grier Andreas Molisch Saeid Nahavandi Elya B. Joffe Sarah Spurgeon Ahmet Murat Tekalp

3 Title Page

4 Copyright Page Copyright © 2021 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved. Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per‐copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750‐8400, fax (978) 750‐4470, or on the web at www.copyright.com . Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748‐6011, fax (201) 748‐6008, or online at http://www.wiley.com/go/permission . Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762‐2974, outside the United States at (317) 572‐3993 or fax (317) 572‐4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com . Library of Congress Cataloging‐in‐Publication data applied for: ISBN: 9781119810452 Cover design by Wiley Cover image: © Rasi Bhadramani/iStock/Getty Images