LibCat » Книги » Приключения » unrecognised » Liliana Andrade - Multi-Processor System-on-Chip 1

Liliana Andrade - Multi-Processor System-on-Chip 1

Здесь есть возможность читать онлайн «Liliana Andrade - Multi-Processor System-on-Chip 1» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
Multi-Processor System-on-Chip 1
Автор:
Liliana Andrade
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
4 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 80
- 1
- 2
- 3
- 4
- 5

Multi-Processor System-on-Chip 1: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Multi-Processor System-on-Chip 1»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

A Multi-Processor System-on-Chip (MPSoC) is the key component for complex applications. These applications put huge pressure on memory, communication devices and computing units. This book, presented in two volumes – Architectures and Applications – therefore celebrates the 20th anniversary of MPSoC, an interdisciplinary forum that focuses on multi-core and multi-processor hardware and software systems. It is this interdisciplinarity which has led to MPSoC bringing together experts in these fields from around the world, over the last two decades. <p><i>Multi-Processor System-on-Chip 1</b> covers the key components of MPSoC: processors, memory, interconnect and interfaces. It describes advance features of these components and technologies to build efficient MPSoC architectures. All the main components are detailed: use of memory and their technology, communication support and consistency, and specific processor architectures for general purposes or for dedicated applications.

Multi-Processor System-on-Chip 1 — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Multi-Processor System-on-Chip 1», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

Figure 2.3. Operation of a Volta tensor core (NVIDIA 2020)

Machine learning computations normally rely on FP32 arithmetic; however, significant savings in memory footprint and increases in performance/efficiency can be achieved by using 16-bit representations for training and 8-bit representations for inference with acceptable precision loss. The main 16-bit formats are FP16 and BF16, which is FP32 with 16 mantissa bits truncated (Intel 2018), and INT16 that covers the 16-bit integer and fixed-point representations (Figure 2.4a). Those reduced bit-width formats are, in fact, used as multiplication operands in linear operations, whose results are still accumulated in FP32, INT32 or larger fixed-point representations.

While mainstream uses of 8-bit formats in convolutional network inference are signed or unsigned integers (Jacob et al . 2018; Krishnamoorthi 2018), floating-point formats smaller than 16-bit are also investigated. Their purpose is to eliminate the complexities associated with small integer quantization: fake quantization, where weights and activations are quantized and dequantized in succession during both the forward and backward passes of training; and post-training calibration, where the histogram of activations is collected on a representative dataset to adjust the saturation thresholds. Microsoft introduced the Msfp8 data format (Chung et al . 2018), which is FP16 truncated to 8 bits, with only 2 bits of mantissa left, along with its extension Msfp9. Among the reduced bit-width floating-point formats, however, the Posit8 representations generate the most interest (Carmichael et al . 2019).

A Posit n.es representation (Figure 2.4b) is parameterized by n , the total number of bits, and es , the number of exponent bits (Gustafson and Yonemoto 2017). The main difference with an IEEE 754 binary floating-point representation is the regime field, which has a dynamic width and encodes a power of 2 2esin unary numerals. (de Dinechin et al . 2019) discuss the advantages and disadvantages of Posit representations. They advise the use of Posit as a storage-only format in order to benefit from the compact encoding, while still relying on standard IEEE binary floating-point arithmetic for numerical guarantees. Experimentally, Posit8 numbers provide an effective compressed representation of FP32 network weights and activations by rounding them to Posit8.1 or Posit8.2 numbers (Resmerita et al . 2020). Another approach is to use Posit8.1 on a log domain for the multiplicands, while converting to a linear domain for the accumulations (Johnson 2018). In both cases, however, the large dynamic range that motivates the use of the Posit representations in machine learning inference requires high-precision or exact accumulations.

Figure 24 Numerical formats used in deep learning inference adapted from - фото 16

Figure 2.4. Numerical formats used in deep learning inference (adapted from Gustafson (2017) and Rodriguez et al. (2018))

2.2.3. Application requirements

In the case of automated driving applications (Figure 2.5), the perception and the path planning functions require programmability, high performances and energy efficiency, which leads to the use of multi-core or GPGPU many-core processors. Multi-core processing entails significant execution resource sharing on the memory hierarchy, which negatively impacts time predictability (Wilhelm and Reineke 2012). Even with a predictable execution model (Forsberg et al . 2017), the functional safety of perception and path planning functions may only reach ISO 26262 ASIL-B. Conversely, vehicle control algorithms, as well as sensor and actuator management, must be delegated to electronic control units that are specifically designed to host ASIL-D functions.

Similarly, unmanned aerial vehicle applications targeted by the MPPA processor are composed of two criticality domains, one being safety-critical and the other non-safety-critical (Figure 2.6). On the MPPA processor, these two domains can be segregated by physical isolation mechanisms, ensuring that no execution resources can be shared between them. The safety-critical domain hosts the trajectory control partition (DO-178C DAL-A/B). The non-critical domain hosts a secured communication partition (ED-202 SAL-3), a data management partition (DAL-E) running on Linux, machine learning and other embedded high-performance computing partitions running on a lightweight POSIX OS. Of interest is the fact that the secured partition is located in the non-safety-critical domain, as the availability requirements of functional safety are incompatible with the integrity requirements of cyber-security.

Figure 2.5. Autoware automated driving system functions (CNX 2019)

Figure 2.6. Application domains and partitions on the MPPA3 processor

Finally, embedded applications in the areas of defense, avionics and automotive have common requirements in the area of cyber-security (Table 2.1). The foundation is the availability of a hardware root of trust (RoT), i.e. a secured component that can be inherently trusted. Such RoT can be provided either as an external hardware security module (HSM), or integrated into the system-on-chip as a central security module (CSM). In both cases, this security module maintains the device’s critical security parameters (CSP) such as public authentication keys, device identity and master encryption keys in a non-volatile secured memory. The security module embeds a TRNG, hashing, symmetric and public-key cryptographic accelerators, in order to support the chain of trust through digital signature verification of firmware and software.

Table 2.1. Cyber-security requirements by application area

	Defense	Avionics	Automotive
Hardware root of trust
Physical attack protection
Software and firmware authentication
Boot firmware confidentiality
Application code confidentiality
Event data record integrity