LibCat » Книги » Приключения » unrecognised » Machine Vision Inspection Systems, Machine Learning-Based Approaches

Machine Vision Inspection Systems, Machine Learning-Based Approaches

Здесь есть возможность читать онлайн «Machine Vision Inspection Systems, Machine Learning-Based Approaches» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
Machine Vision Inspection Systems, Machine Learning-Based Approaches
Автор:
Неизвестный Автор
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
5 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 100
- 1
- 2
- 3
- 4
- 5

Machine Vision Inspection Systems, Machine Learning-Based Approaches: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Machine Vision Inspection Systems, Machine Learning-Based Approaches»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Machine Vision Inspection Systems (MVIS) is a multidisciplinary research field that emphasizes image processing, machine vision and, pattern recognition for industrial applications. Inspection techniques are generally used in destructive and non-destructive evaluation industry. Now a day's the current research on machine inspection gained more popularity among various researchers, because the manual assessment of the inspection may fail and turn into false assessment due to a large number of examining while inspection process.
This volume 2 covers machine learning-based approaches in MVIS applications and it can be employed to a wide diversity of problems particularly in Non-Destructive testing (NDT), presence/absence detection, defect/fault detection (weld, textile, tiles, wood, etc.,), automated vision test & measurement, pattern matching, optical character recognition & verification (OCR/OCV), natural language processing, medical diagnosis, etc. This edited book is designed to address various aspects of recent methodologies, concepts, and research plan out to the readers for giving more depth insights for perusing research on machine vision using machine learning-based approaches.

Machine Vision Inspection Systems, Machine Learning-Based Approaches — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Machine Vision Inspection Systems, Machine Learning-Based Approaches», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

Character detection using one-shot learning has been addressed previously by researchers such as Lake et al. [6] using generative character model, Koch et al. [7] using Convolutional Neural Networks (CNN). In this proposed study, we focus on using capsule networks integrated into a Siamese network [8] to learn a generalized abstract function which outputs the similarity of two images. Capsule networks are the latest advancement in the computer vision domain, and they possess several advantages over traditional convolutional layers [9].

Translation invariance or disability to identify the position of an object relative to another is one main shortcoming of convolutional layers compared to capsules [10]. Further use of global pooling in CNN causes loss of valuable information. Hinton et al. [11] have proposed capsule networks as a solution to these problems. In this study, by using a capsule-based network architecture, we achieve equal level performance as deep convolutional Siamese networks, which proposed in previous literature but using a smaller number of parameters.

The main contributions of the study:

Propose a novel capsule-based Siamese network architecture to perform one-shot learning,

Improve energy function of Siamese network to grab complex information output by Capsules,

Evaluate and analyse the performance of the model to identify characters which are previously not seen,

Extend Omniglot dataset by adding new characters from Sinhala language.

The chapter is structured as follows. Section 2.2explores related learning techniques. Section 2.3describes the design and implementation aspects of the proposed solution for the capsule layers-based Siamese network. Section 2.4evaluates the methodology using several experiments and analyzing the results. Section 2.5discusses the contribution of the proposed solution with the existing studies and concludes the chapter.

2.2 Background Study

2.2.1 Convolutional Neural Networks

Convolutional neural networks have been commonly used in computer vision research and applications [12] due to their ability to process a large amount of data and extract meaningful and powerful representations from it [13–15]. Before the era of CNNs, computer vision tasks largely relied on handcrafted features and mathematical modeling. There a large number of applications that relies on features Gabor wavelets [16–18], fractal dimensions [19–21], symmetric axis chords [22].

However, when it comes to handwritten character classification for low resource languages, the deep neural network’s this ability becomes more of a limitation, as not much of labeled training data available.

An ideal solution for handwritten character recognition should be based on zero-shot learning, where no previous sample used to classify or one- shot learning, where only one or few samples are used for training [23]. Several attempts have been made to modify different deep neural networks to match requirements of one-shot learning [24–26].

2.2.2 Related Studies on One-Shot Learning

Initial attempts on one-shot learning in computer vision domain are based on probabilistic approaches. Fei-Fei et al. [4] in 2003, have introduced a model to learn visual concepts and then use that knowledge to learn new categories. They have used a variational Bayesian framework. Here, the probabilistic models have used to represent the object groups and a probability density function has used to denote the prior knowledge. Their model has supported to learn four visual concepts, human faces, aeroplanes, motorcycles, and spotted cats. Initially, abstract knowledge is learned by training on many samples belong to three categories. Then this knowledge is used to understand the remaining category with the help of a small number of examples (1 to 5 training examples).

Lately, neural networks came in as a solution to the one-shot learning problem. The two main types of networks used in the one-shot learning tasks are memory augmented neural networks [26, 27] and Siamese neural networks [7, 24, 28]. Memory augmented neural networks are similar to Recurrent neural networks (RNN), but they have an external memory and try to separate the computation from memory [29]. Siamese networks have two similar branches of networks, and the output of those compared to get a decision on one-shot task. Most of the time, Siamese network branches are built on convolutional layers or fully connected layers.

2.2.3 Character Recognition as a One-Shot Task

Lake et al. [6] in 2013, has introduced Omniglot dataset and defined a one- shot learning problem there as a handwritten character recognition task. Omniglot is a handwritten character dataset similar to digit dataset named MNIST, which stands for Modified National Institute of Standards and Technology database [30]. However, in contrast to MNIST, Omniglot has 1,600 characters belonging to 50 alphabets. Each character has only 20 samples where MNIST has only ten classes and thousands of samples for each class. In order to accurately categorize characters in Omniglot, Lake et al. have proposed a one-shot learning approach named, Hierarchical Bayesian Program Learning (HBPL) [6]. Their approach is based on decomposing characters into strokes and determining a structural description for the detected pixels. Here, the strokes in different characters have identified using the knowledge gained from the previous characters. However, this method cannot be applied to complex images, since it uses stroke data to determine class. Further, inference under HBPL is difficult because it has a vast parameter space [7]. In the proposed solution with the capsule layers-based Siamese network, we borrow the problem defined by Lake et al. and propose a novel solution that works a more human-like way.

The above-mentioned methods needed some manual feature engineering, but in human cognition, the required features are learned along with the process of learning new visual concepts. For example, when we observe a car, we decompose it to wheels, body, and internal parts spontaneously. Moreover, to differentiate it from a bicycle, we use those learned features. A similar process can be replicated in machines using capsule neural networks.

Koch et al. [7] in 2015, have proposed a model using Siamese neural networks as a solution to the one-shot learning problem. They have used the same dataset and approach as Lake et al. [6], but their model has used convolutional units in neural networks to achieve understanding about the image. According to Hinton et al. [11], CNNs are misguided in what they are trying to achieve and far from how human visual perception works; hence, they have proposed capsules instead of convolutions.

In this chapter, we present a Siamese neural network based on Capsule networks to solve one-shot learning problem. The idea of the capsule first proposed by Hinton et al. in 2011 and later used for numerous applications [31, 32]. Generally, CNNs aim for viewpoint invariance of the “neuron” activities, so that the characters can be recognized irrespective of the viewing angle. This can be performed by a single scalar output to recap the tasks of replicated feature detectors [9]. In contrast to CNN, capsule networks use local “capsules” that can perform computations on the inputs, internally. These results are encapsulated into an informative output vector [11]. Sabour et al. [9], have proposed an algorithm to train capsule networks based on the concept of routing by agreement between capsules. Dynamic routing helps to achieve equivariance, while CNNs can only achieve invariance by the pooling layers.