1 ...6 7 8 10 11 12 ...19
Artificial intelligence is a field of study in computer science. Much like the field of medicine, it encompasses many sub-disciplines, specializations, and techniques.
Semantic networks and symbolic reasoning
Also known as good old-fashioned AI (GOFAI), semantic networks and symbolic reasoning dominated solutions during the first three decades of AI development in the form of rules engines and expert systems.
Semantic networks are a way to organize relationships between words, or more precisely, relationships between concepts as expressed with words, which are gathered to form a specification of the known entities and relationships in the system, also called an ontology.
The is a relationship takes the form “X is a Y” and establishes the basis of a taxonomic hierarchy. For example: A monkey is a primate. A primate is a mammal. A mammal is a vertebrate. A human is a primate. With this information, the system can not only link human with primate, but also with mammal and vertebrate, as it inherits the properties of higher nodes.
However, the meaning of monkey as a verb, as in “don’t monkey with that,” has no relationship to primates, and neither does monkey as an adjective, as in monkey bread, monkey wrench, or monkey tree, which aren’t related to each other either. Now you start to get an inkling of the challenge facing data scientists.
Another relationship, the case relationship, maps out the elements of a sentence based on the verb and the associated subject, object, and recipient, as applicable. Table 1-1shows a case relationship for the sentence “The boy threw a bone to the dog.”
TABLE 1-1Case Relationship for a Sentence
Case |
Threw |
Agent |
Boy |
Object |
Bone |
Recipient |
Dog |
The case relationship for other uses of “threw” won’t necessarily follow the same structure.
The pitcher threw the game.
The car threw a rod.
The toddler threw a tantrum.
Early iterations of rules engines and expert systems were code-driven, meaning much of the system was built on manually coded algorithms. Consequently, they were cumbersome to maintain and modify and thus lacked scalability. The availability of big data set the stage for the development of data-driven models. Symbolic AI evolved using the combination of machine-learning ontologies and statistical text mining to get the extra oomph that powers the current AI renaissance.
The information age has produced a super-abundance of data, a kind of potential digital energy that AI scientists mine and refine to power modern commerce, research, government, and other endeavors.
Data mining processes structured data such as is found in corporate enterprise resource planning (ERP) systems or customer databases, and it applies modelling functions to produce actionable information. Analytics and business intelligence (BI) platforms can quickly identify and retrieve information from large datasets of structured data and apply the data mining functions described here to create models that enable descriptive, predictive, and prescriptive analytics:
Association: This determines the probability that two contemporaneous events are related. For example, in sales transactions, the association function can uncover purchase patterns, such as when a customer who buys milk also buys cereal.
Classification: This reveals patterns that can be used to categorize an item. For example, weather prediction depends on identifying patterns in weather conditions (such as rising or dropping air pressure) to predict whether it will be sunny or cloudy.
Clustering: This organizes data by identifying similarities and grouping elements into clusters to reveal new information. One example is segmenting customers by gender, marital status, or neighborhood.
Regression: This predicts a numeric value depending on the variables in a given dataset. For example, the price of a used car can be determined by analyzing its age, mileage, condition, option packages, and other variables.
Because data mining works on the structured data within the organization, it is particularly suited to deliver a wide range of operational and business benefits. For example, data mining can crunch data from IoT systems to enable the predictive maintenance of factory equipment or combine historical sales data with customer behaviors to predict future sales and patterns of demand.
Text mining deals with unstructured data, which must be organized and structured before applying data modeling and analytics. Using natural-language processing (NLP), text-mining software can extract data elements to populate the structured metadata fields such as author, date, and content summary that enable analysis.
Text mining can go beyond data mining to synthesize vast amounts of content to identify people, places, things, events, and time frames mentioned in written text, assign emotional tone to each mention of them (negative, positive, or neutral), and even understand whether the document is factual or opinion.
Text mining is important for its ability to digest unstructured textual data, which contains more context and valuable insights than structured, transactional data, because it reflects the author’s opinion, intention, emotion, and conclusions.
In 2018, Google introduced a technique for NLP pre-training called Bidirectional Encoder Representations from Transformers (BERT). This technique replaces ontologies with statistical-based mining to ratchet up the relevance of search results.
With AI and machine learning comes an assumption that the more clean data you have, the more accurate your predictions become. But this also assumes you have the horsepower to process and analyze that data quickly, at scale, without dimming the city’s lights. To be effective at customer analysis, AI solutions must process immense amounts of data efficiently and scale to meet increasing volumes of data over time as it is collected and persisted.
Table 1-2compares and contrasts the properties and uses of data mining versus text mining.
TABLE 1-2Data Mining Versus Text Mining
|
Data Mining |
Text Mining |
Overview |
Data mining searches for patterns and relationships in structured data. |
Text mining transforms unstructured textual data into structured information to enable data analysis. |
Data Type |
Structured data from large datasets is found in systems such as databases, spreadsheets, ERP, and accounting applications. |
Unstructured textual data is found in emails, documents, presentations, videos, file shares, social media, and the Internet. |
Data Retrieval |
Structured data is homogenous and organized, making it easy to retrieve. |
Unstructured textual data comes in many different formats and content types located in a more diverse range of applications and systems. |
Data Preparation |
Structured data is formal and formatted, facilitating the process of ingesting data into analytical models. |
Linguistic and statistical techniques — including NLP keywording and meta-tagging — must be applied to turn unstructured into usable structured data. |
Taxonomy |
There is no need to create an overriding taxonomy. |
A global taxonomy must be applied to organize the data into a common framework. |
Читать дальше