Applied Modeling Techniques and Data Analysis 2

Здесь есть возможность читать онлайн «Applied Modeling Techniques and Data Analysis 2» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Applied Modeling Techniques and Data Analysis 2: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Applied Modeling Techniques and Data Analysis 2»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

BIG DATA, ARTIFICIAL INTELLIGENCE AND DATA ANALYSIS SET Coordinated by Jacques Janssen

Applied Modeling Techniques and Data Analysis 2 — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Applied Modeling Techniques and Data Analysis 2», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

Part 2 covers the area of applied stochastic and statistical models and methods and comprises eight chapters: Chapter 10, “The Double Flexible Dirichlet: A Structured Mixture Model for Compositional Data”, by Roberto Ascari, Sonia Migliorati and Andrea Ongaro; Chapter 11, “Quantization of Transformed Lévy Measures”, by Mark Anthony Caruana; Chapter 12, “A Flexible Mixture Regression Model for Bounded Multivariate Responses”, by Agnese M. Di Brisco and Sonia Migliorati; Chapter 13, “On Asymptotic Structure of the Critical Galton-Watson Branching Processes with Infinite Variance and Allowing Immigration”, by Azam A. Imomov and Erkin E. Tukhtaev ; Chapter 14, “Properties of the Extreme Points of the Joint Eigenvalue Probability Density Function of the Wishart Matrix”, by Asaph Keikara Muhumuza, Karl Lundengård, Sergei Silvestrov, John Magero Mango and Godwin Kakuba ; Chapter 15, “Forecast Uncertainty of the Weighted TAR Predictor”, by Francesco Giordano and Marcella Niglio ; Chapter 16, “Revisiting Transitions Between Superstatistics”, by Petr Jizba and Martin Prokš ; Chapter 17, “Research on Retrial Queue with Two-Way Communication in a Diffusion Environment”, by Viacheslav Vavilov .

We wish to thank all the authors for their insights and excellent contributions to this book. We would like to acknowledge the assistance of all those involved in the reviewing process of this book, without whose support this could not have been successfully completed. Finally, we wish to express our thanks to the secretariat and, of course, the publishers. It was a great pleasure to work with them in bringing to life this collective volume.

Yannis DIMOTIKALIS

Crete, Greece

Alex KARAGRIGORIOU

Samos, Greece

Christina PARPOULA

Athens, Greece

Christos H. SKIADAS

Athens, Greece

December 2020

PART 1 Financial and Demographic Modeling Techniques

1

Data Mining Application Issues in the Taxpayer Selection Process

This chapter provides a data analysis framework designed to build an effective learning scheme aimed at improving the Italian Revenue Agency’s ability to identify non-compliant taxpayers, with special regard to self-employed individuals allowed to keep simplified registers. Our procedure involves building two C4.5 decision trees, both trained and validated on a sample of 8,000 audited taxpayers, but predicting two different class values, based on two different predictive attribute sets. That is, the first model is built in order to identify the most likely non-compliant taxpayers, while the second identifies the ones that are are less likely to pay the additional due tax bill. This twofold selection process target is needed in order to maximize the overall audit effectiveness. Once both models are in place, the taxpayer selection process will be held in such a way that businesses will only be audited if they are judged as worthy by both models. This methodology will soon be validated on real cases: that is, a sample of taxpayers will be selected according to the classification criteria developed in this chapter and will subsequently be involved in some audit processes.

1.1. Introduction

Fraud detection systems are designed to automate and help reduce the manual parts of a screening/checking process (Phua et al . 2005). Data mining plays an important role in fraud detection as it is often applied to extract fraudulent behavior profiles hidden behind large quantities of data and, thus, may be useful in decision support systems for planning effective audit strategies. Indeed, huge amounts of resources (to put it bluntly, money) may be recovered from well-targeted audits. This explains the increasing interest and investments of both governments and fiscal agencies in intelligent systems for audit planning. The Italian Revenue Agency (hereafter, IRA) itself has been studying data mining application techniques in order to detect tax evasion, focusing, for instance, on the tax credit system, supposed to support investments in disadvantaged areas (de Sisti and Pisani 2007), on fraud related to credit mechanisms, with regard to value-added tax – a tax that is levied on the price of a product or service at each stage of production, distribution or sale to the end consumer, except where a business is the end consumer, which will reclaim this input value (Basta et al . 2009) and on income indicators audits (Barone et al . 2017).

This chapter contributes to the empirical literature on the development of classification models applied to the tax evasion field, presenting a case study that focuses on a dataset of 8,000 audited taxpayers on the fiscal year 2012, each of them described by a set of features, concerning, among others, their tax returns, their properties and their tax notice. 1

In this context, all the taxpayers are in some way “unfaithful”, since all of them have received a tax notice that somehow rectified the tax return they had filed. Thus, the predictive analysis tool we develop is designed to find patterns in data that may help tax offices recognize only the riskiest taxpayers’ profiles.

Evidence on data at hand shows that our first model, which is described in detail later, is able to distinguish the taxpayers who are worthy of closer investigation from those who are not. 2

However, by defining the class value as a function of the higher due taxes, we satisfy the need of focusing on the taxpayers who are more likely to be “significant” tax evaders, but we do not ensure an efficient collection of their tax debt. Indeed, data shows that as the tax bill increases, the number of coercive collection procedures put in place also increases. Unfortunately, these procedures are highly inefficient, as they are able to only collect about 5% of the overall credits claimed against the audited taxpayers (Italian Court of Auditors 2016). As a result, the tax authorities’ ability to collect the due taxes may be jeopardized.

Further analysis is thus devoted to finding a way to discover, among the “significant” evaders, the most solvent ones. We recall that the 2018–2020 Agreement between the IRA and the Ministry of Finance states that audit effectiveness is measured, among others, by an indicator that is simply equal to the sum of the collected due taxes which summarizes the effectiveness of the IRA’s efforts to tackle tax evasion (Ministry of Economy and Finance – IRA Agreement for 2018–2010 2018). This is a reasonable indicator because the ordinary activities taken in the fight against tax evasion are crucial from the State budget point of view, because public expenditures (i.e. public services) strictly depend on the amount of public revenue. Of course, fraud and other incorrect fiscal behaviors may be tackled, even though no tax collection is guaranteed, in order to reach the maximum tax compliance. Such extra activities may also be jointly conducted with the Finance Guard or the Public Prosecutor if tax offenses arise.

Therefore, to tackle our second problem, i.e. to guarantee a certain degree of due tax collection, a trivial fact that we start from is that a taxpayer with no properties will not be willing to pay his dues, whereas if he had something to lose (a home or a car that could be seized), then, if the IRA’s claim is right, it is more probable that he might reach an agreement with the tax authorities.

Therefore, a second model only focusing on a few features indicating whether the taxpayer owned some kind of assets or not is built, in order to predict each tax notice’s final status (in this case, we only distinguish between statuses ending with an enforced recovery proceeding and statuses where such enforced recovery proceedings do not take place). Once both models are available, the taxpayer selection process is held in such a way that businesses will only be audited if they are judged as worthy by both models.

Читать дальше
Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

Похожие книги на «Applied Modeling Techniques and Data Analysis 2»

Представляем Вашему вниманию похожие книги на «Applied Modeling Techniques and Data Analysis 2» списком для выбора. Мы отобрали схожую по названию и смыслу литературу в надежде предоставить читателям больше вариантов отыскать новые, интересные, ещё непрочитанные произведения.


Отзывы о книге «Applied Modeling Techniques and Data Analysis 2»

Обсуждение, отзывы о книге «Applied Modeling Techniques and Data Analysis 2» и просто собственные мнения читателей. Оставьте ваши комментарии, напишите, что Вы думаете о произведении, его смысле или главных героях. Укажите что конкретно понравилось, а что нет, и почему Вы так считаете.

x