A Corpus Driven Computational Intelligence Framework for Deception Detection in Financial Text

Minhas, Saliha Z

A Corpus Driven Computational Intelligence Framework for Deception Detection in Financial Text

Financial fraud rampages onwards seemingly uncontained. The annual cost of fraud in the UK is estimated to be as high as £193bn a year [1] . From a data science perspective and hitherto less explored this thesis demonstrates how the use of linguistic features to drive data mining algorithms can aid...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autor principal:	Minhas, Saliha Z (Autor)
Tipo de documento:	Electrónico Libro
Lenguaje:	Inglés
Publicado:	2016
En:	Año: 2016
Acceso en línea:	Volltext (kostenfrei)
Verificar disponibilidad:	HBZ Gateway

MARC


LEADER	00000cam a22000002c 4500
001	1866149008
003	DE-627
005	20250115054906.0
007	cr uuu---uuuuu
008	231018s2016 xx \|\|\|\|\|o 00\| \|\|eng c
035			\|a (DE-627)1866149008
035			\|a (DE-599)KXP1866149008
040			\|a DE-627 \|b ger \|c DE-627 \|e rda
041			\|a eng
084			\|a 2,1 \|2 ssgn
100	1		\|a Minhas, Saliha Z \|e VerfasserIn \|4 aut
245	1	2	\|a A Corpus Driven Computational Intelligence Framework for Deception Detection in Financial Text
264		1	\|c 2016
336			\|a Text \|b txt \|2 rdacontent
337			\|a Computermedien \|b c \|2 rdamedia
338			\|a Online-Ressource \|b cr \|2 rdacarrier
520			\|a Financial fraud rampages onwards seemingly uncontained. The annual cost of fraud in the UK is estimated to be as high as £193bn a year [1] . From a data science perspective and hitherto less explored this thesis demonstrates how the use of linguistic features to drive data mining algorithms can aid in unravelling fraud. To this end, the spotlight is turned on Financial Statement Fraud (FSF), known to be the costliest type of fraud [2]. A new corpus of 6.3 million words is composed of102 annual reports/10-K (narrative sections) from firms formally indicted for FSF juxtaposed with 306 non-fraud firms of similar size and industrial grouping. Differently from other similar studies, this thesis uniquely takes a wide angled view and extracts a range of features of different categories from the corpus. These linguistic correlates of deception are uncovered using a variety of techniques and tools. Corpus linguistics methodology is applied to extract keywords and to examine linguistic structure. N-grams are extracted to draw out collocations. Readability measurement in financial text is advanced through the extraction of new indices that probe the text at a deeper level. Cognitive and perceptual processes are also picked out. Tone, intention and liquidity are gauged using customised word lists. Linguistic ratios are derived from grammatical constructs and word categories. An attempt is also made to determine ‘what’ was said as opposed to ‘how’. Further a new module is developed to condense synonyms into concepts. Lastly frequency counts from keywords unearthed from a previous content analysis study on financial narrative are also used. These features are then used to drive machine learning based classification and clustering algorithms to determine if they aid in discriminating a fraud from a non-fraud firm. The results derived from the battery of models built typically exceed classification accuracy of 70%. The above process is amalgamated into a framework. The process outlined, driven by empirical data demonstrates in a practical way how linguistic analysis could aid in fraud detection and also constitutes a unique contribution made to deception detection studies
856	4	0	\|u https://core.ac.uk/download/82960676.pdf \|x Verlag \|z kostenfrei \|3 Volltext
935			\|a mkri
951			\|a BO
ELC			\|a 1
LOK			\|0 000 xxxxxcx a22 zn 4500
LOK			\|0 001 4391829762
LOK			\|0 003 DE-627
LOK			\|0 004 1866149008
LOK			\|0 005 20231018043713
LOK			\|0 008 231018\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|ger\|\|\|\|\|\|\|
LOK			\|0 035 \|a (DE-2619)CORE18781064
LOK			\|0 040 \|a DE-2619 \|c DE-627 \|d DE-2619
LOK			\|0 092 \|o n
LOK			\|0 852 \|a DE-2619
LOK			\|0 852 1 \|9 00
LOK			\|0 935 \|a core
OAS			\|a 1
ORI			\|a SA-MARC-krimdoka001.raw