0% found this document useful (0 votes)
14 views23 pages

Shrikar 1

The technical seminar report presents a big data-driven financial auditing method utilizing convolutional neural networks (CNNs) to enhance the accuracy and efficiency of audits. It addresses the challenges faced by traditional auditing methods in the big data era and proposes a novel approach that integrates advanced data processing techniques and AI. The report includes a literature survey, objectives, and performance results from simulations, demonstrating the effectiveness of the proposed methodology in real-world financial auditing scenarios.

Uploaded by

suhasreddyvk18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views23 pages

Shrikar 1

The technical seminar report presents a big data-driven financial auditing method utilizing convolutional neural networks (CNNs) to enhance the accuracy and efficiency of audits. It addresses the challenges faced by traditional auditing methods in the big data era and proposes a novel approach that integrates advanced data processing techniques and AI. The report includes a literature survey, objectives, and performance results from simulations, demonstrating the effectiveness of the proposed methodology in real-world financial auditing scenarios.

Uploaded by

suhasreddyvk18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 23

A Technical Seminar Report

on

BIG DATA DRIVEN FINANCIAL AUDITING METHOD


USING CONVOLUTION NEURAL NETWORK

Submitted to CVR College of Engineering


by

Pandari Shrikar
20B81A05N6

As Part of Academic Requirement for B. Tech Degree

Department of Computer Science and Engineering

CVR COLLEGE OF ENGINEERING


(An UGC Autonomous Institute, Accredited by NAAC with ‘A’ Grade)

Academic Year 2023 – 2024


CVR COLLEGE OF ENGINEERING
(An UGC Autonomous Institute, Accredited by NAAC with ‘A’ Grade)

Department of Computer Science and Engineering

CERTIFICATE

This is to certify that the technical seminar report titled A Big Data-Driven
Financial Auditing Method Using Convolutional Neural Network is submitted
by Pandari Shrikar, bearing H.T. No: 20B81A05N6, as part of academic
requirement of Graduate Engineering Program in Computer Science and
Engineering.

Class Technical Seminar Coordinator Head of the Department

Dr. M. Swami Das Dr. A. Vani Vathsala


ACKNOWLEDGEMENT

I sincerely thank Dr. K. Ramamohan Reddy, Principal, CVR College of Engineering, for his
cooperation and encouragement throughout the technical seminar.

I earnestly thank Dr. A. Vani Vathsala, Head of Department, Department of Computer Science
and Engineering, CVR College of Engineering, for giving timely cooperation and taking necessary
action throughout the course of my technical seminar.

I express my sincere thanks and gratitude to my Seminar Coordinator Dr. K. Venkatesh Sharma,
Department of Computer Science and Engineering, CVR College of Engineering, for his valuable
help and encouragement throughout the technical seminar.

I express my sincere thanks and gratitude to my Professor In-charge Dr. R. K. Selva Kumar
Department of Computer Science and Engineering, CVR College of Engineering, for her valuable
help and encouragement throughout the technical seminar.

I express my sincere thanks and gratitude to my Section In-charge and Class Seminar coordinator
Dr. M. Swami Das, Department of Computer Science and Engineering, CVR College of
Engineering, for her valuable help and encouragement throughout the technical seminar.

Finally, we thank all those whose guidance helped us in this regard. I place in records my sincere
appreciation and indebtedness to my parents and all the lecturers for their understanding and
cooperation, without whose encouragement and blessing it would not have been possible to
complete this work.

With Regards,
P. Shrikar

20B81A05N6
LIST OF FIGURES:
Figure 1 Algorithm framework of convolutional neural network-based data fusion technology......................8
Figure 2 Internet financial audit supervision object diagram.............................................................................9
Figure 3 Technical logic of audit......................................................................................................................10
Figure 4 Flow chart of audit method................................................................................................................10
Figure 5 Algorithm performance results..........................................................................................................11
Figure 6 ELM model validation curve.............................................................................................................12
Figure 7 Scatter plot of loans under all loan accounts......................................................................................12
Figure 8 Financial asset allocation, economic policy uncertainty, and audit fees with different maturities....13
Figure 9 An example of an unsuccessful training result for the model............................................................13

1
ABBREVATIONS:
1DCNN: 1-Dimensional Convolutional Neural Network
ADASYN: Adaptive Synthetic Sampling
Adaboost: Adaptive Boosting
AI: Artificial Intelligence
AUC: Area Under the Curve
CNN: Convolutional Neural Network
DT: Decision Tree
ELM: Extreme Learning Machine
F-measure: F1 Score (a measure of a test's accuracy)
G-mean: Geometric Mean (a measure of classifier performance)
KNN: k-Nearest Neighbors
LR: Logistic Regression
NCR: Non-Conformance Report
P2P: Peer-to-Peer
RBO: Rank-Biased Overlap
SME: Subject Matter Expert
SMOTE: Synthetic Minority Over-sampling Technique
SMOTE-RBU: SMOTE Re-sampling based on Random Balanced Under-sampling
SMOTETomek: SMOTE and Tomek links
SVM: Support Vector Machine
XGBoost: eXtreme Gradient Boosting
NLP: Natural Language Processing

2
CONTENTS

1. ABSTRACT 4
2. INTRODUCTION 5
3. MOTIVATION AND LITERATURE SURVEY 6
3.1. MOTIVATION 6
3.2. LITERATURE SURVEY 6
4. OBJECTIVE 7
5. TOPIC DESCRIPTION 8
5.1. CONVOLUTION NEURAL NETWORK-BASED DATA FUSION 8
5.2. DESIGN OF FINANCIAL AUDITING MODEL 9
5.3. ALGORITHM PERFORMANCE RESULTS 11
5.4. ANALYSIS OF THE APPLICATION RESULTS OF THE FINANCIAL AUDIT MODEL
14
6. CONCLUSION 14
QUESTIONS ASKED BY EXPERTS 16
REFERENCES 17

3
1. ABSTRACT

In the big data era, traditional auditing methods are facing challenges such as limited audit scope, uneven
distribution of audit power, and insufficient audit analysis. To pursue high efficiency, the utilization of big
data analysis technique in financial auditing has been a novel tendency in this area. Deep learning has been
popular in many areas due to its high freedom degree. Thus, this technical report employs a typical deep
learning model convolution neural network (CNN) and proposes a big data-driven financial auditing method
using CNN. Specifically, the strong ability of feature abstraction of CNN is leveraged to extract multi-level
features in materials, such as visual features, textual features, etc. Then, the multi-source features from
auditing materials can be well fused for final discrimination. Some simulation experiments are conducted on
real-world financial auditing scenes for assessment. And the results show that the designed the proposed
financial auditing method possesses relatively high auditing accuracy. .

4
2. INTRODUCTION
In the age of big data, governments and enterprises are leveraging advanced technologies to enhance
governance and operational efficiency. Financial auditing, a crucial aspect of economic stability, has faced
challenges due to the exponential growth in data volume. To address this, artificial intelligence (AI)
algorithms have emerged as promising tools for intelligent auditing. Particularly, with the rise of internet
finance enterprises catering to the funding needs of small and medium-sized enterprises (SMEs), the demand
for auditing services has surged.

In this context, cloud computing and financial auditing methods have been integrated to improve audit
quality and comparability. Cloud auditing models store financial transaction data on cloud platforms,
enabling auditors to access advanced software and enhance audit efficiency. However, challenges persisted,
including low efficiency and data overload. A novel approach was introduced, utilizing geometric data
reduction techniques to optimize efficiency and achieve desired outcomes.

The financial industry's evolution towards big data finance has driven advancements in data center
infrastructure and software systems. With these advancements, financial risks have become more complex,
necessitating innovative approaches in financial auditing. Big data technology provides the necessary tools
to address emerging challenges efficiently. By leveraging big data and AI, auditors can swiftly pinpoint
critical audit areas, ensuring accurate and timely issue detection.

This technical report proposes a cutting-edge solution: a big data-driven financial auditing method
employing convolutional neural networks (CNNs). CNNs excel at multi-level feature extraction, including
visual and textual features, enhancing the discrimination power of auditing materials. The key contributions
of this work include an analysis of big data technologies, a critique of traditional auditing methods, and the
proposal of computer-aided auditing techniques. Real-world case studies from banking data substantiate the
application of these methodologies, providing valuable insights.

Moreover, this technical report envisions the future of financial auditing in the big data era, advocating the
amalgamation of traditional practices with innovative technologies. The research introduces a
comprehensive approach, bridging the gap between conventional auditing methods and the demands of
modern financial landscapes. By embracing big data and AI, financial auditing can evolve to meet the
challenges of the digital age, ensuring effective governance and economic stability.

5
3. MOTIVATION AND LITERATURE SURVEY

3.1. MOTIVATION

In the rapidly evolving landscape of financial auditing, traditional methods are proving
insufficient to handle the complexities of big data. The advent of big data technologies has
opened new avenues for revolutionizing financial auditing processes. Harnessing the power of
convolutional neural networks (CNNs) offers a promising approach to enhance the accuracy,
efficiency, and depth of financial audits. The motivation behind this research stems from the
pressing need to leverage big data-driven techniques, specifically CNNs, to navigate the vast and
intricate datasets encountered in financial auditing. By exploring this novel method, we aim to
significantly enhance the audit quality, expedite processes, and provide actionable insights to
auditors and financial institutions alike.

3.2. LITERATURE SURVEY

The literature survey delves into the existing body of knowledge in financial auditing and big
data analytics. Traditional financial auditing methods have been extensively studied, emphasizing
structured data and statistical techniques. With the emergence of big data, recent studies have
explored the integration of advanced machine learning algorithms and data-driven approaches in
the auditing domain. Notably, there is a growing interest in the application of convolutional
neural networks (CNNs) for pattern recognition and analysis within financial datasets. Various
research works have demonstrated the potential of CNNs in enhancing fraud detection, risk
assessment, and anomaly detection in financial transactions. This survey critically examines the
strengths and limitations of prior studies, paving the way for the proposed methodology, which
combines big data strategies and CNNs to tackle financial auditing challenges. .

6
4. OBJECTIVE

The primary objective of this research is to develop a robust and efficient financial auditing methodology
empowered by big data analytics and convolutional neural networks (CNNs). The key objectives include:

 Incorporating Big Data Techniques: Integrate advanced big data processing and analysis methods
to handle the immense volume, velocity, and variety of financial data. Employ data cleaning,
preprocessing, and integration strategies tailored for unstructured and semi-structured financial data
sources.

 Implementing Convolutional Neural Networks: Develop and optimize convolutional neural


network models specialized for financial data analysis. Leverage CNN’s deep learning capabilities to
identify complex patterns, anomalies, and fraudulent activities within vast financial datasets.

 Enhancing Audit Accuracy and Efficiency: Improve the accuracy and efficiency of financial audits
by automating data analysis processes. Utilize CNNs to provide auditors with timely, accurate, and
actionable insights, enabling them to focus on critical areas that require human expertise.

 Ensuring Data Security and Compliance: Implement robust data security measures and ensure
compliance with industry regulations and standards. Safeguard sensitive financial information
throughout the auditing process, maintaining the integrity and confidentiality of the data.

 Validation and Performance Assessment: Rigorously validate the proposed methodology using
real-world financial datasets. Evaluate the performance of the CNN-based auditing system against
traditional auditing methods, emphasizing metrics such as accuracy, precision, recall, and
computational efficiency.

By achieving these objectives, this research aims to establish an innovative paradigm in financial auditing,
leveraging big data analytics and convolutional neural networks to elevate the accuracy, speed, and
reliability of financial audits, ultimately benefiting auditors, financial institutions, and the broader economy.

7
5. TOPIC DESCRIPTION

5.1. CONVOLUTION NEURAL NETWORK-BASED DATA FUSION

Figure 1 depicts the process of transforming initial materials into digital features for big data

analysis. It includes steps like data collection, preprocessing, entity extraction, and utilization of
Figure 1 Algorithm framework of convolutional neural network-based data fusion technology.
distributed data storage systems. The diagram highlights stages from raw data to structured
knowledge, covering cleaning, screening, and neural network model application.

The big data processing cycle involves collecting unstructured data from various sources like
mobile phones, computers, satellites, etc. Traditional data collection tools struggle with format
conversion and handling the large volume of unstructured and semi-structured data, which
comprises 80% of existing storage systems. Data cleaning and screening technologies are crucial
for optimizing multi-source data, ensuring quality, and preparing it for analysis. Distributed
storage systems are essential due to their scalability and reliability, overcoming the limitations of
centralized storage. Data visualization technology has evolved to meet big data needs, presenting
information in intuitive forms like images and animations. Entity extraction, a subtask of named
entity recognition, extracts meaningful phrases from data, crucial for creating accurate maps and
analyses. The introduction of big data technology in audit data centers improves industry
coverage and supports large-scale data analysis.

Various algorithms like ADASYN, Borderline-SMOTE, and RBO address minority class
sampling issues, considering data distribution. However, these methods pose challenges in
classification boundary changes and increased computational demands. Handling large files and
fragmented data requires optimizing database performance and leveraging high-performance

8
cache databases like Redis.

Principal-agent theory applies to corporate financial decisions, where management's discretion


over financial asset allocation impacts shareholder interests. Auditing acts as an external
supervision factor, facilitating information transfer between shareholders and management,
mitigating principal-agent issues. In the context of online small and micro loan companies and
P2P platforms, audit investigations typically use sample surveys due to the vast number of
entities in the sector.

5.2. DESIGN OF FINANCIAL AUDITING MODEL

The auditing process has evolved with the advent of mobile Internet-based platforms. It combines
online and offline interaction, leveraging cloud computing and big data technologies. Electronic
data auditing, specific to cloud computing, focuses on verification, not mining, following a
structured three-stage process: audit preparation, implementation, and report generation. The big
data anti-money laundering audit system employs advanced correlation analysis for minute-level
processing, surpassing foreign hourly-level standards. It assesses various services like credit
evaluations and product recommendations.
Audit implementation occurs through on-site and online systems. On-site systems cater to
government auditors, ensuring efficient, comprehensive audit coverage. Online systems facilitate
dynamic, remote audits, enabling trend analysis and proactive measures based on historical data.
However, challenges arise in data acquisition due to limited direct access to financial data stored
in private clouds. Currently, auditors rely on general-purpose software, leading to inefficiencies
and incomplete data processing.
Figure 2 addresses a data
processing bottleneck by using
Redis, a high-performance cache
database. It manages fragmented
data persistence during file
uploads, enhancing concurrent
task handling. The approach
involves Redis, a 10MB shard
size, optimizing data processing
and averting system slowdowns.

9
Figure 2 Internet financial audit supervision object diagram.

Figure 3 depicts the audit's


final stages, emphasizing
forming reliable conclusions
and determining the
appropriate audit opinion
based on evaluations,
highlighting the significance
of this process.

10
Figure 3 Technical logic of audit.

Figure 4 Flow chart of audit method.

Figure 4 illustrates the audit opinion, indicating the outcome and findings of the auditing process.
The figure might depict different categories or types of audit opinions, such as unqualified,
qualified, adverse, or disclaimer, reflecting the conclusions drawn after the audit implementation
stage.

5.3. ALGORITHM PERFORMANCE RESULTS

The NCR sampling algorithm lacks precision in removing redundant samples. SMOTETomek
and Kmeans SMOTE-RBU provide superior cleaning results, creating balanced samples in
datasets like German and Default. After sampling and normalization, classifier performance was
evaluated using Accuracy, F-measure, G-mean, and AUC metrics. Kmeans SMOTE-RBU
consistently outperformed other algorithms across all classifiers, achieving an Accuracy index of
0.8211, a significant improvement over the baseline.
ELM hidden layer nodes set at 50 for optimizing validation accuracy across diverse datasets.
Combination of ELM and 1DCNN networks resulted in enhancements for various basic models:
The NB model demonstrated significant improvements with accuracy rising by 6.84%, F-measure
by 6.93%, G-mean by 8.3%, and AUC by 8.3%.SVM, Adaboost, and XGBoost models showed
enhancements ranging from 1.08% to 4.2%. Other basic models experienced moderate
enhancements around 1%. The LR model saw about a 1% improvement. KNN model improved

11
by over 3%. SVM exhibited a 1.7% increase in accuracy and a 1.5% improvement in F-measure.
DT model enhanced by approximately 1.83%. Adaboost model improved by 2.7%. XGBoost
model showed improvements with accuracy rising by 0.85%, F-measure by 1.2%, G-mean by
2.07%, and AUC by 1.95%. .

Figure 5 Algorithm performance results

Figure 5 shows the comparative performance of various sampling algorithms, highlighting the
superiority of KmeansSMOTE-RBU, achieving an accuracy index of 0.8211. Notably, it
significantly outperforms the original dataset across Accuracy, F-measure, G-mean, and AUC
metrics.

12
Figure 6 ELM model validation curve

Figure 6 shows the verification curve of the Extreme Learning Machine (ELM) model, displaying
the relationship between the number of nodes in the ELM hidden layer and classification
accuracy across four datasets: Australian, Japanese, German, and Default. The graph helps
determine the optimal number of nodes for the ELM model based on training and validation
accuracy.

Figure 7 Scatter plot of loans under all loan accounts.

Figure 7 displays loan data from XX Bank, categorizing it into 46 loan types. It visualizes the
concentration of loan information for each type using color-coding, with darker colors indicating
higher repetition. This helps analyze loan distribution and identify potential patterns.

Figure 8 Financial asset allocation, economic policy uncertainty, and audit fees with different maturities.

Figure 8 depicts regulatory challenges for small loan companies, emphasizing disparities with
financial institutions under China's Banking and Insurance Regulatory Commission rules. Small

13
loan firms lack supervision and benefits, hindering their growth in an unfavourable environment.

Figure 9 An example of an unsuccessful training result for the model.

Figure 9 displays an unsuccessful model's outcome, prompting adjustments in parameters


like training steps and session size. This failure affects both quantity and quality of
customer groups for small loan companies, indicating challenges in the modeling process.

5.4. ANALYSIS OF THE APPLICATION RESULTS OF THE FINANCIAL


AUDIT MODEL

The analysis was conducted using data from XX Bank's credit loan database, focusing on three
key tables: loan sub-accounts (4,961 entries), loan sub-accounts (26,050 entries), and loan
issuance/recovery (41,366 entries). Due to limitations in Excel, the R language was employed for
in-depth analysis. Notable findings included the visualization of loan distribution using ggplot2,
revealing concentration variations among 46 loan types. The study addressed challenges faced by
small loan companies, navigating legal ambiguities and market competition. Model adjustments
were made, including modifications to parameters for improved performance and the
implementation of efficient convolution operations for numerical analysis. Management of short-
term asset allocation aimed at enhancing financial stability, although it was noted that increasing

14
long-term assets posed audit risks. Small loan firms were found to grapple with legal,
competitive, and risk control challenges. Adjusting parameters was instrumental in improving
model outcomes and recognition performance. Audit risk management strategies were crucial,
emphasizing that short-term asset allocation reduced financial stress, while long-term assets
heightened cash flow and audit risks. To counter these challenges, auditors implemented precise
procedures, leading to higher audit premiums to mitigate inspection risk.

6. CONCLUSION
In the era of big data, computer-aided auditing has received more and more attention from audit departments
and auditors, especially financial auditing. The application of computer statistical analysis software has
gradually deepened, and the development and application of the R language have also received increased
attention. And from this, the application of computer-aided auditing is proposed. Based on the financial
auditing practice, the case analysis of the real data of the bank is carried out. From the perspective of a
financial auditor, in the software environment of the R language, the bank loan data has been
comprehensively made. In the process of case application, the meaning of R language code is fully and
comprehensively explained, the common functions and models in the R language are found in combination
with audit practice, and the content of data visualization is mainly realized. At the same time, the data
analysis of each step also combines traditional audit methods to think about the possible problems behind the

15
case data and put forward suggestions for the follow-up audit work. Finally, through the code writing and
visual presentation of the R language, the application of financial big data auditing is realized, and the
working ideas of financial big data auditing are tentatively planned to provide certain help to auditors.

16
QUESTIONS ASKED BY EXPERTS

Q: How can Convolutional Neural Networks (CNNs) enhance traditional financial auditing methods?
A: CNNs excel in pattern recognition within complex datasets, enabling them to identify intricate financial
patterns
and anomalies that might go unnoticed in traditional methods. By leveraging CNNs, auditors can analyze
vast datasets efficiently, ensuring a comprehensive audit process.

Q: What challenges does the integration of CNNs pose in financial auditing, and how are these challenges
mitigated?
A: Challenges include data privacy concerns, model interpretability, and computational requirements.
Privacy protocols are adhered to, interpretability is ensured through transparent model design, and
computational challenges are mitigated through optimized algorithms and hardware resources.

Q: How does the proposed methodology handle unstructured financial data, such as textual information and
multimedia content?
A: The methodology incorporates natural language processing (NLP) techniques to handle textual data,
converting unstructured information into structured formats for analysis. Multimedia content is processed
using feature extraction methods, allowing CNNs to analyze diverse data types comprehensively.

Q: What are the implications of the research findings for the financial auditing industry?
A: The research findings offer a transformative approach to financial auditing, enhancing accuracy, speed,
and reliability. Financial auditing firms can leverage these insights to adopt advanced technologies,
improving their auditing practices, client satisfaction, and industry reputation. .

17
REFERENCES
[1] Z. Guo, K. Yu, A. K. Bashir, D. Zhang, Y. D. Al-Otaibi, and M. Guizani, ‘‘Deep information
fusion-driven POI scheduling for mobile social networks,” IEEE Netw., vol. 36, no. 4, pp. 210–216,
Jul. 2022.
[2]Q. Li, L. Liu, Z. Guo, P. Vijayakumar, F. Taghizadeh-Hesary, and K. Yu, ‘‘Smart assessment and
forecasting framework for healthy development index in urban cities,’’ Cities, vol. 131, Dec. 2022,
Art. no. 103971.
[3]L. Yang, Y. Li, S. X. Yang, Y. Lu, T. Guo, and K. Yu, ‘‘Generative adversarial learning for
intelligent trust management in 6G wireless networks,’’ IEEE Netw., vol. 36, no. 4, pp. 134–140, Jul.
2022.
[4]Z. Zhou, Y. Su, J. Li, K. Yu, Q. M. Jonathan Wu, Z. Fu, and Y. Shi, ‘‘Secret-to-image reversible
transformation for generative steganography,’’ IEEE Trans. Dependable Secure Comput., early
access, Oct. 27, 2022, doi: 10.1109/TDSC.2022.3217661.
[5]Q. Zhang, Z. Guo, Y. Zhu, P. Vijayakumar, A. Castiglione, and B. B. Gupta, ‘‘A deep learning-based
fast fake news detection model for cyber-physical social services,’’ Pattern Recognit. Lett., vol. 168,
pp. 31–38, 2023.
[6]Z. Guo, D. Meng, C. Chakraborty, X.-R. Fan, A. Bhardwaj, and K. Yu, ‘‘Autonomous behavioral
decision for vehicular agents based on cyberphysical social intelligence,’’ IEEE Trans. Computat.
Social Syst., early access, Oct. 27, 2022, doi: 10.1109/TCSS.2022.3212864.
[7]E. M. Hassib, A. I. El-Desouky, L. M. Labib, and E.-S.-M. El-Kenawy, ‘‘WOA + BRNN: An
imbalanced big data classification framework using whale optimization and deep neural network,’’
Soft Comput., vol. 24, no. 8, pp. 5573–5592, Apr. 2020.
[8]Y. Li, H. Ma, L. Wang, S. Mao, and G. Wang, ‘‘Optimized content caching and user association for
edge computing in densely deployed heterogeneous networks,’’ IEEE Trans. Mobile Comput., vol.
21, no. 6, pp. 2130–2142, Jun. 2022.
[9]L. Zhao, Z. Bi, A. Hawbani, K. Yu, Y. Zhang, and M. Guizani, ‘‘ELITE: An intelligent digital twin-
based hierarchical routing scheme for softwarized vehicular networks,’’ IEEE Trans. Mobile
Comput., early access, May 31, 2022, doi: 10.1109/TMC.2022.3179254.
[10] L. Zhao, Z. Yin, K. Yu, X. Tang, L. Xu, Z. Guo, and P. Nehra, ‘‘A fuzzy logic-based intelligent
multiattribute routing scheme for twolayered SDVNs,’’ IEEE Trans. Netw. Service Manage., vol. 19,
no. 4, pp. 4189–4200, Dec. 2022. .

18
19
20

You might also like