A Study On Intelligent Document Processing Using AWS: Smt. K.S. Sukrutha, Ms. Harini.S, Ms - Kusuma. M. V

Uploaded by

maharanifmp94

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views5 pages

A Study On Intelligent Document Processing Using AWS: Smt. K.S. Sukrutha, Ms. Harini.S, Ms - Kusuma. M. V

Uploaded by

maharanifmp94

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

International Journal for Multidisciplinary Research (IJFMR)

E-ISSN: 2582-2160 ● Website: www.ijfmr.com ● Email: editor@ijfmr.com

A Study on Intelligent Document Processing

Using AWS
Smt. K.S. Sukrutha1, Ms. Harini.S2, Ms.Kusuma. M. V. 3
1
Assistant Professor, Department of Computer Science, M.M. K & S.D.M Mahila Maha Vidyalaya,
Mysuru, India
2,3
III BCA Students, M.M. K & S.D.M Mahila Maha Vidyalaya, Mysuru, India

ABSTRACT
This article mainly approaches the topic of Intelligent Document Processing from the viewpoint of the
users of cloud computing platforms and the end-users. The market for commercial OCR, document
categorization and data extraction technologies are briefly reviewed to the extent that it is publicly
available. The necessity for an effective and efficient retrieval of the stored information is increased by
digital repositories. In this study, we suggest that the phases of document layout analysis, document
picture categorization and understanding on digital documents should intensively apply intelligent
techniques. Specifically, the Intelligent Document Processing can be used instead of Artificial
Intelligence, Deep Learning and RPA to retrieve the data and convert to understandable form.

KEYWORDS
Intelligent Document Processing (IDP), Amazon Web Services (AWS), Robotic Process Automation
(RPA), Straight Through Processing (STP), Natural Language Processing (NLP), Personally Identifiable
Information (PII), Deep Learning(DL).

INTRODUCTION
Intelligent document processing (IDP) is the process of converting the unstructured data like email,
images and PDF document to usable data. Usually all 80%-90% of data in the business company is in
the form of unstructured format. Intelligent document processing is one of the processes that is carried
out by artificial intelligence and AI technologies and one of the best automations in coming up
generation it involves computer vision, Deep Learning and machine learning, etc. to extract data [2].
Intelligent document processing crosses 175 zettabytes all over the world by 2025 with the document
from PDF ’s, email etc.
In order to convert the unstructured and semi structured data to usable information IDP mainly act as a
key to the RPA. Without intelligent document processing RPA-automation process requires a
knowledge worker to read documents and extract data. Further, if RPA alone is used for the conversion
of structured, unstructured and semi structured data it leads to less productivity, less economic benefits,
less accurate and less customer satisfaction [5].
Together RPA and IDP will provide a very useful tool to automate a business company or enterprises.
RPA and intelligent document processing which are very much friendly to user and non-invasive are
widely applicable across all companies. [2]

IJFMR23044308 Volume 5, Issue 4, July-August 2023 1

International Journal for Multidisciplinary Research (IJFMR)
E-ISSN: 2582-2160 ● Website: www.ijfmr.com ● Email: editor@ijfmr.com

Intelligent document processing provides

1. Save cost- reduces much processors to process large volume of data.
2. Straight through processing (STP) - minimize the need of knowledge worker when intelligent
document processing is used.
3. Easy to use - any business or enterprise can easily use automate the process.
4. It provides efficiently – it provides document centric process.
5. High accuracy – when AI is used obviously the accuracy is increased.
6. Strategic goals boosted-when a business uses the intelligent document processing automatically their
goals like improving customer experience is also increased.
7. Enterprise with goals like superior customer experience and to provide low cost are opting them to
intelligent document processing.
8. Though the work of intelligent document processing is little bit similar as OCR and RPA many of
them get confused with all of them but they all perform differently.
9. Intelligent document processing does more than OCR in which OCR converts the bitonal imagery to
a machine-readable form (bitonal imagery means single bit images that has only 0 or 1 (0 to 255)
i.e., white or black but the intelligent document processing process the unstructured data to a
structured one) [2]
10. Unlike OCR, intelligent document processing uses Artificial Intelligence (AI) and machine learning
(ML) technologies to change or transform unstructured and semi-structured material into data that
can be understood. Intelligent document processing primarily uses robotic process automation (RPA)
to extract data, improve validation, and automatically enter information into current applications[4].
a. Machine Learning - Machine learning emphases primarily on the study of computer algorithms that
are enhanced by data-driven experiences.
b. Artificial Intelligence – Unlike natural intelligence, which is used by living things, artificial
intelligence is a type of intelligence that is implemented on computers.
c. Deep Learning - Deep Learning is a combination of many machine learning techniques that
primarily makes use of multiple layers in diverse neural network architectures.

INTELLIGENT DOCUMENT PROCESSING IN AWS

Companies like Amazon uses intelligent document processing (Amazon A21) to process some critical
document and for Natural Language Processing with Amazon comprehend and Amazon A21.

Figure 1. Stages of IDP workflow

Figure 1 mainly explains various stages of the intelligent document processing pipeline and the
connectivity between each steps starting from the application or document submission to investigation
and closing the application or documents. We can also observe the technical details such as data capture
classification and extraction stages and document enrichment, review and verification and extend the
solution to provide analytics and visualization for a claim’s fraud use case.

IJFMR23044308 Volume 5, Issue 4, July-August 2023 2

International Journal for Multidisciplinary Research (IJFMR)
E-ISSN: 2582-2160 ● Website: www.ijfmr.com ● Email: editor@ijfmr.com

Figure 2. Architecture of IDP

The above architecture diagram explains the various stages of IDP workflow
a) DATA CAPTURE- it usually centralizes the data and claims.
b) CLASSIFICATION- it usually sends all unstructured and semi-structured data through the pipeline
to extract the data.
c) EXTRACTION-it extracts all the information from the claim and tag the notes so that search will be
easy.
d) ENRICHMENT-all the unstructured and semi- structured are all identified and made a proper view
to the readers.
e) REVIEW AND VALIDATION-for the review and the validation the converted document is sent to
the authenticated person to check it.
f) READY TO USE-the converted file is updated to the main database within 24 hours after getting
from the validation from authenticated person.

PHASES IN INTELLIGENT DOCUMENT PROCESSING

1. CLASSIFICATION PHASE – Collected documents of various types are categorized before further
extraction through Amazon Comprehend custom classification which is a two step process as shown
illustrated in figure 3. The custom classification is the process mainly helps to automate the
document and identify the missing document from the packet.

TWO CUSTOM CLASSIFICATION STEPS

• EXTRACT TEXT USING AMAZON TEXTRACT – From any document or image, Amazon
textract uses machine learning to automatically extract text, handwriting, image and data. Instead of
hours or days, Textract can extract the data in minutes. Thus, we can process the documents quickly
and extract the information out of it.
• TRAIN AMAZON COMPREHEND CUSTOM - training Amazon comprehend custom
classification or documents classifier to recognize the classes of interest based on the text content
[1].

IJFMR23044308 Volume 5, Issue 4, July-August 2023 3

International Journal for Multidisciplinary Research (IJFMR)
E-ISSN: 2582-2160 ● Website: www.ijfmr.com ● Email: editor@ijfmr.com

Figure 3. Document classification

2. EXTRACTION PHASE - in the extraction phase, we mainly extract the document where we have
the data using Amazon comprehend
• We also use the claims processing packets to process it.
• CMS-1500 CLAIM FORM is used to extract data.
• Non-institutional provide a CMS-1500 from as a standard claim.
• CMS-1500 should be processed accurately or else it can slow down the claim process or delay
payments by the carrier.[1]

Figure 4. Data extraction process

THE KEY SERVICES OF INTELLIGENT DOCUMENT PROCESSING

1) AMAZON TEXTRACT - it mainly extracts the hand writing, data from scanned document.
• It is beyond the OCR (Optical Character Recognition)
• Used mainly to extract data from forms and tables and understand it.
• It includes machine learning to read and process the document, especially the non-manual form data
or document.[3]

2) AMAZON COMPREHEND-mainly identifies the quantity, location, person date, dominant

language, Personally Identifiable Information (PII)
• It classifies intro-relevant classes.
• It is a natural language processing (NLP) service that use Machine Learning(ML) to extract the
insights from text[3].

3) AMAZON AUGMENTED AI (AMAZON A21) - it mainly provides a bridge between Amazon

textract and Amazon comprehend to provide a ability to introduce a human review or validation
within intelligent document processing flow. It mainly uses the machine learning service to make
easy to build workflow required for human review [3].

IJFMR23044308 Volume 5, Issue 4, July-August 2023 4

International Journal for Multidisciplinary Research (IJFMR)
E-ISSN: 2582-2160 ● Website: www.ijfmr.com ● Email: editor@ijfmr.com

CONCLUSION
In this study, we saw how the structured, unstructured and semi-structured data, files, claims processed
in AWS AI services and to automate the intelligent document processing pipeline. Here, we emphasized
an idea to classify the documents into various document classes using an Amazon comprehend custom
classifier, and to use Amazon to extract unstructured, semi-structured, structured and specialized
document types.
The classification and extraction phase are expanded with Amazon textract. We also use Amazon
comprehend pre-defined entities and custom entities to enrich the data and show how to extend the
intelligent document processing pipeline to integrate with analytics and visualization services for further
processing.

REFERENCES
1. Intelligent document processing with AWS AI services: Part1|AWS Machine Learning Blog,
https://aws.amazon.com/blogs/machine-learning/part-1-intelligent-document-processing-with-aws-
ai-services/
2. Content from a website, https://www.automationanywhere.com
3. Intelligent document processing AWS Solutions for Machine learning AI/ML,
https://aws.amazon.com
4. Intelligent Document Processing-Methods and Tools in the real World(Published paper),
https://www.researchgate.net - Graham A Cutting, Independent Researcher, F
grahamcutting@cantab.net; Anne-Francoise Cutting - Decelle : Universite de Geneve / CUI, CH ;
anne-francoise.cutting-decelle@unige.ch
5. Deloitte, https://www2.deloitte.com

IJFMR23044308 Volume 5, Issue 4, July-August 2023 5

IEEE Conference Template
No ratings yet
IEEE Conference Template
5 pages
The Ultimate Guide To Intelligent Document Processing 1709708578
No ratings yet
The Ultimate Guide To Intelligent Document Processing 1709708578
8 pages
Ebook A Users Guide To Intelligent Document Processing
No ratings yet
Ebook A Users Guide To Intelligent Document Processing
24 pages
DocScience Deck One Pager
No ratings yet
DocScience Deck One Pager
1 page
Blue Prism Intelligent Document Processing Ebook - 2023-05-17-001114 - Tqte
No ratings yet
Blue Prism Intelligent Document Processing Ebook - 2023-05-17-001114 - Tqte
16 pages
Automate Data Extraction With Intelligent Document Processing RUN311
No ratings yet
Automate Data Extraction With Intelligent Document Processing RUN311
25 pages
Aws Article ML 15158 Idp Genai
No ratings yet
Aws Article ML 15158 Idp Genai
19 pages
AI For Banking Industry
No ratings yet
AI For Banking Industry
33 pages
What Is IDP?: The Journey From OCR & Beyond
No ratings yet
What Is IDP?: The Journey From OCR & Beyond
20 pages
Intelligent Document Processing With AWS AI and ML
No ratings yet
Intelligent Document Processing With AWS AI and ML
40 pages
AIM103 S Modernize Document Workflows With Intelligent Processing Sponsored by PrinterLogic by Vasion
No ratings yet
AIM103 S Modernize Document Workflows With Intelligent Processing Sponsored by PrinterLogic by Vasion
25 pages
AI-Powered Document Processing
No ratings yet
AI-Powered Document Processing
28 pages
Market Guide For Intelligent Document Processing Solutions 757528 NDX
No ratings yet
Market Guide For Intelligent Document Processing Solutions 757528 NDX
42 pages
AI Driven Intelligent Document Processin
No ratings yet
AI Driven Intelligent Document Processin
12 pages
7 Leading Machine Learning Use Cases
No ratings yet
7 Leading Machine Learning Use Cases
11 pages
Idc Marketscape 2024 Vendor Assessment
No ratings yet
Idc Marketscape 2024 Vendor Assessment
13 pages
Infographic IDP Vs OCR Latest
No ratings yet
Infographic IDP Vs OCR Latest
1 page
Inbound 1847757615059828184
No ratings yet
Inbound 1847757615059828184
14 pages
Competitive Landscape Intelligent Document Processing Platform Providers
No ratings yet
Competitive Landscape Intelligent Document Processing Platform Providers
19 pages
UiPath 2
No ratings yet
UiPath 2
28 pages
7 Leading Machine Learning Use Cases
No ratings yet
7 Leading Machine Learning Use Cases
11 pages
An Overview On IDP and Its Significance in Business
No ratings yet
An Overview On IDP and Its Significance in Business
8 pages
Automate Data Extraction and Analysis With Intelligent Document Processing
No ratings yet
Automate Data Extraction and Analysis With Intelligent Document Processing
8 pages
7 Leading Machine Learning Use Cases
No ratings yet
7 Leading Machine Learning Use Cases
11 pages
Transform Your Data From Unstructured To Structured With Ninestars' Proprietary IDP Solution
No ratings yet
Transform Your Data From Unstructured To Structured With Ninestars' Proprietary IDP Solution
1 page
Document Processing
No ratings yet
Document Processing
4 pages
Sciencedirect: Procedia Computer Science 230 (2023) 725-736
No ratings yet
Sciencedirect: Procedia Computer Science 230 (2023) 725-736
12 pages
AWS Textxtract2019 0312 MCL Slide Deck
100% (1)
AWS Textxtract2019 0312 MCL Slide Deck
64 pages
Handout Select The Right AI ML and Generative AI Tools For Your Use Case
No ratings yet
Handout Select The Right AI ML and Generative AI Tools For Your Use Case
46 pages
Ebook 200 Appian RPA and IDP Use Case Ideas - EN
No ratings yet
Ebook 200 Appian RPA and IDP Use Case Ideas - EN
19 pages
TCG Report 08.22
No ratings yet
TCG Report 08.22
14 pages
Exploring AI-driven Approaches For Unstructured Document Analysis and Future Horizons
No ratings yet
Exploring AI-driven Approaches For Unstructured Document Analysis and Future Horizons
54 pages
Intelligent Process Automation Ocr Whitepaper PDF
100% (1)
Intelligent Process Automation Ocr Whitepaper PDF
16 pages
Document Intelligence .&3 Simple Habits To Improve Your Critical Thinking
No ratings yet
Document Intelligence .&3 Simple Habits To Improve Your Critical Thinking
31 pages
Document Understanding Webinar
No ratings yet
Document Understanding Webinar
28 pages
Overview AI ML PDF
No ratings yet
Overview AI ML PDF
19 pages
Challenge Solution: Machine Learning and AI Applied To Everyday Processes
No ratings yet
Challenge Solution: Machine Learning and AI Applied To Everyday Processes
2 pages
Document Automation Pitch Deck
No ratings yet
Document Automation Pitch Deck
33 pages
The Inevitability of Hyperautomation
No ratings yet
The Inevitability of Hyperautomation
61 pages
Logistics IDP: AI-Driven Efficiency
100% (1)
Logistics IDP: AI-Driven Efficiency
20 pages
TransformoDocs: Smart Doc Automation
No ratings yet
TransformoDocs: Smart Doc Automation
6 pages
Building A Scalable Intelligent Document Processing Platform For Financial Institutions
No ratings yet
Building A Scalable Intelligent Document Processing Platform For Financial Institutions
12 pages
Intelligent Automation for Document Management
No ratings yet
Intelligent Automation for Document Management
16 pages
D Ai: B, M A - : Ocument Enchmarks Odels and Ppli Cations
No ratings yet
D Ai: B, M A - : Ocument Enchmarks Odels and Ppli Cations
23 pages
AI Data Extraction Checklist - v6
No ratings yet
AI Data Extraction Checklist - v6
10 pages
Ultimate Guide To Intelligent Document Processing
No ratings yet
Ultimate Guide To Intelligent Document Processing
13 pages
AIM394 NEW - Transforming Multimodal Content With Amazon Bedrock Data Automation
No ratings yet
AIM394 NEW - Transforming Multimodal Content With Amazon Bedrock Data Automation
42 pages
Ai Driven Document Processing A Novel Framework For 22nei1ew7b04
No ratings yet
Ai Driven Document Processing A Novel Framework For 22nei1ew7b04
10 pages
Data Science Document Processing & Structuring Project
No ratings yet
Data Science Document Processing & Structuring Project
6 pages
AI-Driven Document Parsing
No ratings yet
AI-Driven Document Parsing
12 pages
Layoutlm: Pre-Training of Text and Layout For Document Image Understanding
No ratings yet
Layoutlm: Pre-Training of Text and Layout For Document Image Understanding
9 pages
Powering Content Ai
No ratings yet
Powering Content Ai
13 pages
251689-Article Text-582403-2-10-20220225
No ratings yet
251689-Article Text-582403-2-10-20220225
9 pages
Efficient Automated Processing of The Unstructured Documents Using Artificial Intelligence A Systematic Literature Review and Future Directions
No ratings yet
Efficient Automated Processing of The Unstructured Documents Using Artificial Intelligence A Systematic Literature Review and Future Directions
43 pages
Deep Learning in NLP and Video
No ratings yet
Deep Learning in NLP and Video
12 pages
Marine Insurance
No ratings yet
Marine Insurance
1 page
Balloon Mortgage Calculator - Bankrate
No ratings yet
Balloon Mortgage Calculator - Bankrate
2 pages
Admin Settings - Komgo
No ratings yet
Admin Settings - Komgo
18 pages
Irjet V11i1148
No ratings yet
Irjet V11i1148
9 pages
Infocom Workshop 25
No ratings yet
Infocom Workshop 25
6 pages
Application of Machine Learning in LC-MS-based Non-Targeted Analysis
No ratings yet
Application of Machine Learning in LC-MS-based Non-Targeted Analysis
20 pages
ICATM Paper Template
No ratings yet
ICATM Paper Template
5 pages
Aspiring Software Developer Profile
No ratings yet
Aspiring Software Developer Profile
1 page
M.Tech & Ph.D. CompSci Courses
No ratings yet
M.Tech & Ph.D. CompSci Courses
7 pages
Adaptive Beamforming Techniques For Speech
No ratings yet
Adaptive Beamforming Techniques For Speech
6 pages
Deep Learning for Drone RF Detection
No ratings yet
Deep Learning for Drone RF Detection
6 pages
AutoML Enhances ASD Detection
No ratings yet
AutoML Enhances ASD Detection
11 pages
Efficient LiDAR-Based Navigation System
No ratings yet
Efficient LiDAR-Based Navigation System
8 pages
SEM-VII AIML DE Syllabus
No ratings yet
SEM-VII AIML DE Syllabus
81 pages
Structure of Neural Networks
No ratings yet
Structure of Neural Networks
12 pages
Classification of Citrus Plant Diseases Using Deep Transfer Learning
No ratings yet
Classification of Citrus Plant Diseases Using Deep Transfer Learning
17 pages
The Great A.I.
No ratings yet
The Great A.I.
27 pages
Enset
No ratings yet
Enset
15 pages
Introduction To Neural Network - Deep Learning
No ratings yet
Introduction To Neural Network - Deep Learning
17 pages
Deep Learning Math Background
No ratings yet
Deep Learning Math Background
30 pages
AI Landscape
No ratings yet
AI Landscape
111 pages
Updated Question Bank For Class 9 - Half Yearly Examination 2022
No ratings yet
Updated Question Bank For Class 9 - Half Yearly Examination 2022
27 pages
Pathway 2035 For Financial Innovation Your Navigator.v1.0.1
No ratings yet
Pathway 2035 For Financial Innovation Your Navigator.v1.0.1
44 pages
Empowering IoT Resilience Hybrid Deep Learning Techniques For Enhanced Security
No ratings yet
Empowering IoT Resilience Hybrid Deep Learning Techniques For Enhanced Security
22 pages
1 s2.0 S1877050924008664 Main
No ratings yet
1 s2.0 S1877050924008664 Main
8 pages
Agricultural Water Management
No ratings yet
Agricultural Water Management
14 pages
Dagim Firide Yimenu
No ratings yet
Dagim Firide Yimenu
138 pages
Gen-Ai ppt1
No ratings yet
Gen-Ai ppt1
5 pages
CLASS 10 AI Chapter 1
No ratings yet
CLASS 10 AI Chapter 1
36 pages
15 Cutting-Edge AI Projects For Defense Applications-1
No ratings yet
15 Cutting-Edge AI Projects For Defense Applications-1
6 pages
AAPC Module 1 - Introduction To Artificial Intelligence
No ratings yet
AAPC Module 1 - Introduction To Artificial Intelligence
18 pages
FSDL Berkeley Lecture8 Data Management
No ratings yet
FSDL Berkeley Lecture8 Data Management
86 pages
Python Data Science 2024 - Explo - Wilson, Stephen
No ratings yet
Python Data Science 2024 - Explo - Wilson, Stephen
170 pages
Project Report 1
No ratings yet
Project Report 1
35 pages