0% found this document useful (0 votes)

19 views4 pages

GHGH

Uploaded by

nirmala periasamy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views4 pages

GHGH

Uploaded by

nirmala periasamy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Methodology: AI

Framework for
Computational Analysis of
Sanskrit Texts
1 Methodology
This research introduces a novel, two-stage artificial intelligence framework
specifically designed for the computational analysis of Sanskrit texts with
the goal of reviving embedded scientific knowledge. Due to the highly
inflected and semantically dense nature of Sanskrit, traditional NLP
pipelines are insufficient. Therefore, we propose an architecture composed
of two purpose-built algorithms: SanskritDeepNet-Lexical Graph Constructor
(SDN-LGC) and VedaJnana-Conscious Concept Synthesizer (VJCCS). The
entire pipeline begins with curated data collection and linguistic
normalization, proceeds to semantic graph generation, and concludes with
scientific concept extraction via ontology mapping.

1.1 Data Collection and Preprocessing

The foundation of our system is built upon the Digital Corpus of Sanskrit
(DCS) - a freely available, linguistically annotated repository of classical
Sanskrit texts. We selected approximately 20,000 sentences across
wellknown scientific treatises such as Susruta Samhita, Charaka Samhita,
Aryabhatiya, and the Vaisesika Sutras. These texts are not only linguistically
rich but also represent diverse scientific domains, including medicine,
astronomy, mathematics, and metaphysics.

Preprocessing of these texts begins with Unicode normalization to ensure

consistency across character representations. This is followed by sandhi
splitting, where compound euphonic transformations are separated into
their constituent words using a hybrid rule-based and probabilistic engine.
After this, we apply lemmatization to reduce words to their root forms using
the DCS lexicon, and extract morphological features such as tense, number,
voice, and grammatical case. Compound words (samāsa) are decomposed,
and syntactic roles are tagged for each token. The preprocessed and
annotated sentences are then transformed into token vectors for
downstream semantic parsing.
1.2 SanskritDeepNet-Lexical Graph
Constructor (SDNLGC)
The first core algorithm, SDN-LGC, is responsible for transforming a
sentence into a structured semantic dependency graph that captures
grammatical and conceptual relationships between words. To begin, the
input sentence is encoded into a contextual vector T , using a weighted sum
of token embeddings:
n
T =∑ w i x i+ b
i=1
(1)

Here, x i denotes the vector representation of the i th token, w i is a learned

attention weight, and b is a bias term. To introduce non-linearity and
normalize the aggregated vector, a sigmoid function is applied:

1
σ (T )=
1+e −T
(2)

The semantic relationships between each token pair are computed using
scaled dot-product attention:
(Q K T )i j / dk
e
Ai j = n

∑ e( Q K ) /d
T
ik k

k=1
(3)

These attention scores are passed through a multi-layer perceptron to

classify syntactic roles such as subject, object, modifier, or compound head:

Y =softmax ( W 2 ⋅ ReLU ( W 1 ⋅ X +b 1 ) +b2 )

(4)

The output is then formalized as a directed graph:

G=(V , E)
(5)

Where V represents nodes (words) and E represents syntactic or semantic

edges (relationships). The model is trained by minimizing the mean squared
error between predicted and gold-standard dependencies:
n
1
Loss = ∑ ( ý i − y i ) 2
n i=1
(6)
This graph encodes valuable structural insights such as which term is the
main predicate, which are its arguments, and how scientific terms are
constructed through compound formations or hierarchical dependencies.

1.3 VedaJnana-Conscious Concept Synthesizer

(VJCCS)
Once the sentence is parsed into a dependency graph, the second core
algorithm, VJ-CCS, performs domain-aware concept extraction by mapping
the graph structure onto a curated ontology of ancient Indian scientific
knowledge. This is achieved through hybrid symbolic-neural reasoning.

The algorithm begins by computing the cosine similarity between each

graph node vector and ontology term embeddings:

a ⋅b
CosSim(a ,b)=
‖ a ‖‖b ‖
(7)

To capture dependencies across longer sequences or linked ideas in

compound sentences, a contextual recurrent neural network is used:

ht =f ( W xt +U ht − 1+ b )
(8)

The system's performance is evaluated using standard metrics. Precision

measures the correctness of identified concepts:

TP
Precision =
T P+ F P
(9)

Recall quantifies the coverage of true concepts:

TP
Recall =
T P+ F N
(10)

F1 score balances both metrics:

2 ⋅ Precision ⋅ Recall
F1 Score =
Precision + Recall
(11)

To balance the graph-based and semantic objectives, the total loss function
is defined as:

Ltotal =α ⋅ Lgraph + β ⋅ Lsemantic

(12)
To identify conceptually meaningful co-occurrences, we use Pointwise
Mutual Information (PMI):

Ei j=log ⁡
( P ( wi , w j )
P ( wi ) ⋅ P ( w j ) )
(13)

Graph nodes are updated by aggregating contextual information from their

neighbors:

C k= ∑ Ak j h j
j∈ N (k)
(14)

Finally, to project this structured representation into the conceptual space

of the scientific domain, a hyperbolic activation is applied:

Z=tanh ⁡( W z H +b z )
(15)

The output vector Z represents the recognized scientific concept mapped

from the original Sanskrit sentence. For instance, a graph containing the
terms "vāta," "pitta," and "kapha" would be identified as related to Ayurveda,
while "graha," "nakṣatra," and "tithi" would map to astronomy. This enables
intelligent indexing, annotation, and interpretation of ancient texts in a
modern digital framework.

Chanda Analysis
No ratings yet
Chanda Analysis
358 pages
Enhancing Panini's - Aṣṭādhyāyī - (पाणिनिः अष्टाध्यायी)
No ratings yet
Enhancing Panini's - Aṣṭādhyāyī - (पाणिनिः अष्टाध्यायी)
5 pages
SanskritShala A Neural Sanskrit NLP Toolkit With W
No ratings yet
SanskritShala A Neural Sanskrit NLP Toolkit With W
9 pages
Chang 2023 Arxiv
No ratings yet
Chang 2023 Arxiv
20 pages
123456
No ratings yet
123456
16 pages
NLP - Mid 2 Examination
No ratings yet
NLP - Mid 2 Examination
4 pages
Module 4
No ratings yet
Module 4
25 pages
Implementation of Sandhi Viccheda For Sanskrit Wordssentencesparagraphs
No ratings yet
Implementation of Sandhi Viccheda For Sanskrit Wordssentencesparagraphs
7 pages
Automated Semantic Graph Construction For Enhanced Scientific Literature Analysis
No ratings yet
Automated Semantic Graph Construction For Enhanced Scientific Literature Analysis
11 pages
Intro to NLP & Word Vectors
No ratings yet
Intro to NLP & Word Vectors
42 pages
Sanskrit AI UNIT4 5 Questions Answers
No ratings yet
Sanskrit AI UNIT4 5 Questions Answers
2 pages
NLP Unit 3
No ratings yet
NLP Unit 3
20 pages
NLP - Unit 3 Part2
No ratings yet
NLP - Unit 3 Part2
12 pages
Sri Hari Etal 87
No ratings yet
Sri Hari Etal 87
18 pages
A Little Pretraining Goes A Long Way: A Case Study On Dependency Parsing Task For Low-Resource Morphologically Rich Languages
No ratings yet
A Little Pretraining Goes A Long Way: A Case Study On Dependency Parsing Task For Low-Resource Morphologically Rich Languages
10 pages
NLP Techniques For Sanskrit MSR Thesis
No ratings yet
NLP Techniques For Sanskrit MSR Thesis
91 pages
Sanskrit Word Segmentation Dataset
No ratings yet
Sanskrit Word Segmentation Dataset
10 pages
Assignment No 6
No ratings yet
Assignment No 6
5 pages
NLP Presentation
No ratings yet
NLP Presentation
27 pages
Electronics 10 01372 With Cover
No ratings yet
Electronics 10 01372 With Cover
24 pages
09 Bert Ai GPT
No ratings yet
09 Bert Ai GPT
63 pages
Question Bank From UNIT-3
No ratings yet
Question Bank From UNIT-3
1 page
A Survey On Semantic Processing Techniques: A, C, B, D, E, F, B, A, A
No ratings yet
A Survey On Semantic Processing Techniques: A, C, B, D, E, F, B, A, A
100 pages
SemanticsSpeechRecognitionUnderstanding PDF
No ratings yet
SemanticsSpeechRecognitionUnderstanding PDF
11 pages
NLP UNIT 2 (Ques Ans Bank)
No ratings yet
NLP UNIT 2 (Ques Ans Bank)
26 pages
Unit 3-1
No ratings yet
Unit 3-1
66 pages
Unit III 1
No ratings yet
Unit III 1
11 pages
Low-Rank Tensors For Verbs in Compositional Distributional Semantics
No ratings yet
Low-Rank Tensors For Verbs in Compositional Distributional Semantics
6 pages
Distributional Semantics Word Vectors (3) - 71-93
No ratings yet
Distributional Semantics Word Vectors (3) - 71-93
23 pages
Unit 3 Jntu
No ratings yet
Unit 3 Jntu
9 pages
UNIT 5 NLP Tools and Techniques
No ratings yet
UNIT 5 NLP Tools and Techniques
7 pages
Sanskrit Dependency Parsing Models
No ratings yet
Sanskrit Dependency Parsing Models
20 pages
UNIT 4 New
No ratings yet
UNIT 4 New
14 pages
Shallow Syntax Analysis in Sanskrit Guided by Semantic Nets Constraints
No ratings yet
Shallow Syntax Analysis in Sanskrit Guided by Semantic Nets Constraints
10 pages
Ijms 25 11811
No ratings yet
Ijms 25 11811
27 pages
Unit 5 DL
No ratings yet
Unit 5 DL
11 pages
DVT U4 My Notes
No ratings yet
DVT U4 My Notes
15 pages
NLP Deep Learning Course Overview
No ratings yet
NLP Deep Learning Course Overview
40 pages
Using Ontology Embeddings With Deep Learning Redouane
No ratings yet
Using Ontology Embeddings With Deep Learning Redouane
11 pages
NLP Unit-4
No ratings yet
NLP Unit-4
54 pages
S Ai: A - : Ymbolic Framework For Logic Based Approaches Combining Generative Models and Solvers
No ratings yet
S Ai: A - : Ymbolic Framework For Logic Based Approaches Combining Generative Models and Solvers
46 pages
A Survey On Semantic Parsing
No ratings yet
A Survey On Semantic Parsing
22 pages
A Word Sense Induction Model
No ratings yet
A Word Sense Induction Model
66 pages
Project Proposal
No ratings yet
Project Proposal
7 pages
Methodology
No ratings yet
Methodology
5 pages
001 Defining Neurosymbolic Ai
No ratings yet
001 Defining Neurosymbolic Ai
14 pages
Introduction To Distributional Semantics
No ratings yet
Introduction To Distributional Semantics
43 pages
Yadav 2014
No ratings yet
Yadav 2014
6 pages
NLP Unit5 Discourse and Lexical Resources Elaborated
No ratings yet
NLP Unit5 Discourse and Lexical Resources Elaborated
4 pages
NLP Unit4 Semantics and Pragmatics Elaborated
No ratings yet
NLP Unit4 Semantics and Pragmatics Elaborated
4 pages
Semantic Interpretation For Convolutional Neural Networks: What Makes A Cat A Cat?
No ratings yet
Semantic Interpretation For Convolutional Neural Networks: What Makes A Cat A Cat?
33 pages
Research Paper
No ratings yet
Research Paper
8 pages
Named Entity Recognition
No ratings yet
Named Entity Recognition
120 pages
Semantic Analysis
No ratings yet
Semantic Analysis
16 pages
NER Overview PPT Final
No ratings yet
NER Overview PPT Final
20 pages
Hindi Word Sense Disambiguation Method
No ratings yet
Hindi Word Sense Disambiguation Method
17 pages
Text Analytics Basics
No ratings yet
Text Analytics Basics
28 pages
A Unified Architecture For Natural Language Processing
No ratings yet
A Unified Architecture For Natural Language Processing
15 pages
Unsupervised Hindi Word Sense Disambiguation Using Graph Based Centrality Measures
No ratings yet
Unsupervised Hindi Word Sense Disambiguation Using Graph Based Centrality Measures
8 pages
Main
No ratings yet
Main
2 pages
GHGH
No ratings yet
GHGH
5 pages
ETABS Building Input
No ratings yet
ETABS Building Input
7 pages
Literature Review
No ratings yet
Literature Review
12 pages
Preli 2
No ratings yet
Preli 2
3 pages
Preli
No ratings yet
Preli
3 pages
GHGH
No ratings yet
GHGH
4 pages
GHGH
No ratings yet
GHGH
2 pages
05 - Chapter 1
No ratings yet
05 - Chapter 1
17 pages
GHGH
No ratings yet
GHGH
3 pages
GHGH
No ratings yet
GHGH
4 pages
ADidravidar Welfare Schem2 2022-2023
No ratings yet
ADidravidar Welfare Schem2 2022-2023
90 pages
Cake Shop
No ratings yet
Cake Shop
102 pages
Kalman
No ratings yet
Kalman
6 pages
Caable
No ratings yet
Caable
39 pages
Revision Assignment
No ratings yet
Revision Assignment
3 pages
SRIVARI
No ratings yet
SRIVARI
45 pages
Reading Supplement 2 - No Time To Be Nice at Work
No ratings yet
Reading Supplement 2 - No Time To Be Nice at Work
10 pages
Deep Feature Extraction and Classification of Hyperspectral Images Based On Convolutional Neural Networks
No ratings yet
Deep Feature Extraction and Classification of Hyperspectral Images Based On Convolutional Neural Networks
38 pages
Sai Milk
No ratings yet
Sai Milk
61 pages
A Framework For The Development of Decision Support Systems
100% (1)
A Framework For The Development of Decision Support Systems
27 pages
AP Language and Composition Literary and Rhetorical Terms List
100% (1)
AP Language and Composition Literary and Rhetorical Terms List
4 pages
Lesson Plan Xi
No ratings yet
Lesson Plan Xi
3 pages
Golden Gate Colleges Graduate School: I. Context and Rationale
No ratings yet
Golden Gate Colleges Graduate School: I. Context and Rationale
20 pages
CCE3 - KNOWLEDGE REPRESENTATION AND ML DL With Answer
No ratings yet
CCE3 - KNOWLEDGE REPRESENTATION AND ML DL With Answer
46 pages
A.J. Ayer-A Critique of Ethics
100% (1)
A.J. Ayer-A Critique of Ethics
4 pages
Caw 3-2
No ratings yet
Caw 3-2
3 pages
Neuropsychological Assessment Report
No ratings yet
Neuropsychological Assessment Report
11 pages
Image Captioning with Deep Learning
No ratings yet
Image Captioning with Deep Learning
16 pages
NLC English 8 Consolidation NT v.1
No ratings yet
NLC English 8 Consolidation NT v.1
12 pages
A List of Emotions and Facial
No ratings yet
A List of Emotions and Facial
38 pages
Scope of Industrial Psychology
No ratings yet
Scope of Industrial Psychology
2 pages
Lesson Plan On Disciplines and Ideas in The Applied Social Sciences
No ratings yet
Lesson Plan On Disciplines and Ideas in The Applied Social Sciences
2 pages
The Nonaka and Takeuchi Knowledge Spiral
No ratings yet
The Nonaka and Takeuchi Knowledge Spiral
18 pages
Leadership Assignment
No ratings yet
Leadership Assignment
8 pages
Understanding Constructed Response in Math
No ratings yet
Understanding Constructed Response in Math
21 pages
Neuro Obs - Resident
100% (7)
Neuro Obs - Resident
2 pages
Social Psychology
No ratings yet
Social Psychology
4 pages
Final Output - DISS - BOTON
No ratings yet
Final Output - DISS - BOTON
4 pages
Literacy Development in Course Books For Teaching
No ratings yet
Literacy Development in Course Books For Teaching
15 pages
DLM - Seminar 3.spring2023
No ratings yet
DLM - Seminar 3.spring2023
6 pages
DLL Mathematics-1 Q1 W6
No ratings yet
DLL Mathematics-1 Q1 W6
5 pages
Samanthas Case Study
No ratings yet
Samanthas Case Study
23 pages
English Course Lesson Plan
No ratings yet
English Course Lesson Plan
4 pages
3rdQ Health DLL
85% (13)
3rdQ Health DLL
4 pages
Edid 6505 Mini Project Assignment
No ratings yet
Edid 6505 Mini Project Assignment
18 pages
Gestalt Therapy Explained - History, Definition and Examples
No ratings yet
Gestalt Therapy Explained - History, Definition and Examples
1 page
State Verbs Info
No ratings yet
State Verbs Info
5 pages
Week 1
No ratings yet
Week 1
2 pages
Obe Syllabus Fola - 2 (Jap)
No ratings yet
Obe Syllabus Fola - 2 (Jap)
7 pages

GHGH

Uploaded by

GHGH

Uploaded by

Methodology: AI

1.1 Data Collection and Preprocessing

Preprocessing of these texts begins with Unicode normalization to ensure

Here, x i denotes the vector representation of the i th token, w i is a learned

These attention scores are passed through a multi-layer perceptron to

Y =softmax ( W 2 ⋅ ReLU ( W 1 ⋅ X +b 1 ) +b2 )

The output is then formalized as a directed graph:

Where V represents nodes (words) and E represents syntactic or semantic

1.3 VedaJnana-Conscious Concept Synthesizer

The algorithm begins by computing the cosine similarity between each

To capture dependencies across longer sequences or linked ideas in

The system's performance is evaluated using standard metrics. Precision

Recall quantifies the coverage of true concepts:

F1 score balances both metrics:

Ltotal =α ⋅ Lgraph + β ⋅ Lsemantic

Graph nodes are updated by aggregating contextual information from their

Finally, to project this structured representation into the conceptual space

The output vector Z represents the recognized scientific concept mapped

You might also like