RP 1

Uploaded by

saidumastanbid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views6 pages

RP 1

Uploaded by

saidumastanbid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

CS230-Fall 2020 Final Project Report

Conversational Image Recognition Chatbot

Shengyang Su
bllysu@stanford.edu

Abstract
This paper proposes a chatbot framework that adopts a model which consists of
image recognition and a natural language processing technologies. Based on this
chatbot framework, neural encoder-decoder model is utilized with Late Fusion
encoder[1] and 2 different decoders(generative and discriminate). The chatbot is
able to detect object in the image, tell ahout and recognize the image, later on the
chatbot is also able to answer the questions about this image.

1 Introduction
A conversational chatbot is an application that is able to communicate with humans using natural
language. An image recognition deep learning based chatbot is an application to recognize the image
which the user uploaded and answer the questions about the image. Ever since the birth of AI and
computer vision, modeling conversations remains one of the field’s challenges, especially to combine
both natural language processing and image recognition. Chatbots are now widely used as part of
platform as applications like Apple’s Siri[1], Google’s Google Assistant[2] or Microsoft’s Cortana[3].
The main problem domain of my project is building a image recognization chatbot, which is capable
of recognize the object in an image and generating the best response for any the user’s query about the
image. In order to achieve this goal, the chatbot needs to detect the object in the image and have the
related dialog of the image after training, also have understanding of the sender’s messages so that it
can predict which sort of response will be relevant and it must be correct lexically and grammatically
while generating the reply.

2 Related work
Conversational Modeling Chatbot For text based chatbot, there are two main approaches for
generating responses in chatbot. The traditional approach is to use hard-coded templates and rules to
create chatbots. The more novel approach was made possible by the rise of deep learning. Neural
network[4] models are trained on large amounts of data to learn the process of generating relevant and
grammatically correct responses to input utterances. A common challenge with most conversational
chatbot models is that they don’t have any memory. Although architectures have been devised in
order to take into account previous dialog card turns, they are still limited. For example they can’t
consider previously happened dialog cards. A user would expect from a chatbot that once he/she
talks about something, the chatbot will mostly remember the dialog, as is normal in human-human
conversations. The chatbot should be able to learn new things about the user in order to make the
experience personalized.
The Encoder-Decoder Model When applying neural networks to natural language processing (NLP)
tasks each word (symbol) has to be transformed into a numerical representation [5]. This is done

CS230: Deep Learning, Winter 2018, Stanford University, CA. (LateX template borrowed from NIPS 2017.)
through word embeddings, which represent each word as a fixed size vector of real numbers. Word
embeddings are useful because instead of handling words as huge vectors of the size of the vocabulary,
they can be represented in much lower dimensions. Word embeddings are trained on large amounts
of natural language data and the goal is to build vector representations that capture the semantic
similarityb between words. More specifically, because similar context usually is related to similar
meaning, words with similar distributions should have similar vector representations. This concept is
called the Distributional Hypothesis[6]. Each vector representing a word can be regarded as a set of
parameters and these parameters can be jointly learned with the neural network’s parameters, or they
can be pre-learned.
Mask R-CNN[7] Mask R-CNN extends Faster R-CNN to solve instance segmentation tasks. It
achieves this by adding a branch for predicting an object mask in parallel with the existing branch
for bounding box recognition. In principle, Mask R-CNN is an intuitive extension of Faster R-CNN,
but constructing the mask branch properly is critical for good results. Most importantly, Faster
R-CNN was not designed for pixel-to-pixel alignment between network inputs and outputs. This
is evident in how RoIPool, the de facto core operation for attending to instances, performs coarse
spatial quantization for feature extraction. To fix the misalignment, Mask R-CNN utilises a simple,
quantization-free layer, called RoIAlign, that faithfully preserves exact spatial locations. Secondly,
Mask R-CNN decouples mask and class prediction: it predicts a binary mask for each class indepen-
dently, without competition among classes, and relies on the network’s RoI classification branch to
predict the category. In contrast, an FCN usually perform per-pixel multi-class categorization, which
couples segmentation and classification.
Object Detection on COCO[8] Object detection is the task of detecting instances of objects of
a certain class within an image. The state-of-the-art methods can be categorized into two main
types: one-stage methods and two stage-methods. One-stage methods prioritize inference speed,
and example models include YOLO, SSD and RetinaNet. Two-stage methods prioritize detection
accuracy, and example models include Faster R-CNN, Mask R-CNN and Cascade R-CNN. The most
popular benchmark is the MSCOCO dataset. Models are typically evaluated according to a Mean
Average Precision metric.

3 Dataset

One important element of deep learning and machine learning at large is dataset. A good dataset will
contribute to a model with good precision and recall. In the realm of object detection in images or
motion pictures, there are some household names commonly used and referenced by researchers and
practitioners. The names in the list include Pascal, ImageNet, SUN, and COCO.
A Dataset with Context COCO stands for Common Objects in Context. As hinted by the name,
images in COCO dataset are taken from everyday scenes thus attaching “context” to the objects
captured in the scenes. We can put an analogy to explain this further. Let’s say we want to detect
a person object in an image. A non-contextual, isolated image will be a close-up photograph of a
person. Looking at the photograph, we can only tell that it is an image of a person. However, it will
be challenging to describe the environment where the photograph was taken without having other
supplementary images that capture not only the person but also the studio or surrounding scene.
COCO was an initiative to collect natural images, the images that reflect everyday scene and provides
contextual information. In everyday scene, multiple objects can be found in the same image and each
should be labeled as a different object and segmented properly. COCO dataset provides the labeling
and segmentation of the objects in the images. A machine learning practitioner can take advantage of
the labeled and segmented images to create a better performing object detection model.
Objects in COCO As written in the original research paper, there are 91 object categories in COCO.
However, only 80 object categories of labeled and segmented images were released in the first
publication in 2014. Currently there are two releases of COCO dataset for labeled and segmented
images. After the 2014 release, the subsequent release was in 2017. My goal is to train an ML model
on the COCO Dataset. Then be able to generate my own labeled training data to train on. So far, I
have been using the maskrcnn-benchmark model by Facebook and training on COCO Dataset 2014.
Getting the data The COCO dataset can be download here: https://cocodataset.org/download I am
only training on the 2014 dataset. Here is an example for the COCO data format JSON file which
just contains one image as seen the top-level "images" element, 3 unique categories/classes in total

2
Figure 1: Sample Data Part 1 Figure 2: Sample Data Part 2

seen in top-level "categories" element and 2 annotated bounding boxes for the image seen in top-level
"annotations" element.

4 Methods

Figure 3: Model Framework

In this project, the model for the image reorganisation chatbot is based on a encoder-decoder
framework, composed of two parts 1. an encoder that converts the input(images and question/answers)
into a vector space, 2. a decoder that converts the embedded vector into an output(answers).
Encoders: One encoder is being used as a novel encoder methodology called late fusion that convert
inputs into a joint representation.
• Late Fusion (LF) Encoder: we treat history(H) as a long string with the entire history concatenated.
The current question(Q) and H are separately encoded using two different LSTMs. The Image(I),
H and Q are are concatenated and linearly transformed to a desired size of joint representation. We
encode the image with a VGG-16 CNN, question and concatenated history with separate LSTMs
and concatenate the three representations. This is followed by a fully-connected layer and tanh
non-linearity to a 512-d vector, which is used to decode the re-sponse. In this encoder, we treat each
feature as a long string with the entire history concatenated. Question and Question-Answer pair are
separately encoded with 2 different LSTMs, and individual representations of participating inputs are
concatenated and linearly transformed to a desired size of joint representation. While doing error
analysis, I found that the model used very little information from the image, as the magnitude of the
gradient during backpropagation on the image was very low. This would primarily have been because,
we were not fine tuning the VGG16 for the task. Fine Tuning would have made the complete model
understand what parts are important in the image. I then integrated a Self-Attention module inspired
from the SAGAN[8]. On training the complete Encoder-Decoder Network with Self-Attention
stabilized the training a lot and decreased the loss further and improved the performance on the used
metric on the validation set. The gradients were now travelling all the way back to the image. Hence,
proving that the information from the image was also utilized.
Feature Exctration: In this part, let’s also discuss the model we are using for object detection and
localization. Mask-RCNN [9] is an extension of Faster-RCNN. Apart from object detection and
classification Mask-RCNN also produces instance segmentation masks. Mask-RCNN introduces a
new branch for the instance segmentation which outputs the masks of detected objects. The instance
segmentation branch is a fully connected network, which provides pixel to pixel segmentation on
the basis of the Region of Interests. A fully connected network is a network in which every node is
connected to every other node. As detecting and overlaying a mask over the object is much more

3
Figure 4: Self-Attention Module (Source: SAGAN)

complex than just drawing the bounding boxes around them, Mask-RCNN introduces a new layer
called ROI-Align layer in place if ROI-pooling layer. Some Things that maskrcnn does really well:
1. everything’s a hyper param; 2. logging - for lots of feedback; 3. single tqdm output for training.
Figure 4 represents the architecture of Mask-RCNN.
Decoder: there are two decoders that are being used in the framework.
• Generative (LSTM) decoder: where the encoded vector is set as the initial state of the Long Short-
Term Memory (LSTM) RNN language model. During training, we maximize the log-likelihood of the
ground truth answer sequence given its corresponding encoded representation (trained end-to-end).
To evaluate, we use the model’s loglikelihood scores and rank candidate answers. Note that this
decoder does not need to score options during training. As a result, such models do not exploit the
biases in option creation and typically underperform models that do, but it is debatable whether
exploiting such biases is really indicative of progress. Moreover, generative decoders are more
practical in that they can actually be deployed in realistic applications.
• Discriminative (softmax) decoder: computes dot product similarity between input encoding and an
LSTM encoding of each of the answer options. These dot products are fed into a softmax to compute
the posterior probability over options. During training, we maximize the log-likelihood of the correct
option. During evaluation, options are simply ranked based on their posterior probabilities.

5 Experiments/Results/Discussion

Figure 5: System Archetecture

Data splitting: With the data encoded, we can now split it into training and testing sets. The training
set will be used to train the model while the testing set will be used for evaluating its performance on
unseen data. This can be done using a stratified approach, whereby of the patterns in the tags are well
represented in the testing set. I split the 83k into 80k for training, 3k for validation, and use the 40k
as test.

4
Figure 6: Result 1 Figure 7: Result 2

Preprocessing: The data handling of this type of dataset is quite complex and cumbersome. As for
each image there are ten rounds of question and answers. To meet the deadlines I set for myself, I
extracted VGG16-relu7 features as a input to language processing models. The dataset was handled
in such a way that for each coco image id we get its image features and all the questions to iterate on.
This strategy greatly helped in handling such large and complex dataset. Also need to spell-correct
VisDial data, convert digits to words, and remove contractions, before tokenizing using the Python
NLTK [10]. We then construct a dictionary of words that appear at least five times in the train set.
Tuning Hyperparameters: The leanring Models are implemented with python package Torch [11].
All the Model hyperparameters are customized by early stop based on Mean Reciprocal Rank (MRR)
metric. The dimension of the LSTMs are 2-layered with 512-dim hidden states. The model trained to
learn 300-dim embeddings for words and images. All of word embeddings are used across question
that are asked, history, and decoder LSTMs. For better tuning the performance of the model, Adam
optimizer with slicely changed learning rate. At each iterations, gradients are set between to [5, 5] to
avoid explosion.
Evaluation and Performance: One fundamental challenge in dialog systems is evaluation. It is an
open problem in NLP to evaluate the answers. Metrics like BLEU, ROGUE are there but they are
known to correlate poorly with human judgement. To challenge this issue, I used the metric evaluating
individual responses. At test time, the model is given an Image(I), History (H) and a set of 100
candidate answers. The model ranks the answers and it is then evaluated on Mean Reciprocal Rank
of the human response (higher the better) and Recall@k i.e. existence of the human response in top-k
ranked responses (lower the better). For evaluation, conducted by different decoder c comparison
measured by mean reciprocal rank (MRR), recall@k for k = {1, 5, 10} and mean rank. Note that
higher is better for MRR and recall@k, while lower is better for mean rank. As we can see, most
discriminative models significantly outperform generative models. I also found out that model with
complcated history has better result than the one without well explained history based on different
datasets.

Model R@1 R@5 R@10 MeanR MRR NDCG

lf-disc-faster-rcnn-x101 0.4617 0.7780 0.8730 4.7545 0.6041 0.5162
lf-gen-faster-rcnn-x101 0.3620 0.5640 0.6340 19.4458 0.4657 0.5421

Reulst Analysis: Based on testing regarding to NLP part, when the question asked to the model is
more complicated, the result or response become not accurate, this uncertainty may caused by the
image context history itself, or due to the lack training of data, to improve this we need to make sure
each trained model get enough data to train.

6 Conclusion/Future Work
The purpose of this project is to utilize natural language processing and computer vision models
for efficient identify and answer following question to any images, this could apply further in

5
organizations, schools, hospitals and military areas. We are utilizing COCO datasetto train the
data, images are fused together to get a combined more informative output for detection of a
doubtful presence. We are using Encoder-decoder CNN architecture for fusion of images and Resnet
architecture for object detection and localization. Localization of object or persons is done using
Mask-RCNN model which not only localizes the object, but also provides a mask for localized object.
I believe this area has huge impact in visual recognition in industry or academic.

References

[1] arXiv:1611.08669
[2] Apple (2017). Siri. https://www.apple.com/ios/siri/. Accessed: 2017-10-04.
[3] Google (2017). Google assistant. https://assistant.google.com/. Accessed:2017-10-04.
[4] Microsoft (2017a). Cortana. https://www.microsoft.com/en-us/windows/cortana. Accessed: 2017-10-04.
[5] Bengio, Y., Ducharme, R., Vincent, P., and Jauvin, C. (2003). A neural probabilistic language model.
Journal of machine learning research, 3(Feb):1137–1155.for neural dialogue generation. arXiv preprint
arXiv:1701.06547.
[6] Harris, Z. S. (1954). Distributional structure. Word, 10(2-3):146–162
[7] maskrcnn-benchmark https://github.com/facebookresearch/maskrcnn-benchmark
[8] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. DollÃ¡r, and C. L. Zitnick. Microsoft
COCO: Common Objects in Context. In ECCV, 2014. 2, 3
[9] arXiv:1805.08318
[10] NLTK. http://www.nltk.org/.
[11] Torch. http://torch.ch/
[12]alicebot.org, Alicebot Technology History, [Online]. Available:
http://www.alicebot.org/history/technology.html.
[13] Futurism, "The History of Chatbots," [Online]. Available: https://futurism.com/images/the-historyof-
chatbots-infographic/.
[14] Chatbots.org, "Smarterchild," [Online]. Available: https://www.chatbots.org/chatterbot/smarterchild/.
[15] I. N. d. Silva, D. H. Spatti, R. A. Flauzino, L. H. B. Liboni and S. F. d. R. Alves, Artificial Neural Networks
A Practical Course, Springer International Publishing, 2017.
[16] O. Davydova, "7 Types of Artificial Neural Networks for Natural Language Processing," [Online].Available:
https://www.kdnuggets.com/2017/10/7-types-artificial-neural-networks-naturallanguage-processing.html.

YOLO-Based Object Detection with Voice and Cartoon Effects
No ratings yet
YOLO-Based Object Detection with Voice and Cartoon Effects
6 pages
21MDSWE164 Lab 1 DL
No ratings yet
21MDSWE164 Lab 1 DL
4 pages
DW & Caption Generator - Paper 1
No ratings yet
DW & Caption Generator - Paper 1
6 pages
YOLO V3 ML Project
No ratings yet
YOLO V3 ML Project
15 pages
Data Science Interview Questions 1
No ratings yet
Data Science Interview Questions 1
15 pages
DL Group 6 Rep
No ratings yet
DL Group 6 Rep
11 pages
Artificial Intelligence Based Mobile Robot
No ratings yet
Artificial Intelligence Based Mobile Robot
19 pages
Material
No ratings yet
Material
5 pages
Trustworthy - Final Essay
No ratings yet
Trustworthy - Final Essay
21 pages
Arabic Image Description Model
No ratings yet
Arabic Image Description Model
9 pages
2022 V13i3059
No ratings yet
2022 V13i3059
11 pages
Kuang 2022 J. Phys. Conf. Ser. 2170 012017
No ratings yet
Kuang 2022 J. Phys. Conf. Ser. 2170 012017
7 pages
Real Object Detection System Using Yolov3 Images
No ratings yet
Real Object Detection System Using Yolov3 Images
6 pages
A Deep Learning Approach For Automated Image Captioning With CNN and LSTM Models
No ratings yet
A Deep Learning Approach For Automated Image Captioning With CNN and LSTM Models
6 pages
A Unified Sequence Interface For Vision Tasks
No ratings yet
A Unified Sequence Interface For Vision Tasks
14 pages
Automatic Image Captioning Using Neural Networks
No ratings yet
Automatic Image Captioning Using Neural Networks
9 pages
Leveraging Computer Vision and Natural Language Processing For Object Detection and Localization
No ratings yet
Leveraging Computer Vision and Natural Language Processing For Object Detection and Localization
11 pages
Deep Learning Applications and Image Processing
No ratings yet
Deep Learning Applications and Image Processing
5 pages
Comparative Analysis of Deep Learning Image Detection Algorithms
No ratings yet
Comparative Analysis of Deep Learning Image Detection Algorithms
27 pages
Fruit Old
No ratings yet
Fruit Old
37 pages
Face Mask Detection Project Report
100% (4)
Face Mask Detection Project Report
14 pages
Object Detection and Identification
No ratings yet
Object Detection and Identification
8 pages
Deep Learning for Object Tracking
No ratings yet
Deep Learning for Object Tracking
3 pages
Conversational Interface For Textual and Visual Data Interpretation Using Customized CNN
No ratings yet
Conversational Interface For Textual and Visual Data Interpretation Using Customized CNN
5 pages
Deep Learning Based Text To Image Genera
No ratings yet
Deep Learning Based Text To Image Genera
6 pages
NLP UNIT 5c
No ratings yet
NLP UNIT 5c
33 pages
Visual Image Caption Generator
No ratings yet
Visual Image Caption Generator
8 pages
Comparative Analysis of Deep Learning Image Detect
No ratings yet
Comparative Analysis of Deep Learning Image Detect
22 pages
4.1 - Unsupervised Visual Representation Learning by Context Prediction
No ratings yet
4.1 - Unsupervised Visual Representation Learning by Context Prediction
10 pages
NNDL Unit 5
No ratings yet
NNDL Unit 5
21 pages
Facial Emotion Detection
No ratings yet
Facial Emotion Detection
10 pages
A Guide To Image Captioning. How Deep Learning Helps in Captioning
No ratings yet
A Guide To Image Captioning. How Deep Learning Helps in Captioning
17 pages
Social Distancing Detection Using Tensorflow
No ratings yet
Social Distancing Detection Using Tensorflow
4 pages
Du 2018 J. Phys. Conf. Ser. 1004 012029
No ratings yet
Du 2018 J. Phys. Conf. Ser. 1004 012029
9 pages
Irjet V10i1067
No ratings yet
Irjet V10i1067
5 pages
Deep Learning for Vehicle Detection
No ratings yet
Deep Learning for Vehicle Detection
14 pages
Det GPT
No ratings yet
Det GPT
17 pages
Image Captioning Generator Using Deep Machine Learning
No ratings yet
Image Captioning Generator Using Deep Machine Learning
3 pages
Builders' Guide
No ratings yet
Builders' Guide
21 pages
Real Time Object Detection Using YOLO
No ratings yet
Real Time Object Detection Using YOLO
6 pages
Detection and Content Retrieval of Object in An Image Using YOLO
No ratings yet
Detection and Content Retrieval of Object in An Image Using YOLO
8 pages
AI Image Captioning for CSE Students
No ratings yet
AI Image Captioning for CSE Students
17 pages
1.convolutional Neural Networks For Image Classification
No ratings yet
1.convolutional Neural Networks For Image Classification
11 pages
8 Modern Convolutional Neural Networks: Et Al. Et Al. Et Al
No ratings yet
8 Modern Convolutional Neural Networks: Et Al. Et Al. Et Al
57 pages
Object Detection System With Voice Alert For Blind
No ratings yet
Object Detection System With Voice Alert For Blind
7 pages
Icrcct24 001
No ratings yet
Icrcct24 001
6 pages
Recent Advances in Convolutional Neural Networks-2018
No ratings yet
Recent Advances in Convolutional Neural Networks-2018
42 pages
Admin,+4554 Article+Text 17736 2 10 20210928
No ratings yet
Admin,+4554 Article+Text 17736 2 10 20210928
13 pages
Pami Im2Show and Tell: Lessons Learned From The 2015 MSCOCO Image Captioning Challenge
No ratings yet
Pami Im2Show and Tell: Lessons Learned From The 2015 MSCOCO Image Captioning Challenge
12 pages
SoS'25 Midterm - Report
No ratings yet
SoS'25 Midterm - Report
14 pages
Cognitive Model For Object Detection Based On Speech-to-Text Conversion
No ratings yet
Cognitive Model For Object Detection Based On Speech-to-Text Conversion
5 pages
Unit 5 Notes
100% (1)
Unit 5 Notes
33 pages
Unit 5 Part 1
No ratings yet
Unit 5 Part 1
11 pages
Image Summarizer: Seeing Through Machine Using Deep Learning Algorithm
No ratings yet
Image Summarizer: Seeing Through Machine Using Deep Learning Algorithm
7 pages
Image Captioning
No ratings yet
Image Captioning
17 pages
Vehicle Object Detection Based On Deep Learning
No ratings yet
Vehicle Object Detection Based On Deep Learning
8 pages
Hussain 2020 J. Phys. Conf. Ser. 1432 012087
No ratings yet
Hussain 2020 J. Phys. Conf. Ser. 1432 012087
14 pages
Image Captionbot For Assistive Technology
No ratings yet
Image Captionbot For Assistive Technology
3 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
54 pages
Rassias 1989
No ratings yet
Rassias 1989
6 pages
Hybrid Heuristics For High Speed Route Optimization: 2019 Detrack Systems Pte Ltd. All Rights Reserved
No ratings yet
Hybrid Heuristics For High Speed Route Optimization: 2019 Detrack Systems Pte Ltd. All Rights Reserved
20 pages
Image Denoising Using Wavelet Transform
No ratings yet
Image Denoising Using Wavelet Transform
7 pages
Short Project On MBAl
67% (3)
Short Project On MBAl
19 pages
Introduction To N-Grams and Evaluation
No ratings yet
Introduction To N-Grams and Evaluation
7 pages
1725877145module 3 How AI Works
No ratings yet
1725877145module 3 How AI Works
18 pages
Barrier Function
No ratings yet
Barrier Function
8 pages
Narima 2021
No ratings yet
Narima 2021
9 pages
L2D-Multiple Regression D 2022-03-03 21 - 20 - 03
No ratings yet
L2D-Multiple Regression D 2022-03-03 21 - 20 - 03
31 pages
Function Modules vs Subroutines
No ratings yet
Function Modules vs Subroutines
1 page
Enhancing Control Systems Through Type-3 Fuzzy Log
No ratings yet
Enhancing Control Systems Through Type-3 Fuzzy Log
15 pages
Pattern Recognization Question Bank-1
No ratings yet
Pattern Recognization Question Bank-1
3 pages
01 Speed Read Tensorflow Playground
No ratings yet
01 Speed Read Tensorflow Playground
6 pages
What Is Data Science
No ratings yet
What Is Data Science
3 pages
Human Emotion Detection Using Machine Learning
No ratings yet
Human Emotion Detection Using Machine Learning
8 pages
Student Solution Chap 08
No ratings yet
Student Solution Chap 08
6 pages
G10 Math Q2 - Week 1 - Polynomial Functions
50% (2)
G10 Math Q2 - Week 1 - Polynomial Functions
14 pages
DAV - Viva QnA - Doubtly - in
No ratings yet
DAV - Viva QnA - Doubtly - in
12 pages
FCoDS - W02 - Applied Cryptography
No ratings yet
FCoDS - W02 - Applied Cryptography
22 pages
Linear Functions
No ratings yet
Linear Functions
6 pages
CVPR2022 Tutorial Diffusion Model
No ratings yet
CVPR2022 Tutorial Diffusion Model
188 pages
Linear Transforms
No ratings yet
Linear Transforms
20 pages
Pix2Vox Context-Aware 3D Reconstruction From Single and Multi-View Images
No ratings yet
Pix2Vox Context-Aware 3D Reconstruction From Single and Multi-View Images
9 pages
CSE 100 Lab Report: Fall 2024
No ratings yet
CSE 100 Lab Report: Fall 2024
3 pages
Blockchain Cryptography Basics
No ratings yet
Blockchain Cryptography Basics
22 pages
D1: Algorithms (Further Maths) : Name .. Score: Percentage: Grade: Further Maths Target Grade
No ratings yet
D1: Algorithms (Further Maths) : Name .. Score: Percentage: Grade: Further Maths Target Grade
1 page
MATLAB Simulation For Digital Signal Processing PDF
No ratings yet
MATLAB Simulation For Digital Signal Processing PDF
5 pages
Deep Learning Frameworks & Techniques
No ratings yet
Deep Learning Frameworks & Techniques
5 pages
ML Unit Wise Important Questions
No ratings yet
ML Unit Wise Important Questions
2 pages