Report PDF
Report PDF
on
Warehouse Robot
Bachelor of Technology
IN
COMPUTER ENGINEERING
BY
Mr Faisal Alam
Department Of Computer Engineering
1
Dated…………
Declaration
The work presented in project entitle “Warehouse Robot” submitted to the Department of Computer
Engineering, Zakir Husain College of Engineering and Technology, Aligarh Muslim University
Aligarh, for the award of the degree of Bachelor of Technology in Computer Engineering, during the
session 2016-17, is my original work. I have neither plagiarized nor submitted the same work for the
award of any degree.
Date: (Signature)
Place Arpit Varshney
(Signature)
Khan saad Bin Hasan
2
Certificate
This is to certify that the Project Report entitled “Warehouse Robot”, being submitted by “student
name(s)”, in partial fulfillment of the requirements for the award of the degree of Bachelor of
Technology in Computer Engineering, during the session 2016-17, in the Department of Computer
Engineering, Zakir Husain College of Engineering and Technology, Aligarh Muslim University
Aligarh, is a record of candidate’s own work carried out by him under my (our) supervision and
guidance.
Faisal Alam
Assistant Professor
Department of Computer Engineering
ZHCET, AMU, Aligarh
3
Table of Contents
Abstract 6
List of Figures 7
List of Tables 8
Chapter 1 Introduction 9
1.1 Motivation 1 0
1.2 Objectives and Scope 11
1.3 Organisations
4
5.3.1 YOLOV3 25
5.3.2 How It Works? 26
5.3.3 YOLOV3 Architecture[33] 26
5.4 Transfer Learning 27
5.4.1 Dataset Preparation
5.5 Object Tracking 28
5.5.1 Path Planning 28
5.5.1.2 A Star Algorithm 29
References 31
5
ABSTRACT
In recent times, we have seen a huge number of robots being used in warehouses to automate mundane
tasks. This helps reduce operation costs and makes the warehouses safer and more efficient. This also
helps take the burden off of human workers and helps them focus on more creative tasks. However, These
robots are not always intelligent hence cannot be used in settings other than they were built for. Intelligent
robots are available but are too costly for most warehouses. We are trying to build a robot that can assist
us in transferring goods from one place to another within a storage facility. This will help us to account
for things also. We want the robot to be autonomous so we reduce the amount of workforce needed.
Backup systems are needed to ensure safety of the goods, other robots and people. The robot must also be
cheap and must be programmable to do multiple tasks if needed.
6
LIST OF FIGURES
7
LIST OF TABLES
8
Chapter-1
Introduction
1.1 Motivation
We have seen the rise of large warehouses in modern times especially with the rise of online retail and
ecommerce, the size and number of warehouses has grown considerably. Not long ago warehouses used
manual labor for sorting, managing inventories, transporting, and managing warehouse goods and also to
do dangerous jobs like transporting hazardous substances, climbing to high places, going into dangerous
places within warehouses etc. This incurred a huge cost on the warehouse owners and a huge human cost
as well. This meant that warehouses may be potentially dangerous for the workers and labour was made
to do repetitive tasks while their human intelligence could be utilized to more creative problem solving.
In recent times however this has changed considerably and continues to do so. Many companies like
Alibaba, Amazon etc. have robots transporting their goods autonomously without much human
supervision(or little human supervision). Many companies now build robots that can go up the
warehouses vertically and the new robots are helping redesign the traditional warehouses to provide more
compact and efficient ways to store products. This results in less space being used and potential saving of
expensive real estate.
It has become easier in recent times to acquire large fleets of autonomous robots that can do the heavy
lifting and also robots that can help us with the inventory and management of products. These robots are
autonomous and can also recharge themselves. This can potentially reduce a large percentage of the
warehouse’s workforce resulting in savings of huge amounts of money and saving employees from
potential damage. Also, Unlike most humans these robots can work hours on end without tiring or making
any errors(or very few as compared to humans). Some studies suggest that the initial investments in these
robots can be recovered within a year or two.
However, there is a tradeoff between cheap robots and robots that are somewhat intelligent. Robots that
are cheap can usually do a limited set of tasks efficiently but will fail when given other tasks. This makes
the robots very task specific and in many cases warehouse specific. Hence, Warehouses must contact the
robot supplier to provide the exact robots for their warehouses or make warehouses that suit the particular
robot. This is not suitable for small or old warehouses. Since, small warehouses might not have enough
money to buy specific robots and old warehouses might not work with new robots. These robots use very
simple sensors and actuators and hence are very cheap.
The other case is of intelligent robots, which can adapt to different situations and can be made to do
different tasks via programming or via some softwares. However, to make even mildly intelligent robots
9
we require expensive sensors and actuators like LIDARs, infrared sensors etc. This again might not be
suitable for most of the warehouses since the cost will blow up when buying a large number of robots.
Hence, we have to find a balance between the two approaches- Whether to buy cheap robots that can not
generalize to many tasks or to buy robots that can do many tasks but will be expensive.
We aim to provide the middle path between the two approaches. Our aim is to build a cheap robot that can
generalize to multiple tasks with little modification. This can help small warehouses as well as old
warehouses. Also, it will help warehouses use the same robots for different tasks, Hence, they can replace
robots within their warehouses and switch old robots to smaller tasks if needed. We use cheaper sensors
and alternate sensors wherever possible.
This does lead to another set of problems. Since, the robot is unable to get an accurate understanding of its
position. And we can not deal with dynamic or fast moving objects or sudden changes in the environment.
This however is not always needed in a warehouse-like setting where the setting may not be very dynamic
and exact positions are not always needed.
Our first approach was to use Visual SLAM with multiple cameras mounted on the robot providing a
wide angle view. These images can then be stitched together and used for Visual SLAM. However, we
were unable to complete this since we had low computation power onboard and the lag associated with
sending streams to a high computation power computer was much more than was desirable. Hence, we
had to abandon this approach.
We decided to take the cameras off the robot and use an external camera connected to another computer
with higher computational power. This should work well within a warehouse, since we can mount the
cameras on the ceilings or use the stream from security cameras. We then detected the robot and via
search and planning algorithms gave it a path to move on. We gave commands to it based on that path and
it moved.
Currently, we are trying to build a backup system using ultrasonic sensors and servo motors. We are also
exploring reinforcement learning approaches so we do not have to explicitly program how the robot
should behave. We also plan to add obstacle avoidance and object detection.
10
1.3 Organization
Chapter 1 gives an introduction to the problem, our motivation to solve it and the scope of the solution.
Chapter 2 Discusses various approaches that have been used to solve similar problems.
In Chapter 3 we have given a brief description of the various tools and technologies we have used.
Chapter 4 talks about how we have put together the robot and made the circuit.
In Chapter 5 we have discussed in detail our approach to solving the problem and the various algorithms
used.
Chapter 6 gives an idea as to what problems we are facing currently and what may our future course of
action be.
11
Chapter-2
Literature Review
Gmapping[15] and Hector slam[16] are techniques used to create 2D maps which may be useful but are
not sufficient to get a comprehensive map of the world. Cartographer[17] on the other hand has a lot of
advantages. It is fast and works in real time. It has a great documentation available, it is easy to set up and
shows promising results. Apart from LIDAR based SLAM, vision based SLAM techniques are also
available. They can work even with monocular cameras as shown in [18][19][20] in real time. These
approaches seem promising to build a 3D map of the environment. These techniques promise to be less
resource intensive as compared to LIDAR based techniques and can even be run on mobile phones.
We found the work of wayve ai[25] which is highly based on machine learning techniques. They use
simulation for training their car and then transfer this learning to the real world and provide fine tuning as
needed[26][27]. Waymo uses their custom built “carcraft” [28], truevision ai also has a very good
simulator[29], carla is also a very good simulator[30]. For visualization, even though Rviz is very good
and customizable, Xviz is also available which is customized for self-driving cars[31].
iRobot[34] was one of the first companies to come up with innovative commercial floor cleaners.
Roomba 400 series similar to 600 series[38], was one of their first robots with very cheap sensors and a
naive model which was based on randomly moving and hoping that the robot would eventually cover the
entire area. It became an instant hit with the people. Irobot Scooba 450[39] uses iAdapt Responsive
Navigation Technology, which is a highly advanced software system with sensors that allows the robot to
cover the entire floor section with multiple passes. On the other hand, the Braava 380t[40] uses the
NorthStar Navigation System that comes with a stand-alone battery-operated Cube that allows the device
to determine its location in every room and automatically build a map to ensure an efficient cleaning
route. The robot will then be able to make a single pass in every area within its generated configured map
and once it’s done cleaning, it is programmed to return to its starting position.
12
Dyson[36] EYE 360[37] uses a 360 Degree View camera to build a map of the room and navigate
efficiently
13
2.2.4 Autostore[41]
Autostore is perhaps the most unique concept of robots. It is cheap, efficient and saves a lot of money.
The idea is to have a vertical storage system, hence better utilizing the whole space and not leaving any
space in between shelves. The robots are made to move within grooves of the grills at the top of the
vertical storage hence the need for expensive sensors is minimized. The robots can pick up goods from
below it and then transport it to the proper exit station.
14
Chapter-3
Tools and Technologies Used
3.1 Languages and framework
● Keras is an open-source neural-network library written in Python. It is capable of running on
top of TensorFlow, Microsoft Cognitive Toolkit, R, Theano, or PlaidML. Designed to enable fast
experimentation with deep neural networks, it focuses on being user-friendly, modular, and
extensible.
We have Used Keras to Build the Model for Object Detection.
● Tensorflow is a free and open-source software library for dataflow and differentiable
programming across a range of tasks. It is a symbolic math library, and is also used for machine
learning applications such as neural networks.
We have Used Tensorflow in Transfer Learning.
● Socket programming Computer network programming involves writing computer programs that
enable processes to communicate with each other across a computer network.
Used Python for establishing socket for Providing the video stream and performing computational
task on the remote computer
● Python is an interpreted, high-level, general-purpose programming language.
● Opencv is a library of programming functions mainly aimed at real-time computer vision.Used in
Computer Vision tasks like Object detection , Color Tracking , etc.
● Numpy Used Numpy for the scientific large calculation, like Convolution operation , sliding
window, etc.
● pygame[14] (the library) is a Free and Open Source python programming language library for
making multimedia applications like games built on top of the excellent SDL library. Like SDL,
pygame is highly portable and runs on nearly every platform and operating system.
15
3.4 Materials Used
● BO Gear motors[1]: These are very popular motors for hobbyists. These motors have gears
inside which facilitate controlled application of power
● LS298n based motor drivers[2]: It is a very versatile motor driver, it uses the popular L298
motor driver IC and has an onboard 5V regulator which it can supply to an external circuit. It can
control up to 4 DC motors, or 2 DC motors with directional and speed control.
● power bank[3]: We use 10000mAh power banks to run motors as well as to power the arduino
and raspberry pi. The current given to motors is via 2.1 A port, to raspberry pi via 2.1 A port and
via 1.5 A port to Arduino.
● or low level computing we have used Arduino UNO since it has a large user
Arduino uno[12]: F
base and a supportive community and works with a wide variety of sensors and actuators hence
ideal for our use case.
16
Fig 3.4 Arduino Uno Fig 3.5 Raspberry pi 3B
●
●
●
●
Fig 3.6 MG90S Servo Motors Fig 3.7 HCSR04 Ultrasonic Distance Sensor
● HC SR04[9]: It is a cheap and surprisingly accurate sensor which is widely used to calculate
distances. We have used it so we can make a backup system to protect the robot from colliding. It
is mounted on the servos thus giving us a 360 Degree view.
● Cameras: Rpi cam[10] has been used since it is made to work with raspberry pi especially hence
is optimized for performance but we can place only one such camera on rpi hence we have also
used Logitech C270 webcam[7].
17
● Normal sized Exam board[5] has been used as the chassis of the robot, jumper wires[4] are
used to do connection between sensors and actuators, RW 002 off road wheels[6] are the wheels
we have used.
Fig 3.8 Raspberry pi camera Fig 3.9 Logitech Webcam Fig 3.10 Offroad Wheels
18
Chapter-4
Assembly and Circuit Diagram
● Wheels are screwed on the motors which are fixed to the bottom of the board. 4 such motors
along with 4 wheels have been used. 2 motor drivers have been fitted to drive the 4 motors.
● The motors are connected to the motors and wires are taken to the upper part of the chassis to
connect with the arduino. The Arduino is connected to the raspberry pi to enable i2c
communication.
● Cameras are connected to the raspberry pi. The rpi is connected to wifi via a mobile hotspot and
the remote workstation is also connected to the same mobile hotspot so that the remote computer
and the rpi are on the same network.
● The Arduino is also connected to 2 servo motors and 2 ultrasonic sensors.
● The motor drivers are powered by the same 2.1A power supply connected in parallel. One power
bank has been used to power the motors only since on starting the motors exerted more pressure
on the powerbank leading to it turning off. Another powerbank has been used power arduino and
rpi.
● The circuit diagram along with the pins of arduino are as shown in the figures.
19
Fig 4.1 Fritzing Diagram
20
Fig 4.2 How the networking is done
21
Table-4.2 Arduino Pin Layout
22
Chapter-5
Implementation Details
5.1 Problem Solving Approaches
There are two approaches that we are considering to solve this problem:
It does not matter which approach we take we have to Implement two things before we can proceed
23
5.1.4 Video Streaming with wide field of View
We need to stream the video from the robot to the computer since the algorithms may require a large
amount of computational power to run which the rpi cannot provide. Hence we are using a workstation.
The field of view of the stream must be as large as possible since it would help in better decision making.
SLAM algorithms also require a field of view ~120 Degrees. Also we need stream that has little delay as
possible. We have tried out different things to achieve the above but with little success. We have tried to
use image stitching to stitch together images from different cameras but it didn't work as expected. Also
the lag due to video streaming is much more than we would like.
We are also implementing a backup system that can help save the vehicle from any untoward incident if
things were to go wrong. The arduino can always act as the backup system and should be able to stop the
vehicle in case of any obstacle and if needed can drive the robot to safety.
5.2.1 Arduino
● Wrote the code to move the robot forward, backward, right and left. These are made into
functions so calling right or left will move the robot as such.
● Wrote the code to rotate servo motors and to get distance data from HCSR04 sensors. The reading
of HCSR04 has to be converted based upon the temperature due to the difference in speed of
sound based upon the temperature of the medium. We have used the table shown to calculate the
effect of temperature.
● A Backup Algorithm has also been implemented. It uses the servo motors and ultrasonic sensors
to get the distances in front and back of the robot. If an obstacle is detected which has distance
less than the safety distance the robot is stopped and may backtrack during which if another
obstacle is similarly detected. It is again stopped. It looks around by rotating its servo and tries to
find a space where no obstacle is there. It then rotates itself in that direction and moves along thus
securing the robot.
● The robot can be given command by rpi via I2C communication. These commands are then used
to decide where the robot should move and appropriate commands are given.
5.2.2 Raspberry pi
● Send stream to the workstation over a udp connection.
● Receive command from the workstation, relay the command to the arduino via I2C.
24
5.2.3 Workstation
● The workstation establishes a UDP connection with the rpi and receives the stream.
● The stream can be shown in a pygame window and the user can give commands to it via
keyboard up, down, right, left arrows and space to stop .
● Object Detection is implemented in a different module and it is called to detect objects.
● Images can also be pasted together and shown from the different cameras.
The relevant algorithms and techniques from the above are detailed below.
5.3.1 YOLOV3
You only look once (YOLO) at an image to predict what objects are present and where they are present
using a single convolutional network.
It also it’s previous version but YOLOv3 uses a few tricks to improve training and increase performance,
including: multi-scale predictions, a better backbone classifier, etc.
25
5.3.2 How It Works?
Prior detection systems repurpose classifiers or localizers to perform detection. They apply the model to
an image at multiple locations and scales. High scoring regions of the image are considered detections.
It uses a totally different approach. It applies a single neural network to the full image. This network
divides the image into regions and predicts bounding boxes and probabilities for each region. These
bounding boxes are weighted by the predicted probabilities
26
5.4 Transfer Learning
Transfer learning is a useful way to quickly retrain YOLOv3 on new data without needing to retrain the
entire network. We accomplish this by starting from the official YOLOv3 weights, and setting each
layer's .requires_grad field to false that we do not want to calculate gradients for and optimize.
We have figured out the classes that are being present in the dataset and we are currently preparing our
dataset that includes - shoe , wheel , boxes , etc.
27
5.5 Object Tracking
Object Tracking is the process of locating a moving object (or multiple objects) over time using a camera,
This technique allows us to track a particular object on the basis of its Colour .
We Used this technique in order to locate the robot in the specified space , so that the robot can move
across the optimized path from source to destination
28
Fig 5.5 Optimized path Between Source and Destination
29
Chapter-6
Current Challenges and Future Plan
● Implementing a backup algorithm that can help us avoid any accident is one of our top
priorities. Currently we have implemented the algorithm and tested with arduino only. However,
the testing with the workstation and rpi along with the planning algorithm remains to be done.
● Lag in stream: We have tried a lot of different techniques for streaming but most result in
significant lags of upto 10s(with logitech camera) and less than 1s with rpi cam. There seems to
be a tradeoff between the quality of video, lag associated and the stillness of video. We wish to
find the optimal way to reduce these.
● Field of View of all the cameras is not more than 60 Degrees while we require around 120
degrees hence we resorted to image stitching which didn’t turn out quite well. Hence, we would
like to explore other possibilities to realize this.
● We wish to work upon our Object Detection model and to collect data for it to make it work
better on our data. Our main objective is to detect obstacles for which we are currently exploring
Machine learning based techniques such as YOLO object detection as well as classical
techniques, such as Haar Cascades.
● We plan to collect a video of the robot moving, so we can use it to train a reinforcement learning
model. This would help us to generalize the model without programming it for many scenarios.
● We are working on implementing the Kalman filter algorithm. It requires us to get data from
multiple sensors. This includes wheel encoders and inertial measurement units. Currently, we
have been able to get data from both, however, mounting them on the robot is a challenge we
would like to address.
30
REFERENCES
31
simulation-facilities/537648/ (Aug-10-2019)
[29] https://www.truevision.ai/ (Aug-10-2019)
[30] Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., & Koltun, V. (2017). CARLA: An open urban driving
simulator. arXiv preprint arXiv:1711.03938 .
[31] https://github.com/uber/xviz (Aug-10-2019)
[32] https://github.com/AlexeyAB/darknet (Sep-10-2019)
[33] Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv preprint arXiv:1804.02767
(2018).
[34]Irobot.com, 'iRobot Corporation: We Are The Robot Company', 2015. [Online]. Available:
http://www.irobot.com/.
[35]Neato, 'Neato Robotics | Smartest, Most Powerful, Best Robot Vacuum', 2015. [Online]. Available:
http://www.neatorobotics.com/.
[36]Dyson.com, 'Latest Dyson Vacuum Cleaner Technology | Dyson.com', 2015. [Online]. Available:
http://www.dyson.com/vacuum-cleaners.aspx.
[37]Dyson 360 Eyeâ„¢ robot, 'Dyson 360 Eyeâ„¢ robot', 2015. [Online]. Available: https://www.dyson360eye.com/.
[38]Irobot.com, 'iRobot Corporation: We Are The Robot Company', 2015. [Online]. Available:
https://www.irobot.in/600-series.aspx
[39]Irobot.com, 'iRobot Corporation: We Are The Robot Company', 2015. [Online]. Available:
https://store.irobot.com/default/scooba-floor-scrubbing/irobot-scooba-450/S450020.html?cgid=us
[40]Irobot.com, 'iRobot Corporation: We Are The Robot Company', 2015. [Online]. Available:
https://store.irobot.com/default/braava-floor-mopping-irobot-braava-floor-mopping/B380020.html
[41] bastiansolutions, (accessed March 17, 2020) https://www.bastiansolutions.com/blog/what-is-autostore/
[42] ssi-schaefer, (accessed March 17, 2020)
https://www.ssi-schaefer.com/en-us/products/conveying-transport/automated-guided-vehicles/fahrerloses-transports
ystem-weasel-53020
[43] exotec, (accessed March 17, 2020) https://www.exotec.com/en/
[44] amazon-robotics, (accessed March 17, 2020) https://www.amazonrobotics.com/#/
32