Using SqoopTool To Transfer Data
Between Hadoop And MySQL
❖ Sqoop
Sqoop is a tool designed to transfer data between Hadoop and relational
database servers. It is used to import data from relational databases such as
MySQL, Oracle to Hadoop HDFS, and export from the Hadoop file system to
relational databases.
Implementation
● MySQL
i.) Open the Cloudera Terminal and execute the following command in order
to start MySQL server. (Note: The default password is cloudera for root user)
mysql -u root -p
ii.) Creating a database
create database myTutorial
iii.) Creating a Table
(Note: The database must be in use before you create a table.)
iv.) Insert values
● Cloudera
After you exit MySQL, create a folder in Cloudera file system to import the
above MySQL table which was created.
( In the following steps,’myfirstdata’ folder is created in /home/cloudera)
i.) Importing the table using Sqoop
sqoop import connect
jdbc:mysql://youripaddress:3306/<database_name>
--username root --password cloudera
-–table <table_name>
--targetdir=<target_directory> -m 1
Here,
-m specifies the number of mappers
3306 is the default port for MySQL
In our case:
[cloudera@quickstart ~]$ sqoop import --connect
jdbc:mysql://localhost:3306/myTutorial --username root --password cloudera
--table myTable --target-dir=/home/cloudera/myfirstdata -m 1
ii.) Displaying the contents in HDFS
hadoop fs -ls /home/cloudera/myfirstdata
hadoop fs -cat /home/cloudera/myfirstdata/part-m-00000