SQL and Xquery Tutorial For Ibm Db2, Part 1
SQL and Xquery Tutorial For Ibm Db2, Part 1
      Jessica Cao
      Training Tools Developer
      IBM
03 Aug 2006
      This tutorial introduces readers to the Aroma database to apply Structured Query
      Language (SQL) and XML Query (XQuery) knowledge to solve typical business
      questions. Also included are detailed instructions for installing DB2® and the Aroma
      database. This tutorial is Part 1 of the SQL & XQuery tutorial for IBM DB2 series.
      All the examples in this document are based on Aroma, a sample database that
      contains sales data for coffee and tea products sold in stores across the United
      States. Each example consists of three parts:
      For students using this as part of an academic class, obtain from your instructor the
      instructions to connect to the Aroma database and learn about any differences
      between the guide and your local set up.
      This tutorial was written for DB2 Express-C 9 for UNIX®, Linux® and Windows®
      (formerly known as Viper).
      This tutorial also introduces the basics of SQL and suggests some typical business
      questions that IBM DB2 9 can be used to answer.
      System requirements
      To use this tutorial to the fullest, you should have IBM DB2 9 installed. Download
      DB2 Express-C, a no-charge version of DB2 Express Edition for the community that
      offers the same core data features as DB2 Express Edtion and provides a solid base
      to build and deploy applications. You will also need to install the Aroma database,
      which can be downloaded from here.
             3.     Find setup.exe in the unzipped folder and double-click on it. The DB2
                    Setup Launchpad will appear.
             4.     Within the DB2 Setup Launchpad, click on Install a Product from the
                    panel on the left.
             6.     In the installer, click Next to view the License Agreement. You will need to
                    Accept the agreement to continue and click Next.
             8.     Confirm that you plan to "Install DB2 Universal Database Express Edition
                    on this computer and save my settings in a response file" and click Next.
             11. In the "User Information" screen, set the Domain to None -- use local
                 user account using the pull-down menu, use db2admin as the User
                 name, and set a password.
                 (Optionally, you can select the LocalSystem account, but there are
                 limitations introduced if this option is used. Click the Help button to learn
                 more.)
             12. Make sure the checkbox Use the same user name and password for
                 the remaining DB2 services is checked and click Next.
13. Click Next to skip the "Prepare the DB2 tools catalog" screen.
             15. Verify that the Enable operating system security checkbox is checked.
                 Leave the DB2 administrators group as the default value DB2ADMNS and
                 leave DB2 users group as the default value DB2USERS. Click Next.
             16. Click Finish to start the DB2 product installation when you reach the
                 "Start copying files and create response file" screen.
             17. When the installation is complete, a window with the message "Setup is
                 complete" will appear. Click Finish to complete the installation procedure.
             18. A window entitled "DB2 First Steps" will now appear. Click Create profile.
                 Your Web browser will open a page called DB2 First Steps. This tutorial
                 contains many useful links to information on DB2, but it will not be used
                 during this tutorial. Close your Web browser and continue with this
                 tutorial.
             2.     Start the Command Window from the Start menu ==> all Programs ==>
                    IBM DB2 ==> Command Line Tools ==> Command Window.
             4.     Two files, create.txt and report.txt, will be created. Edit report.txt and
                    compare to the expected results below:
                       TABLE                                  RECORDS
                       --------------------- -----------
                       aroma.class                         9
                       aroma.deal                           9
                       aroma.line_items             182
                       aroma.market                     17
                       aroma.orders                      27
                       aroma.period                      821
                       aroma.product                   59
                       aroma.promotion              194
                          aroma.sales                         69941
                          aroma.store                        18
                          aroma.supplier                 9
      XML support
      XML and basic XQueries will be introduced in Part 7 of this series. To support XML,
      the aromadb database was created with the statement:
      Note the specification of UTF-8 as the codeset for the database. The use of XML
      features is restricted to a database that is defined with codeset UTF-8 and has only
      one database partition. When creating your own databases in the future, don't forget
      to include this parameter or your database won't support XML features!
      Another set of tables contain data to understand the purchases of Aroma Coffee and
      Tea Company. This set of tables is referred to as the purchasing group of tables and
      can be used to understand the purchases from suppliers. This group of tables is not
      used in any examples in this tutorial. It only exists in the Appendix to provide a more
      complete picture of the Aroma Coffee and Tea Company.
      The crow's feet in this diagram indicate a one-to-many relationship between two
      tables. For example, each distinct value in the Perkey column of the Period table
      can occur only once in that table but many times in the Sales table. Column names
      in bold are primary-key columns. Column names in italics are foreign-key columns.
      Column names in bold italics are primary and foreign-key columns.
      Sample data from each table is presented in the following sections so you can see
      how these primary-key to foreign-key relationships work.
      A primary key is a unique key that is a part of the definition of the table. A table
      cannot have more than one primary key, and the columns of a primary key cannot
      contain null values. (Null values will be discussed later in detail. If a value is null, that
      means there is a place to store the data, but nothing is there - this is different from a
      blank.) Primary keys are optional and can be defined in CREATE TABLE or ALTER
      TABLE statements. Primary keys are important because there can only be one
      occurrence of each value; the same row will always be selected. There may be a
      couple of cities with the same name; however, selecting the record based on the
      primary key will ensure the same row is always selected.
      A foreign key is specified in the definition of a referential constraint. A table can have
      zero or more foreign keys. The value of the composite foreign key is null if any
      component of the value is null. Foreign keys are optional and can be defined in
      CREATE TABLE statements or ALTER TABLE statements.
      The following table displays the first few rows of the Period table. The primary-key
      column is the Perkey column:
      The following table displays the first few rows of the Product table. The primary key
      is a combination of the Classkey and Prodkey values:
      Note: recall that a primary key is a unique key that is a part of the definition of the
      table. Normally, a primary key is a single column in a table. In this case, the primary
      key is a combination of two columns, which means that there can only be one
      combination like this in a table.
The following table displays the first few rows of the Class table.
      The following table displays the first few rows of the Store table (some columns
      were truncated to fit on the page). The primary-key column is Storekey; Mktkey is a
The following table displays the first few rows of the Market table.
Promotion table
      The following table displays the first few rows of the Promotion table. The
      primary-key column is Promokey.
      The Value column identifies the cash value of a given promotion. The user can
      evaluate the cost of the promotions by summing these values.
      Sales table
      The following table displays the first 20 rows of the Sales table.
      The last column, Comments, stores information about customer feedback using the
      XML data type. XML is introduced in Part 7 of this series.
      The Sales table contains a multipart primary key. Each of the key's five columns is a
      foreign-key reference to the primary key of another table:
      This primary key links the Sales data to the Period, Product, Store, and
      Promotion tables.
      Gold table
      The Gold table contains information for "Gold Card" customers that allow them to
      enjoy various discounts, free gifts, and e-mail reminders of promotional activities.
      The Gold table will be created by the participant in Part 6 of this series, and will have
      the following columns: unique card ID, store key, first name, last name, e-mail, and
      status.
                • SQL was devised specifically as part of the relational theory. You need a
                  language to talk to your database, to retrieve data from it, or to make
                  changes to the data.
                • SQL is a non-navigational language. In other words, with SQL, you don't
                  have to tell DB2 to find a record, read its pointer, and follow that pointer to
                  a related row in the table. Simply tell DB2 what to do and DB2 figures out
                  how to handle the command. It's that simple!
                • You can use SQL to modify aspects of a table without having to take the
                  table offline, add new columns to an existing table, or add business rules
                  to the definition of the table.
Easy
                • What were the weekly sales of Lotta Latte brand coffee in San Jose
                  during last year?
                • What were the average monthly sales of all coffee products in the West
                  during each month of last year?
      Moderately difficult
                • How do the sales of Lotta Latte in San Jose compare with its sales in Los
                  Angeles and New York?
                • How has the monthly market share of Lotta Latte changed during the last
                  two years in all markets?
                • Which suppliers charge the most for bulk tea products?
                • What was the most successful promotion last December in California?
      Very challenging business questions, which DB2 9 can be used to provide
      answers.
                • What were the running totals for Lotta Latte sales during each month of
                  last year?
                • What were the ratios of monthly sales to total sales (expressed as
                  percentages) for Lotta Latte during the same period?
                • Which ten cities had the worst coffee sales in 2004 with regard to dollar
                  sales and quantities sold?
                • Which Aroma stores fall into the top 25 percent in terms of sales revenue
                  for the first quarter of 2005? Which stores fall into the middle 50 percent,
                  and which fall into the bottom 25 percent?
      Numbers
      All numbers have a sign and a precision. The precision is the number of bits or
      digits, excluding the sign.
      Character string
      A character string is a sequence of bytes. The length of the string is the number of
      bytes in the sequence. If the length is zero, the value is called the empty string.
      Graphic string
      A graphic string is a sequence of double-byte character data.
      Binary string
      A binary string is a sequence of bytes. It is used to hold nontraditional data, such as
      pictures. Binary Large OBject (BLOB) is a binary string. See Large Objects (LOBs)
      for more information.
      Datetime values
      Datetime values are representations of dates, times, and timestamps (a character
      string of 14 digits that represents a valid date and time in the form
      yyyyxxddhhmmss). Datetime values can be used in certain arithmetic and string
      operations and are compatible with certain strings; however, they are neither strings
      nor numbers.
      Null value
      The null value is a special value that is distinct from all non-null values. It means the
      absence of any other value for that column in the row. The null value exists for all
      data types.
      XML
      DB2 9 is the industry's first hybrid data server for managing data from both relational
      and pureXML™ structures. DB2 9 has introduced an optimized data storage engine
      for XML data alongside the existing relational engine.
      XML is self-describing in that the meaning of the data is included in the record. The
      Comments column uses the XML data type to store customer comments. See Part 7
      of this series for more details.
      You can run these interactively or you can save your statements in a script. To
      execute these statements, you would click the Play icon in the upper-left corner of
      the graphic (circled in Figure 2, below).
      Statements can alternatively be run from the Command Line Processor (CLP). Open
      the CLP from the Start menu ==> all Programs ==> IBM DB2 ==> Command Line
      Tools ==> Command Line Processor.
      Connecting to a database
      You need to connect to a database before you can use SQL statements to query or
      manipulate data. The CONNECT statement associates a database connection with
      a user name.
      To connect to the aromadb database, type the following command in the DB2
      command line processor:
      If you are using this guide as part of a classroom, ask your instructor for the userid
      and password to connect to the database.
      If no userid and password are required, or if you are using this in a self-study mode
and using your own computer, simply type the following command:
CONNECT TO aromadb
The following message tells you that you have made a successful connection:
Once you are connected, you can start using the database.
      Contents of the DB2 Information Center are updated with each product or fixpak
      release. Be certain you know which version of DB2 you are running to ensure that
      you obtain the correct information. The DB2 Information Center becomes more
      helpful the more that you take advantage of it. As you become familiar with the
      options available, you can narrow your search more effectively and get the answers
      to your questions more efficiently.
                • You can easily access the DB2 Information Center from the Start menu
                  ==> all Programs ==> IBM DB2 ==> Information ==> Information
                  Center.
                • Alternatively, you could use this url:
                  http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp
      Example
      In Part 3 of this series you will encounter an error when you attempt to execute a
      query that incorporated mismatched datatypes. You will receive the following error
message:
      To better understand the meaning of this message, you may want to research the
      details of the error and learn the corrective action you should take.
             2.     Enter the error code from the error message (SQL0401N) in the Search
                    box at the top left corner.
3. Click GO.
             4.     The result set displays two results. The first result is "100% SQL0401",
                    indicating that it is 100% relevant, while the second result is only 8%
                    relevant. Select the first one.
      Section 8. Metadata
      Metadata can be defined as data about the structure of the database itself. This data
      includes information such as lists of the tables, indexes, columns, and more included
      in the database.
      You will often hear metadata called by other names, including catalog data or
      system tables. Metadata is automatically collected by DB2 9 whenever a user takes
      an action to create, delete, or change some aspect of the database.
      The standards of storing data established for conventional SQL do not apply to the
      storage of metadata. Each database provider sets up unique methods to store and
      process the metadata. Knowing how to retrieve this information can be of
      tremendous assistance in working with a database.
      System tables
      DB2 9 stores metadata in nearly 100 system tables. They are all contained in the
      SYSIBM schema. This schema is created automatically each time a database is
      created. System tables also follow a standard naming convention. Each table name
      begins with SYS, as in SYSTABLES, SYSINDEXES, and SYSUSERAUTH. (For this
      reason, user-defined tables cannot begin with the letters SYS.)
      A user retrieves metadata from system tables in much the same way as data from
      regular tables. SQL statements behave the same in both cases. Knowledge of the
      structure of the tables is all that is needed. Given the appropriate authorizations, a
      user can obtain a list of all of the system tables holding metadata by querying the
      database.
Example query
                                                         SELECT name
                                                         FROM sysibm.systables
                                                         WHERE name LIKE 'SYS%';
      Result
      NAME
      SYSATTRIBUTES
      SYSBUFFERPOOLNODES
      SYSBUFFERPOOLS
      ...
      SYSTABLES
      ...
      SYSXSROBJECTS
      You may or may not be able to execute the above query, depending on permissions
      you have been granted by the System Manager and/or Database Administrator.
      If you execute the above query without the WHERE clause, you will obtain a list of
      all tables in the database. The result set will be ordered by Creator (schema) and
      alphabetically by table name within each Creator. The column name "Creator"
      assumes that the schema identified for each table has the same name as its creator.
Example query
      Result
      NAME                                               CREATOR
      CLASS                                              AROMA
      ...                                                AROMA
      SUPPLIER                                           AROMA
      ...                                                ...
      SYSATTRIBUTES                                      SYSIBM
      ...                                                SYSIBM
      SYSXMLSTATS                                        SYSIBM
      ...                                                ...
      By constraining on the name of the schema, you can limit the result set to those
      tables that are part of a given schema.
Example query
      Result
      NAME                                               CREATOR
      CLASS                                              AROMA
      ...                                                AROMA
      SUPPLIER                                           AROMA
      Similarly, you can list information about the columns in a specific table by accessing
      the sysibm.syscolumns system table.
      Example query
      SELECT * FROM sysibm.syscolumns WHERE tbname='CLASS' AND
      tbcreator='AROMA';
      Result
      NAME               TBNAME             TBCREATOR    REMARKS   COLTYPE    ...
      CLASS_DESC CLASS                      AROMA                  CHAR       ...
      CLASS_TYPE         CLASS              AROMA                  CHAR       ...
      CLASSKEY           CLASS              AROMA                  INTEGER    ...
      Examine the contents of this table by executing the command. There are many more
      columns in the result set than shown here.
      Section 9. Summary
      This tutorial introduced the retail group of tables and suggested some typical
      business question that IBM DB2 9 can be used to answer. These tables were
      created to show an environment where:
The most commonly used data types were described in this tutorial:
      Numbers
      Character String
      Graphic String
      Binary String
      Datetime Values
      Null Value
      XML
      Downloads
       Description                                 Name             Size     Download
                                                                             method
      Aroma Database                               Aroma_Data.zip   1MB      HTTP
      Resources
      Learn
         • Read the "DB2 XML evaluation guide" (developerWorks, June 2006), a
           step-by-step tutorial introducing the reader to the DB2 Viper data server on
           Windows platforms using the XML storage and searching (SQL/XML, XQuery)
           capabilities available to support next-generation applications.
         • Check out this article and "Get off to a fast start with DB2 Viper"
           (developerWorks, March 2006).
         • View this article series' "Appendix A" (developerWorks, August 2006).
         • Learn how to "Query DB2 XML data with XQuery" (developerWorks, April
           2006).
         • Learn how to "Query DB2 XML data with SQL" (developerWorks, March 2006).
         • Read the IBM Systems Journal and celebrate 10 years of XML.
         • Refer to the SQL Reference, Vol 1 for additionanl information.
         • Refer to the SQL Reference, Vol 2 for additionanl information.
         • Refer to the DB2 information Center for troubleshooting.
      Get products and technologies
         • Now you can use DB2 for free. Download DB2 Express-C, a no-charge version
           of DB2 Express Edition for the community that offers the same core data
           features as DB2 Express Edtion and provides a solid base to build and deploy
           applications.
      Discuss
         • Participate in the discussion forum for this content.
         • Visit the DB2 9 On-line Support Forum.
      Bruce Creighton
                  Bruce Creighton is a Skills Segment Planner in the Information
                  Management Education Planning and Development department. In this
                  role, he plans investment in educational content and balances the
                  investment between areas where IBM can attain revenue and those
                  where the requirement for skills development are important enough to
                  provide free education.
      Jessica Cao
                       Jessica Cao is an Arts and Science and Computer Science student at
                       McMaster University. She expects to complete her combined honours
                       degree in April 2009. Jessica is working in IBM Toronto lab's DB2
                       Information Management Skills Channel Planning and Enablement
                       Program to take advantage of her interest in programming, editing, and
                       writing.
      Trademarks
      IBM, DB2, Universal Database, OS/2, and pureXML are registered trademarks of
      IBM Corporation in the United States and other countries.
      Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft
      Corporation in the United States, other countries, or both.
      UNIX is a registered trademark of The Open Group in the United States and other
      countries.
      Linux is a registered trademark of Linus Torvalds in the United States, other
      countries, or both.