IPT Chapter 3
IPT Chapter 3
Chapter 3
Data Mapping and Exchange: Meta data; Data representation and encoding; XML, DTD,
XML schema s
3.1 Data Mapping and Exchange
Data Mapping:
         In computing and data          management, data              mapping is          the    process      of
creating data     element mappings between two distinct data models
         In metadata, the term data element is an atomic unit of data that has precise
meaning or precise semantics. A data element has:
     1. An identification such as a data element name
     2. A clear data element definition
     3. One or more representation terms
     4. Optional enumerated values Code (metadata)
     5. A list of synonyms to data elements
         A data model organizes data elements and standardizes how the data elements
relate to one another. Since data elements document real life people, places and things
and the events between them, the data model represents reality.
Why Do We Need Data Mapping?
Data mapping is used as a first step for a wide variety of data integration tasks
including:
     •    Data transformation or data mediation between data source and destination
     •    Identification of data relationships
     •    Discovery of hidden sensitive data
     •    Consolidation of multiple databases into a single data base and identifying
          redundant columns of data for consolidation or elimination
          Data integration involves           combining data residing         in     different   sources    and
providing users with a unified view of these data. This process becomes significant in a
variety of situations, which include both commercial (when two similar companies need
to      merge    their databases)       and      scientific     (combining          research     results    from
different bioinformatics repositories) domains.
Data Exchange:
          Data     exchange is          the       process        of      taking data structured            under
a source schema and          actually         transforming      it     into        data    structured      under
a target schema, so that the target data is an accurate representation of the source
data.
                                                1 ||: By Beya
                          Integrative programming and technologies
3.2 Metadata
        Metadata (metacontent) is defined as the data providing information about one
or more aspects of the data, such as:
•   Means of creation of the data
•   Purpose of the data
•   Time and date of creation
•   Creator or author of the data
•   Location on a computer network where the data were created
Example
Digital image   may include metadata that describe the picture size, the color depth, the
image resolution, time and date of image creation.
                                        2 ||: By Beya
                        Integrative programming and technologies
A text document's metadata may contain information about how long the document is,
who the author is, when the document was written, and a short summary of the
document.
3.3 Introduction to XML
   •   XML stands for Extensible Markup Language
   •   XML is a markup language much like HTML
   •   XML was designed to describe data, not to display data
   •   XML tags are not predefined. You must define your own tags
   •   XML is designed to be self-descriptive
   •   XML is a W3C Recommendation
   •   XML does not DO anything
Difference between XML and HTML
   •   XML is not a replacement for HTML; XML is a complement to HTML.
   •   XML is a software- and hardware-independent tool for carrying information.
   •   XML was designed to describe data, with focus on what data is
   •   HTML was designed to display data, with focus on how data looks
XML Does Not DO Anything:
 The following example is a note to Tove, from Jani, stored as XML:
<note>
  <to>Tove</to>
  <from>Jani</from>
  <heading>Reminder</heading>
  <body>Don'tforget me this weekend!</body>
</note>
The note above is quite self descriptive. It has sender and receiver information, it also
has a heading and a message body.
But still, this XML document does not DO anything. It is just information wrapped in
tags. Someone must write a piece of software to send, receive or display it.
How Can XML be used?
XML is used in many aspects of web development, often to simplify data storage and
sharing.
   1. XML Separates Data from HTML
   2. XML Simplifies Data Sharing
   3. XML Simplifies Data Transport
   4. XML Simplifies Platform Changes
                                        3 ||: By Beya
                         Integrative programming and technologies
XML element
   •   An XML document contains XML Elements.
   •   An XML element is everything from (including) the element's start tag to
       (including) the element's end tag.
   •   An element can contain:
          o other elements
          o text
          o attributes
          o or a mix of all of the above...
Empty XML Elements
An alternative syntax can be used for XML elements with no content: Instead of writing
a book element (with no content) like this:
<book></book>
It can be written like this:
<book />
This sort of element syntax is called self-closing.
                                         4 ||: By Beya
                        Integrative programming and technologies
Example:
                                             5 ||: By Beya
                           Integrative programming and technologies
                                          6 ||: By Beya
                              Integrative programming and technologies
                                               7 ||: By Beya
                               Integrative programming and technologies
      Note: Only the characters "<" and "&" are strictly illegal in XML. The greater than
      character is legal, but it is a good habit to replace it.
7. Comments in XML
      •    The syntax for writing comments in XML is similar to that of HTML.
      •    <!-- This is a comment -->
8. White-space is preserved in XML
      •    HTML truncates multiple white-space characters to one single white-space:
                      •       HTML:                           •   Hello
                                                                          Tove
                                              8 ||: By Beya
                        Integrative programming and technologies
         Like all XML documents, this one starts with an XML declaration, <?xml
version="1.0" encoding="UTF-8"?>. This XML declaration indicates that we're using
XML version 1.0, and using the UTF-8 character encoding,
         This XML declaration, <?xml?>, uses two attributes, version and encoding, to
set the version of XML and the character set we're using. Next we create a new XML
element named <document>. XML tags themselves always start with < and end with
>.Then we store other elements in our <document> element, or text data, as we
wish.
Character Encodings: ASCII, Unicode, and UCS
         The characters in an XML document are stored using numeric codes. That can
be an issue, because different character sets use different codes, which means an XML
processor might have problems trying to read an XML document that uses a character
set called a character encoding
Which character sets are supported in XML? ASCII? Unicode? UCS?
There are many character encodings that an XML processor can support, such as the
following:
   •    US-ASCII— U.S. ASCII
   •    UTF-8— Compressed Unicode
   •    UTF-16— Compressed UCS
   •    ISO-10646-UCS-2— Unicode
   •    ISO-10646-UCS-4— UCS
   •    ISO-2022-JP— Japanese
   •    ISO-2022-CN— Chinese
   •    ISO-8859-5— ASCII and Cyrillic
                                       9 ||: By Beya
                        Integrative programming and technologies
        1. Cascading Style Sheets (CSS),        which    you   can   also   use   with   HTML
           documents
        2. Extensible Style sheet Language style sheets (XSL), designed to be used
           only with XML documents
Example 3: (example3.xml)
                                       10 ||: By Beya
                        Integrative programming and technologies
     <heading>
          Hello From XML
     </heading>
     <message>
          This is an XML document!
     </message>
</document>
           <SCRIPT LANGUAGE="JavaScript">
                  function getData()
                  {
                         xmldoc= document.all("firstXML").XMLDocument;
                         nodeDoc = xmldoc.documentElement;
                         nodeHeading = nodeDoc.firstChild;
     <BODY>
          <CENTER>
                 <H1>
                        Retrieving data from an XML document
                                       11 ||: By Beya
                         Integrative programming and technologies
</H1>
                 <DIV ID="message"></DIV>
                 <P>
                 <INPUT       TYPE="BUTTON"             VALUE="Read     the       heading"
ONCLICK="getData()">
           </CENTER>
       </BODY>
</HTML>
                                       12 ||: By Beya
                           Integrative programming and technologies
          As an example, you can see how you add a DTD to our XML document.
DTDs can be separate documents, or they can be built into an XML document as
we've done here using a special element named <!DOCTYPE>.
An XML Document with a DTD (example4.xml)
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="css1.css"?>
<!DOCTYPE document
[
     <!ELEMENT document (heading, message)>
     <!ELEMENT heading (#PCDATA)>
     <!ELEMENT message (#PCDATA)>
]>
<document>
       <heading>
             Hello From XML
       </heading>
       <message>
             This is an XML document!
       </message>
</document>
                                           13 ||: By Beya
                        Integrative programming and technologies
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
                                      14 ||: By Beya
                         Integrative programming and technologies
•   The XML Schema language is also referred to as XML Schema Definition (XSD),
    describes the structure of an XML document.
•   Defines the legal building blocks (elements and attributes) of an XML document like
    DTD.
•   defines which elements are child elements
•   defines the number and order of child elements
•   defines whether an element is empty or can include text
•   defines data types for elements and attributes
•   defines default and fixed values for elements and attributes
                                       15 ||: By Beya
                            Integrative programming and technologies
Example
Here are some XML elements:
<lastname>Refsnes</lastname>
<age>36</age>
<dateborn>1970-03-27</dateborn>
And here are the corresponding simple element definitions:
<xs:element name="lastname" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
<xs:element name="dateborn" type="xs:date"/>
                                             16 ||: By Beya
                         Integrative programming and technologies
                                          17 ||: By Beya
                       Integrative programming and technologies
A complex XML element, "description", which contains both elements and text:
<description>
It happened on <date lang="norwegian">03.03.99</date>
</description>
XSD Elements Only
How to Define a Complex Element using XML Scheme
Look at this complex XML element, "employee", which contains only other elements:
<employee>
  <firstname>John</firstname>
  <lastname>Smith</lastname>
</employee>
The "employee" element can be declared directly by naming the element, like this:
<xs:element name="employee">
  <xs:complexType>
     <xs:sequence>
        <xs:element name="firstname" type="xs:string"/>
        <xs:element name="lastname" type="xs:string"/>
     </xs:sequence>
  </xs:complexType>
</xs:element>
      If you use the method described above, only the "employee" element can use
the specified complex type. Note that the child elements, "firstname" and "lastname",
are surrounded by the <sequence> indicator. This means that the child elements must
appear in the same order as they are declared. The "employee" element can have a
type attribute that refers to the name of the complex type to use:
XSD Empty Elements
An empty complex element cannot have contents, only attributes.
An empty XML element:
<product prodid="1345" />
It is possible to declare the "product" element more compactly, like this:
<xs:element name="product">
  <xs:complexType>
     <xs:attribute name="prodid" type="xs:positiveInteger"/>
  </xs:complexType>
</xs:element>
                                      18 ||: By Beya
                         Integrative programming and technologies
XSD Indicators
We can control HOW elements are to be used in documents with indicators.
Order Indicators
Order indicators are used to define the order of the elements.
Order indicators are:
•        All
•        Choice
•        Sequence
All Indicator
The <all> indicator specifies that the child elements can appear in any order, and that
each child element must occur only once:
<xs:element name="person">
    <xs:complexType>
      <xs:all>
          <xs:element name="firstname" type="xs:string"/>
          <xs:element name="lastname" type="xs:string"/>
      </xs:all>
    </xs:complexType>
</xs:element>
Choice Indicator
The <choice> indicator specifies that either one child element or another can occur:
<xs:element name="person">
    <xs:complexType>
      <xs:choice>
          <xs:element name="employee" type="employee"/>
          <xs:element name="member" type="member"/>
      </xs:choice>
    </xs:complexType>
</xs:element>
Sequence Indicator
The <sequence> indicator specifies that the child elements must appear in a specific
order:
                                       19 ||: By Beya
                          Integrative programming and technologies
<xs:element name="person">
       <xs:complexType>
        <xs:sequence>
           <xs:element name="firstname" type="xs:string"/>
           <xs:element name="lastname" type="xs:string"/>
        </xs:sequence>
      </xs:complexType>
</xs:element>
An XML Document
Let's have a look at this XML document called "shiporder.xml":
<?xml version="1.0" encoding="ISO-8859-1"?>
<shiporder orderid="889923">
      <orderperson>John Smith</orderperson>
      <shipto>
        <name>Ola Nordmann</name>
        <address>Langgt 23</address>
        <city>4000 Stavanger</city>
        <country>Norway</country>
      </shipto>
 </shiporder>
         The XML document above consists of a root element, "shiporder", that contains
a required attribute called "orderid". The "shiporder" element contains child elements:
"orderperson" and “shipto”.
Create an XML Schema
         Now we want to create a schema for the XML document above. We start by
opening a new file that we will call "shiporder.xsd". To create the schema we could
simply follow the structure in the XML document and define each element as we find
it. We will start with the standard XML declaration followed by the xs:schema element
that defines a schema:
<?xml version="1.0" encoding="UFT-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
...
</xs:schema>
                                        20 ||: By Beya
                        Integrative programming and technologies
      In the schema above we use the standard namespace (xs), and the URI
associated with this namespace is the Schema language definition, which has the
standard value of http://www.w3.org/2001/XMLSchema.
      Next, we have to define the "shiporder" element. This element has an attribute
and it contains other elements, therefore we consider it as a complex type. The child
elements of the "shiporder" element is surrounded by a xs:sequence element that
defines an ordered sequence of sub elements:
<xs:element name="shiporder">
  <xs:complexType>
     <xs:sequence>
        ...
     </xs:sequence>
  </xs:complexType>
</xs:element>
      Then we have to define the "orderperson" element as a simple type (because it
does not contain any attributes or other elements). The type (xs:string) is prefixed
with the namespace. The prefix associated with XML Schema that indicates a predefined
schema data type:
<xs:element name="orderperson" type="xs:string"/>
Next, we have to define two elements that are of the complex type: "shipto". We start
by defining the "shipto" element:
<xs:element name="shipto">
  <xs:complexType>
     <xs:sequence>
        <xs:element name="name" type="xs:string"/>
        <xs:element name="address" type="xs:string"/>
        <xs:element name="city" type="xs:string"/>
        <xs:element name="country" type="xs:string"/>
     </xs:sequence>
  </xs:complexType>
</xs:element>
We can now declare the attribute of the "shiporder" element. Since this is a required
attribute we specify use="required".
Note: The attribute declarations must always come last:
<xs:attribute name="orderid" type="xs:string" use="required"/>
                                       21 ||: By Beya
                        Integrative programming and technologies
<xs:element name="shiporder">
  <xs:complexType>
     <xs:sequence>
          <xs:element name="shipto">
           <xs:complexType>
              <xs:sequence>
                 <xs:element name="name" type="xs:string"/>
                 <xs:element name="address" type="xs:string"/>
                 <xs:element name="city" type="xs:string"/>
                 <xs:element name="country" type="xs:string"/>
              </xs:sequence>
            </xs:complexType>
           </xs:element>
</xs:sequence>
<xs:attribute name="orderid" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
An XSD Example
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
                                       22 ||: By Beya
                          Integrative programming and technologies
<xs:element name="note">
<xs:complexType>
    <xs:sequence>
         <xs:element name="to" type="xs:string"/>
         <xs:element name="from" type="xs:string"/>
         <xs:element name="heading" type="xs:string"/>
         <xs:element name="body" type="xs:string"/>
    </xs:sequence>
</xs:complexType>
</xs:element>
The Schema above is interpreted like this:
•       <xs:element name="note"> defines the element called "note"
•       <xs:complexType> the "note" element is a complex type
•       <xs:sequence> the complex type is a sequence of elements
•       <xs:element name="to" type="xs:string"> the element "to" is of type string (text)
•       <xs:element name="from" type="xs:string"> the element "from" is of type string
•       <xs:element name="heading" type="xs:string"> the element "heading" is of type
        string
•       <xs:element name="body" type="xs:string"> the element "body" is of type string
                                        23 ||: By Beya
                        Integrative programming and technologies
Example
<html>
<body>
 <span id="to"></span>
<span id="from"></span>
<span id="message"></span>
<script>
if (window.XMLHttpRequest)
  {// code for IE7+, Firefox, Chrome, Opera, Safari
  xmlhttp=new XMLHttpRequest();
  }
else
  {// code for IE6, IE5
                                      24 ||: By Beya
                                Integrative programming and technologies
     xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
     }
xmlhttp.open("GET","note.xml",false);
xmlhttp.send();
xmlDoc=xmlhttp.responseXML;
document.getElementById("to").innerHTML=
           xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue;
document.getElementById("from").innerHTML=
           xmlDoc.getElementsByTagName("from")[0].childNodes[0].nodeValue;
document.getElementById("message").innerHTML=
           xmlDoc.getElementsByTagName("message")[0].childNodes[0].nodeValue;
</script>
</body>
</html>
Important Note!
To       extract   the   text   "Tove"   from     the   <to>     element   in   the   XML   file   above
("note.xml"), the syntax is: getElementsByTagName("to")[0].childNodes[0].nodeValue
Notice that even if the XML file contains only ONE <to> element you still have to
specify the array index [0]. This is because the getElementsByTagName() method
returns an array.
25 ||: By Beya