Web Designing – 2 (405)
Unit 1: Introduction to XML
Unit-1 : Introduction of XML:
1.1 Characteristic and Use of XML
1.2 XML syntax (Declaration, Tags, elements)
1.3 root element, case sensitivity
1.4 XML document:
1.4.1 Document Prolog Section
1.4.2 Document element section
1.5 XML declaration and rules of declaration.
What is XML?
▪      XML stands for Extensible Markup Language. It is a textual data format which
       application used to communicate (send and receive data) with other application.
▪      It is a text-based markup language derived from Standard Generalized Markup Language
       (SGML).
▪      It is a markup language it uses tags.
▪      To define and store data in a shareable manner. XML supports information exchange
       between computer systems such as websites, databases, and third-party applications.
HTML VS XML
 HTML                                           XML
 We cand only use pre-defined tags              We can use both pre-defined and used
                                                defined tags
    Ex.<p><b><br>                               Ex. <friendlist> <channel><xml>
    Not case sensitive                          case sensitive
Why XML?
In real world xml use for create GUI, Java based web application and increase searching of
your website.
      Raj                            Ram
English,Hindi,Gujarati      Chinese,Koren,English
Here, data format English
      Application1           Application2
Here, data format may be json,xml,etc
Note: (JavaScript Object Notation) is a data interchange format that uses human-readable
text to store and transmit data.
▪     Data exchange (Platform & language independent)
By: Dr. Ami Desai                                                                 1|Page
                                                                       Web Designing – 2 (405)
    XML allows data to be exchanged between databases, user desktops, and third-party
    applications in any platform and any language.
▪   Data structure
    XML uses tags to define the structure and meaning of data, such as the beginning and
    end of a paragraph or the location of an image.
▪   Data sharing
    XML is a flexible way to create information formats and share structured data
    electronically.
▪   Data readability
    XML is designed to be readable by both humans and machines.
▪   Data customization
    XML is highly customizable and can be used to create different content types, such as
    web, print, and mobile content.
▪   Web searching
    Search engines use XML tags to make searches more accurate. For example, a search
    engine can limit search results to pages that contain a specific tag, such as an author's
    name.
Uses or Advantages of XML
XML is widely used in the era of web development. It is also used to simplify data storage
and data sharing.
The main features or advantages of XML are given below.
▪   XML separates data from HTML
    o If we need to display dynamic data in our HTML document, it will take a lot of work
       to edit the HTML each time the data changes.
    o With XML, data can be stored in separate XML files. This way we can focus on using
       HTML/CSS for display and layout, and be sure that changes in the underlying data
       will not require any changes to the HTML.
    o With a few lines of JavaScript code, we can read an external XML file and update the
       data content of our web page.
▪   XML simplifies data sharing
    o In the real world, computer systems and databases contain data in incompatible
       formats.
    o XML data is stored in plain text format (not in databse). This provides a software-
       and hardware-independent way of storing data.
By: Dr. Ami Desai                                                                  2|Page
                                                                        Web Designing – 2 (405)
    o    This makes it much easier to create data that can be shared by different applications.
▪   XML simplifies data transport(connect between different application)
    o One of the most time-consuming challenges for developers is to exchange data
       between incompatible systems over the Internet.
    o Exchanging data as XML greatly reduces this complexity, since the data can be read
       by different incompatible applications.
▪   XML simplifies Platform change
    o XML data is stored in text format. This makes it easier to expand or upgrade to new
       operating systems, new applications, or new browsers, without losing data.
▪   XML increases data availability
    o Different applications can access our data, not only in HTML pages, but also from
       XML data sources.
    o With XML, your data can be available to all kinds of "reading machines" (Handheld
       computers, voice machines, news feeds, etc.), and make it more available for blind
       people, or people with other disabilities.
▪   XML can be used to create new internet languages
    o A lot of new Internet languages are created with XML.
    o Here are some examples:
                ✓ XHTML
                ✓ WSDL(Web Service Description Lang) for describing available web
                    services
                ✓ WAP(Wireless Application Protocol) and WML (Wireless Markup
                    Lang) as markup languages foxr handheld devices
                ✓ RSS (Really Simple Syndication) languages for news feeds
                ✓ RDF(Resource Description Framework) and OWL (Web Ontology
                    Language) for describing resources and ontology
                ✓ SMIL (Synchronized Multimedia Integration Language) for describing
                    multimedia for the web
Characteristics of XML
▪   There are three important characteristics of XML that make it useful in a variety of
    systems and solutions:
      o XML is extensible: XML allows you to create your own self-descriptive tags, or
         language, that suits your application.
      o XML carries the data, does not present it: XML allows you to store the data
         irrespective of how it will be presented. XML page carries data from other data
         files.
      o XML is a public standard: XML was developed by an organization called the
         World Wide Web Consortium (W3C) and is available as an open standard.
By: Dr. Ami Desai                                                                    3|Page
                                                                     Web Designing – 2 (405)
XML also has some other characteristics:
   1. XML is a structured format, which means that we can define exactly how the data is
      to be arranged, organized and expressed within the file. When we are given a file, we
      can validate that it conforms to a specific structure, prior to importing the data.
   2. XML is a described format, which means that within the text file, every item of data
      has a name that is both human- and machine-readable as well as being uniquely
      identifiable. Ex: <youtube xs:string>
   3. XML can easily describe hierarchical data and the relationships between data.
             <youtube>
               <channel>sony<channel>
               <subscriber>1k</subscriber>
              </youtube>
   4. XML can be validated, which means we can provide a second XML file – an XML
      Schema Definition file – that describes exactly how the XML data file should be
      structured.
   5. XML is a strongly-typed format, which means the schema definition file specifies
      the data type of each element. When importing the data, the application can check the
      schema definition to identify the data type to import it as.
   6. XML is a global format. There is only one way to express a number in an XML file
      (with US number formats) and only one way to express a date The most common
      types are: xs:string. xs:decimal. xs:integer.
   7. XML is a standard format. It also allows different applications to read, write,
      understand and validate the same XML files, allowing us to share data between
      applications in an extremely strong manner.
Structure of XML
   ▪   XML document is a well-organized collection of components and associated markup.
   ▪   An XML document can hold a wide range of information. For instance, a database
       having numbers or a mathematical equation etc.
   ▪   An XML document can have following elements:
XML document has 2 sections:
   1) Document Prolog            2) Document Elements
   1) Document Prolog: It contains XML & document type declaration. These components
   should appear before root of the document and at very first line of the document.it is
   optional.
By: Dr. Ami Desai                                                                4|Page
                                                                        Web Designing – 2 (405)
   1. Declaration: a processing instruction that provides basic information about the
      format of an XML document.ex.
      <? xml version="1.0" encoding="utf-8"?>
   ▪ This is the version of XML used in the document. The latest version number can be
      fetched from http://w3.org.
      Syntax for XML Declaration
<?xml
version="version_number"
encoding="encoding_declaration"
standalone="standalone_status"
?>
E.g. <?xml version = "1.0" encoding = "UTF-8" standalone = "no" ?>
For use external data set standalone=”no”.by default it is “no”
An XML declaration should follow with the following rules −
   ▪   If the XML declaration is present in the XML, it must be placed as the first line in the
       XML document.
   ▪   If the XML declaration is included, it must contain version number attribute.
   ▪   The Parameter names and values are case-sensitive.
   ▪   The names are always in lower case.
By: Dr. Ami Desai                                                                   5|Page
                                                                        Web Designing – 2 (405)
   ▪    The order of placing the parameters is important. The correct order is: version,
        encoding and standalone.
   ▪    Either single or double quotes may be used.
   ▪    The XML declaration has no closing tag i.e. </?xml>
   2. Document element
      1. Root: XML document must have a root element. A root element can have child
         elements and sub-child elements. For example: In the following XML
         document, <message> is the root element and <to> , <from> , <subject> and
         <text> are child elements.
      2. Comments: <!-- My Comment --> but after prolog statement.
      3. DocType: Document Type Declaration node can take 2 forms, a reference to an
         external file which contains the DTD Schema, or an inline DTD Schema
         description.
       DTD is the rules to be followed to make the XML file valid.
            •    It can be standard rules or user defined rules. It may be HTML, math(maths)
                ,svg(graphics) etc
            •   DTD can be declared in XML file inside <!DOCTYPE> declaration, or defined in
                external file and referenced inside <!DOCTYPE>.
            •   The purpose of a DTD is to define the structure of an XML document. It defines
                the structure with a list of legal elements.
   Note: A "well formed" XML document is not the same as a "valid" XML document. But A
           "valid" XML document must be well formed. In addition, it must confirm to a
           document type definition.
   Example :
   <!DOCTYPE book[
   <!ELEMENT book (title,author,price)>
   <!ELEMENT title (#PCDATA)>
   <!ELEMENT author (#PCDATA)>
   <!ELEMENT price (#PCDATA)> ]>
   The DTD above is interpreted like this:
        •   !DOCTYPE book defines that the root element of the document is book.
        •   !ELEMENT book defines that the book element must contain the elements: "title,
            author, price”
        •   !ELEMENT title defines the title element to be of type "#PCDATA"
By: Dr. Ami Desai                                                                   6|Page
                                                                Web Designing – 2 (405)
       • !ELEMENT author defines the author element to be of type "#PCDATA"
       • !ELEMENT price defines the price element to be of type "#PCDATA"
Note: PCDATA: Parse able Character Data #PCDATA means that the element contains data
which is display, CDATA: Character Data which is not visible ex: id of any tag.
There are two types of DTDs:
       1) Internal / Embedded DTD
       2) External DTD
1) Internal / Embedded DTD
       <?xml version="1.0" encoding="UTF-8"?>
       <!DOCTYPE student [
       <!ELEMENT student (id,name,age,addr,email,ph)>
       <!ELEMENT id (#PCDATA)>
       <!ELEMENT name (#PCDATA)>
      <!ELEMENT age (#PCDATA)>
       <!ELEMENT addr (#PCDATA)>
       <!ELEMENT email (#PCDATA)>
       <!ELEMENT ph (#PCDATA)> ]>
       <student>
              <id>543</id>
              <name>Ravi</name>
              <age>21</age>
              <addr>Guntur</addr>
              <email>nsr@gmail.com</email>
              <ph>9855555</ph>
            <gender>male</gender>
       </student>
       2) External DTD
       <!ELEMENT student (id,name,age,addr,email)>
       <!ELEMENT id (#PCDATA)>
       <!ELEMENT name (#PCDATA)>
       <!ELEMENT age (#PCDATA)>
       <!ELEMENT addr (#PCDATA)>
       <!ELEMENT email (#PCDATA)>
By: Dr. Ami Desai                                                           7|Page
                                                                      Web Designing – 2 (405)
        Save the above code as “student.dtd” and prepare “student.xml” as follows... Doctype
        define type of file .Here student is name of tag.
        SYSTEM: This indicates that the Document Type Definition (DTD) file is specified
        externally. The SYSTEM keyword is used to provide the location (a URI or a path) of
        the external DTD file.
        <?xml version="1.0" encoding="UTF-8"?>
        <!DOCTYPE student SYSTEM "student.dtd">
        <student>
              <id>543</id>
              <name>Ravi</name>
              <age>21</age>
              <addr>Guntur</addr>
              <email>nsr@gmail.com</email>
        </student>
        In the above example we are using <!DOCTYPE student SYSTEM "student.dtd"> which
        is used to provide “student.dtd” code in our “student.xml” file.
        If the above xml code follows the exact rules defined in DTD then we can conclude
        that our xml document is a valid document. Otherwise it is an invalid document.
        4. Elements which comprises of element name, value, namespace, comment, CDATA,
           entity references.
XML declaration and rules of XML Elements Rules
  1. Root: Every XML file must contain a root element that is parent element Ex. <root>
     Root Element − An XML document can have only one root element. For example,
     following is not a correct XML document, because both the x and y elements occur at
     the top level without a root element –
<x>...</x>
<y>...</y>
   2.   The Following example shows a correctly formed XML document – It should in
        paired.
<root>
<x>...</x>
By: Dr. Ami Desai                                                                 8|Page
                                                                      Web Designing – 2 (405)
<y>...</y>
</root>
   3.   XML prolog: It is the first line of any xml file.it is optional.
        Ex.<?xml version=”1.0” encoding=”UTF-8”?>
   4.   Tags: all element must have a closing tag.xml tags are case sensitive.
        Case Sensitivity − The names of XML-elements are case-sensitive. That means the
        name of the start and the end elements need to be exactly in the same case.
   5.   Attribute:The value must be quoted ex. <name id=”a”> here id is attribute of <name>
        element.
   6.   Comments: <!--- comment text -->
   7.   An element name can contain any alphanumeric characters. The only punctuation
        mark allowed in names are the hyphen (-), under-score (_) and period (.).
   8.   An element, which is a container, can contain text or elements as seen in the above
        example.
        Syntax Rules for Tags and Elements
        Element Syntax − Each XML-element needs to be closed either with start or with end
        elements as shown below −
<element>....</element>
or in simple-cases, just this way −
<element/>
   9.   Nesting of Elements − An XML-element can contain multiple XML-elements as its
        children, but the children elements must not overlap. i.e., an end tag of an element
        must have the same name as that of the most recent unmatched start tag.
By: Dr. Ami Desai                                                                 9|Page