1.8.
2 Database Administrator
One of the main reasons for using DBMSs is to have central
 control of both the data and the programs that access those
 data. A person who has such central control over the system
 is called a database administrator (DBA). The functions
 of a DBA include:
 Schema definition. The DBA creates the original
 database schema by executing a set of data definition
 statements in the DDL.
 Storage structure and access-method definition.
 The DBA may specify some parameters pertaining to
 the physical organization of the data and the indices to
 be created.
 Schema and physical-organization
 modification. The DBA carries out changes to
 the schema and physical organization to reflect the
 changing needs of the organization, or to alter the
 physical organization to improve performance.
 Page 25
 Granting of authorization for data access. By
 granting different types of authorization, the database
 administrator can regulate which parts of the database
 various users can access. The authorization information
 is kept in a special system structure that the database
 system consults whenever a user tries to access the
 data in the system.
 Routine maintenance. Examples of the database
 administrator's routine maintenance activities are:
 Periodically backing up the database onto remote
 servers, to prevent loss of data in case of disasters
 such as flooding.
 Ensuring that enough free disk space is available for
 normal operations, and upgrading disk space as
 required.
Monitoring jobs running on the database and
 ensuring that performance is not degraded by very
 expensive tasks submitted by some users.
 1.9 History of Database Systems
 Information processing drives the growth of computers, as it
 has from the earliest days of commercial computers. In fact,
 automation of data processing tasks predates computers.
 Punched cards, invented by Herman Hollerith, were used at
 the very beginning of the twentieth century to record U.S.
 census data, and mechanical systems were used to process
 the cards and tabulate results. Punched cards were later
 widely used as a means of entering data into computers.
 Techniques for data storage and processing have evolved
 over the years:
 1950s and early 1960s: Magnetic tapes were
 developed for data storage. Data-processing tasks such
 as payroll were automated, with data stored on tapes.
 Processing of data consisted of reading data from one or
 more tapes and writing data to a new tape. Data could
 also be input from punched card decks and output to
 printers. For example, salary raises were processed by
 entering the raises on punched cards and reading the
 punched card deck in synchronization with a tape
 containing the master salary details. The records had to
 be in the same sorted order. The salary raises would be
 added to the salary read from the master tape and
 written to a new tape; the new tape would become the
 new master tape.
 Tapes (and card decks) could be read only
 sequentially, and data sizes were much larger than main
 memory; thus, data-processing programs were forced to
 process data in a particular order by reading and
 merging data from tapes and card decks.
 Page 26
Late 1960s and early 1970s: Widespread use of hard
 disks in the late 1960s changed the scenario for data
 processing greatly, since hard disks allowed direct
 access to data. The position of data on disk was
 immaterial, since any location on disk could be accessed
 in just tens of milliseconds. Data were thus freed from
 the tyranny of sequentiality. With the advent of disks,
 the network and hierarchical data models were
 developed, which allowed data structures such as lists
 and trees to be stored on disk. Programmers could
 construct and manipulate these data structures.
 A landmark paper by Edgar Codd in 1970 defined the
 relational model and nonprocedural ways of querying
 data in the relational model, and relational databases
 were born. The simplicity of the relational model and the
 possibility of hiding implementation details completely
 from the programmer were enticing indeed. Codd later
 won the prestigious Association of Computing Machinery
 Turing Award for his work.
 Late 1970s and 1980s: Although academically
 interesting, the relational model was not used in
 practice initially because of its perceived performance
 disadvantages; relational databases could not match
 the performance of existing network and hierarchical
 databases. That changed with System R, a
 groundbreaking project at IBM Research that developed
 techniques for the construction of an efficient relational
 database system. The fully functional System R
 prototype led to IBM's first relational database product,
 SQL/DS. At the same time, the Ingres system was being
 developed at the University of California at Berkeley. It
 led to a commercial product of the same name. Also
 around this time, the first version of Oracle was
 released. Initial commercial relational database
 systems, such as IBM DB2, Oracle, Ingres, and DEC Rdb,
played a major role in advancing techniques for efficient
 processing of declarative queries.
 By the early 1980s, relational databases had become
 competitive with network and hierarchical database
 systems even in the area of performance. Relational
 databases were so easy to use that they eventually
 replaced network and hierarchical databases.
 Programmers using those older models were forced to
 deal with many low-level implementation details, and
 they had to code their queries in a procedural fashion.
 Most importantly, they had to keep efficiency in mind
 when designing their programs, which involved a lot of
 effort. In contrast, in a relational database, almost all
 these low-level tasks are carried out automatically by
 the database system, leaving the programmer free to
 work at a logical level. Since attaining dominance in the
 1980s, the relational model has reigned supreme among
 data models.
 The 1980s also saw much research on parallel and
 distributed databases, as well as initial work on object
oriented databases.
 Page 27
 1990s: The SQL language was designed primarily for
 decision support applications, which are query
intensive, yet the mainstay of databases in the 1980s
 was transaction-processing applications, which are
 update-intensive.
 In the early 1990s, decision support and querying re
emerged as a major application area for databases.
 Tools for analyzing large amounts of data saw a large
 growth in usage. Many database vendors introduced
 parallel database products in this period. Database
 vendors also began to add object-relational support to
 their databases.
The major event of the 1990s was the explosive
 growth of the World Wide Web. Databases were
 deployed much more extensively than ever before.
 Database systems now had to support very high
 transaction-processing rates, as well as very high
 reliability and 24 × 7 availability (availability 24 hours a
 day, 7 days a week, meaning no downtime for
 scheduled maintenance activities). Database systems
 also had to support web interfaces to data.
 2000s: The types of data stored in database systems
 evolved rapidly during this period. Semi-structured data
 became increasingly important. XML emerged as a data
exchange standard. JSON, a more compact data
exchange format well suited for storing objects from
 JavaScript or other programming languages
 subsequently grew increasingly important. Increasingly,
 such data were stored in relational database systems as
 support for the XML and JSON formats was added to the
 major commercial systems. Spatial data (that is, data
 that include geographic information) saw widespread
 use in navigation systems and advanced applications.
 Database systems added support for such data.
 Open-source database systems, notably PostgreSQL
 and MySQL saw increased use. "Auto-admin" features
 were added to database systems in order to allow
 automatic reconfiguration to adapt to changing
 workloads. This helped reduce the human workload in
 administering a database.
 Social network platforms grew at a rapid pace,
 creating a need to manage data about connections
 between people and their posted data, that did not fit
 well into a tabular row-and-column format. This led to
 the development of graph databases.
 In the latter part of the decade, the use of data
 analytics and data mining in enterprises became
ubiquitous. Database systems were developed
 specifically to serve this market. These systems
 featured physical data organizations suitable for analytic
 processing, such as "column-stores," in which tables are
 stored by column rather than the traditional row
oriented storage of the major commercial database
 systems.
 The huge volumes of data, as well as the fact that
 much of the data used for analytics was textual or semi
structured, led to the development of programming
 frameworks, such as map-reduce, to facilitate
 application programmers' use of parallelism in analyzing
 data. In time, support for these features migrated into
 traditional database systems. Even in the late 2010s,
 debate continued in the database research
 community over the relative merits of a single
 database system serving both traditional transaction
 processing applications and the newer data-analysis
 applications versus maintaining separate systems for
 these roles.
 Page 28
 The variety of new data-intensive applications and the
 need for rapid development, particularly by startup
 f
 irms, led to "NoSQL" systems that provide a lightweight
 form of data management. The name was derived from
 those systems' lack of support for the ubiquitous
 database query language SQL, though the name is now
 often viewed as meaning "not only SQL." The lack of a
 high-level query language based on the relational model
 gave programmers greater flexibility to work with new
 types of data. The lack of traditional database systems'
 support for strict data consistency provided more
 f
 lexibility in an application's use of distributed data
 stores. The NoSQL model of "eventual consistency"
 allowed for distributed copies of data to be inconsistent
as long they would eventually converge in the absence
 of further updates