A Comparative analysis of Digital Library Open Source Softwares:‘Dspace’ and
‘GreenStone’
Lakshmi Verma
DESIDOC
Email id-laxmi@desidoc.drdo.in
ABSTRACT
The exponential growth in data generation and subsequent transformation into knowledge has
created huge repositories of knowledge in the libraries. This has revolutionalized the methods
and techniques to retrieve the relevant and useful information for the users. The growth of
Information and communication technology (ICT) has facilitated into achieving this. In this
paper, a study of two open-source digital library management software has been presented which
collects and disseminates information for library-users. This analysis involves the study of
related software documents and related technical manuals.
KEY WORDS
Open Source, Digital Library, Digital Library Management System, software, DSpace, GSDL,
Greenstone
1. Introduction
A place, where collection of information resources is stored in print and other forms in an
organized and accessible manner for print or study is referred to as Library. As defined by
International Organization for Standardization a library is “irrespective of the title, any organized
collection of printed books and periodicals or of any other graphic or audio-visual materials, and
the services of a staff to provide and facilitate the use of such materials as are required to meet
the informational, research, educational or recreational needs of its users”1.
Digital Library is a type of information retrieval system where the information is stored in digital
format which can be accessed within network of computer users6. It uses online repositories
which can store the textual information systematically and can be accessed by users 24X7. There
are various such digital repositories available, which may be open source or proprietary.
Open source describes the method of software development, which uses the power of review and
transparency of distributed peer-to-peer progression. Here the codes of software are available in
open domain which can be customized by the respective users. This technique helps in providing
high quality software through high reliablity, low cost, flexibility and end of traditional seller
lock-in.Since, these open source software come under “Open Source free license”, it allows the
developers / users to change, improve and distribute software many times.
DESIDOC is the central information centre of DRDO which contains various types of
information repositories to disseminate S&T information digitally to the DRDO users. To create
and manage these information repositories, a suitable Digital library Management System
(DLMS) was required. Due to this, it was necessary to analyse the best-suited DLMS for
DESIDOC use. This resulted in the comparison of the two most popular DLMS available today
i.e. DSpace and GSDL.
1
2. Methodology
To compare the two Digital Library Management Systems (DLMS), various review
papers on related subjectwere analysed. Also the technical details and complete specifications
werealso considered through the technical manuals available on the official websites of the two
DLMS. To compare the practical aspects, the frontend and backend of DSpace and GSDL were
also analysed to find their suitability with specific requirements of an information repository.
3. Digital Library Management Systems
The Open Source Digital Library management systems are the software applications
that help in creating and presenting information repositories. The repositories built with the help
of these Digital library management systems can be searched and browsed based on Metadata as
these features are inbuilt in such applications. Apart from this, they can be easily maintained,
enhanced and re-created. Presently many open source software (OSS) applications are available
for library and information management. For example DSpace, GSDL, Fedora, E-print etc.
Therefore, organizations can choose the one which is the most suitable for their requirement and
implement them to create digital repositories. In this paper will be focusing mainly on two of the
most popular Open source Digital Library Management Systems: DSpace and GSDL.
4. DSpace
DSpace is an open source library management software which allows us to capture and
store digital data like text, video, audio etc into created repositories. It also provides facility to
index, preserve and disseminate the digital material. Thus digital libraries use DSpace to manage
the digital materials and publications in professionally maintained repositories.
If we see the world-wide scenario, there are more than 1000 digital repositories which
are developed using the DSpace application for storing, distributing and preserving their digital
data. DSpace is more common as a platform to build an institutional repository which is a
digital collection of research documentation, intellectual publications, library collections etc.
DSpace performs three major tasks to build a repository:
• It captures and ingests the digital content along with metadata
• It lists the content systematically and helps in searching based on keywords and metadata
• It supports preservation of the digital data for a long period of time
Therefore, DSpace can easily be customized to manage and preserve the digital content and
provide accessibility of this data to the users. Since it is an open source software, an
active community of developers, researchers and users across the world are collaborating to
provide their expertise to enhance this application.
DSpace is capable of storing a wide range of digital data, which includes documents like
articles, technical reports, conference papers, books, theses, multimedia publications,
Administrative records, images, audio-video files, web pages etc. It also provides multiple
features like visualization, simulation of the stored data etc.
4.1 Latest Features of Dspace
2
As DSpace is a continuously growing platform, it keeps on releasing upgraded versions from
time to time. 6.x is the latest update to the DSpace platform 8. It consists of an upgraded
configuration system, upgraded file storage plugins, and better quality control / health-check
reporting features (through REST API and also through email). Furthermore, DSpace 6 has a
Java API refactor that adds support for both UUIDs and Hibernate in the database layer. This
feature makes it compatible for future challenges.
As reported by DSpace official website, the new Features and improvements in 6.x version
includes:
• “Java API refactor, featuring Hibernate and UUIDs
• Enhanced (reloadable) configuration system, featuring a new local.cfg configuration file
• Enhanced file storage plugins, featuring support for Amazon S3
• Configurable site healthchecks via email
• XMLUI framework for metadata import from external sources, featuring support for
PubMed imports
• XMLUI export of search results to CSV (for batch editing)
• XMLUI extensible administrative control panel
• REST API Quality Control Reports, along with sample HTML clients and CSV export
(for batch editing)
• REST API support for additional authentication methods (e.g. LDAP, etc)
• All searches default to boolean AND.
• Enhanced indexing for searches (Excel is now searchable, as well as right-to-left text in
PDFs)
• OAI-PMH adds compliance for Open AIRE 3.0 guidelines for literature repositories”2
4.2 Limitation of Dspace
During implementation some limitations have been observed such as Flat File and
Metadata structure, poor user interface, lack of scalability and extensibility, Limited API,Limited
Metadata Features, Limited Reporting Capabilities and lack of support for linked data.
5. GSDL
Greenstone Digital Library is an open source, multilingual software, which has been
released under the terms of the GNU General Public License and is used widely for creating
repositories and making them accessible online11. The development and distribution of GSDL is
an outcome of the joint efforts bythe New Zealand Digital Library Project at the University of
Waikato, UNESCO and the Human Info hyperlink "http://humaninfo.org/" NGO.The aim of
Greenstone software is to enable the users in building their own digital libraries. It provides a
way to organize this information and publish it on the web or any other digital storage media like
DVD and USB flash drives. In the later case, it will run on a non-networked environment. The
digital libraries built by GSDL are fully-searchable and metadata-driven digital resource9.
Infact, this software encourages the effective deployment of digital libraries to share
information and put it in the public domain. Therefore, it is in itself not a digital library, rather it
provides a platform to build the digital library.
3
In 2004 its developers of GSDL were awarded by IFIP Namur award for "contributions
to the awareness of social implications of information technology, and the need for a holistic
approach in the use of information technology that takes account of social implications”9.
5.1 GSDL Versions:
There are two main versions of GSDL namely GSDL2 and GSDL3. GSDL2 was the
earlier version and still under wide-use where as GSDL3 is the latest version under active
development. The best thing is that GSDL3 has backward compatibility and contains almost all
the features of GSDL2. If a programmer is already working on GSDL2, he can either work with
the latest release of GSDL2 or consider upgrading to GSDL3. The Greenstone Librarian
Interface (GLI) provides a feature to import 'Greenstone2 collection' which helps in migrating to
the new software for existing users of GSDL2.
Greenstone3 has been developed in JAVA and uses various latest web technologies—like
XML Transforms (XSLT), and the Java Authentication and Authorization Service (JASS). In the
same context if we see Greenstone2, then it was written in C++ and was based on many self-
developed techniques by the developers as many latest web technologies were not available at
the time. This made the users totally dependent upon the documentation by the development
team. All these limitations have been overcome in the latest GSDL version.
5.2 Limitation of GSDL
Some limitations of GSDL have also been observed like Interactive content updation and
management are not possible, no provisions for identifying duplicacy, metadata handling seems
to be a bit difficult, during the collection building processing of some documents it hangs. Also,
Linux Version looks robust than Windows.
6. Comparison Table of DSpace and GSDL:
Based on above discussion Features Comparison Table1 for DSpace and GSDL is given below:
Table 1
S. No Features of GSDL DSpace
Open source
Software
1. Product Type Software Software
2. Year of creation 1997 2002
3. License Free Free
cost/Update
Cost
4. License GNU GNU
5. Services Training Service via 3rd part service
provider
6. Plug-in extends Yes Yes
7. Resource No CNRI Handles
Identifier
8. OAI-PMH Yes Yes
9. Z39.50 Support Yes No
10. Supported File doc, pdf, html,ppt, doc, pdf, html, ppt, jpeg,
4
formats 7 postscript, jpeg, gif, gif, audio, video,etc.
video, mp3, etc
11. Supported Item Can store and manage all Can store and manage all
Types(Storage types of content types of content
and rendition)
12. Thumbnail Images, Audio, Video Images
Preview
13. Multilingual Greenstone provides Unicode character
Support5 ready-to use multilingual encoding, so different
interfaces that are already languages can be
translated in many supported
languages.
14. Machine-to Z39.50, OAIMHP OAIMHP,OAIORE,
Machine SWORD, SWAP
Interoperability.
15. Syndication --- RSS, ATOM
16. User User Groups LDAP Authenticat ion,
Authentication Shiboleth Authentication
17. Searching Field Specific, Boolean Field Specific, Boolean
Capabilities4 Logic, Logic, Sorting options
18. Browsing Browsing can be done By Author, Title, Subject
Options and collection
19. Metadata Dublin Core, Qualified Dublin Core, Qualified
formats3 DC , ,METS, RFC1807’ DC, METS
NZGLS (New Zealand
Government Locator
Service), AGLS
(Australian Government
Locator Service)
20. Associated Apache Web server, Java Java JDK5 or later Apache
Software3 1.4.0 or above, Image Ant 1.6.2 or later, Apache
Magick Software Ghost Maven 2.0.8 or later Java
scripts and Web Browser 1.4 or later, PostgreSQL
7.3 or later, Apache
Tomcat 4.x/5.x and Web
Browser
21. Software Windows95/98/Me/NT/ Windows(NT/2000/XP/10)
Platforms3 2000/XP/10 Unix/Linux, and AllPOSIX
and MAC OS-X (Linux/BSD/UNIX-like
OSs),OSX
22. Statistical Yes Yes
reporting
23. Databases Its Own PostgreSQL
24. Programming C++, Perl, Java Java and JSP
Language
25. Web Server Apache/I IS Apache and Tomcat
26. URL for free http:// http://www.dspace.org/
downloads10 www.greenstone.org/
download
5
7. Practical Implementation Of Dspace At DESIDOC
DESIDOC is theDefence Scientificinformationand Documentation centre of DRDO
which provides information to various DRDO laboratories through its information and
knowledge basede-services. More than 30 services are being provided to all the DRDO units
country-wide through DRDO IntranetFig 7.1 depicts the webpage on DESIDOC library-portal
which contains the list of all these 33 services.
Fig 7.1
7.1 Requirement of DLMS at DESIDOC:
The web-services used to provide the information to the users were developed at different
platforms in due course of time.This caused varied user experiences and difficulty in maintaining
and hosting these 33 services. Therefore, for uniform user experience, enhanced search features
and for effective maintenance of these services, it was decided to build these information
repositories by using a uniform DLMS platform.
7.2 DSpace at DESIDOC
Based on the analysis carried out to compare the most suitable DLMS platform to build
DESIDOC repositories, it was decided to opt for DSpace for creating these digital repositories.
Few reasons to go for it was that it used JSP and PostgreSQL as frontend and backend to build
the applications. Both of these are available in open domain and easy to work with. Also,
postgreSQL is capable of storing and handling large amount to data which was the requirement
of DESIDOC. Furthermore, DSpace provides various features for the users likefull text search,
metadata based search, federated search etc. At the same time many more features can be
incorporated in these DSpace digital repositories like usage pattern analysis, implementation of
6
business-intelligence tools etc, to make the services much more effective and user- friendly. The
features provided currently by DSpace can be summarized as follows:
1. Uniform user-interface (UI) for all the services
2. Specific search for individual services, available on the home page of each service
3. Searches from meta data and full text for the service.
4. Federated search facility: One common text box which search full text and meta data in all
the 18 services migrated into DSpace
Fig 7.2 shows these features in one of the repositories created using DSpace platform:
Fig 7.2
Fig 7.3 and 7.4 depict the search features of repositories built on DSpace platform.
7
Fig 7.3
Fig7.4
In this way DESIDOC has successfully implemented DSpace platform for 18 of its services like
DRDO E-Journals, DRDO Knowledge Repository, DRDO Science Spectrum, DRDO
Technology Spectrum, Newspaper Clipping Service, Institutional repositories of DRDO,
Archiving of Newspaper Clipping, Union catalogue of periodicals, Archiving of E- journals etc.
What is worth-mentioning here is that all these repositories are actually, different
communities of the same DSPace DLMS where as for the users, they appear as different services
provided bythe Digital library of DESIDOC. This integration of various information-web-
services into a common Dspace platform, has helped the IT-administrators at DESIDOC a
lot,becausenow, instead of maintaining and managing multiple backends and frontends of
various repositories they have now to deal with only one DLMS i.e DSPace.
7.3 Proposed Enhancement in the DLMSat DESIDOC: As the current repositories built
using Dspace have performed quite well in meeting the users’ requirements, DESIDOC is in the
process to migrate the remaining repositories also on the Dspace platform. This will help
DESIDOC , as an information centre, to have all its repositories in the common DLMS i.e.
DSpace. After this is done,there will be an advantage to search all the repositoriesthrough a
single search i.e. federated search. Ease of maintenance and uniform user experience will be
another benefits.
Also, it is proposed to analyse the usage pattern of these repositories through the
features provided in the Dspace DLMS. This will help the librarians at DESIDOC to understand
the users’ needs and further enhance the type and quality of content delivered through the
DESIDOC repositories.
8. Conclusion
Therefore, Digital Library Management software (DLMS) provide a platform to the
digital library-service providers to create an easy to use, customizable architecture for its users.
8
With help of these, the institutional repositories, research documents, manuscripts, audio-video
data of organizations can be stored, preserved and also disseminated to the targeted users. The
two software discussed above, though differ in their architecture and presentation, still meet all
the broad requirements of digital libraries. As a result, its difficult to prefer one specific DLMS
over the other system. Instead of generalizing the suitability, we should emphasise on specific
needs of a particular digital library. As explained above, DESIDOC based on its specific
requirements, has opted for Dspace as it DLMS but Dspace has its own set of advantages and
disadvantages. So, some other Information centre may prefer GSDL for similar purpose.
Therefore, depending upon the specific needs one DLMS may be preferred over the other.
Theselection of the softwareis normally based upon on the format of data to be uploaded, the
way it is to be disseminated,the choice of backend and frontend of the application and time
duration for establishing a Digital Library etc.
9. References:
1.http://portal.unesco.org/en/
ev.phpurl_id=13086&url_do=do_topic&url_section=201.html(accessed on 6Dec 2017)
2. DSpace 6.X System Documentation . http://www.dspace.org/. (accessed on 6Dec 2017)
3.https://arizona.openrepository.com(accessed on 6Dec 2017)
4. Shahkar Tramboo, Humma, (2012), Journal of Computer Applications (0975 – 8887),59(16)
retrieved from https://arxiv.org/ftp/arxiv/papers/1212/1212.4935.pdf
5. https://researchtrend.net (accessed on 6Dec 2017)
6.presentation on digital library https://www.slideshare.net/sandeepsinghsainimba/digital-
library-15882846 (accessed on 6Dec 2017)
7.Ravikumar and Ramanan( 2014), Journal of the University Librarians Association of Sri
Lanka, 18(2), https://jula.sljol.info/articles/10.4038/jula.v18i2.7867/galley/5202/
download/(accessed on 6Dec 2017)
8. DSpace 6.X System Documentation retrieved
fromhttps://wiki.duraspace.org/display/DSDOC6x/DSpace+6.x+Documentation(accessed on
6Dec 2017)
9.http://www.greenstone.org (accessed on 6Dec 2017)
10. Shalini R. Lihitkar* and Ramdas S. Lihitkar,2012, DESIDOC Journal of Library &
Information Technology,32(5),393-400 retreived from
http://publications.drdo.gov.in/ojs/index.php/djlit/article/view/2660/1318(accessed on 6Dec
2017)
11.Nabajyoti Das,PLANNER,
2007,http://ir.inflibnet.ac.in:8080/ir/bitstream/1944/1373/1/48.pdf(accessed on 6Dec 2017)
9
-------------------------------------------------------------------------------------------------------------------
10