skip to main content
10.1145/1410140.1410151acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
research-article

Merging changes in XML documents using reliable context fingerprints

Published: 16 September 2008 Publication History

Abstract

Different dialects of XML have emerged as ubiquitous document exchange formats. For effective collaboration based on such documents, the capability to propagate edit operations performed on a document is indispensable. In order to avoid the transmission of whole documents, deltas are used to describe these edit operations, allowing the construction of a new version of a document. However, patching a document with a delta it was not generated for is error-prone, and any insert or delete operations performed on the document are likely to affect all subsequent paths within that document.
In this paper, we present a delta format for XML documents that uses context-aware fingerprints to identify edit operations. This allows our XML patch procedure to find the correct position of an edit operation, even if the document was updated in the meantime. Possible conflicts are detected. Experimental results show the reliability of the presented fingerprinting technique and prove the high quality of the resulting patched documents.

References

[1]
S. Balasubramaniam and B. C. Pierce. What is a file synchronizer? In Fourth Annual ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom '98), Oct. 1998.
[2]
J. Boyer. Canonical XML version 1.0, 2001.
[3]
E. Bruno, J. L. Maitre, and E. Murisasco. Extending xQuery with transformation operators. In DocEng '03: Proceedings of the 2003 ACM symposium on Document engineering, pages 1--8, New York, NY, USA, 2003. ACM.
[4]
D. Chamberlin, D. Florescu, J. Melton, J. Robie, and J. Siméon. XQuery Update Facility 1.0, 2008.
[5]
S. S. Chawathe and H. Garcia-Molina. Meaningful change detection in structured data. In SIGMOD Conference, pages 26--37, 1997.
[6]
J. Clark and S. deRose. XML Path Language (XPath). Technical report, World Wide Web Consortium, 1999.
[7]
G. Cobéna, S. Abiteboul, and A. Marian. Detecting Changes in XML Documents. In Proceedings of the 18th International Conference on Data Engineering, 26 February - 1 March 2002, San Jose, CA, pages 41--52. IEEE Computer Society, 2002.
[8]
D. de Brum Saccol, N. Edelweiss, R. de Matos Galante, and C. Zaniolo. XML version detection. In DocEng '07: Proceedings of the 2007 ACM symposium on Document engineering, pages 79--88, New York, NY, USA, 2007. ACM.
[9]
D. Eastlake, J. Reagle, and D. Solo. XML-Signature syntax and processing, 2002.
[10]
R. L. Fontaine. Merging XML files: a new approach providing intelligent merge of XML data sets. In Proceedings of XML Europe 2002, 2002.
[11]
Free Software Foundation. Comparing and Merging Files, 2002.
[12]
C.-L. Ignat and M. C. Norrie. Flexible collaboration over XML documents. In CDVE, pages 267--274, 2006.
[13]
S. Khanna, K. Kunal, and B. C. Pierce. A formal investigation of diff3. In Arvind and Prasad, editors, Foundations of Software Technology and Theoretical Computer Science (FSTTCS), Dec. 2007.
[14]
H.-K. Ko and S. Lee. An efficient scheme to completely avoid re-labeling in XML updates. In WISE, pages 259--264, 2006.
[15]
J. Kornblum. Identifying almost identical files using context triggered piecewise hashing. Digital Investigation, 3(Supplement-1):91--97, 2006.
[16]
F. Lam, N. Lam, and R. Wong. Efficient synchronization for mobile XML data. In CIKM '02: Proceedings of the eleventh international conference on Information and knowledge management, pages 153--160, New York, NY, USA, 2002. ACM.
[17]
E. Leonardi, S. S. Bhowmick, and S. K. Madria. Xandy: Detecting changes on large unordered XML documents using relational databases. In L. Zhou, B. C. Ooi, and X. Meng, editors, DASFAA, volume 3453 of Lecture Notes in Computer Science, pages 711--723. Springer, 2005.
[18]
T. Lindholm. A three-way merge for XML documents. In DocEng '04: Proceedings of the 2004 ACM symposium on Document engineering, pages 1--10, New York, NY, USA, 2004. ACM.
[19]
T. Lindholm, J. Kangasharju, and S. Tarkoma. A hybrid approach to optimistic file system directory tree synchronization. In V. Kumar, A. B. Zaslavsky, U. Çetintemel, and A. Labrinidis, editors, MobiDE, pages 49--56. ACM, 2005.
[20]
T. Lindholm, J. Kangasharju, and S. Tarkoma. Fast and simple XML tree di erencing by sequence alignment. In DocEng '06: Proceedings of the 2006 ACM symposium on Document engineering, pages 75--84, New York, NY, USA, 2006. ACM.
[21]
A. Marian, S. Abiteboul, G. Cobéna, and L. Mignet. Change-centric management of versions in an XML warehouse. In The VLDB Journal, pages 581--590, 2001.
[22]
H. Maruyama, K. Tamura, and N. Uramoto. Digest Values for DOM (DOMHASH), 2000.
[23]
R. Rivest. The md5 message-digest algorithm, 1992.
[24]
S. Rönnau and U. M. Borghoff. Intelligent merging of XML documents for distributed collaboration. In Proceedings of the Distributed Intelligent Systems and Technologies Workshop, pages 71--78, St. Petersburg, Russia, 2008.
[25]
S. Rönnau, J. Scheffczyk, and U. M. Borghoff. Towards XML version control of office documents. In DocEng '05: Proceedings of the 2005 ACM symposium on Document engineering, pages 10--19, New York, NY, USA, 2005. ACM.
[26]
L. A. Rosado, A. P. Márquez, and J. M. Gil. Managing branch versioning in versioned/temporal XML documents. In D. Barbosa, A. Bonifati, Z. Bellahsene, E. Hunt, and R. Unland, editors, XSym, volume 4704 of Lecture Notes in Computer Science, pages 107--121. Springer, 2007.
[27]
B. Stein. Fuzzy-fingerprints for text-based information retrieval. In I-KNOW'05: Proceedings of the 5th International Conference on Knowledge Management, pages 572--579. Journal of Universal Computer Science, 2005.
[28]
I. Tatarinov, Z. G. Ives, A. Y. Halevy, and D. S. Weld. Updating XML. In SIGMOD '01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data, pages 413--424, New York, NY, USA, 2001. ACM.

Cited By

View all
  • (2023)Spork: Structured Merge for Java With Formatting PreservationIEEE Transactions on Software Engineering10.1109/TSE.2022.314376649:1(64-83)Online publication date: 1-Jan-2023
  • (2021)Déjà Vu? Client-Side Fingerprinting and Version Detection of Web Application Software2021 IEEE 46th Conference on Local Computer Networks (LCN)10.1109/LCN52139.2021.9524885(81-89)Online publication date: 4-Oct-2021
  • (2016)UI Tags: Confidentiality in Office Open XMLCyber Security10.1007/978-3-319-28313-5_2(19-33)Online publication date: 8-Jan-2016
  • Show More Cited By

Index Terms

  1. Merging changes in XML documents using reliable context fingerprints

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DocEng '08: Proceedings of the eighth ACM symposium on Document engineering
    September 2008
    312 pages
    ISBN:9781605580814
    DOI:10.1145/1410140
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 September 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. CSCW
    2. XML diff
    3. XML patch
    4. fingerprint
    5. office applications
    6. version control

    Qualifiers

    • Research-article

    Conference

    DocEng '08
    Sponsor:
    DocEng '08: ACM Symposium on Document Engineering
    September 16 - 19, 2008
    Sao Paulo, Brazil

    Acceptance Rates

    DocEng '08 Paper Acceptance Rate 21 of 62 submissions, 34%;
    Overall Acceptance Rate 194 of 564 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Spork: Structured Merge for Java With Formatting PreservationIEEE Transactions on Software Engineering10.1109/TSE.2022.314376649:1(64-83)Online publication date: 1-Jan-2023
    • (2021)Déjà Vu? Client-Side Fingerprinting and Version Detection of Web Application Software2021 IEEE 46th Conference on Local Computer Networks (LCN)10.1109/LCN52139.2021.9524885(81-89)Online publication date: 4-Oct-2021
    • (2016)UI Tags: Confidentiality in Office Open XMLCyber Security10.1007/978-3-319-28313-5_2(19-33)Online publication date: 8-Jan-2016
    • (2016)Bridging the gap between tracking and detecting changes in XMLSoftware—Practice & Experience10.1002/spe.230546:2(227-250)Online publication date: 1-Feb-2016
    • (2014)A New Approach for Meaningful XML Schema MergingProceedings of the 16th International Conference on Information Integration and Web-based Applications & Services10.1145/2684200.2684302(430-439)Online publication date: 4-Dec-2014
    • (2014)Using versioned trees, change detection and node identity for three-way XML mergingSICS Software-Intensive Cyber-Physical Systems10.1007/s00450-013-0253-534:1(3-16)Online publication date: 29-Nov-2014
    • (2013)Document changesProceedings of the 2013 ACM symposium on Document engineering10.1145/2494266.2494322(281-282)Online publication date: 10-Sep-2013
    • (2013)Introduction to the universal delta modelProceedings of the 2013 ACM symposium on Document engineering10.1145/2494266.2494284(47-56)Online publication date: 10-Sep-2013
    • (2012)XCCComputer Science - Research and Development10.1007/s00450-010-0140-227:2(95-111)Online publication date: 1-May-2012
    • (2011)A generic calculus of XML editing deltasProceedings of the 11th ACM symposium on Document engineering10.1145/2034691.2034718(113-120)Online publication date: 19-Sep-2011
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media