skip to main content
10.1145/2807442.2807478acmconferencesArticle/Chapter ViewAbstractPublication PagesuistConference Proceedingsconference-collections
research-article

DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization

Published: 05 November 2015 Publication History

Abstract

Answering questions with data is a difficult and time-consuming process. Visual dashboards and templates make it easy to get started, but asking more sophisticated questions often requires learning a tool designed for expert analysts. Natural language interaction allows users to ask questions directly in complex programs without having to learn how to use an interface. However, natural language is often ambiguous. In this work we propose a mixed-initiative approach to managing ambiguity in natural language interfaces for data visualization. We model ambiguity throughout the process of turning a natural language query into a visualization and use algorithmic disambiguation coupled with interactive ambiguity widgets. These widgets allow the user to resolve ambiguities by surfacing system decisions at the point where the ambiguity matters. Corrections are stored as constraints and influence subsequent queries. We have implemented these ideas in a system, DataTone. In a comparative study, we find that DataTone is easy to learn and lets users ask questions without worrying about syntax and proper question form.

Supplementary Material

PDF File (uist2990-paper.pdf)
suppl.mov (uist2990-file4.mp4)
Supplemental video
MP4 File (p489.mp4)

References

[1]
Agrawal, S., Chaudhuri, S., and Das, G. Dbxplorer: A system for keyword-based search over relational databases. In Data Engineering '02, IEEE (2002), 5--16.
[2]
Androutsopoulos, I., Ritchie, G. D., and Thanisch, P. Natural language interfaces to databases--an introduction. Natural language engineering 1, 01 (1995), 29--81.
[3]
Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., and Sudarshan, S. Keyword searching and browsing in databases using banks. In Data Engineering '02, IEEE (2002), 431--440.
[4]
Blunschi, L., Jossen, C., Kossmann, D., Mori, M., and Stockinger, K. Soda: Generating SQL for business users. Proceedings of the VLDB Endowment 5, 10 (2012), 932--943.
[5]
Bostock, M., Ogievetsky, V., and Heer, J. D3 data-driven documents. Trans. on Vis. and Comp. Graphics (TVCG) 17, 12 (2011), 2301--2309.
[6]
Casner, S. M. Task-analytic approach to the automated design of graphic presentations. ACM Transactions on Graphics (ToG) 10, 2 (1991), 111--151.
[7]
Cleveland, W. S., et al. The elements of graphing data. Wadsworth Advanced Books and Software Monterey, CA, 1985.
[8]
Cox, K., Grinter, R. E., Hibino, S. L., Jagadeesan, L. J., and Mantilla, D. A multi-modal natural language interface to an information visualization environment. International Journal of Speech Technology 4, 3--4 (2001), 297--314.
[9]
Ge, R., and Mooney, R. J. A statistical semantic parser that integrates syntax and semantics. In Computational Natural Language Learning '05, Association for Computational Linguistics (2005), 9--16.
[10]
Healey, C. G., Kocherlakota, S., Rao, V., Mehta, R., and St Amant, R. Visual perception and mixed-initiative interaction for assisted visualization design. Trans. on Vis. and Comp. Graphics (TVCG) 14, 2 (2008), 396--411.
[11]
Hristidis, V., and Papakonstantinou, Y. Discover: Keyword search in relational databases. In VLDB'02, VLDB Endowment (2002), 670--681.
[12]
Kate, R. J., and Mooney, R. J. Using string-kernels for learning semantic parsers. In ICCL-ACL'06, Association for Computational Linguistics (2006), 913--920.
[13]
Li, F., and Jagadish, H. V. Nalir: an interactive natural language interface for querying relational databases. In SIGMOD'14, ACM (2014), 709--712.
[14]
Li, Y., Yang, H., and Jagadish, H. Nalix: an interactive natural language interface for querying xml. In SIGMOD'05, ACM (2005), 900--902.
[15]
Mackinlay, J. Automating the design of graphical presentations of relational information. ACM Trans. Graph. 5, 2 (Apr. 1986), 110--141.
[16]
Mackinlay, J., Hanrahan, P., and Stolte, C. Show me: Automatic presentation for visual analysis. Trans. on Vis. and Comp. Graphics (TVCG) 13, 6 (2007), 1137--1144.
[17]
Manning, C. D., and Schütze, H. Foundations of statistical natural language processing. MIT press, 1999.
[18]
Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., and McClosky, D. The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL): System Demonstrations (2014), 55--60.
[19]
Miller, G. A. Wordnet: a lexical database for english. Communications of the ACM 38, 11 (1995), 39--41.
[20]
Popescu, A.-M., Armanasu, A., Etzioni, O., Ko, D., and Yates, A. Modern natural language interfaces to databases: Composing statistical parsing with semantic tractability. In Computational Linguistics '04, Association for Computational Linguistics (2004), 141.
[21]
Popescu, A.-M., Etzioni, O., and Kautz, H. Towards a theory of natural language interfaces to databases. In IUI'03, ACM (2003), 149--157.
[22]
Rao, V. R. Mixed-initiative techniques for assisted visualization, 2003.
[23]
Roth, S. F., Kolojejchick, J., Mattis, J., and Goldstein, J. Interactive graphic design using automatic presentation knowledge. In CHI'94, ACM (1994), 112--117.
[24]
Roth, S. F., and Mattis, J. Automating the presentation of information. In Artificial Intelligence Applications 1991, vol. 1, IEEE (1991), 90--97.
[25]
Satyanarayan, A., and Heer, J. Lyra: An interactive visualization design environment. In Computer Graphics Forum, vol. 33, Wiley Online Library (2014), 351--360.
[26]
Schwarz, J., Hudson, S., Mankoff, J., and Wilson, A. D. A framework for robust and flexible handling of inputs with uncertainty. In UIST'10, ACM (2010), 47--56.
[27]
Shilman, M., Tan, D. S., and Simard, P. Cuetip: a mixed-initiative interface for correcting handwriting errors. In UIST'06, ACM (2006), 323--332.
[28]
Simitsis, A., Koutrika, G., and Ioannidis, Y. Précis: from unstructured keywords as queries to structured databases as answers. VLDB Journal 17, 1 (2008), 117--149.
[29]
Stolte, C., Tang, D., and Hanrahan, P. Polaris: A system for query, analysis, and visualization of multidimensional relational databases. Trans. on Vis. and Comp. Graphics (TVCG) 8, 1 (2002), 52--65.
[30]
Sun, Y., Leigh, J., Johnson, A., and Lee, S. Articulate: A semi-automated model for translating natural language queries into meaningful visualizations. In Smart Graphics, Springer (2010), 184--195.
[31]
Tang, L. R., and Mooney, R. J. Using multiple clause constructors in inductive logic programming for semantic parsing. In ECML '01. Springer, 2001, 466--477.
[32]
Tata, S., and Lohman, G. M. Sqak: doing more with keywords. In SIGMOD'08, ACM (2008), 889--902.
[33]
Trifacta. Vega. http://trifacta.github.io/vega/.
[34]
Tufte, E. R., and Graves-Morris, P. The visual display of quantitative information, vol. 2. Graphics press Cheshire, CT, 1983.
[35]
Wickham, H. ggplot2: Elegant Graphics for Data Analysis. Springer, New York, Aug. 2009.
[36]
Wilkinson, L., Wills, D., Rope, D., Norton, A., and Dubbs, R. The grammar of graphics. Springer Science & Business Media, 2006.
[37]
Wu, Z., and Palmer, M. Verbs semantics and lexical selection. ACL '94 (1994), 133--138.
[38]
Xiao, C., Wang, W., Lin, X., Yu, J. X., and Wang, G. Efficient similarity joins for near-duplicate detection. ACM Trans. on DB Systems (TODS) 36, 3 (2011), 15.
[39]
Zelle, J. M., and Mooney, R. J. Learning to parse database queries using inductive logic programming. In National Conference on Artificial Intelligence '96 (1996), 1050--1055.

Cited By

View all
  • (2024)TaskFinder: A Semantics-Based Methodology for Visualization Task RecommendationAnalytics10.3390/analytics30300153:3(255-275)Online publication date: 4-Jul-2024
  • (2024)SQLucid: Grounding Natural Language Database Queries with Interactive ExplanationsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676368(1-20)Online publication date: 13-Oct-2024
  • (2024)Talk2Data: A Natural Language Interface for Exploratory Visual Analysis via Question DecompositionACM Transactions on Interactive Intelligent Systems10.1145/364389414:2(1-24)Online publication date: 7-Feb-2024
  • Show More Cited By

Index Terms

  1. DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    UIST '15: Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology
    November 2015
    686 pages
    ISBN:9781450337793
    DOI:10.1145/2807442
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 November 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. mixed-initiative interfaces
    2. natural language interaction
    3. visualization

    Qualifiers

    • Research-article

    Funding Sources

    • NSF

    Conference

    UIST '15

    Acceptance Rates

    UIST '15 Paper Acceptance Rate 70 of 297 submissions, 24%;
    Overall Acceptance Rate 561 of 2,567 submissions, 22%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)277
    • Downloads (Last 6 weeks)40
    Reflects downloads up to 30 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)TaskFinder: A Semantics-Based Methodology for Visualization Task RecommendationAnalytics10.3390/analytics30300153:3(255-275)Online publication date: 4-Jul-2024
    • (2024)SQLucid: Grounding Natural Language Database Queries with Interactive ExplanationsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676368(1-20)Online publication date: 13-Oct-2024
    • (2024)Talk2Data: A Natural Language Interface for Exploratory Visual Analysis via Question DecompositionACM Transactions on Interactive Intelligent Systems10.1145/364389414:2(1-24)Online publication date: 7-Feb-2024
    • (2024)SlopeSeeker: A Search Tool for Exploring a Dataset of Quantifiable TrendsProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645208(817-836)Online publication date: 18-Mar-2024
    • (2024)DataDive: Supporting Readers' Contextualization of Statistical Statements with Data ExplorationProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645155(623-639)Online publication date: 18-Mar-2024
    • (2024)Inferring Visualization Intent from ConversationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679589(1184-1194)Online publication date: 21-Oct-2024
    • (2024)Chart What I Say: Exploring Cross-Modality Prompt Alignment in AI-Assisted Chart AuthoringExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650921(1-7)Online publication date: 11-May-2024
    • (2024)Natural Language Dataset Generation Framework for Visualizations Powered by Large Language ModelsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642943(1-22)Online publication date: 11-May-2024
    • (2024)Bridging the Gulf of Envisioning: Cognitive Challenges in Prompt Based Interactions with LLMsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642754(1-19)Online publication date: 11-May-2024
    • (2024)DynaVis: Dynamically Synthesized UI Widgets for Visualization EditingProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642639(1-17)Online publication date: 11-May-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media