A trichotomy for regular simple path queries on graphs

G Bagan, A Bonifati, B Groz - Proceedings of the 32nd ACM SIGMOD …, 2013 - dl.acm.org
G Bagan, A Bonifati, B Groz
Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGAI symposium on Principles of …, 2013dl.acm.org
Regular path queries (RPQs) select vertices connected by some path in a graph. The edge
labels of such a path have to form a word that matches a given regular expression. We
investigate the evaluation of RPQs with an additional constraint that prevents multiple
traversals of the same vertices. Those regular simple path queries (RSPQs) quickly become
intractable, even for basic languages such as (aa)* or a* ba*. In this paper, we establish a
comprehensive classification of regular languages with respect to the complexity of the …
Regular path queries (RPQs) select vertices connected by some path in a graph. The edge labels of such a path have to form a word that matches a given regular expression. We investigate the evaluation of RPQs with an additional constraint that prevents multiple traversals of the same vertices. Those regular simple path queries (RSPQs) quickly become intractable, even for basic languages such as (aa)* or a*ba*.
In this paper, we establish a comprehensive classification of regular languages with respect to the complexity of the corresponding regular simple path query problem. More precisely, we identify for which languages RSPQs can be evaluated in polynomial time, and show that evaluation is NP-complete for languages outside this fragment. We thus fully characterize the frontier between tractability and intractability for RSPQs, and we refine our results to show the following trichotomy: evaluation of RSPQs is either AC0 , NL-complete or NP-complete in data complexity, depending on the language L. The fragment identified also admits a simple characterization in terms of regular expressions.
Finally, we also discuss the complexity of deciding whether a language L belongs to the fragment above. We consider several alternative representations of L: DFAs, NFAs or regular expressions, and prove that this problem is NL-complete for the first representation and PSPACE-complete for the other two. As a conclusion we extend our results from edge-labeled graphs to vertex-labeled graphs.
ACM Digital Library