Analyze your XlsForms as directed graphs. Survey elements, such as
select_one ..., calculate, or note become nodes in such a graph. In other
words, the nodes of the graph are the individual XlsForm questions (rows in an
XlsForm). The edges are dependencies on other questions. If question B depends
on question A being answered a specific way (e.g. through ${...} in the
relevant column), then an edge points from A to B. A dependency could also
be when a label in question D displays the value of survey element C. Here
an edge points from C to D.
All package dependencies, networkx and xlrd, are on PyPI. To install, a
single pip call on the command line suffices:
python3 -m pip install https://github.com/jkpr/OdkGraph/zipball/master
✅ First, make sure the ODK Xlsform converts cleanly to XML.
Import the OdkGraph class with
from odkgraph import OdkGraphNext, create an OdkGraph object. The __init__ method accepts a path to the
file:
odk_graph = OdkGraph('/path/to/odk/xlsform.xlsx')Access nodes through a variety of ways
odk_graph['age'] # Get the ODK survey element (node) named 'age'
odk_graph[0] # Zero-indexed node access. This example returns the first node
odk_graph.excel_row(2) # Return the ODK survey element from row 2 in the Excel fileSlicing is also supported.
Some useful things this code does now that we have an OdkGraph object:
odk_graph.number_edges() # The number of edges (dependencies)
odk_graph.number_nodes() # The number of nodes (survey elements)
odk_graph.forward_dependencies() # The ODK elements that depend on things that are defined after them in the Xlsform
odk_graph.terminal_nodes() # The ODK elements that depend on other elements, but nothing depends on them
odk_graph.isolates() # The ODK elements that depend on nothing else, and nothing depends on them
odk_graph.simple_cycles() # A list of cyclical dependenciesWith node(s) in hand, we can do
age = odk_graph['age']
odk_graph.predecessors(age) # All nodes that 'age' directly depends on
odk_graph.successors(age) # All nodes that directly depend on 'age'
odk_graph.all_dependencies_of([age]) # All nodes that 'age directly or indirectly depends on
odk_graph.all_nodes_dependent_on([age]) # All nodes that directly or indirectly depend on 'age'The underlying networkx network (documentation here) can be accessed with
odk_graph.networkSee all methods and attributes on OdkGraph and their docstrings with
help(OdkGraph)or by reading the source code.
Submit bug reports to James K. Pringle at jpringleBEAR@jhu.edu minus the bear.