Replication package for ASE 2020 submission
Directory blizzard contains:
- Replication of the results reported by the BLIZZARD authors in
blizzard-replication.csv. - We had to reverse-engineer the Lucene index that BLIZZARD uses. To verify the results, we evaluated the original systems using our implementation of the index. Results in
blizzard-re.csv
reformulated-queries contains the non-preprocessed reformulated queries which correspond to the 1,405 low-quality queries used in our study. There is one csv file corresponding to each strategy evaluated in sec. 4.5 (970), with the addition of ALL_TEXT.csv (containing the full text of each bug report), and BLIZZARD--ALL_TEXT.csv (the result of running BLIZZARD on the full text of the bug reports).
Each file contains the queries that contain all the components that are part of the corresponding strategy, e.g., if a query does not contain OB, it will not appear in OB.csv, BLIZZARD-OB.csv, EB--OB.csv, etc.
The results directory contains:
trbl-results.csvcontains the results for each combination of technique, threshold (N), granularity, and strategy. (WARNING: large file)result-summary-*.csvcontains a summary of all strategies (Table 7), for both combinatorial and conjunctive reformulations.stat-testscontains the full results of the statistical tests presented in section 4.query-stats.csvcontains the full Table 10.
The replication package of Chaparro et al.'s work which first defined the query reduction strategies that we adapt.