-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
Four out of five nltk_data packages are set to unzip by default, although it seems likely that most package readers would only require minimal adjustments in order to not need unzipping.
For ex., running all NLTK's doctests and unit tests with no data packages unzipped raises many errors, where some can be avoided by just using a join() function inherited from nltk.data's PathPointer class to concatenate path components, regardless of whether they refer to the file system or a zipped package.
In particular, the doctests for all the switch_* functions in nltk.data (i.e. punkt, maxent and perceptron) are affected by this problem, and raise NotADirectoryError or LookupError(resource_not_found) when they are run without unzipping the corresponding models.
This can be fixed by a simple modification of the way path components are assembled in the corresponding package readers.