-
Notifications
You must be signed in to change notification settings - Fork 365
Description
Hey There,
The issue that I will talk about next is discussed here previously: https://groups.google.com/forum/#!topic/moa-development/ho-_Z22k1-E
The task WriteStreamToARFFFile does not work properly. Although some initial statistics on the distribution of the label sets are outputted to the terminal, the process terminates with a NullPointerException.
The error is replicated by some other user in MOA Development Google Group as well.
The error is similar to this:
Failure reason: Failed writing to file /home/****/Synth.arff *** STACK TRACE ***java.lang.RuntimeException: Failed writing to file /home/****/Synth.arff at moa.tasks.WriteStreamToARFFFile.doMainTask(WriteStreamToARFFFile.java:86) at moa.tasks.MainTask.doTaskImpl(MainTask.java:50) at moa.tasks.AbstractTask.doTask(AbstractTask.java:57) at moa.tasks.TaskThread.run(TaskThread.java:76) Caused by: java.lang.NullPointerException at com.yahoo.labs.samoa.instances.SparseInstanceData.locateIndex(SparseInstanceData.java:237) at com.yahoo.labs.samoa.instances.SparseInstanceData.setValue(SparseInstanceData.java:220) at com.yahoo.labs.samoa.instances.InstanceImpl.setValue(InstanceImpl.java:269) at moa.streams.generators.multilabel.MetaMultilabelGenerator.generateMLInstance(MetaMultilabelGenerator.java:274) at moa.streams.generators.multilabel.MetaMultilabelGenerator.nextInstance(MetaMultilabelGenerator.java:228) at moa.streams.generators.multilabel.MetaMultilabelGenerator.nextInstance(MetaMultilabelGenerator.java:46) at moa.tasks.WriteStreamToARFFFile.doMainTask(WriteStreamToARFFFile.java:80) ... 3 more
The setting which results in the error is as follows:
- Pick 'WriteStreamToARFFFile' task. As its options:
- stream: generators.multilabel.MetaMultilabelGenerator (with default values. I also tried to change some of the options there, such as NumLabels and LabelCardinality)
- arffFile: An empty file that I specified with proper read write permissions.
- maxInstances: 100,000. Or any other value
- taskResultFile: This is left blank, as it is for the results on the generated data (for most common labelset etc.)