Skip to content

Generating Multi-Label Synthetic Data Stream gives a NullPointerException #137

@abuyukcakir

Description

@abuyukcakir

Hey There,

The issue that I will talk about next is discussed here previously: https://groups.google.com/forum/#!topic/moa-development/ho-_Z22k1-E

The task WriteStreamToARFFFile does not work properly. Although some initial statistics on the distribution of the label sets are outputted to the terminal, the process terminates with a NullPointerException.

The error is replicated by some other user in MOA Development Google Group as well.

The error is similar to this:

Failure reason: Failed writing to file /home/****/Synth.arff *** STACK TRACE ***java.lang.RuntimeException: Failed writing to file /home/****/Synth.arff at moa.tasks.WriteStreamToARFFFile.doMainTask(WriteStreamToARFFFile.java:86) at moa.tasks.MainTask.doTaskImpl(MainTask.java:50) at moa.tasks.AbstractTask.doTask(AbstractTask.java:57) at moa.tasks.TaskThread.run(TaskThread.java:76) Caused by: java.lang.NullPointerException at com.yahoo.labs.samoa.instances.SparseInstanceData.locateIndex(SparseInstanceData.java:237) at com.yahoo.labs.samoa.instances.SparseInstanceData.setValue(SparseInstanceData.java:220) at com.yahoo.labs.samoa.instances.InstanceImpl.setValue(InstanceImpl.java:269) at moa.streams.generators.multilabel.MetaMultilabelGenerator.generateMLInstance(MetaMultilabelGenerator.java:274) at moa.streams.generators.multilabel.MetaMultilabelGenerator.nextInstance(MetaMultilabelGenerator.java:228) at moa.streams.generators.multilabel.MetaMultilabelGenerator.nextInstance(MetaMultilabelGenerator.java:46) at moa.tasks.WriteStreamToARFFFile.doMainTask(WriteStreamToARFFFile.java:80) ... 3 more

The setting which results in the error is as follows:

  1. Pick 'WriteStreamToARFFFile' task. As its options:
  • stream: generators.multilabel.MetaMultilabelGenerator (with default values. I also tried to change some of the options there, such as NumLabels and LabelCardinality)
  • arffFile: An empty file that I specified with proper read write permissions.
  • maxInstances: 100,000. Or any other value
  • taskResultFile: This is left blank, as it is for the results on the generated data (for most common labelset etc.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions