Moved the RT fmrisim generator and updated it to use resource stream by CameronTEllis · Pull Request #460 · brainiak/brainiak

CameronTEllis · 2020-03-27T22:25:47Z

No description provided.

CameronTEllis · 2020-03-28T19:33:43Z

@mihaic Hi Mihai, I am running into issues with using tmp_path. I have tried to follow the protocol they describe in the pytest link you sent but can't find it. Can you advise how the tests/utils/test_fmrisim_real_time.py should be written to use tmp_path

CameronTEllis · 2020-04-07T23:58:00Z

After many pep8 errors, I now have new errors I am not sure about. The first one throws an error because I do not have enough arguments, but that is precisely the test I am running that should be caught by the pytest command. Is that because it is not in a function? The other errors are peculiar. Is it something about how the dictionary data type is being altered by being supplied a resource stream?

mihaic · 2020-04-08T16:37:39Z

Yes, testing for raised exceptions should be part of Pytest functions.

However, the current errors are raised by run-checks, not run-tests, so they are unrelated to Pytest. They are raised by the Mypy type checker.

Mypy assumes all the values in data_dict are the same type as the first value you set, which is the output of resource_stream:

data_dict['ROI_A_file'] = resource_stream(gen.__name__,
                                          "sim_parameters/ROI_A.nii.gz")

You can avoid it by explicitly specifying a type for for data_dict that accepts any value type:

from typing import Dict

data_dict: Dict = {}

You can disable type checking for any line by adding a # type: ignore comment. For example, for testing that the function call raises an exception:

    gen.generate_data()  # type: ignore

mihaic · 2020-04-15T17:45:54Z

@CameronTEllis, in case you are baffled by the current error, note that nibabel.load can only deal with file paths. To work with the io.BufferedReader object returned by resource_stream, you can do something like this:

import gzip

roi_a_file = resource_stream(__name__, "ROI_A.nii.gz")
image = nibabel.nifti1.Nifti1Image.from_bytes(gzip.decompress(roi_a_file.read()))

CameronTEllis · 2020-04-15T23:50:23Z

@mihaic Hmm still strange errors. this one seems to be a version issue perhaps: from_bytes was added recently. This solution is also quite cumbersome, are there alternative solutions to find the path to something in the resource stream so that it can be treated as a path?

mihaic · 2020-04-15T23:58:25Z

It is cumbersome, but you have implemented it already. :)

You can try the following for the import error:

from nibabel.nifti1 import Nifti1Image
_from_bytes = Nifti1Image.from_bytes

Note that I wrote _from_bytes because it is not public. Nevertheless, I recommend you use the more customary form Nifti1Image.from_bytes.

mihaic · 2020-04-16T17:03:48Z

I recommend changing the signature of def generate_data, to explicitly include all the file path keys currently in data_dict, including noise_dict_file. That means incorporating the _get_input_names into generate_data. This will simplify the docstring as well (note that it is currently out of sync with the code and not formatted for Sphinx).

from typing import Any, Dict

def generate_data(
    output_dir: str,
    data_dict: Dict[str, Any],
    roi_a_file: str = None,
    roi_b_file: str = None,
    template_path: str = None,
    noise_dict_file: str = None,
) -> None:

CameronTEllis · 2020-04-17T04:13:05Z

Hmm I am not familiar with this style of signature. But separate from that, how does this solve the problem? If noise_dict_file is sometimes a resource stream, sometimes just a path to a file, how will it load this file as text? Moreover, if I set default values to 'None' then the code will crash without setting these values, whereas the current set up should run the default files unless specified otherwise, which is the correct behavior.

mihaic · 2020-04-17T16:32:42Z

I should not have changed the style, sorry about that. (The style follows PEP 484, as mentioned in our contributing documentation.)

My suggestion is to move the code from _get_input_names into generate_data. You are already dealing in _get_input_names with the case when noise_dict_file is None. Also, callers of generate_data will only pass paths, not resource streams. Resource streams are only an internal mechanism to use the default files shipped with BrainIAK when the callers do not pass paths. Does that make sense?

CameronTEllis · 2020-04-17T16:57:52Z

Sorry I am still confused. The resource stream call I do to get the text file from _get_input_names fails because you can't use open() on a resource stream object.

The whole reason we got into this resource stream business is because I need a way to read the files that are in the sim_parameters folder to this function. It would obviously be easier if there was a way for this code to automatically find the path to the sim_parameters folder and just use that but we were told that wasn't possible with a conda install. Is that not the case?

Using these files should be the default behavior but users can specify other paths if they want to. Incorporating _get_input_names into generate_data is fine but the reason I separated them was because I was exceeding the complexity limits. Moreover, incorporating this won't solve the problem of reading in the text file as a resource stream.

mihaic · 2020-04-17T17:27:39Z

Right, you cannot use open with resource_stream. In our earlier discussion we selected a way to load the file data into Nifti1Image. I was assuming you were going to apply a similar approach to reading the noise_dict_file file data. Sorry for not providing an example. Here is how you can do it:

text_in_noise_dict_file = resource_stream(__name__, "sub_noise_dict.txt").read().decode()

You can keep _get_input_names separate and put this code there to help with complexity.

Note that you should change the test code to pass None for all the files.

There are other ways to read data files shipped with Python packages, but I recommend we continue using resource_stream.

CameronTEllis · 2020-04-30T18:07:20Z

@mihaic Hi I assumed that the temp path that was created would be a string but it seems to be a PosixPath. Is there a way to convert that to a string or to append a suffix to the path to specify a certain file? From the documentation it seems like: tmp_path / $str could work

mihaic · 2020-04-30T18:09:52Z

Indeed, tmp_path / "rt_007.npy" will work.

CameronTEllis · 2020-05-01T02:35:13Z

@mihaic is this saying there is an indent on line 11 of the docstring? If so, I don't see it: there are no characters on that line

mihaic · 2020-05-01T17:08:38Z

First, congrats for getting Pytest to succeed!

The line numbers in docstring errors are misleading, because they refer to the docstring, not the file. We documented this issue some time ago, but unfortunately we still do not have a solution:
#42

In this case, the file line number is 11 + 3 (the start of the the docstring) = 14:

numTRs - Specify the number of time points

Remember that docstrings must follow the Sphinx format. In particular:

indentation is significant in reST, so all lines of the same paragraph must be left-aligned to the same level of indentation
https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#paragraphs

Note that the text documenting the generate_data function is not a docstring, so it is not visible in Python, nor in the HTML documentation. Please make it a NumPy-like docstring as our contributing guideline mentions:
https://www.sphinx-doc.org/en/stable/usage/extensions/example_numpy.html

CameronTEllis · 2020-05-02T20:03:17Z

@mihaic Suggestions on how to resolve the Python 3.5 vs 3.6 issue? Seems like a different solution is needed for this dictionary mem map issue

mihaic · 2020-05-04T17:25:40Z

I forgot that type annotations for variables were only introduced in Python 3.6. Let's maintain compatibility with Python 3.5 for the moment. Please change to:

data_dict = {}  # type: Dict

gdoubleyew

Cameron, looks good! Thanks for making all these default settings, that will make it very easy to get started using. Also, thanks for making all the test functions.

…lt inputs

mihaic · 2020-05-05T23:30:38Z

Thanks, @CameronTEllis & @gdoubleyew.

Note that I changed from package_data to include_package_data in setup.py. In the future, we only need to add files to Git to have them show up in the installation.

The Conda error we were seeing was caused by sim_parameters not being a Python package, which was causing the inclusion of data files to fail silently. We would have needed to prefix each file with the sim_parameters directory. See:
https://setuptools.readthedocs.io/en/latest/setuptools.html#including-data-files

CameronTEllis added 4 commits March 27, 2020 18:24

Moved the RT fmrisim generator and updated it to use resource stream

f0c9b68

PEP8 issues

b9f11ae

PEP8 issues

34afff0

PEP8 issues

58c07e4

mihaic requested changes Mar 30, 2020

View reviewed changes

Comment thread tests/utils/test_fmrisim_real_time.py Outdated

Comment thread tests/utils/test_fmrisim_real_time.py Outdated

Comment thread brainiak/utils/fmrisim_real_time_generator.py

Comment thread brainiak/utils/fmrisim_real_time_generator.py

CameronTEllis added 9 commits April 7, 2020 17:38

Updates based on comments

61a5b8e

Does not delete contents of tmp_path when function is called

eb5fcee

fixed the resource stream path

156798a

fixed the resource stream path

dc15451

pep8 error

0edc66b

pep8 error

330f4de

pep8 error

6e502a1

Wrong function call

33e73e3

Wrong function call

6ba226c

CameronTEllis added 5 commits April 8, 2020 15:19

Fix the test for run-checks

1b30e9e

Add pydicom to the list of software

a8f655b

Remove the deep copy that protects the dictionary across function calls

5719137

Remove the deep copy that protects the dictionary across function calls

7da52c8

Remove the deep copy that protects the dictionary across function calls

3edd15f

CameronTEllis added 3 commits April 15, 2020 19:15

Resource stream updates

e925a4c

Resource stream updates

bb3e67d

Resource stream updates

e93e9f6

CameronTEllis added 2 commits April 15, 2020 20:01

Resource stream updates

b1d67a9

Resource stream updates

87d358c

Add resource support for text files

6176a55

CameronTEllis added 5 commits April 30, 2020 14:42

Update for file name appending

8c3498d

Make posix path a string

6629e57

Reorder function calls

a8bf77c

Reorder function calls

1caefe9

PEP8 error

8ff7cb2

CameronTEllis added 2 commits May 1, 2020 21:37

Make docstring consistent with the Sphinx styles

107e9e7

Make docstring consistent with the Sphinx styles

a7673c1

gdoubleyew approved these changes May 4, 2020

View reviewed changes

CameronTEllis added 4 commits May 4, 2020 14:26

Update for python 3.5, improved arg parser and fixed a big with defau…

14a63e0

…lt inputs

Fixed underscore

4e3a07c

Make python 3.5 compatible for the path

a129015

Make python 3.5 compatible for the path

2a039c7

gdoubleyew approved these changes May 4, 2020

View reviewed changes

mihaic added 2 commits May 5, 2020 12:06

dev: Use setuptools include_package_data

5334129

Merge branch 'master' into fix-data

c1cbd81

mihaic approved these changes May 5, 2020

View reviewed changes

mihaic merged commit 00da44e into brainiak:master May 5, 2020

Conversation

CameronTEllis commented Mar 27, 2020

Uh oh!

CameronTEllis commented Mar 28, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CameronTEllis commented Apr 7, 2020

Uh oh!

mihaic commented Apr 8, 2020

Uh oh!

mihaic commented Apr 15, 2020

Uh oh!

CameronTEllis commented Apr 15, 2020

Uh oh!

mihaic commented Apr 15, 2020

Uh oh!

mihaic commented Apr 16, 2020

Uh oh!

CameronTEllis commented Apr 17, 2020

Uh oh!

mihaic commented Apr 17, 2020

Uh oh!

CameronTEllis commented Apr 17, 2020

Uh oh!

mihaic commented Apr 17, 2020

Uh oh!

CameronTEllis commented Apr 30, 2020

Uh oh!

mihaic commented Apr 30, 2020

Uh oh!

CameronTEllis commented May 1, 2020

Uh oh!

mihaic commented May 1, 2020

Uh oh!

CameronTEllis commented May 2, 2020

Uh oh!

mihaic commented May 4, 2020

Uh oh!

gdoubleyew left a comment

Choose a reason for hiding this comment

Uh oh!

mihaic commented May 5, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants