WIP: Revamped ISC analyses with additional statistical tests #384

snastase · 2018-10-04T00:41:55Z

I'm re-working and expanding the ISC (and ISFC) code to accommodate several new features. We can now compute either pairwise or leave-one-out ISCs (rather than just leave-one-out). All analyses now also accept 'mean' or 'median' as summary statistics. I've increased the flexibility of inputs to the core ISC functions, and tried to standardize output geometries (time points by voxels). I've moved statistical tests out of the core ISC and ISFC functions—these are now separate functions. In addition to phase randomization, I've added a circular time-shifting test (Kauppi et al., 2014), and group-level permutation and bootstrap tests (Chen et al., 2016). Both the phase randomization and circular time-shift randomization tests operate on the data themselves and call the core ISC function internally to recompute ISCs on the randomized data. Two changes to the phase randomization code: no more randomization across voxels (as this disrupts the spatial autocorrelation of fMRI data), and in the leave-one-out approach, only the left-out subject is phase randomized per iteration. If all subjects are randomized, then N-1 are averaged, the Nth subject will always be correlated with what is effectively a flat time series, yielding an overly tight null distribution and inflated false positive rates (FPRs). Group-level bootstrap and permutation tests operate directly on the ISC values and should better control FPRs (Chen et al., 2016). For pairwise ISCs, bootstrap resampling of subjects and group relabeling operate at the row/column level, not at the level of individual pairs (therefore respecting the correlation structure among pairs). Updated and expanded tests. No statistical tests for ISFC yet, but working on it. I would also recommend changing the module/filename from "isfc.py" to "isc.py", despite the ISC values being a subset of the ISC values, I think ISC is conceptually superordinate and we may want to add things like spatial ISC in the future (which does not fit under the ISFC heading). NB: This is my first major PR, so apologies if I'm missing anything obvious.

buildbot-princeton · 2018-10-04T00:41:57Z

Can one of the admins verify this patch?

buildbot-princeton · 2018-10-04T00:41:57Z

Can one of the admins verify this patch?

mihaic · 2018-10-04T16:18:21Z

Jenkins, add to whitelist.

mihaic · 2018-10-04T16:19:01Z

@snastase, please ignore the Travis status for the moment. The failure is unrelated. I'm working on a fix.

qihongl · 2018-10-05T18:20:30Z

Hi, folks @snastase @mihaic sorry about the delayed response! And thank you very much @snastase
! I'm very excited about this pull request! Though my schedule is pretty crazy recently. If this is not time sensitive, I'd love to help with anything!

qihongl · 2018-10-05T18:25:35Z

@mihaic @snastase I probably should discuss this point somewhere else:

This pull request raised a great point - standardized data geometry. I guess we probably should consider standardizing the input data geometry to be consistent with sklearn. For example, SRM currently takes n_features (n_voxels) x n_exampels (n_TRs), whereas sklearn, as well as many standard ML textbooks, use the transposed input format n_exampels x n_features. This have been causing some confusion in the past.

Moreover, I think we should encourage all future packages to be consistent with some default data format.

manojneuro · 2018-10-05T18:29:24Z

@snastase thanks for working on this. Meir and David Turner are doing some tests on memory utilization of ISC with Searchlight. I'll request them to use this fix as a part of their tests. This will provide an extra validation on many of these updates.

mihaic · 2018-10-05T18:49:02Z

@qihongl, I agree about the geometry comment. See issue #356 for a different idea.

snastase · 2018-10-05T19:29:50Z

@qihongl @mihaic I definitely agree about having a standard input geometry, and any transposes, slicing, etc happening internally. I would advocate for n_samples (typically time points, conditions) by n_features (typically voxels, electrodes, surface vertices), as this seems to be the most widespread convention and makes the most sense for, e.g., brain decoding analyses. I think standardizing this throughout will be critical for overall usability of BrainIAK software. I'm not very familiar with xarrays, but they look very cool. Seems like the ultimate solution would be to generally assume a standardized 2D geometry for neural data (with checks) but also allow for the increased flexibility / explicitness of xarrays.

snastase · 2018-10-05T19:33:49Z

@manojneuro I haven't put a lot of effort into optimizing for speed / computing resources yet—was going for flexibility and readability—so I suspect there are several points where we could speed things up! Current version may run slower than the previous implementation until then, but I haven't thoroughly tested this.

…bose

mihaic · 2018-12-05T20:24:43Z

@snastase, regarding your f-string commit, I created issue #393 about the minimum version of dependencies we should support.

mihaic · 2018-12-06T17:58:57Z

I think removing (not deprecating) the old module brainiak.isfc is actually the way to go considering we are not keeping any function signature unchanged.

In any case, the current approach of duplicating the code is not what we want.

snastase · 2018-12-06T18:20:10Z

Okay, will remove the brainiak.isfc duplicate (and tests) entirely.

I'm a bit confused about how to handle the current test error regarding linked references in the docstring. I'm getting the following message:

Warning, treated as error:
/home/travis/build/brainiak/brainiak/brainiak/isc.py:docstring of brainiak.isc.permutation_isc:30:Duplicate explicit target name: "chen2016".
make: *** [html] Error 2
/home/travis/build/brainiak/brainiak
make docs failed
The command "./pr-check.sh" exited with 1.

Is this because I have multiple pointers to the same reference? I.e., the Chen2016 reference is used for both bootstrap_isc and permutation_isc. What's the solution to this?

snastase · 2018-12-06T18:29:18Z

@mihaic also pointed out in an offline conversation that we'll need to adjust the ISC example (https://github.com/brainiak/brainiak/blob/master/examples/isfc/isfc.py) to match the new functionality.

mihaic · 2018-12-06T18:31:20Z

The error is because of multiple definitions of chen2016. You should remove the second of the definitions and replace it with a reference, e.g.:
The implementation is based on the following publication: [Chen2016]_.
http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#citations

snastase · 2018-12-06T23:43:35Z

Hey @mihaic, some tests are failing for linux during the conda build, apparently due to problems with pymanopt(?)—any idea what that's about?

mihaic · 2018-12-07T00:03:15Z

It's not related to your PR (I can see the same error on my machine on the master branch). I'm looking into it, but if you're done before I figure it out, we can merge your PR anyway.

snastase · 2018-12-07T00:08:13Z

@mihaic I think all the base functionality @qihongl reviewed is basically ready to go. There are other related topics (e.g., updating the examples)... not sure if those are best handled here or in another PR.

mihaic · 2018-12-07T00:10:20Z

The master branch is supposed to be functional, so I don't think we can postpone updating the examples.

mihaic · 2018-12-07T00:29:06Z

Also, why do you think this test failed on Linux on Jenkins?

=================================== FAILURES ===================================
_______________________________ test_isc_output ________________________________

    def test_isc_output():
    
        logger.info("Testing ISC outputs")
    
        data = correlated_timeseries(20, 60, noise=0,
                                     random_state=42)
        iscs = isc(data, pairwise=False)
        assert np.all(iscs[:, :2] == 1.)
        assert np.all(iscs[:, -1] < 1.)
    
        iscs = isc(data, pairwise=True)
>       assert np.all(iscs[:, :2] == 1.)
E       assert False
E        +  where False = <function all at 0x7f6aef557e18>(array([[1., 1.],\n       [1., 1.],\n       [1., 1.],\n       [1., 1.],\n       [1., 1.],\n       [1., 1.],\n       [1., 1.],...       [1., 1.],\n       [1., 1.],\n       [1., 1.],\n       [1., 1.],\n       [1., 1.],\n       [1., 1.],\n       [1., 1.]]) == 1.0)
E        +    where <function all at 0x7f6aef557e18> = np.all

snastase · 2018-12-07T01:34:00Z

@mihaic Okay, the ISC/ISFC example is now functional and updated to use the new code (including new MaskMultiSubjectData shape, etc). I'm not sure why that test would fail. The point was to ensure that the correlation between identical time series was exactly 1. I haven't been able to recreate a situation where the test failed locally. I guess it's possible that the np.corrcoef computation could introduce some tiny deviation and thus break the assertion? I switched the assertion from testing for strictly 1 to np.allclose to introduce a tiny bit of tolerance.

* Tolerance for NaNs for ISC/ISFC, and tests * Messing with NaN options in isfc * Option to return NaNs in compute_correlation * Moved NaN handling into _normalize_for_correlation * NaN handling for ISFC and tests * Reshaped ISFC output to n_subjects x n_voxel_pairs * Fixed whitespace errors etc * tolerate_nans in phaseshift_isc and timeshift_isc tests * Added new function to squareform ISFCs and keep diagonal * Replaced old p_from_null with new version * Removed ISC test NIfTI data because we're simulating * Moved phase randomization in phaseshift_isc to utils * Removed vestigial ecdf from utils and tests * Moved _check_timeseries_input to utils to fix dependencies * Fixing duplicate references * isfc can now take 'targets' as input (asymmetric) * Fixed up 'targets' for ISFCs, added tests * Increasing some bits of code coverage * Modified exception tests to use shmancy pytest.raises * Added news items for issues and PR * Updated NEWS directly for outdated PR #384

This will add it to the next release, not v0.8, as it should be. In PR brainiak#403, I mistakenly suggested editing NEWS directly.

It will be added to the next release where it belongs, not v0.8. In PR #405, I mistakenly suggested editing NEWS directly.

snastase added 12 commits September 24, 2018 16:33

Reworked core ISC function for more flexible input, no internal stats

a278d5e

ISC function intelligently handles only two subjects

be185d2

Added bootstrap resampling subjects for one-sample test

0472690

Moved non-pairwise operations out of voxel loop for speed

d949b0b

Added permutation test for one- and two-sample tests

be76561

Added function for circular time-shift randomization test

25b2e73

Fixed time-shift to only shift test subject in leave-one-out

decfbf9

New phase randomization test, both pairwise and leave-one-out

511afd2

Revamped ISFC with no internal stats (no pairwise yet)

169c9db

Added pairwise option to ISFC, fixed inputs to compute_correlation

be46bf5

Fixed bug in leave-one-out procedure for ISC and ISFC

0a634a5

Added tests for new ISC/ISFC functionality and statistical tests

4f9f2c3

snastase requested review from cbaldassano, manojneuro and qihongl October 4, 2018 00:43

snastase assigned snastase and unassigned snastase Oct 4, 2018

Fixed location of test_isfc.py and added module imports

94d10cc

snastase added 2 commits October 6, 2018 15:20

Small fixes in ISC/ISFC tests

d0fc472

Check simulated data isn't inadvertently correlated in test, more ver…

e9e37fa

…bose

snastase added 4 commits December 5, 2018 20:30

New smaller permutation_isc functions to reduce complex

7a58b31

Fixed test_images.py for new MMSD shape

5bbafe9

Set tolerance for allclose in ISC/ISFC test comparison

196b94d

Fixed dedent in docstring refs for html rendering

d2e00a2

snastase added 3 commits December 6, 2018 14:54

Removed deprecated isfc.py entirely, fixed [Chen2016]

38530a3

Removed duplicate Simony reference in phaseshift_isc

04c7734

Fixed references and added doi URLs

3a5a0d8

snastase added 2 commits December 6, 2018 20:15

Updated example with current ISC code

a39f0b8

Changed test_isc_output to allclose with some tolerance

bae11b6

snastase merged commit e09b345 into brainiak:master Dec 7, 2018

snastase deleted the isc-stats branch December 7, 2018 15:19

This was referenced Dec 15, 2018

isc: Make the ISC and ISFC code MPI efficient #361

Open

utils: Document difference between compute_p_from_null_distribution and p_from_null #397

Closed

mihaic mentioned this pull request Jan 9, 2019

isc: output dimensions for ISFC analysis #399

Closed

snastase mentioned this pull request Jan 15, 2019

Handling NaNs in ISC/ISFC (and ISFC output shape) #405

Merged

snastase added a commit to snastase/brainiak that referenced this pull request Mar 27, 2019

Updated NEWS directly for outdated PR brainiak#384

357376b

mihaic added a commit to mihaic/brainiak that referenced this pull request Apr 8, 2019

docs: Move PR brainiak#384 NEWS entry to newsfragments

638ed6a

This will add it to the next release, not v0.8, as it should be. In PR brainiak#403, I mistakenly suggested editing NEWS directly.

mihaic added a commit that referenced this pull request Apr 10, 2019

docs: Move PR #384 NEWS entry to newsfragments (#417)

24040c3

It will be added to the next release where it belongs, not v0.8. In PR #405, I mistakenly suggested editing NEWS directly.

WIP: Revamped ISC analyses with additional statistical tests #384

WIP: Revamped ISC analyses with additional statistical tests #384

Uh oh!

Conversation

snastase commented Oct 4, 2018

Uh oh!

buildbot-princeton commented Oct 4, 2018

Uh oh!

buildbot-princeton commented Oct 4, 2018

Uh oh!

mihaic commented Oct 4, 2018

Uh oh!

mihaic commented Oct 4, 2018

Uh oh!

qihongl commented Oct 5, 2018

Uh oh!

qihongl commented Oct 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

manojneuro commented Oct 5, 2018

Uh oh!

mihaic commented Oct 5, 2018

Uh oh!

snastase commented Oct 5, 2018

Uh oh!

snastase commented Oct 5, 2018

Uh oh!

mihaic commented Dec 5, 2018

Uh oh!

mihaic commented Dec 6, 2018

Uh oh!

snastase commented Dec 6, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

snastase commented Dec 6, 2018

Uh oh!

mihaic commented Dec 6, 2018

Uh oh!

snastase commented Dec 6, 2018

Uh oh!

mihaic commented Dec 7, 2018

Uh oh!

snastase commented Dec 7, 2018

Uh oh!

mihaic commented Dec 7, 2018

Uh oh!

mihaic commented Dec 7, 2018

Uh oh!

snastase commented Dec 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

qihongl commented Oct 5, 2018 •

edited

Loading

snastase commented Dec 6, 2018 •

edited

Loading

snastase commented Dec 7, 2018 •

edited

Loading