-
Notifications
You must be signed in to change notification settings - Fork 141
WIP: Revamped ISC analyses with additional statistical tests #384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
1 similar comment
|
Can one of the admins verify this patch? |
|
Jenkins, add to whitelist. |
|
@snastase, please ignore the Travis status for the moment. The failure is unrelated. I'm working on a fix. |
|
@mihaic @snastase I probably should discuss this point somewhere else: This pull request raised a great point - standardized data geometry. I guess we probably should consider standardizing the input data geometry to be consistent with Moreover, I think we should encourage all future packages to be consistent with some default data format. |
|
@snastase thanks for working on this. Meir and David Turner are doing some tests on memory utilization of ISC with Searchlight. I'll request them to use this fix as a part of their tests. This will provide an extra validation on many of these updates. |
|
@qihongl @mihaic I definitely agree about having a standard input geometry, and any transposes, slicing, etc happening internally. I would advocate for n_samples (typically time points, conditions) by n_features (typically voxels, electrodes, surface vertices), as this seems to be the most widespread convention and makes the most sense for, e.g., brain decoding analyses. I think standardizing this throughout will be critical for overall usability of BrainIAK software. I'm not very familiar with xarrays, but they look very cool. Seems like the ultimate solution would be to generally assume a standardized 2D geometry for neural data (with checks) but also allow for the increased flexibility / explicitness of xarrays. |
|
@manojneuro I haven't put a lot of effort into optimizing for speed / computing resources yet—was going for flexibility and readability—so I suspect there are several points where we could speed things up! Current version may run slower than the previous implementation until then, but I haven't thoroughly tested this. |
|
I think removing (not deprecating) the old module In any case, the current approach of duplicating the code is not what we want. |
|
Okay, will remove the brainiak.isfc duplicate (and tests) entirely. I'm a bit confused about how to handle the current test error regarding linked references in the docstring. I'm getting the following message: Is this because I have multiple pointers to the same reference? I.e., the Chen2016 reference is used for both bootstrap_isc and permutation_isc. What's the solution to this? |
|
@mihaic also pointed out in an offline conversation that we'll need to adjust the ISC example (https://github.com/brainiak/brainiak/blob/master/examples/isfc/isfc.py) to match the new functionality. |
|
The error is because of multiple definitions of |
|
Hey @mihaic, some tests are failing for linux during the conda build, apparently due to problems with pymanopt(?)—any idea what that's about? |
|
It's not related to your PR (I can see the same error on my machine on the master branch). I'm looking into it, but if you're done before I figure it out, we can merge your PR anyway. |
|
The master branch is supposed to be functional, so I don't think we can postpone updating the examples. |
|
Also, why do you think this test failed on Linux on Jenkins? |
|
@mihaic Okay, the ISC/ISFC example is now functional and updated to use the new code (including new MaskMultiSubjectData shape, etc). I'm not sure why that test would fail. The point was to ensure that the correlation between identical time series was exactly 1. I haven't been able to recreate a situation where the test failed locally. I guess it's possible that the np.corrcoef computation could introduce some tiny deviation and thus break the assertion? I switched the assertion from testing for strictly 1 to np.allclose to introduce a tiny bit of tolerance. |
* Tolerance for NaNs for ISC/ISFC, and tests * Messing with NaN options in isfc * Option to return NaNs in compute_correlation * Moved NaN handling into _normalize_for_correlation * NaN handling for ISFC and tests * Reshaped ISFC output to n_subjects x n_voxel_pairs * Fixed whitespace errors etc * tolerate_nans in phaseshift_isc and timeshift_isc tests * Added new function to squareform ISFCs and keep diagonal * Replaced old p_from_null with new version * Removed ISC test NIfTI data because we're simulating * Moved phase randomization in phaseshift_isc to utils * Removed vestigial ecdf from utils and tests * Moved _check_timeseries_input to utils to fix dependencies * Fixing duplicate references * isfc can now take 'targets' as input (asymmetric) * Fixed up 'targets' for ISFCs, added tests * Increasing some bits of code coverage * Modified exception tests to use shmancy pytest.raises * Added news items for issues and PR * Updated NEWS directly for outdated PR #384
This will add it to the next release, not v0.8, as it should be. In PR brainiak#403, I mistakenly suggested editing NEWS directly.
It will be added to the next release where it belongs, not v0.8. In PR #405, I mistakenly suggested editing NEWS directly.
I'm re-working and expanding the ISC (and ISFC) code to accommodate several new features. We can now compute either pairwise or leave-one-out ISCs (rather than just leave-one-out). All analyses now also accept 'mean' or 'median' as summary statistics. I've increased the flexibility of inputs to the core ISC functions, and tried to standardize output geometries (time points by voxels). I've moved statistical tests out of the core ISC and ISFC functions—these are now separate functions. In addition to phase randomization, I've added a circular time-shifting test (Kauppi et al., 2014), and group-level permutation and bootstrap tests (Chen et al., 2016). Both the phase randomization and circular time-shift randomization tests operate on the data themselves and call the core ISC function internally to recompute ISCs on the randomized data. Two changes to the phase randomization code: no more randomization across voxels (as this disrupts the spatial autocorrelation of fMRI data), and in the leave-one-out approach, only the left-out subject is phase randomized per iteration. If all subjects are randomized, then N-1 are averaged, the Nth subject will always be correlated with what is effectively a flat time series, yielding an overly tight null distribution and inflated false positive rates (FPRs). Group-level bootstrap and permutation tests operate directly on the ISC values and should better control FPRs (Chen et al., 2016). For pairwise ISCs, bootstrap resampling of subjects and group relabeling operate at the row/column level, not at the level of individual pairs (therefore respecting the correlation structure among pairs). Updated and expanded tests. No statistical tests for ISFC yet, but working on it. I would also recommend changing the module/filename from "isfc.py" to "isc.py", despite the ISC values being a subset of the ISC values, I think ISC is conceptually superordinate and we may want to add things like spatial ISC in the future (which does not fit under the ISFC heading). NB: This is my first major PR, so apologies if I'm missing anything obvious.