-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random values in DataLoader.load()
output if a stream/tier does not provide values for a certain column
#518
Comments
are you sure it's not just a problem related to the config files not being generated correctly? we have other parameters in the hit tier that are there for ged and not spm and did not have this problem in testing, I think, right @gracesong312 ? |
Yes I am sure. It's just because in rectangular data structures you obviously need some placeholder, when the hit corresponding to a certain index does not define a column value:
In case of floats one could use |
I think I overlooked this in the original testing, here's a basic test I just ran on the
Is it possible to just return |
In my test I randomly get true or false in @gracesong312 do you pre-allocate empty columns for the output table? This could explain why the values are unpredictable (because no actual value is ever written to the pre-allocated memory). Pandas uses NumPy internally, so a boolean column cannot contain non-booleans. I would not force that column to be float in order to be able to use |
Yes, I allocate memory with pygama/src/pygama/flow/utils.py Lines 137 to 144 in 663d352
|
Example:
is_valid_0vbb
is available in tierhit
for thegeds
subsystem only. If loading data fromspms
andgeds
at the same time,is_valid_0vbb
will randomly provideTrue
orFalse
(I guess because of uninitialized memory).This is obviously very dangerous and must be fixed ASAP.
We cannot simply fix it by using default values unfortunately (
NaN
, for example, works only with floats), so we need to return a different data structure.The text was updated successfully, but these errors were encountered: