Skip to content

Correct compute::context::get_devices method#847

Open
JablonskiMateusz wants to merge 1 commit into
boostorg:masterfrom
JablonskiMateusz:master
Open

Correct compute::context::get_devices method#847
JablonskiMateusz wants to merge 1 commit into
boostorg:masterfrom
JablonskiMateusz:master

Conversation

@JablonskiMateusz

Copy link
Copy Markdown

get a vector of cl_device_id and then use the device ids
to populate a vector of compute::device

Signed-off-by: Mateusz Jablonski mateusz.jablonski@intel.com

@coveralls

coveralls commented Jun 24, 2020

Copy link
Copy Markdown

Coverage Status

Coverage increased (+0.008%) to 84.027% when pulling 881ca42 on JablonskiMateusz:master into 36c8913 on boostorg:master.

get a vector of cl_device_id and then use the device ids
to populate a vector of compute::device

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
@JablonskiMateusz

Copy link
Copy Markdown
Author

@jszuppe could you look at this PR?

@kylelutz

kylelutz commented Jul 8, 2020

Copy link
Copy Markdown
Collaborator

Hey @JablonskiMateusz! Could you provide some detail on the motivation for this change? Was this causing issues with your OpenCL library?

For context, device is a very thin wrapper on cl_device_id, and they have identical memory layouts. There should be no need to first populate vector<cl_device_id> and then copy into vector<device>.

@JablonskiMateusz

Copy link
Copy Markdown
Author

Hi @kylelutz,
the problem is scenario with sub devices. The current implementation is that boost queries number of context devices, then creates a vector of compute::device (default constructor of compute::device is called, without calling clRetainDevice), then it passes the pointer to data of the vector to clGetContextInfo and it fills the devices (also without calling clRetainDevice) and the vector is passed to the user's app. When the app destroys the vector, then the destructor of compute::device checks if it is a sub device and it calls clReleaseDevice. This causes that boost calls clReleaseDevice without calling clRetainDevice before. According to OpenCL spec, it caues UB:

After the device reference count becomes zero and all the objects attached to device (such as
command-queues) are released, the device object is deleted. Using this function to release a
reference that was not obtained by creating the object or by calling clRetainDevice causes
undefined behavior.

IMHO the cl_device_id should be passed to compute::device to its constructor or to assignment operator, currently it is performed in a quite hacky way like memcpy to the memory of compute::device .

My change was inspired by compute::program::get_devices where the vector of compute::device is populated from vector of cl_device_id https://github.com/boostorg/compute/blob/master/include/boost/compute/program.hpp#L191

@JablonskiMateusz

Copy link
Copy Markdown
Author

@kylelutz @jszuppe any comments?

@keryell

keryell commented Jul 14, 2020

Copy link
Copy Markdown
Contributor

I think that in a real application the compute::context::get_devices should not be the critical path, so having a more correct version does not seem like a problem.

@JablonskiMateusz

Copy link
Copy Markdown
Author

@kylelutz ping

@JablonskiMateusz

Copy link
Copy Markdown
Author

@kylelutz @jszuppe ping

@JablonskiMateusz

Copy link
Copy Markdown
Author

kindly reminder @kylelutz

@JablonskiMateusz

Copy link
Copy Markdown
Author

@kylelutz @jszuppe could you merge this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants