共享OpenCL内核数据(Sharing OpenCL Kernel Data)

编程入门 行业动态 更新时间:2024-10-24 17:21:18
共享OpenCL内核数据(Sharing OpenCL Kernel Data)

我有2个OpenCL内核, run_kernel和apply_kernel ,我想要一个接一个地顺序完成,几次。 run_kernel的输出包含run_kernel一些输入,但我不知道如何实现它。

目前,我有一个名为d_vertexBuffer cl_mem缓冲区,我填充了我想要给run_kernel的数据,它正确地完成了它的工作。 我像这样设置内核arg:

error = clSetKernelArg(run_kernel, 0, sizeof(cl_mem), (void*) &d_vertexBuffer);

我尝试将apply_kernel设置为使用相同的d_vertexBuffer ,但我猜这会弄乱run_kernel访问它,因为OpenCL代码在尝试访问缓冲区时会获得NaN。 我像这样设置apply_kernel :

error = clSetKernelArg(apply_kernel, 0, sizeof(cl_mem), (void*) &d_vertexBuffer);

我像这样创建d_vertexBuffer :

d_vertexBuffer = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR, vertexBufferSize, h_vertexBuffer, &error);

为了多次运行这些内核,我有一个for循环,它将命令队列中的内核排入队列。 显然,这绝不是正确的方法。 我如何才能使两个内核能够共享数据?

I have 2 OpenCL kernels, run_kernel and apply_kernel that I want completed sequentially one after the other, a few times. The output of run_kernel contains some of the input for apply_kernel, but I'm not sure how to implement this.

Currently, I have a single cl_mem buffer named d_vertexBuffer that I filled with the data I want to give run_kernel, and it does its thing correctly. I set the kernel arg like this:

error = clSetKernelArg(run_kernel, 0, sizeof(cl_mem), (void*) &d_vertexBuffer);

I tried setting apply_kernel to use the same d_vertexBuffer, but I'm guessing this messes up run_kernel accessing to it, since the OpenCL code is getting NaN whenever it tries to access the buffer. I set the apply_kernel like this:

error = clSetKernelArg(apply_kernel, 0, sizeof(cl_mem), (void*) &d_vertexBuffer);

I create the d_vertexBuffer like this:

d_vertexBuffer = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR, vertexBufferSize, h_vertexBuffer, &error);

In order to run these kernels multiple times, I have a for loop that enqueues the kernel in my command queue. Obviously this must not be the correct way to do it. How would I make it so that the two kernels are able share data?

最满意答案

通过它的声音,您希望能够将run_kernel的重要输出附加到run_kernel的末尾。 您可以使d_vertexBuffer足够大以存储正常输入值( vertexBufferSize )以及run_kernel输出中的额外顶点。 run_kernel重要的输出部分d_vertexBuffer到vertexBufferSize上面的d_vertexBuffer部分

The problem ended up being unrelated; I was accidentally using a 2-index global work size in the apply_kernel when I only wanted 1, so it was throwing out NaN,

更多推荐

本文发布于:2023-08-02 23:55:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1382474.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:内核   数据   OpenCL   Data   Kernel

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!