目标是向以下Cholesky因子函数添加尽可能多的OpenMP以增加并行化.到目前为止,我只有一个正确实现的 #pragma omp并行. vector< vector< double>> 表示二维矩阵.我已经尝试为添加 #pragma omp parallel for for(int i = 0; i< n; ++ i), for(int k = 0; k< i; ++ k)和 for(int j = 0; j< k; ++ j),但是并行化出错. makeMatrix(n,n)初始化大小为 nxn 的全零的 vector< vector< double>> .
The goal is to add as much OpenMP to the following Cholesky factor function to increase parallelization. So far, I only have one #pragma omp parallel for implemented correctly. vector<vector<double>> represents a 2-D matrix. I've already tried adding #pragma omp parallel for for for (int i = 0; i < n; ++i), for (int k = 0; k < i; ++k), and for (int j = 0; j < k; ++j) but the parallelization goes wrong. makeMatrix(n, n) initializes a vector<vector<double>> of all zeroes of size nxn.
vector<vector<double>> cholesky_factor(vector<vector<double>> input) { int n = input.size(); vector<vector<double>> result = makeMatrix(n, n); for (int i = 0; i < n; ++i) { for (int k = 0; k < i; ++k) { double value = input[i][k]; for (int j = 0; j < k; ++j) { value -= result[i][j] * result[k][j]; } result[i][k] = value / result[k][k]; } double value = input[i][i]; #pragma omp parallel for for (int j = 0; j < i; ++j) { value -= result[i][j] * result[i][j]; } result[i][i] = std::sqrt(value); } return result; } 推荐答案我认为您无法使用此算法并行化更多的功能,因为外循环的 i 次迭代取决于内循环的第 1个迭代和第 k 个迭代的结果取决于第 k-1 个的结果迭代.
I don't think you can parallelize much more than this with this algorithm, as the ith iteration of the outer loop depends on the results of the i - 1th iteration and the kth iteration of the inner loop depends on the results of the k - 1th iteration.
vector<vector<double>> cholesky_factor(vector<vector<double>> input) { int n = input.size(); vector<vector<double>> result = makeMatrix(n, n); for (int i = 0; i < n; ++i) { for (int k = 0; k < i; ++k) { double value = input[i][k]; // reduction(-: value) does the same // (private instances of value are initialized to zero and // added to the initial instance of value when the threads are joining #pragma omp parallel for reduction(+: value) for (int j = 0; j < k; ++j) { value -= result[i][j] * result[k][j]; } result[i][k] = value / result[k][k]; } double value = input[i][i]; #pragma omp parallel for reduction(+: value) for (int j = 0; j < i; ++j) { value -= result[i][j] * result[i][j]; } result[i][i] = std::sqrt(value); } return result; }更多推荐
如何将OpenMp添加到三重嵌套的for循环
发布评论