Tensorflow用CUDA 9编译aws新的p3实例(Tensorflow serving compilation with CUDA 9 for aws new p3 instances)

编程入门 行业动态 更新时间:2024-10-15 04:22:19
Tensorflow用CUDA 9编译aws新的p3实例(Tensorflow serving compilation with CUDA 9 for aws new p3 instances)

我能够从亚马逊修改过的源代码中重新编译Tensorflow(在新的深度学习AMI中提供)。

我现在试图编译tf服务与Tensorflow“叉”,但我得到的错误:

ERROR: /root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/contrib/nccl/BUILD:68:1: undeclared inclusion(s) in rule '@org_tensorflow//tensorflow/contrib/nccl:nccl_kernels': this rule is missing dependency declarations for the following files included by 'external/org_tensorflow/tensorflow/contrib/nccl/kernels/nccl_rewrite.cc': '/root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/core/common_runtime/optimization_registry.h' '/root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/core/common_runtime/device_set.h' '/root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/core/common_runtime/device.h' '/root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/core/graph/types.h' '/root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/core/graph/costmodel.h' '/root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/core/graph/node_builder.h' INFO: Elapsed time: 20.377s, Critical Path: 19.47s FAILED: Build did NOT complete successfully

更多信息:我正在使用Tensorflow服务的主分支(提交7a349752c2cbbe741edb91c6c6be1c571e91a5fb )和Bazel发布0.7.0。

我还对tools/bazel.rc进行了一些小修改,以解决另一个编译错误:

# git diff tools/bazel.rc diff --git a/tools/bazel.rc b/tools/bazel.rc index 9397f97..28476f3 100644 --- a/tools/bazel.rc +++ b/tools/bazel.rc @@ -1,4 +1,4 @@ -build:cuda --crosstool_top=@org_tensorflow//third_party/gpus/crosstool +build:cuda --crosstool_top=@local_config_cuda//crosstool:toolchain build:cuda --define=using_cuda=true --define=using_cuda_nvcc=true build --force_python=py2

任何想法缺少什么?

I was able to recompile Tensorflow from the Amazon's modified sources (provided in new their new deep learning AMI).

I am now trying to compile tf serving with that Tensorflow "fork" but I am getting that error:

ERROR: /root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/contrib/nccl/BUILD:68:1: undeclared inclusion(s) in rule '@org_tensorflow//tensorflow/contrib/nccl:nccl_kernels': this rule is missing dependency declarations for the following files included by 'external/org_tensorflow/tensorflow/contrib/nccl/kernels/nccl_rewrite.cc': '/root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/core/common_runtime/optimization_registry.h' '/root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/core/common_runtime/device_set.h' '/root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/core/common_runtime/device.h' '/root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/core/graph/types.h' '/root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/core/graph/costmodel.h' '/root/.cache/bazel/_bazel_root/98acb40d8921d865487eab808ed364b2/external/org_tensorflow/tensorflow/core/graph/node_builder.h' INFO: Elapsed time: 20.377s, Critical Path: 19.47s FAILED: Build did NOT complete successfully

Some more info: I'm using the master branch of Tensorflow serving (commit 7a349752c2cbbe741edb91c6c6be1c571e91a5fb) and Bazel release 0.7.0.

I also made a small change to tools/bazel.rc to resolve another compilation error:

# git diff tools/bazel.rc diff --git a/tools/bazel.rc b/tools/bazel.rc index 9397f97..28476f3 100644 --- a/tools/bazel.rc +++ b/tools/bazel.rc @@ -1,4 +1,4 @@ -build:cuda --crosstool_top=@org_tensorflow//third_party/gpus/crosstool +build:cuda --crosstool_top=@local_config_cuda//crosstool:toolchain build:cuda --define=using_cuda=true --define=using_cuda_nvcc=true build --force_python=py2

Any idea what is missing?

最满意答案

我通常禁用NCCL,因为它似乎永远不能正常构建:

https://github.com/PipelineAI/pipeline/blob/6261c4f31105e40ab8b24ccc7834f9181f4e5aaf/package/tensorflow/16d39e9-d690fdd/Dockerfile.full-gpu#L160

RUN \ cd $TENSORFLOW_SERVING_HOME \ # Remove NCCL since it isn't building properly && sed -i.bak '/nccl/d' tensorflow/tensorflow/contrib/BUILD \ && bazel build -c opt --config=cuda \ --verbose_failures \ --spawn_strategy=standalone --genrule_strategy=standalone \ --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.1 --copt=-msse4.2 \ --crosstool_top=@local_config_cuda//crosstool:toolchain \ tensorflow_serving/... \ && chmod a+x bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server \ && cp bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server /usr/local/bin/ \ && bazel clean --expunge

i usually disable NCCL since it never seems to build properly:

https://github.com/PipelineAI/pipeline/blob/6261c4f31105e40ab8b24ccc7834f9181f4e5aaf/package/tensorflow/16d39e9-d690fdd/Dockerfile.full-gpu#L160

RUN \ cd $TENSORFLOW_SERVING_HOME \ # Remove NCCL since it isn't building properly && sed -i.bak '/nccl/d' tensorflow/tensorflow/contrib/BUILD \ && bazel build -c opt --config=cuda \ --verbose_failures \ --spawn_strategy=standalone --genrule_strategy=standalone \ --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.1 --copt=-msse4.2 \ --crosstool_top=@local_config_cuda//crosstool:toolchain \ tensorflow_serving/... \ && chmod a+x bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server \ && cp bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server /usr/local/bin/ \ && bazel clean --expunge

更多推荐

本文发布于:2023-08-02 03:52:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1368727.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:实例   aws   Tensorflow   CUDA   instances

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!