Installation Failure #3947
laytonjbgmail
started this conversation in
General
Replies: 1 comment 2 replies
-
Hi, I don't think |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Good afternoon,
I'm doing a clean Python installation using Anaconda, followed byt using conda to install the cudatoolkit, followed by using pip to install tensorflow and pytorch. Then I try installing Horovod using pip and get a failure. Here's the output from pip (apologies for length).
(base) laytonjb@laytonjb-APEXX-T3-04:~$ HOROVOD_GPU_OPERATIONS=NCCL HOROVOD_WITH_TENSORFLOW=1 pip install --use-pep517 --no-cache-dir horovod[tensorflow,pytorch]
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting horovod[pytorch,tensorflow]
Downloading horovod-0.28.1.tar.gz (3.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.5/3.5 MB 22.9 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting cloudpickle
Downloading cloudpickle-2.2.1-py3-none-any.whl (25 kB)
Requirement already satisfied: psutil in ./anaconda3/lib/python3.10/site-packages (from horovod[pytorch,tensorflow]) (5.9.0)
Requirement already satisfied: packaging in ./anaconda3/lib/python3.10/site-packages (from horovod[pytorch,tensorflow]) (23.0)
Requirement already satisfied: pyyaml in ./anaconda3/lib/python3.10/site-packages (from horovod[pytorch,tensorflow]) (6.0)
Requirement already satisfied: cffi>=1.4.0 in ./anaconda3/lib/python3.10/site-packages (from horovod[pytorch,tensorflow]) (1.15.1)
Requirement already satisfied: torch in ./anaconda3/lib/python3.10/site-packages (from horovod[pytorch,tensorflow]) (2.0.1+cu118)
Requirement already satisfied: tensorflow in ./anaconda3/lib/python3.10/site-packages (from horovod[pytorch,tensorflow]) (2.12.0)
Requirement already satisfied: pycparser in ./anaconda3/lib/python3.10/site-packages (from cffi>=1.4.0->horovod[pytorch,tensorflow]) (2.21)
Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (4.23.3)
Requirement already satisfied: gast<=0.4.0,>=0.2.1 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (0.4.0)
Requirement already satisfied: typing-extensions>=3.6.6 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (4.6.3)
Requirement already satisfied: setuptools in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (65.6.3)
Requirement already satisfied: tensorflow-estimator<2.13,>=2.12.0 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (2.12.0)
Requirement already satisfied: numpy<1.24,>=1.22 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (1.23.5)
Requirement already satisfied: libclang>=13.0.0 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (16.0.0)
Requirement already satisfied: termcolor>=1.1.0 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (2.3.0)
Requirement already satisfied: h5py>=2.9.0 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (3.9.0)
Requirement already satisfied: wrapt<1.15,>=1.11.0 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (1.14.1)
Requirement already satisfied: opt-einsum>=2.3.2 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (3.3.0)
Requirement already satisfied: google-pasta>=0.1.1 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (0.2.0)
Requirement already satisfied: grpcio<2.0,>=1.24.3 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (1.54.2)
Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (0.32.0)
Requirement already satisfied: flatbuffers>=2.0 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (23.5.26)
Requirement already satisfied: absl-py>=1.0.0 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (1.4.0)
Requirement already satisfied: keras<2.13,>=2.12.0 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (2.12.0)
Requirement already satisfied: jax>=0.3.15 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (0.4.12)
Requirement already satisfied: astunparse>=1.6.0 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (1.6.3)
Requirement already satisfied: six>=1.12.0 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (1.16.0)
Requirement already satisfied: tensorboard<2.13,>=2.12 in ./anaconda3/lib/python3.10/site-packages (from tensorflow->horovod[pytorch,tensorflow]) (2.12.3)
Requirement already satisfied: networkx in ./anaconda3/lib/python3.10/site-packages (from torch->horovod[pytorch,tensorflow]) (3.0)
Requirement already satisfied: sympy in ./anaconda3/lib/python3.10/site-packages (from torch->horovod[pytorch,tensorflow]) (1.11.1)
Requirement already satisfied: triton==2.0.0 in ./anaconda3/lib/python3.10/site-packages (from torch->horovod[pytorch,tensorflow]) (2.0.0)
Requirement already satisfied: filelock in ./anaconda3/lib/python3.10/site-packages (from torch->horovod[pytorch,tensorflow]) (3.9.0)
Requirement already satisfied: jinja2 in ./anaconda3/lib/python3.10/site-packages (from torch->horovod[pytorch,tensorflow]) (3.1.2)
Requirement already satisfied: lit in ./anaconda3/lib/python3.10/site-packages (from triton==2.0.0->torch->horovod[pytorch,tensorflow]) (15.0.7)
Requirement already satisfied: cmake in ./anaconda3/lib/python3.10/site-packages (from triton==2.0.0->torch->horovod[pytorch,tensorflow]) (3.25.0)
Requirement already satisfied: wheel<1.0,>=0.23.0 in ./anaconda3/lib/python3.10/site-packages (from astunparse>=1.6.0->tensorflow->horovod[pytorch,tensorflow]) (0.38.4)
Requirement already satisfied: ml-dtypes>=0.1.0 in ./anaconda3/lib/python3.10/site-packages (from jax>=0.3.15->tensorflow->horovod[pytorch,tensorflow]) (0.2.0)
Requirement already satisfied: scipy>=1.7 in ./anaconda3/lib/python3.10/site-packages (from jax>=0.3.15->tensorflow->horovod[pytorch,tensorflow]) (1.10.1)
Requirement already satisfied: werkzeug>=1.0.1 in ./anaconda3/lib/python3.10/site-packages (from tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (2.3.6)
Requirement already satisfied: requests<3,>=2.21.0 in ./anaconda3/lib/python3.10/site-packages (from tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (2.29.0)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in ./anaconda3/lib/python3.10/site-packages (from tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (0.7.1)
Requirement already satisfied: markdown>=2.6.8 in ./anaconda3/lib/python3.10/site-packages (from tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (3.4.3)
Requirement already satisfied: google-auth-oauthlib<1.1,>=0.5 in ./anaconda3/lib/python3.10/site-packages (from tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (1.0.0)
Requirement already satisfied: google-auth<3,>=1.6.3 in ./anaconda3/lib/python3.10/site-packages (from tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (2.20.0)
Requirement already satisfied: MarkupSafe>=2.0 in ./anaconda3/lib/python3.10/site-packages (from jinja2->torch->horovod[pytorch,tensorflow]) (2.1.1)
Requirement already satisfied: mpmath>=0.19 in ./anaconda3/lib/python3.10/site-packages (from sympy->torch->horovod[pytorch,tensorflow]) (1.2.1)
Requirement already satisfied: urllib3<2.0 in ./anaconda3/lib/python3.10/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (1.26.16)
Requirement already satisfied: rsa<5,>=3.1.4 in ./anaconda3/lib/python3.10/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (4.9)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in ./anaconda3/lib/python3.10/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (5.3.1)
Requirement already satisfied: pyasn1-modules>=0.2.1 in ./anaconda3/lib/python3.10/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (0.3.0)
Requirement already satisfied: requests-oauthlib>=0.7.0 in ./anaconda3/lib/python3.10/site-packages (from google-auth-oauthlib<1.1,>=0.5->tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (1.3.1)
Requirement already satisfied: idna<4,>=2.5 in ./anaconda3/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (3.4)
Requirement already satisfied: charset-normalizer<4,>=2 in ./anaconda3/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (2.0.4)
Requirement already satisfied: certifi>=2017.4.17 in ./anaconda3/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (2023.5.7)
Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in ./anaconda3/lib/python3.10/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (0.5.0)
Requirement already satisfied: oauthlib>=3.0.0 in ./anaconda3/lib/python3.10/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<1.1,>=0.5->tensorboard<2.13,>=2.12->tensorflow->horovod[pytorch,tensorflow]) (3.2.2)
Building wheels for collected packages: horovod
Building wheel for horovod (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for horovod (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [237 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-310
creating build/lib.linux-x86_64-cpython-310/horovod
copying horovod/init.py -> build/lib.linux-x86_64-cpython-310/horovod
creating build/lib.linux-x86_64-cpython-310/horovod/keras
copying horovod/keras/elastic.py -> build/lib.linux-x86_64-cpython-310/horovod/keras
copying horovod/keras/init.py -> build/lib.linux-x86_64-cpython-310/horovod/keras
copying horovod/keras/callbacks.py -> build/lib.linux-x86_64-cpython-310/horovod/keras
creating build/lib.linux-x86_64-cpython-310/horovod/tensorflow
copying horovod/tensorflow/gradient_aggregation.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
copying horovod/tensorflow/gradient_aggregation_eager.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
copying horovod/tensorflow/compression.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
copying horovod/tensorflow/functions.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
copying horovod/tensorflow/elastic.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
copying horovod/tensorflow/init.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
copying horovod/tensorflow/util.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
copying horovod/tensorflow/sync_batch_norm.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
copying horovod/tensorflow/mpi_ops.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
creating build/lib.linux-x86_64-cpython-310/horovod/torch
copying horovod/torch/compression.py -> build/lib.linux-x86_64-cpython-310/horovod/torch
copying horovod/torch/functions.py -> build/lib.linux-x86_64-cpython-310/horovod/torch
copying horovod/torch/init.py -> build/lib.linux-x86_64-cpython-310/horovod/torch
copying horovod/torch/optimizer.py -> build/lib.linux-x86_64-cpython-310/horovod/torch
copying horovod/torch/sync_batch_norm.py -> build/lib.linux-x86_64-cpython-310/horovod/torch
copying horovod/torch/mpi_ops.py -> build/lib.linux-x86_64-cpython-310/horovod/torch
creating build/lib.linux-x86_64-cpython-310/horovod/_keras
copying horovod/_keras/elastic.py -> build/lib.linux-x86_64-cpython-310/horovod/_keras
copying horovod/_keras/init.py -> build/lib.linux-x86_64-cpython-310/horovod/_keras
copying horovod/_keras/callbacks.py -> build/lib.linux-x86_64-cpython-310/horovod/_keras
creating build/lib.linux-x86_64-cpython-310/horovod/common
copying horovod/common/exceptions.py -> build/lib.linux-x86_64-cpython-310/horovod/common
copying horovod/common/basics.py -> build/lib.linux-x86_64-cpython-310/horovod/common
copying horovod/common/process_sets.py -> build/lib.linux-x86_64-cpython-310/horovod/common
copying horovod/common/elastic.py -> build/lib.linux-x86_64-cpython-310/horovod/common
copying horovod/common/init.py -> build/lib.linux-x86_64-cpython-310/horovod/common
copying horovod/common/util.py -> build/lib.linux-x86_64-cpython-310/horovod/common
creating build/lib.linux-x86_64-cpython-310/horovod/mxnet
copying horovod/mxnet/compression.py -> build/lib.linux-x86_64-cpython-310/horovod/mxnet
copying horovod/mxnet/functions.py -> build/lib.linux-x86_64-cpython-310/horovod/mxnet
copying horovod/mxnet/init.py -> build/lib.linux-x86_64-cpython-310/horovod/mxnet
copying horovod/mxnet/mpi_ops.py -> build/lib.linux-x86_64-cpython-310/horovod/mxnet
creating build/lib.linux-x86_64-cpython-310/horovod/spark
copying horovod/spark/mpi_run.py -> build/lib.linux-x86_64-cpython-310/horovod/spark
copying horovod/spark/gloo_run.py -> build/lib.linux-x86_64-cpython-310/horovod/spark
copying horovod/spark/conf.py -> build/lib.linux-x86_64-cpython-310/horovod/spark
copying horovod/spark/init.py -> build/lib.linux-x86_64-cpython-310/horovod/spark
copying horovod/spark/runner.py -> build/lib.linux-x86_64-cpython-310/horovod/spark
creating build/lib.linux-x86_64-cpython-310/horovod/runner
copying horovod/runner/task_fn.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
copying horovod/runner/run_task.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
copying horovod/runner/mpi_run.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
copying horovod/runner/gloo_run.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
copying horovod/runner/init.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
copying horovod/runner/launch.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
copying horovod/runner/js_run.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
creating build/lib.linux-x86_64-cpython-310/horovod/data
copying horovod/data/data_loader_base.py -> build/lib.linux-x86_64-cpython-310/horovod/data
copying horovod/data/init.py -> build/lib.linux-x86_64-cpython-310/horovod/data
creating build/lib.linux-x86_64-cpython-310/horovod/ray
copying horovod/ray/elastic_v2.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
copying horovod/ray/ray_logger.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
copying horovod/ray/driver_service.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
copying horovod/ray/worker.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
copying horovod/ray/utils.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
copying horovod/ray/adapter.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
copying horovod/ray/elastic.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
copying horovod/ray/init.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
copying horovod/ray/runner.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
copying horovod/ray/strategy.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
creating build/lib.linux-x86_64-cpython-310/horovod/tensorflow/keras
copying horovod/tensorflow/keras/elastic.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow/keras
copying horovod/tensorflow/keras/init.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow/keras
copying horovod/tensorflow/keras/callbacks.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow/keras
creating build/lib.linux-x86_64-cpython-310/horovod/tensorflow/data
copying horovod/tensorflow/data/compute_worker.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow/data
copying horovod/tensorflow/data/compute_service.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow/data
copying horovod/tensorflow/data/init.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow/data
creating build/lib.linux-x86_64-cpython-310/horovod/torch/elastic
copying horovod/torch/elastic/init.py -> build/lib.linux-x86_64-cpython-310/horovod/torch/elastic
copying horovod/torch/elastic/state.py -> build/lib.linux-x86_64-cpython-310/horovod/torch/elastic
copying horovod/torch/elastic/sampler.py -> build/lib.linux-x86_64-cpython-310/horovod/torch/elastic
creating build/lib.linux-x86_64-cpython-310/horovod/spark/task
copying horovod/spark/task/gloo_exec_fn.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/task
copying horovod/spark/task/task_service.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/task
copying horovod/spark/task/init.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/task
copying horovod/spark/task/mpirun_exec_fn.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/task
copying horovod/spark/task/task_info.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/task
creating build/lib.linux-x86_64-cpython-310/horovod/spark/keras
copying horovod/spark/keras/tensorflow.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
copying horovod/spark/keras/init.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
copying horovod/spark/keras/optimizer.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
copying horovod/spark/keras/util.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
copying horovod/spark/keras/remote.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
copying horovod/spark/keras/bare.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
copying horovod/spark/keras/datamodule.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
copying horovod/spark/keras/estimator.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
creating build/lib.linux-x86_64-cpython-310/horovod/spark/tensorflow
copying horovod/spark/tensorflow/compute_worker.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/tensorflow
copying horovod/spark/tensorflow/init.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/tensorflow
creating build/lib.linux-x86_64-cpython-310/horovod/spark/torch
copying horovod/spark/torch/init.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/torch
copying horovod/spark/torch/util.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/torch
copying horovod/spark/torch/remote.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/torch
copying horovod/spark/torch/datamodule.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/torch
copying horovod/spark/torch/estimator.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/torch
creating build/lib.linux-x86_64-cpython-310/horovod/spark/data_loaders
copying horovod/spark/data_loaders/pytorch_data_loaders.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/data_loaders
copying horovod/spark/data_loaders/init.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/data_loaders
creating build/lib.linux-x86_64-cpython-310/horovod/spark/driver
copying horovod/spark/driver/host_discovery.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
copying horovod/spark/driver/job_id.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
copying horovod/spark/driver/driver_service.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
copying horovod/spark/driver/init.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
copying horovod/spark/driver/rendezvous.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
copying horovod/spark/driver/mpirun_rsh.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
copying horovod/spark/driver/rsh.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
creating build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
copying horovod/spark/lightning/legacy.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
copying horovod/spark/lightning/init.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
copying horovod/spark/lightning/util.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
copying horovod/spark/lightning/remote.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
copying horovod/spark/lightning/datamodule.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
copying horovod/spark/lightning/estimator.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
creating build/lib.linux-x86_64-cpython-310/horovod/spark/common
copying horovod/spark/common/_namedtuple_fix.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
copying horovod/spark/common/params.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
copying horovod/spark/common/serialization.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
copying horovod/spark/common/init.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
copying horovod/spark/common/constants.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
copying horovod/spark/common/util.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
copying horovod/spark/common/store.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
copying horovod/spark/common/cache.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
copying horovod/spark/common/backend.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
copying horovod/spark/common/datamodule.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
copying horovod/spark/common/estimator.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
creating build/lib.linux-x86_64-cpython-310/horovod/runner/task
copying horovod/runner/task/task_service.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/task
copying horovod/runner/task/init.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/task
creating build/lib.linux-x86_64-cpython-310/horovod/runner/util
copying horovod/runner/util/streams.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
copying horovod/runner/util/init.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
copying horovod/runner/util/lsf.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
copying horovod/runner/util/remote.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
copying horovod/runner/util/network.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
copying horovod/runner/util/cache.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
copying horovod/runner/util/threads.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
creating build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
copying horovod/runner/elastic/driver.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
copying horovod/runner/elastic/worker.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
copying horovod/runner/elastic/settings.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
copying horovod/runner/elastic/init.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
copying horovod/runner/elastic/discovery.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
copying horovod/runner/elastic/constants.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
copying horovod/runner/elastic/rendezvous.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
copying horovod/runner/elastic/registration.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
creating build/lib.linux-x86_64-cpython-310/horovod/runner/driver
copying horovod/runner/driver/driver_service.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/driver
copying horovod/runner/driver/init.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/driver
creating build/lib.linux-x86_64-cpython-310/horovod/runner/common
copying horovod/runner/common/init.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common
creating build/lib.linux-x86_64-cpython-310/horovod/runner/http
copying horovod/runner/http/init.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/http
copying horovod/runner/http/http_client.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/http
copying horovod/runner/http/http_server.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/http
creating build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
copying horovod/runner/common/util/config_parser.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
copying horovod/runner/common/util/hosts.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
copying horovod/runner/common/util/safe_shell_exec.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
copying horovod/runner/common/util/host_hash.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
copying horovod/runner/common/util/settings.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
copying horovod/runner/common/util/secret.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
copying horovod/runner/common/util/init.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
copying horovod/runner/common/util/env.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
copying horovod/runner/common/util/network.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
copying horovod/runner/common/util/timeout.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
copying horovod/runner/common/util/codec.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
copying horovod/runner/common/util/tiny_shell_exec.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
creating build/lib.linux-x86_64-cpython-310/horovod/runner/common/service
copying horovod/runner/common/service/task_service.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/service
copying horovod/runner/common/service/driver_service.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/service
copying horovod/runner/common/service/compute_service.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/service
copying horovod/runner/common/service/init.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/service
running build_ext
Traceback (most recent call last):
File "/home/laytonjb/anaconda3/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 351, in
main()
File "/home/laytonjb/anaconda3/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 333, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/home/laytonjb/anaconda3/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 249, in build_wheel
return _build_backend().build_wheel(wheel_directory, config_settings,
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 416, in build_wheel
return self._build_with_temp_dir(['bdist_wheel'], '.whl',
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 401, in _build_with_temp_dir
self.run_setup()
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 487, in run_setup
super(_BuildMetaLegacyBackend,
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 338, in run_setup
exec(code, locals())
File "", line 213, in
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/init.py", line 107, in setup
return distutils.core.setup(**attrs)
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 343, in run
self.run_command("build")
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 131, in run
self.run_command(cmd_name)
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 84, in run
_build_ext.run(self)
File "/tmp/pip-build-env-zc5jamd7/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
self.build_extensions()
File "", line 106, in build_extensions
File "", line 67, in get_cmake_bin
ModuleNotFoundError: No module named 'packaging'
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for horovod
Failed to build horovod
ERROR: Could not build wheels for horovod, which is required to install pyproject.toml-based projects
The problem seems to be a missing module "packaging". It was installed by conda but I've also removed it and installed it using pip. In both cases I get the same error message.
I also tried the link: #3483 and this didn't provide a solution although it looks to be corrected in PEP 517 (see the script below).
After installing the latest Anaconda and doing "conda update conda" and "conda update --all", I run the following script:
conda install -c conda-forge -y cudatoolkit=11.8.0
python3 -m pip install nvidia-cudnn-cu11==8.6.0.163 tensorflow==2.12.*
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.file)"))' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/:$CUDNN_PATH/lib' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
source $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
python3 -c "import torch; print(torch.cuda.is_available())"
HOROVOD_GPU_OPERATIONS=NCCL HOROVOD_WITH_TENSORFLOW=1 pip install --use-pep517 --no-cache-dir horovod[tensorflow,pytorch]
Any advise? Thanks!
Beta Was this translation helpful? Give feedback.
All reactions