Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ARM64 platform in TensorFlow examples #2119

Merged
merged 3 commits into from
May 21, 2024

Conversation

akhilsaivenkata
Copy link
Contributor

What this PR does / why we need it: Support ARM64 platform in TensorFlow examples

Which issue(s) this PR fixes (optional, in Fixes #<issue number>, #<issue number>, ... format, will close the issue(s) when PR gets merged):
Fixes #2112

Checklist:

  • Docs included if any changes are user facing

Signed-off-by: akhilsaivenkata <akhilammu1@gmail.com>
@coveralls
Copy link

coveralls commented May 18, 2024

Pull Request Test Coverage Report for Build 9163985221

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.008%) to 35.375%

Files with Coverage Reduction New Missed Lines %
pkg/controller.v1/mpi/mpijob.go 1 91.06%
Totals Coverage Status
Change from base Build 9130601320: -0.008%
Covered Lines: 4373
Relevant Lines: 12362

💛 - Coveralls

@akhilsaivenkata
Copy link
Contributor Author

akhilsaivenkata commented May 18, 2024

@tenzen-y , the check is failing because 'libhdf5.so' library is missing in the build environment. So do we need to make any changes to the docker file or is there any workaround?

@tenzen-y
Copy link
Member

@tenzen-y , the check is failing because 'libhdf5.so' library is missing in the build environment. So do we need to make any changes to the docker file or is there any workaround?

Yes, feel free to address that issue. I'm suspecting if bumping tf version would resolve the issue.

@akhilsaivenkata
Copy link
Contributor Author

@tenzen-y , the check is failing because 'libhdf5.so' library is missing in the build environment. So do we need to make any changes to the docker file or is there any workaround?

Yes, feel free to address that issue. I'm suspecting if bumping tf version would resolve the issue.

Here we are using python 3.9 as base image and we are facing issue with tensor flow installation :https://github.com/kubeflow/training-operator/blob/master/examples/tensorflow/distribution_strategy/keras-API/Dockerfile

For remaining tensor flow examples we are using tensorflow as base image which would come with all its dependencies. Is there any reason for using python as base image for the above case?

Signed-off-by: akhilsaivenkata <akhilammu1@gmail.com>
@akhilsaivenkata
Copy link
Contributor Author

Hi @tenzen-y , All checks are successful for this PR, Could you please review and possibly merge the pull request if everything is in order? Thank you for your time and assistance.

@@ -1,5 +1,7 @@
FROM python:3.9

RUN apt-get update && apt-get install -y libhdf5-dev
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RUN apt-get update && apt-get install -y libhdf5-dev
RUN apt-get update \
&& apt-get install -y libhdf5-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

Should we make the container image lightweight?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely @tenzen-y !! Thanks for the suggestion. I will definitely make these changes and make it light weight.

Signed-off-by: akhilsaivenkata <akhilammu1@gmail.com>
@google-oss-prow google-oss-prow bot added size/S and removed size/XS labels May 20, 2024
Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!
/lgtm
/approve

Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tenzen-y

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit e13d336 into kubeflow:master May 21, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support ARM64 platform in TensorFlow examples
3 participants