Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add workspace environment support #20900

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

tdyas
Copy link
Contributor

@tdyas tdyas commented May 10, 2024

Add support for a "workspace environment" which is similar to a local environment except that execution happens in the repository itself (i.e., the "workspace") and not in an execution sandbox. This implements the user-visible experience of Design B from the "In-Workspace Process Execution" design document.

This PR builds on top of the Rust support for in-workspace execution contained in #20772.

Users will use the new workspace_environment target type to configure workspace execution support.

The pants-workspace-exec-testing repository contains an example of how to use this support to allow Pants to invoke Bazel in the workspace.

Comment on lines 590 to 593
# TODO: This is necessary for tests of `adhoc_tool` and `shell_command` with
# workspace execution to pass repeatedly in local Pants development.
# We need a viable solution instead of this hack.
cache_scope=ProcessCacheScope.PER_SESSION,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjyw: Any ideas on how to fix the test caching issue in the Pants repository and avoid this hack?

@tdyas
Copy link
Contributor Author

tdyas commented May 10, 2024

Only the most recent commit is new. The other commits are from #20772.

@tdyas tdyas force-pushed the shell_workspace_support_2 branch 2 times, most recently from 40dd9d4 to 2808018 Compare May 10, 2024 15:38
@tdyas tdyas force-pushed the shell_workspace_support_2 branch 3 times, most recently from 66e9416 to 3387cae Compare May 22, 2024 02:59
@@ -253,6 +254,33 @@ class AdhocToolOutputRootDirField(StringField):
)


class AdhocToolCacheScopeField(StringField):
Copy link
Contributor Author

@tdyas tdyas May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduction of cache scope to adhoc_tool and shell_command target types should be moved to a separate PR if we decide to keep it. It was necessary to provide a way for tests to set cache_scope to ProcessCacheScope.PER_SESSION in order to avoid cached results from prior tests runs from being used for a current test run.

This points to a more fundamental problem: Will workspace executions which modify the workspace be inherently a problem if the user is looking for them to always execute?

Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe dumb question, but why isn't the caching strategy determined by the environment? I.e., local is cacheable and workspace is not? Do we need this extra degree of freedom?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user does have a reproducible process invoked in the workspace environment (e.g., Bazel), shouldn't Pants cache the output then?

Copy link
Contributor Author

@tdyas tdyas May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is two sort of issues here:

  1. Should processes invoked in workspace environments be cacheable? My thought is yes they should (for example, Bazel invocations). (Invalidations may occur as a result of the metadata-based invalidation to come from PoC for metadata-based invalidation #20914.)
  2. The tests of the workspace environment fail when they invoke a non-idempotent process which writes a file into the workspace and which the test checks for to ensure execution actually happened in the workspace). Running that non-idempotent test multipe times fails unless the cache scope is "per session." Maybe this is a code smell in the test? Is there another way that the tests can set cache scope without exposing that ability in public API?

Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What key would we cache against though?

I can see not rerunning the process if the invalidation globs haven't changed from the last time we ran the process, but that is not the same as caching. In other words, of we run the process in state A, then as long as we remain in state A we don't rerun it. Then we switch to state B, so obviously we must rerun the process (and we do because the file mtimes have changed). Then we switch back to state A. Normally we'd retrieve the result from cache, but for workspace environments I think we have to rerun the process, no? All we know is that mtimes have changed, we don't fingerprint anything, so there's nothing to compute a cache key out of.

Or am I fundamentally misunderstanding the proposition here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR does not introduce the metadata-based cache key. Thus, you may want to review #20914 which does introduce the cache key: namely a hash of the "full stats" of the files on which the targets are depending via a invalidation_globs field. This is injected into the applicable Process as an environment variable.

@tdyas tdyas force-pushed the shell_workspace_support_2 branch 2 times, most recently from d5286e8 to 350d7cf Compare May 24, 2024 03:31
@tdyas tdyas requested a review from a team May 24, 2024 03:33
@tdyas tdyas marked this pull request as ready for review May 24, 2024 03:35
@tdyas tdyas requested a review from benjyw May 24, 2024 03:37
docs/notes/2.22.x.md Show resolved Hide resolved
@@ -253,6 +254,33 @@ class AdhocToolOutputRootDirField(StringField):
)


class AdhocToolCacheScopeField(StringField):
Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe dumb question, but why isn't the caching strategy determined by the environment? I.e., local is cacheable and workspace is not? Do we need this extra degree of freedom?

src/python/pants/core/util_rules/environments.py Outdated Show resolved Hide resolved
src/python/pants/core/util_rules/system_binaries.py Outdated Show resolved Hide resolved
@tdyas tdyas force-pushed the shell_workspace_support_2 branch 4 times, most recently from 1dfd950 to ef67eea Compare May 28, 2024 03:49
Comment on lines +567 to +571
def default_cache_scope(self) -> ProcessCacheScope:
if self.val and self.val.has_field(LocalWorkspaceCompatiblePlatformsField):
return ProcessCacheScope.PER_SESSION
else:
return ProcessCacheScope.SUCCESSFUL
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this method as a way for an environment to specify what the default cache scope is.

@benjyw: Is this what you were thinking with your suggestion?

If so, this maybe should be pulled out and all applicable use sites updated.

@tdyas tdyas requested review from benjyw, stuhood and huonw May 28, 2024 03:51
@tdyas tdyas force-pushed the shell_workspace_support_2 branch from ef67eea to 5e66cd1 Compare June 1, 2024 17:03
Copy link
Contributor

@huonw huonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

I haven't approved because I haven't reviewed in detail and it seems @benjyw is on top of that. Just have some questions prompted by the docs.


The primary motivation for this feature is to better support integration with third-party build orchestration tools (for example, Bazel) which may not operate properly when not invoked in the repository (including in some cases incurring signifcant performance penalties).

There is a significant trade-off though which makes this feature inherently **UNSAFE**: Pants cannot reasonbly guarantee that build processes are reproducible if they run in the workspace environment. Thus, Pants puts that burden on you, the Pants user, to guarantee that any process executed in the workspace environment is reproducible based solely on inputs in the repository. If a process is not reproducible, then unknown side effects may occur.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions about side-effects/process executions and "UNSAFE" call out:

  1. Before this PR, one can run non-reproducible code within a sandbox (like shell_command(command="date", ...) that depends on the time, or something that reads mutable data from the internet) and this might not behave like someone expects. From a Pants perspective, are the side-effects from this approach similar or worse than that? (Obviously the process itself could do things arbitrarily bad too, but that's not the focus.)
  2. Is there risk of problems due to processes being killed or cancelled in unexpected ways? If so, is it worth calling that out?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants