Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More Complete ACL Implementation #386

Open
wants to merge 37 commits into
base: main
Choose a base branch
from

Conversation

kevin-montrose
Copy link
Contributor

@kevin-montrose kevin-montrose commented May 15, 2024

This sketches out a more proper ACL implementation - one with categories beyond @admin, and where individual commands can be ACL'd.

Basic idea is to use RespCommandsInfo as a mechanism to discover command ACLs, and attach a bitfield to each User to track individual commands rather than categories.

Outstanding TODOs:

  • Test rest of commands w.r.t. ACL categories
  • Implement individual command ACL'ing (ie. +get)
  • Implement sub-command ACL'ing (ie. -config|get)
  • Figure out what to do with unlisted/internal/cluster commands
  • Figure out what to do with custom extensions
    • Introduced a @garnet ACL category that all commands not in Redis are now tagged with
  • (Probably) Remove CommandCategory - it is redundant
  • Clean up command parsing, now that subcommand bytes are a "later in the pipeline'-thing
    • As part of this, probably DRY up ACL checking - it's a bit error prone to add a new command with this code so "wet"
    • SET (with EX, NX, etc. but not SETEX) are still special cased as they weren't realized as subCmds
  • Cluster subcommands are weird, but need ACL testing too
  • CustomCmd, CustomTnx, CustomObjCmd are weird and need to be ACL'able in some way too
    • Introduced a @Custom ACL category that blocks all of these, and the individual commands can also be ACL'd (that is +customcmd works)
    • Longer term we may want to allow ACL'ing the commands by their actual registered name, but IMO that should be a separate enhancement

This is explicitly not going to implement keyspace patterns. I feel like that should be a separate enhancement, as it's pretty substantial on it's own.


The biggest incidental change here is a substantial refactoring of RespCommand and the various switches over it. This is, IMO, the cleanest way to get a canonical value to test against ACLs with. As a consequence, RespCommand has a bunch more entries as most things that were previously "sub commands" (but not in the RESP sense) are now actual enum values.

There's one additional implementation detail to call out, which is how ACL descriptions are maintained. So that ACL descriptions behave more or less like you'd expect (that is +@foo +bar -bar becomes +@foo) a greedy approach is taken where, upon update, a trial removal of each token is made to see if it impacts the effective ACL set. This is pretty allocate-y, and while that could be improved I think ACL manipulation is rare and locked down enough that clarity was a bit more important. It's worth a stringent review, however. Code is in User.RationalizeACLDescription.

An attempt has been made to cover all commands, but more remain.

A number of TODOs remain, but this gives a rough shape of how it'd work
and how individual Commands or SubCommands would end up ACL'd too.
@lmaas lmaas self-assigned this May 15, 2024
As part of this, needed to actually enumerate all commands in RespCommands which
implies a lot of future cleanup in parsing and command processesing.

This also uncovered a number of different gaps in implemented RESP commands, which
were not addressed unless doing so was necessary for ACL testing.
Garnet-specific commands are still pending.
Introduces @garnet category.
Reworks WATCH so MS and OS are proper subcommands.
Reworks parsing to remove the tuple with subcommand, as it'd always be 0 now.
…ands, while the common case is consolidated into an upfront check.

Note that SET (and pseudo-variants SETEXNX, etc.) is a special case, as it lacks subcommands but has multiple command types.
That could be cleaned up later, but has performance implications - so leaving it for future work.
…CL categories, replace slow temp code with better code, add tests to catch future additions; this involves reordering RespCommand yet again
@kevin-montrose kevin-montrose marked this pull request as ready for review May 21, 2024 16:42
@kevin-montrose
Copy link
Contributor Author

kevin-montrose commented May 21, 2024

There's still some more to do here (more testing, subcommands, some cleanup) , but I'm going on vacation for a week and this is pretty dang close to how it's going to actually look when done. So taking it off draft, and marking ready for review (if not merging).

/cc @mgravell @NickCraver - this is relevant to your interests.

@kevin-montrose kevin-montrose changed the title [SKETCH] More Complete ACL Implementation More Complete ACL Implementation May 21, 2024
Copy link
Contributor

@lmaas lmaas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only made it half-way through the PR, but this implementation looks very promising! Your efforts in addressing the open to-do items in the ACL implementation are greatly appreciated. I have included some comments below.

libs/cluster/Session/ClusterSession.cs Outdated Show resolved Hide resolved
libs/cluster/Session/ClusterSession.cs Outdated Show resolved Hide resolved
libs/server/ACL/ACLException.cs Outdated Show resolved Hide resolved
libs/server/ACL/ACLParser.cs Outdated Show resolved Hide resolved
libs/server/ACL/User.cs Outdated Show resolved Hide resolved
libs/server/ACL/User.cs Outdated Show resolved Hide resolved
libs/server/ACL/User.cs Outdated Show resolved Hide resolved
test/Garnet.test/CredentialManager.cs Outdated Show resolved Hide resolved
Copy link
Contributor

@vazois vazois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR has too many changes to vet, maybe we need to break it into smaller PRs. I think the first is good but very hard to review because the surface area is huge. I think the idea of having a bitmap to match command permissions is good. There might I couple alternatives to tradeoff between CPU cycles or memory. We need to separate general fixes from clear ACL enchantments and avoid pushing parsing inside the execution methods.

libs/cluster/Session/MigrateCommand.cs Show resolved Hide resolved
libs/cluster/Session/ClusterSession.cs Show resolved Hide resolved
libs/cluster/Session/ClusterSession.cs Show resolved Hide resolved
libs/server/Resp/RespServerSession.cs Outdated Show resolved Hide resolved
libs/cluster/Session/RespClusterBasicCommands.cs Outdated Show resolved Hide resolved
libs/server/Resp/Bitmap/BitmapCommands.cs Outdated Show resolved Hide resolved
libs/server/Resp/KeyAdminCommands.cs Outdated Show resolved Hide resolved
libs/server/Resp/Objects/HashCommands.cs Outdated Show resolved Hide resolved
libs/server/Resp/Objects/HashCommands.cs Show resolved Hide resolved
@kevin-montrose
Copy link
Contributor Author

I think this PR has too many changes to vet, maybe we need to break it into smaller PRs. I think the first is good but very hard to review because the surface area is huge. I think the idea of having a bitmap to match command permissions is good. There might I couple alternatives to tradeoff between CPU cycles or memory. We need to separate general fixes from clear ACL enchantments and avoid pushing parsing inside the execution methods.

Yeah, it's gigantic - ACLs touch literally everything. Excluding tests, it is trending smaller as things DRY up.

A bunch of random-ish fixes are somewhat unfortunate, as a big part of testing this is making sure that failed ACLs don't bork the command stream - and I encountered a number of cases of fairly "brittle" command implementations. I'm open to breaking those out into separate PRs once ACLs are in a good place, but I needed some confidence that the ACL design would work for all commands upfront.

kevin-montrose and others added 4 commits May 28, 2024 13:40
Co-authored-by: Lukas Maas <contact@lukasmaas.com>
…ssion ACL validation

address feedback; break CommandPermissionSet out into separate file
address feedback; increment correct statistic, fixing bug
address feedback; adding count is unnecessary in many places, remove that post-DRY-ing
@badrishc
Copy link
Contributor

Great work overall. Shaping up really nicely, thanks!

This means we now document all of these in RespCommandsInfo.json, as they're just normal commands from the perspective of ACL'ing.
…mands cannot be ACL'd, but ACL'ing all custom (with @Custom) or as part of Garnet extensions (with @garnet) works as does ACL'ing all custom non-object (with customcmd), object (with customobjcmd), and transaction (with customtxn) separately

Fixup a number of test failures and nits.
…rt of this, correct some places where commands weren't tagged @fast or @slow (everything should be tagged one or the other)
…have no impact.

This implementation isn't amazing, but ACL changing is hopefully rare and privileged enough that it is acceptable.
@kevin-montrose
Copy link
Contributor Author

@lmaas @vazois @badrishc This is now in a complete state from my perspective, and ready for full review.

@badrishc
Copy link
Contributor

badrishc commented Jun 5, 2024

Benchmark (Release mode): Embedded.perftest --op-workload PING --op-percent 100 -t 1

Framework: net8.0

Branch: main (on Kevin's fork)

Running with 1 threads...
Avg. throughput: 135719.46 Kops/sec; 135719.46 Kops/sec/thread (Max: 135719.46, Min: 135719.46)

Branch: aclImprovements (on Kevin's fork)

Running with 1 threads...
Avg. throughput: 62371.18 Kops/sec; 62371.18 Kops/sec/thread (Max: 62371.18, Min: 62371.18)

Summary

Looks like we are at around 54% performance regression at the moment. Need to drill deeper into why.

Edit: looks like there is too much variance across runs so take above with a grain of salt.

@TalZaccai TalZaccai self-requested a review June 5, 2024 00:48
Copy link
Contributor

@vazois vazois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some issues related to code style. Please use compact switch statements where possible.
There are some protocol compatibility issues with the responses generated for invalid input in cluster and bitmap commands.
Thanks for refactoring the cluster commands to make them compatible with the improved ACL implementation.

libs/server/Resp/Objects/HashCommands.cs Show resolved Hide resolved
libs/server/Resp/Objects/HashCommands.cs Outdated Show resolved Hide resolved
libs/server/Resp/Objects/HashCommands.cs Outdated Show resolved Hide resolved
libs/server/Resp/RespCommand.cs Outdated Show resolved Hide resolved
libs/server/Resp/RespCommand.cs Outdated Show resolved Hide resolved
libs/cluster/Session/ClusterCommands.cs Outdated Show resolved Hide resolved
libs/cluster/Session/ClusterCommands.cs Show resolved Hide resolved
test/Garnet.test.cluster/ClusterNegativeTests.cs Outdated Show resolved Hide resolved
test/Garnet.test/RespAdminCommandsTests.cs Outdated Show resolved Hide resolved
test/Garnet.test/RespAdminCommandsTests.cs Outdated Show resolved Hide resolved
@kevin-montrose
Copy link
Contributor Author

Benchmark (Release mode): Embedded.perftest --op-workload PING --op-percent 100 -t 1

Framework: net8.0

Branch: main (on Kevin's fork)

Running with 1 threads... Avg. throughput: 135719.46 Kops/sec; 135719.46 Kops/sec/thread (Max: 135719.46, Min: 135719.46)

Branch: aclImprovements (on Kevin's fork)

I reproduce this regression on my latest too, I'll spend some time digging (now that all feedback is addressed) and update when I have an explanation/fix.

return false;

while (!RespWriteUtils.WriteError($"ERR Unknown subcommand or wrong number of arguments for COMMAND.", ref dcurr, dend))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error should be in the format of CmdStrings.GenericErrUnknownSubCommand


// Handle COMMAND COUNT
if (count > 0 && (subCommand.SequenceEqual(CmdStrings.COUNT) || subCommand.SequenceEqual(CmdStrings.count)))
while (!RespWriteUtils.WriteError($"ERR Unknown subcommand or wrong number of arguments for COMMAND COUNT.", ref dcurr, dend))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error should be in the format of CmdStrings.GenericErrWrongNumArgs

/// have a matching RespCommandInfo objects defined in RespCommandsInfo
/// </summary>
[Test]
public void CommandsInfoCoverageTest()
Copy link
Contributor

@TalZaccai TalZaccai Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was removed from main, we decided not to have it enforced. Alternatively, we can add an Ignore attribute

@badrishc
Copy link
Contributor

badrishc commented Jun 5, 2024

I reproduce this regression on my latest too, I'll spend some time digging (now that all feedback is addressed) and update when I have an explanation/fix.

  • Is it possible something is inadvertently getting boxed?
  • I made this into a BDN testcase here: main...badrishc/resp-parse-stress - here are the results comparing main to this PR:

main:

Method Job EnvironmentVariables Runtime Mean Error StdDev
InlinePing .NET 6 Empty .NET 6.0 4.292 us 0.0446 us 0.0395 us
InlinePing .NET 8 DOTNET_TieredPGO=0 .NET 8.0 2.449 us 0.0052 us 0.0040 us

PR:

Method Job EnvironmentVariables Runtime Mean Error StdDev
InlinePing .NET 6 Empty .NET 6.0 4.937 us 0.0462 us 0.0409 us
InlinePing .NET 8 DOTNET_TieredPGO=0 .NET 8.0 3.958 us 0.0443 us 0.0393 us

@kevin-montrose
Copy link
Contributor Author

Is it possible something is inadvertently getting boxed?
I made this into a BDN testcase here: main...badrishc/resp-parse-stress - here are the results comparing main to this PR:

Doesn't appear to be allocations based on local profiling. Might be useful to add [MemoryDiagnoser] to that benchmark, which I haven't had a chance to poke at yet.

Doing some quick and dirty benchmarking with the Embedded.perftest example, it looks like it's mostly the cost of doing the CheckACLPermissions on every request now.

On my box, .NET 8 with PGO disabled I'm getting about...

  • main ~62K PINGs / s
  • this branch ~43K PINGs / s
  • this branch, but remove the call to if (...) with the CheckACLPermissions call ~58K PINGs / s

Total loss ~30%, and a solid ~25% (~15K PINGs / s) is lost in checking ACLs.

I'll noodle on speeding this up in the morning.

@badrishc
Copy link
Contributor

badrishc commented Jun 6, 2024

Doesn't appear to be allocations based on local profiling. Might be useful to add [MemoryDiagnoser] to that benchmark, which I haven't had a chance to poke at yet.

I have updated the branch with [MemoryDiagnoser] and added Get/Set in addition to Ping, for completeness. PR for this is here: #446

@kevin-montrose
Copy link
Contributor Author

Mkay, because I can't leave well enough alone despite it being after 11 PM here...

One relatively simple approach would be to slap a int[] (trading space for time) on User that's used to cache the last ACL check indexed by each command (skipping any ACL checks on a 1, only transitioning from 0 to 1 on a successful ACL check). We just allocate a new int[] and swap it out whenever the user's ACL is modified. A really terrible hacked version of this gets in the ~59K range again locally, but that's with no testing so huge grain of salt.

I need to think on if there's any point in keeping the bit-packed version around; if not, we could just initialize the int[] with the "truth" and then there's not even an update.

Another, much less simple, approach would be to lean in to RespCommand being expanded and change the various ProcessXXXCommands methods into a single lookup into an array of function pointers / delegates. The array starts initialized with some function that does the ACL check and replaces the appropriate entry with a pointer to the corresponding NetworkXXX(...) method for the command on successful ACL. Subsequent invocations would jump straight to the command's actual implementation. On any user permission change, we reset the array.

That might ultimately be a bit cleaner, but it'd require a bunch of additional changes on top of this already gigantic change. There's also an indirection in there which makes me not so confident the perf would be any better.

So I'll keep pursuing the first approach. Of course, if there are any other clever ideas, I'd be happy to hear them.

@badrishc
Copy link
Contributor

badrishc commented Jun 6, 2024

Mkay, because I can't leave well enough alone despite it being after 11 PM here...

One relatively simple approach would be to slap a int[] (trading space for time) on User that's used to cache the last ACL check indexed by each command (skipping any ACL checks on a 1, only transitioning from 0 to 1 on a successful ACL check). We just allocate a new int[] and swap it out whenever the user's ACL is modified. A really terrible hacked version of this gets in the ~59K range again locally, but that's with no testing so huge grain of salt.

I need to think on if there's any point in keeping the bit-packed version around; if not, we could just initialize the int[] with the "truth" and then there's not even an update.

Another, much less simple, approach would be to lean in to RespCommand being expanded and change the various ProcessXXXCommands methods into a single lookup into an array of function pointers / delegates. The array starts initialized with some function that does the ACL check and replaces the appropriate entry with a pointer to the corresponding NetworkXXX(...) method for the command on successful ACL. Subsequent invocations would jump straight to the command's actual implementation. On any user permission change, we reset the array.

That might ultimately be a bit cleaner, but it'd require a bunch of additional changes on top of this already gigantic change. There's also an indirection in there which makes me not so confident the perf would be any better.

So I'll keep pursuing the first approach. Of course, if there are any other clever ideas, I'd be happy to hear them.

I'm not sure we need to do this. The bitmap is nice and tight. There might be a simpler way to get back the performance. Can look into it tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants