Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deficiency: uninstantiable union fields still require a tag value (proposal: exclude them) #19860

Open
rohlem opened this issue May 4, 2024 · 0 comments

Comments

@rohlem
Copy link
Contributor

rohlem commented May 4, 2024

Tagged union-s are allowed to have states/fields of uninstantiable (noreturn-like) types - see #3257, #15909, and other issues for explanation.

However, in status-quo it is required that the union states/fields match the tag enum's states/fields exactly,
including states/fields of uninstantiable types.
This introduces an unnecessary inefficiency to tagged union-s that can only be worked around by
manual reification via @Type, which worsens ergonomics, making code both harder to read and write.

Some example code to reference:

/// returned union has 0-3 instantiable states/fields
pub fn U(comptime has_a: bool, comptime has_b: bool) type {
    return union(enum) {
        const FieldA = if (has_a) u8 else noreturn;
        const FieldB = if (has_b) u8 else noreturn;

        a: FieldA,
        b1: FieldB,
        b2: FieldB,

        /// `switch` allows specifying all states/fields,
        /// which improves ergonomics in generic code.
        fn assertNotB2(x: @This()) void {
            switch (x) {
                .a, .b1 => {},
                .b2 => unreachable,
            }
        }
    };
}

comptime {
    const UA = U(true, false);
    const UB = U(false, true);
    const UAB = U(true, true);
    const TagA = @typeInfo(UA).Union.tag_type.?;
    const TagB = @typeInfo(UB).Union.tag_type.?;
    const TagAB = @typeInfo(UAB).Union.tag_type.?;

    const assert = @import("std").debug.assert;
    assert(@bitSizeOf(TagA) == 2); //1 state should require 0 bits
    assert(@bitSizeOf(TagB) == 2); //2 states should only require 1 bit
    // Note: the compiler-generated Tag enum already isn't shared in status-quo
    assert(TagA != TagB);
    assert(TagA != TagAB);
    //for completeness, this is also disallowed in status-quo:
    const E = enum { a };
    const T = union(E) {
        a: void,
        b: noreturn, //error: no field named 'b' in enum 'main.comptime_0__enum_365'
    };
    _ = T{ .a = {} };
}

The main issue with status-quo is that the tag type is forced to grow to more bits than necessary.
There are two main cases to consider:

  • For user-provided union(T) tag types, I propose it should simply be allowed to provide a more optimal enum type than in status-quo,
    which is not required (but can still be allowed) to reserve states/values for uninstantiable union states/fields.
    This boils down to selectively loosening the current state/field equivalence check between the union and the tag type.

  • For compiler-provided union(enum) tag types, it might make sense
    after Proposal: namespace type equivalence based on AST node + captures #18816 / compiler: namespace type equivalence based on AST node + captures #19190
    to expect the tag type to be deduplicated for types created from the same AST node.
    However, this currently isn't the case, and I personally don't see the value in doing this.
    If that particular behavior is desired, an explicit enum type can be created and used instead.
    Therefore, I propose that the compiler-provided tag type also shouldn't include states/fields for union states/fields with uninstantiable types.


Technically optional: Salvaging (exhaustive) switch

The one additional demand I want to pose here is that the ergonomics of switch should not degrade due to this optimization.
I find it highly valuable to be able to write a single switch, include all fields, and re-use that code regardless of which fields are instantiable and which aren't.
I believe that today this only works because the tag enum contains all of these fields, which is used as result type of the enum literals in the switch prongs.

In order to not degrade this use case, tagged union types basically need a list of all uninstantiable field names,
and those particular names have to be whitelisted to appear in switch prongs.
(Further allowing them in ==/!= comparisons, etc., would also be nice though .)

The cleanest implementation I can think of for this would be to include a second full_tag_type in builtin.Type.Union.
This full tag type were to be used as "first result location" for type checking,
while the actual field enum type is used afterwards - at this point uninstantiable field names are dropped due to being unreachable.

I realize this last part is a semi-big language feature to propose, but I really think the ergonomic boon would warrant it.
That said, there'd be ways for me to work around it in userland, so it's not as critical of a requirement as the first half
(which would require reifying all applicable tagged union types with @Type).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant