-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove goto #630
Comments
Unless there are local functions with access to parent stack
|
Your example code can be implemented with defer like this: fn foo() -> %void {
%defer {
// nontrivial error handling code
};
%return errorCondition();
%return someOtherErrorCondition();
%return yetAnotherErrorCondition();
} |
Here's another way to write it: fn foo() {
e: {
if (error_condition) return :e;
if (some_other_error_condition) return :e;
if (yet_another_error_condition) return :e;
return;
}
// nontrivial error handling code
} |
@andrewrk: these two examples are syntactically very tricky. They move the simple and explicit error checking from its proper place to some unnatural location, or they require creating additional functions (and inventing more names). They are replacing the most simple and readable flow
with something strange. The examples also assume that everyone wants to return Zig (I am also unsure whether the examples would work, because of the need to access local variables.) Here's another real code advantage of
To implement such feature (shared error handling code) is impossible w/o |
The second one has the same source locations as your pseudocode.
Where is the Zig error in the second example? Please respect the OP's request:
|
I should had comment the examples separately. The second example is still tricky and creates named block and adds new level of indentation. These are two local level problems I try to minimize: the need to invent names and the excessive indentation. (That's why I proposed Zig having
There's not.
The 3pp code which uses A non-simple use is I do not write open source. I use goto occasionally, only for error handling, and only going down. It never caused problems for me. |
Btw, the second example could be rewritten as:
No need for blocks and one artificial name less, but I would not like it either. Unlike occasional goto such trick is also not present in C codebases I know. |
|
I wrote this code today. Made me regret that there is no goto in javascript.
In the worst case it calls I honestly don't understand why you want to remove goto so much. |
re2c looked like a spam, but I guess it's not. It uses goto heavily, is it? |
pluto439, I am not convinced this JavaScript is a good use case. Why don't you just do a loop and use "break"? |
Why t f would you use a loop here? There is nothing to loop. |
can be done with a loop:
But I am obviously missing something and looking like a fool. Sorry. |
Oh, this. Yes, it could be better with loop, bad example then. The flat code feels simplier though... The ofter example then, I often need to do something like
I guess I have to learn how to use exceptions. My code is pretty unstable right now, it breaks from time to time. I probably should learn D just for its exception handling. |
@pluto439, you can write code like that in js with nested if statements or if you want to make it flat then you can wrap the code in a
In Zig that kind of behaviour can be achieved with while loops if you really want to but you should use %defer (or whatever the syntax for it is in the future), so there is no need for goto. To be honest I believe only reasons for having goto are state machines/interpreters where you would most likely want to have computed goto which we don't have in Zig or error handling for which Zig already has better replacement already. So unless we can get computed goto, goto should probably be removed as obsolete language construct. Computed goto rationales and examples: |
Ilariel, you introduced that I thought jido was going to -- fake loop. Goto is so much simplier than all of this. Why you want to make my life harder? |
find: {
const view_tag = document.getElementById("view") ?? return :find;
const a_tags = view_tag.getElementsByTagName("a");
if (a_tags.len < 3) return :find;
const third_elem = a_tags[2];
// ...
} This looks better to me than any of the JS code up there. Also it found a superfluous check for |
With goto I don't need to create code blocks needlessly. I had a tendency to create too many code blocks, and code went all the way to the right as a result. Now that I'm not afraid to use return anymore, it's a bit better.
How do I read it? Like this?
It wasn't superfluous. There are different actions on "found" and "not_found".
|
Thank you @PavelVozenilek, @Jurily, and @Ilariel for the real actual examples of goto. Here's my analysis of them:
(And configurable macro functions can be implemented with comptime-dynamic function references.)
and:
"computed goto"-style finite state machines can be implemented with As i said here, while-switch can be converted to computed goto by an optimization pass in the compiler. Even if llvm doesn't have this, we can do it on the zig side. Even if it proves to be difficult to do, we can add a language feature later that does this better than general-purpose gotos with respect to communicating intent precisely. And we're not discussing removing computed goto, because zig doesn't have that. This proposal is to remove goto with hardcoded labels. I don't know if re2c uses hardcoded goto or computed goto, because I couldn't find any examples of the output C/C++ code. |
Why the dislike for So there's simple concept proven in practice versus novelty feature. This novelty feature is IMHO harder to think about and less expressive than the old one. I personally do not like novelties and would recommend to keep as close as possible with C syntax/features. |
There are good reasons to have But I'll still give my opinion on
fn foo() {
if (is_error()) goto handle_error;
var value = bar();
return value;
handle_error:
return error.Foo;
} If you can't tell why that's a compile error, then that's evidence that To make unstructured So that's my opinion on |
@thejoshwolfe: I do not know about implementation issues, and cannot argue about these. But it seems to be sold here as backward, dangerous feature to be replaced by shiny new thing. (I remember goto horribleness was discussed long ago in D, but then Walter Bright said that he converts all control flow constructs in gotos internally. Also some proposed syntax could be written using existing constructs, as mentioned here. But I would not like this kind of minimalism. |
So I now have to write a more ugly code to make compiler a bit simplier. I thought your goal was to make close to metal programs, which is why you avoided garbage collection. Removing goto means going in the opposite direction. Eh, whatever, I need to make my own language anyway. |
Nope. Making the compiler simpler is just my opinion. The real argument for removing
That's over simplified. We want to make optimal, readable, maintainable, etc. programs. The Zen of Zig does not include imitating assembly. If you can make an argument that Pseudo code doesn't impress me much, because it's possible to invent pseudo code to justify any imaginable language feature. Pseudo code does not demonstrate a need for |
defer already breaks it. You can make the same code even without defer, and it will be even more oblivious. Defer was made for exception handling (in D, by a different name), it doesn't make much sense by itself, and zig doesn't have exceptions. (Well, it has one, but it's one way and is always fatal.)
I think I talked about this here #594 (comment) . Gotoing after a variable creation will mean that a variable exists, but is simply uninitialized. Also look at this, gotoing over a declaration is ok for variables and fixed-length arrays, but not ok for variable-length arrays https://stackoverflow.com/questions/29880836/skip-variable-declaration-using-goto When time comes to livecoding, you will be jumping all over the place in the debugger. You know how in visual studio you can just drag the yellow arrow all over the place to go back or forward? Internally, this is the "goto". Why forbid something in the language, if you will need that in the debugger later? Defer will somewhat complicate debugging too, because code wouldn't flow from top to bottom anymore, it will return in the defer sometimes. Imagine gotoing into the defer block, what will happen at exit? (It will run all defers in the code block in the reverse order, starting from the one it is in right now.) Goto is the way the hardware works, it's very natural simply for that reason. Removing goto will mean that I will have to create many code blocks that I overwise wouldn't need. With goto I'm 100% sure that everything I can imagine I will be able to implement. I'm imagening something like <> diamond, where code blocks will have intersect. Or some very complex expressions in the "if" statements, when the only alternative to goto will be creating extra variables, and compiler will not be able to optimize them out back to goto properly. Or a situation when I will need to do uninitialization of something in a very specific order, that the standard defer will not be able to handle. I can't invest any time in zig unless you figure out the situation with exeptions and goto. Especially exceptions, because it affects the rest of a program significally. Exceptions can be faster than error codes if used carefully. #578 (comment) |
@paulie-g Do you have any specific examples of code that would be more readable / less reliant on compiler optimizations if goto was available? Real world use cases are far easier to understand and find solutions to or make proposals for. |
Zig is a general purpose language, intended to target a wide variety of architectures. Zig's control flow structures are specifically designed to be portable. For two examples, have a look at the SPIR-V specs under "Structured Control Flow" and WebAssembly Specs. These target architectures do not have jump - in fact they have basically the same thing as Zig's labeled blocks and loops. It's possible for a backend compiler to rework a goto-based control flow into targeting these platforms but (1) that is more work for the compiler, making it compile slower and (2) more difficult to optimize since the goto is actually a higher level construct! Zig's status quo control flow is both convenient for the programmer as well as extremely portable for backend architectures. I second @pfgithub in inviting you to provide specific examples of code that zig fails to adequately codegen for that we can evaluate as motivating use cases. Finally,
If you want to participate here you're going to need to tone this kind of thing down. You're new, we're all ready to welcome you and be friendly and have fun and stuff. You're comin in hot with guns blazing, put em away cowboy, this here's a peaceful town. Have a drink on me |
@andrewrk Point fully taken. I'm used to a comm style where directness bordering on frankness is valued, but it's your community and your rules - fair enough, especially since I've contributed a grand total of zero to it. I'd point out that the epithet was used in reference to a decision, not a person, but re-litigating the benefits of LKML vs nice people behaving nicely cultures isn't on-topic ;) FWIW, I wouldn't have commented at all had I not cared enough - zig looks uniquely attractive (except, at first glance, for this issue). There is no need to provide goto for those platforms. There's not even a need to implement it where it creates a problem, such as in the presence of your HL error handling mechanisms. It's perfectly reasonable to say "it's available for silicon targets only" and "must not appear in a block that uses any of X, Y.. Z". Most of the classical use cases for goto/computed goto have been mentioned here and the other thread already - interpreters, FSMs (which are isometric really), kernels and so on. I have a high-performance non-cryptographic hash function in front of me (in C, obvs) where it's vital. Some Rust people demonstrated the interpreter use case quite conclusively (incidentally, around the same issue - gotos), complete with analysis of what LLVM produces. It's not even just a codegen issue - in most of those cases, you're trying to avoid giving the CPU a single-site branch prediction problem it can't possibly solve, being hit in a tight loop. It's a case of "I have the needed information, but have no way to convey it down the stack and those layers can't possibly induce it because the information isn't there". Frankly, I was just leaving my opinion here without an expectation of a response. The removal has been made, computed gotos were never even in to begin with, and I doubt that any case I put forward is going to move the needle - I don't imagine this decision was taken on a whim nor do I imagine the people who made it were lacking general clue, just experience with doing the sort of specific things where this micromanagement is useful. Broadly, this is about control for the programmer. The explicit and ubiquitous support for custom allocators is super useful for precisely that reason - it has me completely sold - and something that's not present in other competing langs in a usable way. Skimming through the intro, zig seems to emphasize a featureset that gives me fine control when I want it - except, that is, in this particular case. For the life of me I can't understand the departure. |
I've read the inline asm discussion btw. It's an option, but a bad one. The real-world example is the same hash function mentioned above - insn scheduling is heavily microarch-dependent because xMULx latency, reciprocal throughput and uOps on ports changes. Recent gcc, and I imagine clang, gives me correct scheduling up to Skylake-X at least automagically. With inline asm, I'd have to manually solve that optimisation problem not just once, but every time I make a change, once per microarch targeted. |
Can you open a new issue with some zig code and compare it to gcc/clang/msvc code to demonstrate the flaw? I won't close an issue that demonstrates where Zig falls short of beating C in this regard. |
That's going to be time-consuming, in that I'd have to a) create an artificial example, and b) create a benchmark harness in zig that works well enough to capture the difference (non-trivial, wasn't trivial in C either) - the difference is on the order of ~1 cycle for keys with len <= 32 iirc and a poisson distribution of key lengths iirc. Plus an extra 1 cycle that clang loses over gcc, which I'm assuming is in codegen. Realistically, I'm not sure when I would get to that. Out of curiosity, what was gained by the removal? I read the original thread, and this one, and didn't get a sense of what drove it. |
Hi, Is this issue still active? If so I want to try my attempt to post my code that uses goto here... |
@Trung0246 You can do that |
I don't have that much ziglang experience but I did read through the documentation Code: https://pastebin.com/vxS5WSmk (removed some unnecessary part, example usage is probably at the botom) Here is own personal code that uses Basically the logic of The I think I could refactor the code by using 2 lambdas in C++ and same for ziglang but then it's not really elegant compare to using a single If you have any question just ask (I know my code is ridiculously over engineered 😂) |
Does translate-c support goto yet? Once that's done we can tell people to write their goto code in C and see how it translates into zig 😄 |
Here is a use-case for Here's a godbolt comparing a C goto implementation and a Zig Gotos aren't the only way to represent DFAs -- there's the (IMO nicer) way of using a set of mutually tail-recursive functions. There are two problems with this, though: there's no TCO in debug mode, and LLVM still doesn't optimize it as well as it does the Note that this is a different use-case to computed goto -- LLVM actually optimizes |
@mb64 you can use the guaranteed tail call optimization of zig: fn lex_a(data: *Data) callconv(.C) void
{
while(data.cursor < data.input.len and data.input[data.cursor] == 'a') {
data.a_count += 1;
data.cursor += 1;
}
return @call(.{ .modifier = .always_tail }, lex, .{data});
} Full example is here: https://zig.godbolt.org/z/K1c817 |
How is this fn foo() !u8 {
if (is_error()) goto handle_error;
var value = bar();
return value;
handle_error:
return error.Foo;
} different from this? fn foo() !u8 {
attempt: {
if (is_error()) break :attempt;
var value = bar();
return value;
}
return error.Foo;
} |
It's not really. Zig essentially has goto via labeled breaks |
In one particular way it's very different and much better. The version with |
Labeled break is essential goto with a builtin off-by-one error, since the break goes to the statement after the label instead of the labelled statement. It also causes way too many levels of nesting to appear (something I want to avoid). |
Is there a good way to jump to a specific location within a while loop from outside said while loop? E.g. while (true) {
// logic A
if (condition) goto :LABEL;
// logic B
if (condition2) break;
}
while (true) {
// logic C
LABEL:
// logic D
} In a language like C# I would implement the 2nd while loop as a goto to make Of course, one could always use a boolean switch like so: const b = LABEL: while (true) {
// logic A
if (condition) break :LABEL false;
// logic B
if (condition2) break :LABEL true;
};
while (true) {
if (b) {
// logic C
}
b = true;
// logic D
} To me, doing this is more complicated than it needs to be. I would prefer a goto statement. However, I would settle for someone telling me how I could get LLVM to optimize this to be equivalent in the emitted assembly. In my code, the mov al, 1
test al, 1
jne .LBB1_443 ; why do we need to check if ((1 & 1) != 0)? 😔
jmp .LBB1_460 ; this is impossible. We all know 1 and 0 are not equal. (I am using ReleaseFast on zig trunk on godbolt) |
Your solution would require to guard the logic between the two loops if there is any too: while (true) {
// logic A
if (condition) goto :LABEL;
// logic B
if (condition2) break;
}
// logic I
while (true) {
// logic C
LABEL:
// logic D
} Actually it looks like a state machine, so I would express it with explicit state: var state = 0;
while (true) {
if (state == 0) {
// logic A
if (condition) state = 1 else state = 2;
}
else if (state == 1) {
// logic B
if (condition2) state = 3 else state = 0;
}
else if (state == 2) {
// logic D
state = 4;
}
else if (state == 3) {
// logic I
state = 4;
}
else if (state == 4) {
// logic C
state = 2;
}
} |
Just to add, that solution specifically would be further improved by #8220: const initial_state = ;
sw: switch (@as(u3, 0)) {
0 => {
// logic A
continue :sw if (condition) 1 else 2;
},
1 => {
// logic B
continue :sw if (condition2.*) 3 else 0;
},
2 => {
// logic D
continue :sw 4;
},
3 => {
// logic I
continue :sw 4;
},
4 => {
// logic C
continue :sw 2;
},
else => unreachable,
} |
For reference, here the code I am working on: https://zig.godbolt.org/z/sjzYj7nGG. It is not cleaned up very much yet but what I am doing is trying to find the most optimal way to implement my DynSDT data structure: https://validark.github.io/DynSDT You can see I have multiple implementations of the The function mov bpl, 1
test bpl, 1
jne .LBB16_44
jmp .LBB16_50 The if (depq_len == 10 - k and next_i != NULL) {
do_cur_insertion = false;
break; // this should skip the if (do_cur_insertion) statement in the following loop
} It should jump to the second half of this loop: while (true) {
if (do_cur_insertion) {
// ...
}
// please jump here!
do_cur_insertion = true;
if (next_i != NULL) {
// ...
}
depq_len -%= 1;
if (depq_len == std.math.maxInt(@TypeOf(depq_len))) return @intCast(u4, k);
cur_i = depq[depq_len];
} Instead, it jumps to code that checks More complaints about LLVMOddly enough, in that same function, not related to the issue of goto, this code: k += 1;
if (k == 10) return 10; translates to this assembly: inc al
cmp al, 10
je .LBB16_45
; ... stuff
.LBB16_45:
mov al, 10
jmp .LBB16_2 Randomly scrolling through the assembly, I found another example: cmp r11b, 4
jne .LBB16_38 ; if (r11b != 4)
jmp .LBB16_41 ; if (r11b == 4)
; ...
; this is only referenced once in the assembly, so it was generated specifically to go with the previous jmp!
.LBB16_41:
vpextrd esi, xmm0, 3
mov r11b, 4 ; r11b = 4 😔 Maybe this is just a problem in general for LLVM? The emit is also pretty temperamental. Changing the comments in the code can lead to instructions changing order, which is odd (when the instruction order does not matter). My first assembly in this comment sometimes gets one of the other if (depq_len == 0) return k;
depq_len -= 1; This gives me this emit: .LBB16_38:
test r11b, r11b
je .LBB16_2
.LBB16_39:
dec r11b However, this same code in the second while loop gives me this emit: sub r11b, 1
jb .LBB16_2 I can ask more forcefully for this optimization by using this code instead: depq_len -%= 1;
if (depq_len == std.math.maxInt(@TypeOf(depq_len))) return k; Yes, this is something of a micro-optimization, but it is still weird to me that if you use the exact same code to check |
Update: I tried changing the code to look more like a state machine by using an enum rather than a boolean, and it translated the end of each block to this instead: xor r9b, 1
jmp .LBB16_43
; ...
.LBB16_43:
test r9b, 1
je .LBB16_51 At the moment, it might be impossible to achieve what I am trying to do. If anyone wants to take a crack at it, I would be happy to be shown there is a way to get the optimized output I am looking for. Otherwise, maybe #8220 will guarantee the jumps I am trying to do? I also tried implementing this via tail calls: https://zig.godbolt.org/z/Gj9qeahMK |
This is a good example of idiomatic goto in a VM dispatch loop: https://github.com/sqmedeiros/lpeglabel/blob/master/lplvm.c It only stands as an argument for goto if there isn't a better way to write it in Zig. Is there? |
The I really don't see any logic in that file which has a compelling use of (Regardless, I should note that this part of Zig is essentially as decided as it gets; |
I don't have opinions about whether Zig should, or should not, have I saw this in the first comment:
And then read a lot of abstract discussion, with relatively few real-life examples of nontrivial Here is another example, also from the Lua team, where eight gotos are used to structure a VM dispatch loop. Lua is renowned for its relative speed and very small binary. They're using I'm evaluating Zig for writing a VM, so I'm keenly interested in whether it's possible to get the same performance, and a clearer instruction flow, using what Zig has now. If it is, why would anyone want You might look at the question this way: if someone replies with a detailed answer, and demonstration that the machine code generated by a Zig version is broadly equivalent, that's conclusive: Zig doesn't need But I did read all the comments, and code like this wasn't addressed (a related question about computed gotos was). So, given the specific call to action in the first post, I thought it was appropriate to resurrect the discussion. |
@mlugg's answer is still applicable. #8220 would allow you to do arbitrary jumps without the normal footguns associated with goto. Right now, you can only do it with tail call optimization, which is a massive pain and doesn't work on some platforms (iirc WASM). Right now, you can write your VM as a regular switch statement, and once #8220 comes out you can switch over. Otherwise, you might want to use DynAsm or something that allows you to have precise control over register allocation if you really want to write a performant VM. A high level language + LLVM, by itself, is not going to be the ideal choice for a VM if you want to beat LuaJIT. If you're interested in more ideas, I'd look at https://github.com/luajit-remake/luajit-remake |
Thanks for the pointer to that LuaJIT project, definitely the sort of thing I'm interested in generally! Custom ASM-enhanced VMs have their place, as does tracing. There is also a place for portable platform-independent VMs which perform well: so Lua, not LuaJIT. Reading #8220 more closely, I'm beginning to see how it would more than likely cover the parts of a VM's state machine which can't be expressed in C without |
I'd like to note that the use case you bring up is actually very important to the compiler itself. The main loop of semantic analysis in the Zig compiler - our primary bottleneck excluding LLVM - is essentially a VM instruction dispatch loop, which in fact largely motivated #8220 in the first place. Similar loops also exist in all of our code generation backends. So emitting optimal assembly here is definitely something we want to make sure we can do! (Incidentally, I am actively working on #8220, and expect to have a PR up within days.) |
Late to the party, but I think I have an interesting use case for goto in C to contribute: at the end of a switch-case branch to jump to a handful of different 'epilog blocks' (this is code-generated): The compiler output is exactly as you'd expect (in native code at least): the code branches via a jump-table, and the end of each case-branch has a direct jump to one of the three epilog blocks. Empty case branches are "short-cutted" and directly have the address of the goto target in the jump table. I didn't inspect WASM output, but performance is close to native code, so I guess it's fine. Here's a blogpost about how that emulator works in general (only the first couple of sections are interesting): https://floooh.github.io/2021/12/17/cycle-stepped-z80.html PS: as a special case, the https://github.com/floooh/chips/blob/bd1ecff58337574bb46eba5e7e16937899360e56/chips/z80.h#L1805 ...think of it as a giant state machine, not unlike the result of an async/await code transformation. PPS: I'm planning to do some emulator experiments in Zig soon-ish, and I hope I can somehow emulate this construct without a C-like goto (maybe with nested named blocks and labelled break?) - in any case I kinda like how straightforward the C version is for this sort of low-level "outside-the-box" contraption :) |
Please everyone post links to your zig code that uses goto, so we won't remove the feature. Alternatively, post links to C code that uses goto, so we can assess if there's a more zig-like way to do it, or if goto is really the best solution for those cases.
I want real actual code here, not pseudocode examples. Please link to open source projects.
This discussion started in #346.
For reference, here's how you can do without
goto
:defer
for the try-finally pattern.%defer
for the cleanup-on-error pattern.while (true) { switch (state) { ...
for cases where you want computed goto. Here's some existing discussion on that: labeled loops, labeled break, labeled continue #346 (comment)break
to jump forward and labeledcontinue
to jump backward, placing loops where necessary. This is the plan for the C-to-Zig translator. This is the ugliest solution, but should always work when all else fails.The text was updated successfully, but these errors were encountered: