Zig language server and cancellation

198 points
1/20/1970
10 months ago
by goranmoomin

Comments


ninepoints

As someone actually working on some symbol indexing stuff at the moment, I can mention that this model is ignoring some pretty important details, and I think there's some classic overengineering happening here (easy trap to fall into when you're still in the design space).

Generally, when you need a code completion action, it's expected that you have an AST (post semantic analysis). This AST is still very useful even after a user continues to edit the code! Outside of where the cursor is, most of the symbols, definitions, and declarations are relevant, and source locations can be easily translated from a previous version of the document, provided that deltas are tracked. Cancelling an existing translation unit parse on edit is wasteful, because the majority of the time, that parse will produce meaningful results. The better approach (IMO) is to let the parse finish, but immediately enqueue a subsequent parse (with some debouncing timer to avoid overly consuming user resources). If you wanted to level this up further, you could perform incremental parsing/analysis, provided your language supports it. In the presence of a preprocessor, this can be very difficult, but it's the next "upgrade" from the previous approach mentioned in my opinion.

10 months ago

matklad

This depends on the compilation model in use. If that’s a traditional pipeline of phases, where the result is an AST data structure for the whole CU which gets annotated with types, then, yes, “enqueue new analysis” makes sense.

If the compilation model lazy & query based, then theres’s just no “enqueue a subsequent parse”, but rather “give me the type on this thing doing as little analysis as possible”. You got to re-use symbols and declarations not because you heuristicsly adjust offsets and assume they are otherwise valid, but because underlying analysis reasons, precisely, they they are valid and re-usable.

The two models are quite different at the core, and aren’t really an evolution of one into another.

Which of the two approaches to use, and, consequently, whether to think about cancellation at all, depends heavily on the language in question. If the language allows for lazy analysis, it’s probably a better bet, as that gives you correct results faster.

10 months ago

ninepoints

I only know a little bit about zig, but I would have assumed that the latter mechanism you are describing isn't really possible. Most languages with "advanced" features and type systems really need a full semantic tree to do anything meaningful due to the complexities of compile time code and type instantiation.

Even if we are using this latter model, the user is only editing one file at a time. What if the completion source is in a different file altogether? Generally, you're going to be indexing the entire codebase anyways, so we're splitting hairs over a potential optimization in just one facet of the indexer.

10 months ago

59nadir

> Even if we are using this latter model, the user is only editing one file at a time.

I'm not necessarily arguing against your bigger point but this is very frequently a wrong assumption and strikes me as designing around an idealized view of the problem that you would like to be the case, not actual reality. Code generation scripts, formatters, auto-fixers and the like can modify many files and many parts of those files "at a time", unless you have a pedantic view of what "at a time" means. Almost all LSPs fail in these modes of operation and disallow external tools from participating in the code base, which is very annoying and makes them less useful.

Having to actually (re-)open a file so the LSP can "see" changes made to it by an external program is not something that should ever be needed but happens a lot with some of the most worked-on language servers (TypeScript comes to mind).

10 months ago

lozenge

According to the spec it's up to your IDE to notify the LSP when files change (even due to an external software) then the LSP reads the new versions.

Typescript also doesn't use an LSP as it predates the introduction of LSPs.

10 months ago

saghm

What does it mean for a language "not to use an LSP"? The most straightforward way I can think to interpret that statement is that it's saying that a TypeScript language server doesn't exist at all, which certainly doesn't seem to be the case based on finding https://github.com/typescript-language-server as the top result when googling "typescript language server". I guess you might be saying that it's not a first-party language server, but I'm not really sure why that would be relevant when discussing properties of language server implementations; the implementation clearly exists, and even if very few people are using it, that doesn't mean GP's point about a flaw in its design wouldn't still apply.

10 months ago

lozenge

I meant both VS Code and VS use tsserver which doesn't follow the LSP protocol. Yes there is an LSP for other editors.

I don't see the flaw in the design, it's more likely the editor isn't behaving according to spec.

10 months ago

ninepoints

Either way, all those files need to be marked dirty and reindexed, so I suppose I'm just not sure how the topic at hand is relevant. Is the proposal that we attempt to treat each dirtied file as though incremental edits were imminent? Because this is precisely the wrong assumption you're raising. Ultimately, if you need to reopen a file in order to see changes on disk reflected, that's a bug with your lsp server, nothing more.

10 months ago

59nadir

> Either way, all those files need to be marked dirty and reindexed, so I suppose I'm just not sure how the topic at hand is relevant.

You stated an assumption as part of your argument (only one file is being edited at a time) and that assumption is very often false. If that assumption doesn't matter I don't know why you brought it up in the first place. I stated pretty clearly that I wasn't arguing against your larger point but the assumption you explicitly stated was false.

> Ultimately, if you need to reopen a file in order to see changes on disk reflected, that's a bug with your lsp server, nothing more.

You don't say? It's the kind of misbehavior you get when people make assumptions that are false and bake them into the design of things.

10 months ago

ninepoints

And that assumption is relevant in the context of OPs concern but not yours. You created a different hypothetical and orthogonal situation where the assumption need not apply, and to be honest, I lost the plot a bit. It's possible for there to be a branching set of concerns, each of which having their own set of perfectly valid but otherwise disjoint assumptions.

10 months ago

matklad

> Most languages with "advanced" features and type systems really need a full semantic tree to do anything meaningful due to the complexities of compile time code and type instantiation.

I wouldn’t say this depends on abstract complexity of “type system”. For example, Rust is pretty advanced, but lazy query architecture works well for it. Usually, it’s the name resolution/macro expansion that puts the wrench in the works, not a Turing-complete type system.

That being said, yes, in Zig you probably can’t do lazy query-based IDE. It really is a compile-time smalltalk, and wants to have an image.

10 months ago

ninepoints

I guess I meant complexity in the sense of symbol resolution in this context. So things like scoped type aliases, expressions evaluated as template or trait arguments, that sort of thing.

10 months ago

trashburger

This isn't the case for Zig, because the compiler works off a graph of declarations. You only need to compute the declarations referenced in a code block (which have well-defined sources, since you cannot create new decls at comptime right now) in order to compute the function itself.

>What if the completion source is in a different file altogether?

Well, yes, this is a very common case. I don't believe matklad is arguing that the computation should only be restricted to one file. From what I can gather from the article, these "working" and "ready" copies would be per-file as well and the "ready" states of each file would be invalidated as a referenced file gets edited.

10 months ago

vlovich123

I doubt you need a full semantic tree. Just the parts that are accessible to the local edit scope. Like if I break the code within a function, all other code in the unit should be ok. It gets trickier with broken syntax but there’s a lot of clever recovery techniques I’ve seen.

10 months ago

ninepoints

How do you propose code completions are done without it? To even produce a set of suggestions, you need to understand the parse context, the set of available symbols with appropriate types and scope resolution. AST parsers already have recovery mechanisms, and I am assuming those are all working as intended.

10 months ago

vlovich123

When I’m in function foo I may have a local closure X. When I’m editing bar I don’t need to consider X, right?

I think what you’re trying to say is “how do you pick a subset of a tree to load at any given point” / “it’s simpler to have the entire semantic tree and traverse it than trying to keep a live tree”. If that’s a correct reading, I agree, it’s a difficult problem. I don’t have any specific recommendations other than to note it’s a graph and graph databases don’t need to keep the entire graph in memory to do queries so there must be something similar you could do with code. The other part is that not all parts of the graph are equally relevant so intelligently pruning it should result in better completions. I explored this mildly as an undergrad but the topic never sufficiently interested me to continue pursuing it.

10 months ago

ninepoints

If foo is a function in this example, then yes I agree for most PLs. If foo is a class, namespace, or some other declaration context however, its contents are definitely relevant when editing a different context due to possible usage of scope operators.

10 months ago

levodelellis

I pretty much entirely agree with you. Below is a copy/paste of what I said when I saw this article elsewhere. What language is your LSP for?

> I written a LSP for my prototype compiler. I don't like any of the options you listed. My LSP didn't do any typechecking, it didn't build an AST, it didn't need immutable data structures etc.

> Typically when a person is typing into the editor the code is in a broken state (incomplete variable name, missing semi colon, maybe an open but no close parenthesis etc). What I did was look around what part is being edited and using the previous 'build' (when a user saves or ask the compiler to build), I would look up vars and type names. There's no need to rebuild everything on every keystroke. Maybe you can do it on a newline if you really wanted to but midsentence sounds like a bad place to try and you're not really gaining anything from compiling/parsing a single line change

10 months ago

samsquire

I am deeply interested in the multithreading, parallelism, async and coroutine design space and I journal about it everyday in my ideas journal. I am also interested in cancellation.

I wrote a toy very simple 1 scheduler thread:M kernel threads:N lightweight thread runtime in terrible Rust, C and Java.

Hot loops use a structure for its limit and looping variable. Then to cancel the loop, you set the looping variable to the limit from a scheduling thread, cancelling the loop. This is used for process switching and scheduling but it can also be used for cancellation.

Can create very responsive code this way, it's even possible to cancel while (true) loops by replacing them with while (!preempted) {}.

https://github.com/samsquire/preemptible-thread

There is potential for a race, but that can be detected and worked around.

10 months ago

girvo

Silly question, because I’ve been dealing with exactly this complexity in FreeRTOS and haven’t been able to solve it, but if you’re doing while (!preempted) doesn’t that mean that the preemption can only stop the loop after the current iteration of it finishes? I’ll dig into your code, but I’ve not been able to think of a (user space) way of cancelling a blocked thread (such that the thread is properly cleaned up/scope closes and destructors are called) myself

10 months ago

brabel

As far as I know, it' basically impossible to interrupt a blocking IO call unless you're using something designed for preemptive scheduling, like epoll[1].

A Java Thread can be interrupted, but only if the runnable implementation "cooperates"... if it's blocking on a synchronous socket "read", it's going to stay blocked even after the Thread has been interrupted.

In a language like Dart, where async is mostly the default, it's possible to cancel every job running on a "Zone"[2] because the mechanism used for creating a new Future is interceptable by application code, which is pretty cool.

Kotlin coroutines have a mechanism for cancellation built-in[3] and it's also "cooperative", i.e. all suspend functions will cancel within the coroutine context, but a rogue coroutine that blocks on a blocking IO call will remain blocked (as I understand it - not sure if Kotlin provides ways to completely avoid blocking IO).

So I guess whatever language you use, there has to be points of "yielding" control where cancellation can occur... the `while (!preempted)` check is just one of them.

[1] https://man7.org/linux/man-pages/man7/epoll.7.html

[2] https://dart.dev/articles/archive/zones

[3] https://kotlinlang.org/docs/cancellation-and-timeouts.html#c...`

10 months ago

Matthias247

You can interrupt blocking IO calls using signals, so there is an alternative to using epoll/select for this. It can also be used to implement timeouts if another userspace thread tracks the IO operation duration and sends a signal to the IO blocked thread if the max duration has expired.

I think Java thread interruption for blocking IO might be based on signals and is not using nonblocking IO under the hood. But I would need to read JDK source again to tell for sure

10 months ago

girvo

Yeah signals is what I’ve seen too, though I’m unsure if newlib + FreeRTOS has enough to make ‘em work! I’ll have a look though.

10 months ago

samsquire

Yes that's right.

I think as the other commentator said that you would need to use signals to interrupt within between elements of a loop.

I try not interrupt a hot for loop or while true loop to reduce performance impact of an if statement in a loop.

  register_loop(loops, 0, 1000000);
  for (; loops[0].loop_variable < loops[0].limit ; loops[0].loop_variable++) 
  {
    // Hot loop here
  }
It's difficult from another thread to know where in the loop another thread is in.

Then from another thread

  loops[0].loop_variable = loops[0].limit
10 months ago

progbits

If the loop uses loop_variable within its body this can cause a data race. Imagine:

    for (...)
    {
      int x = array[loops[0].loop_variable];
      array[loops[0].loop_variable] = calculation(x);
    }
If the loop_variable = limit from another thread happens after the array read but before the write you get corrupted answer, not just incomplete one.

How do you work around that? Storing a copy and only using that? Or does it not matter in your usecases?

10 months ago

samsquire

I never used the loop variable inside a loop in my testing but thank you for detecting this!

I would change it to this, shouldn't slow down the loop either.

  for (int i = 0; loops[0].loop_variable < loops[0].limit ; i = 
  loops[0].loop_variable, loops[0].loop_variable++) 
    {
      // Hot loop here
    }
10 months ago

tlarkworthy

I would probably go for immutability, which unlocks memoization, which unlocks reuse of parsing computation. Then do everything strictly (no cancellation) so the whole this is super fast for the common case. I do wonder if grabase collection is too hard to solve though for memory managed languages. Reference counting?

10 months ago

riwsky

> We divide the available memory in two equal parts, use one half as a working copy which accumulates useful objects and garbage, and then at some point switch the halves, copying the live objects (but not the garbage) over

So, generational garbage collection?

10 months ago

spullara

Nah, just a copying gc:

http://www.cs.cornell.edu/courses/cs312/2003fa/lectures/sec2...

Generational has more heaps where objects are promoted between them when they last longer.

10 months ago

dontlaugh

More specifically, semi space.

10 months ago

esjeon

There's no generation here. Just live or dead objects. It also sounds like they don't scan the pool, and instead just deep-copy the top-level state object.

10 months ago

tankenmate

Think of it more like a blue / green object store; but each switch only copies over valid objects (for some definition of valid; in use, type checked, etc).

10 months ago

Szpadel

> We can think about applying something like that for cancellation — without going for full immutability, we can let cancelled analysis to work with the old half-state, while we switch to the new one

if we want to cancel analysis basin on old state why do we need garbage collection and two states?

And how is garage collection way different from full immutability? In my understanding this is the same thing but with extra steps, you have to copy element instead updating it in place because existing copy might be used by already running analysis. Reference counting should give you similar benefits with less work needed, usually everything have only single reference (no analysis is using anything) and we could optimize it to modify it in place in that case, otherwise you just need to copy it once without requiring garbage collection later.

Am I missing something?

10 months ago

avgcorrection

Should a language be designed with these things in mind? Or is it completely orthogonal?

10 months ago

matklad

Descriptively, yes, the shape of the language determines how you can structure an interactive compiler. There are at least three wildly different ways to do that:

https://rust-analyzer.github.io/blog/2020/07/20/three-archit...

Which one can work depends on the language in question. Java, Rust, C++ lead to different answers, and that’s not because of the “interesting” differences like borrowchecker vs gc, but rather due to the “boring” differences in name resolution and module system.

Prescriptively, whether we _should_ do this (or, rather, how much), is unclear.

My biased answer is that, while language designers talk a lot about making languages tooling friendly, there’s little of that actually happening (at least with “current” languages, “next” ones seem to fare better). Like, it was said that “rust macros are designed with tooling friendliness in mind”, but overall the language is pretty tooling-hostile, mostly for accidental reasons. It seems to me that if we _actually_ co-design a tooling-first language, without trying to innovate too much, but by just ensuring that existing techniques work robustly, we might arrive at a close, but meaningfully different point in the design space.

Specifically, here’s my version of IDE-friendliness diff for language design:

Push conditional compilation far further in the pipeline, such that it is done after all semantic analysis. This is a pre-requisite for making automated refactors which are _guaranteed_ to work.

Similarly, push meta programming further down, such that code analysis doesn’t invoke user-defined code (which might be arbitrary slow), but, at the same time, meta parts can fully reflect on existing code, including resolved types. C#-style source generators are an interesting design here.

Have strong signatures on micro and macro level. Macro, have well-defined compilation units with explicitly specified dag of dependencies and signature files. Micro, annotate types of the functions. Use tooling to reduce double-annotation burden.

Ensure that each source file can be somewhat deeply analyzed in complete isolation. The last two points should unlock both embarrassingly parallel (distributed) compilation, and snappy completion.

10 months ago

the_mitsuhiko

> there’s little of that actually happening (at least with “current” languages, “next” ones seem to fare better).

I'm not too involved but I think the languages that actually develop the tooling as part of the language design have traditionally done quite well at it in language design. Delphi/C#/TypeScript all had tooling built alongside the language and it shows.

Rust sadly made a ton of choices in the language that just make a lot of things hard, but enabled the type of ecosystem that exists today. So it has different tradeoffs. I'm quite amazed how well RLS is doing considering how hostile the language is to fast iteration.

10 months ago

matklad

I am not entirely sure about C#, if we speak about “static analysis based IDE” tooling. I think this only became focus with Roslyn, and Roslyn came later, after JetBrains did their own thing with resharper.

But yeah, directionally we are getting there: Microsoft seems to be the main popularizer of the idea (TypeScript and LSP), and Google seems to be most acutely aware about the “tooling gap” (Go, Dart, and, most recently, Carbon are cognizant of tooling needs. Carbon has signatures for tooling, and Dart shipped a better version of LSP before LSP became a thing).

10 months ago

the_mitsuhiko

I just remember years ago reading interviews with Hejlsberg about how he designed C# (particularly LINQ) with IDEs in mind. Given that he also worked on Delphi I think it’s a core pillar.

10 months ago

hawk_

I think that was meant in a very narrow sense comparing to SQL. In SQL the columns being selected go before the the table name so the IDE can't suggest a set of valid columns. In LINQ the order is flipped. But other than that there's no grand language design there to support tooling.

10 months ago

the_mitsuhiko

You don’t have to have a grand design, you just need to get the important pieces right. C# has an excellent IDE experience not because LINQ is in the right order but because it has a fundamental compilation model and object system that supports tooling well. They could have gone nuts with compile time logic but they did not to support a better IDE experience.

10 months ago

valenterry

Look at Kotlin to see the opposite. It was designed to be IDE friendly - in fact, it was developed by the IntelliJ folks.

However, looking at it now, it seems that Kotlin is pretty much falling behind even Java in terms of features and functionality. But especially it also seems behind in a well-thought design.

When looking at Rust, I think while there are still a lot of features that I would desire, it at least introduced major improvements like typeclasses (or "traits" in Rust lingo) which really help the ecosystem and libraries to flourish. If this makes it harder for the compiler and the IDE then I think it's a fair tradeoff. Kotlin had the chance to do the same, they deliberately didn't and I is already starting to hinder the language from thriving.

10 months ago

4RealFreedom

Curious as to what parts of Kotlin you think are falling behind Java. Java has been playing catch-up forever. Java just implemented light weight threads - coroutines in Kotlin which I've been using for years. I've used data classes forever and, again, Java just recently implemented their equivalent - records. Sealed classes are the only feature I can think of but not a must-have for me. What am I missing? For some background, I've been a Java developer for over 10 years. I work with Kotlin and Java daily - my company has a backend written in Kotlin and many ancillary services in Java. I prefer Kotlin - it's a joy to write in. JetBrains hit a sweet spot by being able to mix functional and imperative coding.

10 months ago

valenterry

A good example is the latest pattern-matching addendums to Java which are lacking in Kotlin in comparison. Sealed classes (aka coproducts/sumtypes) are another and are an essential part of any programming language, on the same level es enums. I think you are experiencing the blub-paradox in this case.

Yes, Java has been (and is still) playing catch-up, but there has been some serious progress recently and when features come, they tend to be better than the Kotlin counterparts from what I can tell.

That being said, I'm not claiming that Java is better than Kotlin right now. But when projecting into the future, I doubt that Kotlin has one, sorry to say that. A couple of years ago I would have said the exact opposite, but Java has really stepped up since then.

10 months ago

4RealFreedom

Wait, if sealed classes are essential, how did java survive so long without them? Possibly a nice-to-have but essential? To say Kotlin doesn't have a future when it is one of the widest used languages is a stretch. I think it's great that Java has been stepping up their feature development. I will continue to use both.

10 months ago

valenterry

Well, java "survived" without it because, at the time, most other languages didn't have sumtypes either. But almost every new/modern language does. Exceptions are rare (golang is one of them and even for golang there are discussions: https://github.com/golang/go/issues/57644)

> To say Kotlin doesn't have a future when it is one of the widest used languages is a stretch

Well, let me say it like that: if Java and Kotlin continue to progress with the same speed like they do now, then Kotlin will not be a "better Java" anymore and the only selling-point will be IDE support (if that's even possible).

10 months ago

4RealFreedom

Kotlin is a 'better Java' because of the ability to mix functional and imperative code. Java will never do that. I don't know what you mean about 'IDE support'. Neither of them are going anywhere.

10 months ago

valenterry

Kotlin is made by Jetbrains which make IntelliJ - so IDE support is a plus-point for Kotlin. That is what I meant.

I think Java has become way more functional over time. It is not and probably never will properly support a pure functional style (e.g. like Scala) but I don't see it much behind compared to Kotlin. Mind to give some examples where Java is behind Kotlin in this context?

10 months ago

rhdunn

Kotlin is in a tricky place w.r.t. Java because the JVM is one of Kotlin's target platforms. If Kotlin deviated from the JVM too far, it would be harder to migrate to the newer JVM versions.

That specifically applies to things like pattern matching and value classes. Value classes are constrained to single values to allow the Kotlin compiler to do the heavy lifting without involving specific JVM capabilities.

It will be interesting to see how Kotlin evolves with version 2.0 and beyond, with the new compiler and stable multiplatform support.

10 months ago

valenterry

I don't think so. Look at Scala, which is in the same position as Kotlin. Still, it has a far superior featureset both compared to Java and Kotlin and has had pattern matching since almost forever (in fact, Java's pattern matching, sealed classes etc. are heavily inspired by Scala's). And it manages quite fine to do all these things, even before the JVM improved to support those features better in a native way.

10 months ago

avgcorrection

Thanks for that.

It seems that some newer languages (like in the last ten years) either care about (1) batch compilation speed or (2) language design mostly unconstrained from having to have fast batch compilation. It would be interesting to see a third way: language design that takes into account all kinds of future interactive tooling by laying a good groundwork. So maybe from-scratch batch compilation ends up slow, but then it turns out to be fine in practice since you can distribute builds and cache things so that you only pay at most a one-time large cost. And then what you mostly have to deal with is incremental stuff.

10 months ago

firstlink

Interactive proof languages already work that way, to a greater or lesser extent.

But, for general-purpose languages, this revictimizes the docker addicts.

10 months ago

lerno

It’s hard to push down conditional compilation too far without compromising its value, but one can still stop at a much later point than Zig does. (In fact, the fundamental choices of Zig in regards to compile time necessarily makes it hard to create good IDEa for)

In my C-like (C3), I retained an #if-like construct, which while semantic (unlike C’s) still meant dealing with sections of conditional code that had to be resolved before semantic checking could commence. To simplify I removed the ability to conditionally include struct members etc.

Doing more research I found D’s `version` interesting. In particular the non-block variant where the version is attached to a declaration.

I ended up creating a more flexible (and therefore actually less good) version of it as an attribute (`@if(cond)`) which then may be attached to any declaration. The advantage here is that at most this leads to some symbols having two (or more) different declarations (resolvable using the `@if`), which allows a lot of analysis even if the attribute hasn’t had its argument resolved.

I think this might be the way forward at the top level.

Inside of functions / macros, $if and friends are fine as they often do not matter for the overall analysis.

10 months ago

ComputerGuru

Alex, I've been reading all your posts on Zig and Rust LSPs with great fervor and appreciate everything you've done for the community with rust-analyzer. Can I ask you about clangd, the llvm/C++ lsp? It seems to work tremendously well, far better than I naively ever thought it would even on massive codebases. I was using it from its first release and the quality ramped up rather quickly. Can I bug you to write up about it from your perspective and with your insights sometime and where we could benefit from their design choices and where we can't? And of course, the limitations of their approach?

It's ostensibly driven by the `compile_commands.json` produced by the compiler during a run (mapping each source file to the output .o and the command (w/ all options) that was used to generate it) but in reality that's just because C++ doesn't have its own build system so the tooling has no other way of knowing how a project is wired up. But aside from that, it's extremely fast and has fairly low latency for a language that's got a lot of the same warts that rust has (generics, monomorphization, macros, strong types, (limited) type inference, etc) especially when compared to all the other LSP offerings (although I think C++ compilation units are smaller than rust's, which has to help).

10 months ago

matklad

I covered this a bit here:

https://rust-analyzer.github.io/blog/2020/07/20/three-archit...

I don't actually _know_ how exactly clangd works, especially post modules, but pre-modules C++ has a compilation model which is actually quite friendly for an IDE, because of the header files.

Headers essentially explicitly encode body/interface separation that more advanced incremental compilation systems like Salsa try to recover implicitly.

Looking at a slightly simplified typical C++

    #include <iostream>

    void main() {
        std::cout << "Hello, World!" << std::
    }
what a language server could do is that it can run bog-standard phased compiler, and just _freeze_ its state after all includes are parsed&expanded. After that, any typing inside the file needs to re-analyzer just the file itself. Also, the code in included `.h` files is typically far smaller then the code in the corresponding `.cpp` files. So, a language server gets ability to analyse only the relevant code, and skip the rest, for free, it is encoded in compilation model.

Compare this with Rust, where, in the general case, changing a file can affect any other file in CU in arbitrary ways (_usually_ it doesn't, but it _could_, and the complexity lies precisely in figuring out which edits are local and which are not). And CU's themselves are meatier, as macro expansion can run arbitrary code, so you need to somehow cache that, taking into account that macro expansion _anywhere_ in CU can potentially affect _anything_ in CU. So, you need full build systems a-la carte incremental compilation, which is a) hard b) different from how the command line compiler works.

The second point explains the success of clangd I think: _usually_, when you have a command-line compiler and want a language server, you want to rewrite, because the architectures are too different. But because of the way C++ (and OCaml) work, for those two languages you actually can soundly re-use existing code with relatively minor modifications.

10 months ago

littlestymaar

> Push conditional compilation far further in the pipeline, such that it is done after all semantic analysis

So much this. I've idea how this would work un practice (like how you're supposed to accomodate antagonistic semantics which depend on the conditionnal compilation), but I hate #[feature(X)] with passion because of how brittle they are.

10 months ago

eptcyka

Designing syntax and semantics that are easier to parse can also make the language easier to understand to humans, as long as optimizing for these goals doesn't impede on useful language features.

10 months ago

levodelellis

I wrote my (prototype) LSP in a completely different way from the article and I got very good results (autocomplete took less than 10ms, IIRC was <5). I didn't intentionally design my language for it but I did have to write my compiler in a way that would keep the data around so I can support it

10 months ago

KRAKRISMOTT

Can this be done on the runtime level? Zig, if I recall, supports multi-colored functions. Would it be possible to terminate an awaitable early?

10 months ago

valenterry

Multi-colored? Google doesn't spit out anything meaningful and ChatGPT claims that Zig didn't exist in Sept 2021...

10 months ago

59nadir

I think in popular parlance the terminology comes from the "What color is your function?" article[0]. The idea is that "normal" and async functions are different colors and they don't mix. When you call an async function from a normal one you make the normal one async, i.e. if it had no color it now takes on the color of the async one, etc.

Zig has a system where calling an async function is actually doable while still leaving the calling function non-async.

0 - https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...

10 months ago

valenterry

Oh yeah, I know about that. But that is commonly called "colored functions" not "multi colored functions" no? So I thought there was a difference.

10 months ago

wtetzner

Right, colored functions are e.g. async or not async. In Zig, you can write a single function that works as both, which I assume is what multicolored means here.

10 months ago

valenterry

That doesn't seem to make any sense to me - it should be then called uncolored functions no? Because they all look the same.

But maybe that's what they ment to say.

10 months ago

kps

They're generic over ‘color’ at compile time; the compiler generates code for normal and/or async contexts as necessary.

10 months ago

valenterry

Right - so to the developer they all look the same, which means they are uncolored. If they are colored it means they look different (e.g. async vs sync).

10 months ago

littlestymaar

Zig aims to solve this “problem” by having functions that can be either async or not.

10 months ago

a1369209993

They're (presumably) refering to "colorblind" functions. (Note that "blind" is apt here: the color isn't visible, but functions still have color and will break in subtle ways if you try to use them as if they don't.[0])

Zig's approach is probably a improvement over bare async/await, but pre-javascript languages already had more of a improvement in the form of not having async/await. Implementing async/await at all makes the language worse unless there's a (preferably provably) universally-applicable way for functions to be (explicitly) generic about whether they're async or not.

0: https://gavinhoward.com/2022/04/i-believe-zig-has-function-c...

10 months ago

hansvm

ChatGPT isn't great for spitting out accurate facts, especially if you just ask once, especially if you don't add additional details and otherwise engineer the prompt, especially if you're asking questions involving digits and numerals, and you especially if small details matter. Zig was definitely far enough along to have multi-colored functions over a year prior to that.

10 months ago