Lf-lean: The frontier of verified software engineering
Comments
synthesis
nextos
It's interesting Lean is taking off in software engineering. Prior to the advent of LLMs & agents, Lean had almost zero use in software and was mostly focused on mathematics, with Isabelle and Rocq leading the way here. In fact, I asked Kevin Buzzard and others in the Lean community, and they simply shrugged.
The exception was [1], a Lean-based text heavily inspired by Concrete Semantics [2], a cornerstone of Isabelle literature. The latter is, in essence, Winskel's classic semantics book [3], a standard textbook in programming language theory, with all proofs mechanically checked.
More broadly, I'm wondering whether dependent types are the right abstraction or too powerful and heavy for humans to review and make sure specifications are aligned with intent. I've been working on automation for this for more than a year, and I've found refinement types sufficient and much easier to review.
[1] https://github.com/lean-forward/logical_verification_2025
[2] http://concrete-semantics.org
[3] https://direct.mit.edu/books/monograph/4338/The-Formal-Seman...
ngruhn
Is this impressive? They just ported a bunch of theorems/proofs already written in Rocq into Lean. Also Logical Foundations is just a Rocq tutorial with the basics. An absolutely amazing tutorial and probably the best resource out there. But I'm not surprised AI can do that.
benlivengood
Impressive if for no other reason than there are various disparate formally verified projects (seL4, compcert, certikos) that could potentially be unified under a single proof system. Additionally it may be possible to quickly extend existing proofs (e.g. seL4's proofs) to other architectures.
ngruhn
Not saying this is useless. But that article reads like they made some kind of breakthrough in automatic software verification. But is sounds like they rather ported a tutorial test suite from Go to Rust with AI and the tests are still passing.
corysama
I believe what they are bragging about is not the translated proofs, but the process of doing the translation.
> produced by frontier AI with ~2 person-days of human effort versus an estimated ~2.75 person-years manually (a 350x speed-up). We achieve this through task-level specification generators...
akkad33
This website is asking me for permissions on my phone. Why?
Cyphase
I'm not seeing that. Which permissions?
banajama
Not op, but it asked me to "use other apps and services in this device" android, crime
Garlef
Fullstack lean when?
This is cool! I have a similar experience that translations are rather successful. I used AI-tools to translate my hand proofs and it managed to generate a partial Coq proof, this was 6 months ago. I have been spending the last 2 weeks on proofs, and now it is capable of generating proofs for TLA+ (in TLAPS) (Claude 4.6) with a high success rate.
I wonder what the business in Theorem.dev is about as they write "AI that is as capable at program verification as it is at writing Python". To me it seems as though AI models are already capable of writing proofs. From experience, what is missing is a domain-specific dataset of related proofs (and with the right licensing).