Automatic Textbook Formalization

44 points
1/21/1970
2 days ago
by tzury

Comments


smj-edison

It's interesting to see a lot of formalized proofs gather around Lean (specifically Mathlib). I've in a formal math class right now, so it's a bit weird learning ZFC while messing around with Lean, which is based on dependant types.

There's a lot of other proof assistants out there, like Mizar, Isabelle, Metamath, Metamath0, and Coq, each based on different foundations. Metamath0 in particular looks really intriguing, since it's the brainchild of Mario Carneiro, who was also the person who started Mathlib (and I believe is still an active contributor).

Anyways, Lean definitely has a lot going for it, but I love to tinker, and it's interesting to see other proof assistants' takes on things.

2 days ago

statusfailed

I am a huge fan of mm0, the thesis[0] is so brilliantly written, and MMC is such a great step towards the kind of verified computing I really want to be doing

[0]: https://digama0.github.io/mm0/thesis.pdf

2 days ago

smj-edison

Huh I hadn't paid that much attention to MMC, but this looks fascinating! I don't like how fragile SMT servers are, so to be able to use tactics and proof techniques would be really powerful for formal verification. Also, to be able to prove properties about pointers would let me check some of my work with non-tree data structures.

2 hours ago

rdevilla

Metamath is fascinating to me in that it is the most "math-like," in terms of being both readable and executable on pen and paper through simple substitution. I've spent a month or two formalizing basic results in it and found it quite fun; unfortunately the proof assistant and surrounding tooling is, archaic, to put it generously. However, the fact that the system still works and that the proof tree is grounded in results from 1994 that still stand to modern day without modification is testament to its design.

Most people seem to be rallying around Lean these days, which is powerful and quite featureful, but with tactics metaprogramming feels more like writing C++ templates instead of the "assembly language of proof" which I liken metamath to, for its "down to the metal" atomization of proof into very explicit steps. Different (levels of) abstractions for different folks.

Once I return to a proper desktop I will probably woodshed myself into Lean for a week or two to get a better handle on it, but for now tactics feel like utter magic when not just chaining `calc` everywhere.

2 days ago

smj-edison

I feel the same. When I first heard about metamath I was blown away at how I could drill down to the base axioms (I had only tried Lean before). Lean also feels too magical for my taste, and I dislike that I don't have a good mental model of its execution under the hood. I care a lot about execution speed as well, and Lean... isn't always fast. It's another reason metamath's design really speaks to me.

You might find metamath0 interesting, its kernel design has a similar focus on simplicity while cleaning up a lot of metamath's cruft: https://github.com/digama0/mm0

EDIT: and feel free to ask any questions about mm0, I don't know a ton about it yet but I have researched it a good deal. I'm hoping to use it more this fall when I take a class on first order logic and set theory!

2 days ago

mkl

That just shows the first post for most people. Here is all of it: https://xcancel.com/FabianGloeckle/status/204008278585190440...

2 days ago

auggierose

Which LLMs do these agents use?

2 days ago

auggierose

Claude Opus 4.5

2 days ago

measurablefunc

This is a lot of useful data for the next iteration of Claude because not only does Anthropic have the final artifacts but they also saw the entire workflow from start to finish & Facebook paid them for the privilege of giving them all of that training data.

2 days ago

0x3f

Only if you assume they don't honor their enterprise agreements.

2 days ago

measurablefunc

I assume all chat logs are used for training in one way or another because it would be foolish to not do that.

2 days ago

0x3f

More training data at this point leads to marginal improvements, curve is flattening. So advantage is low. Especially when Anthropic definitely has the budget and talent to carry out the same study.

On the other hand, having it leak that you train on your customers data, ignoring the opt-out, is probably existential when close alternatives exist in the market.

2 days ago

measurablefunc

You probably also thought Anthropic did not use pirated PDFs. You don't know how these companies actually operate & you don't know what weasel language they use in their contracts to get away w/ exactly what I assume to be the case.

There is no AI, all these companies have is the chat logs so unless you have further evidence on what they do or don't do behind the scenes I recommend you use a more conservative approach in your assumptions about what they use or don't use for training.

a day ago

0x3f

No, why would they care about using pirated PDFs? Did you actually read/understand what I wrote? Violating their customers comes with risk for them. Violating the copyright of unrelated texbook authors does not. If that's even what they did.

a day ago

measurablefunc

They are currently paying book authors over a billion dollars in damages. You're out of your depth in this discussion so further engagement is not going to be fruitful for anyone involved. Good luck.

a day ago

0x3f

Oh no, not 0.2% of their valuation! The end is near for Anthropic. Humanity is saved. By the copyright lobby, of all people.

a day ago

alex_be

Big step toward AI-assisted mathematical research

2 days ago