“"What I'd like to know is how often they find success with this model" — r/math commenter Redrot”
You know that feeling when an AI gives you a proof that sounds right, but you still need an expert to find the hidden mistake? That is the bottleneck AlphaProof Nexus targets. Instead of trusting prose, it forces every proof attempt through Lean, where invalid logic fails to compile. The catch is that you now need formalized problems, repeated search, and a Lean library that already covers the needed math.
Think of it like a strict spell-checker for math, except the checker rejects bad logic instead of typos. You start with a Lean theorem file where the proof is blanked out with `sorry`. Gemini 3.1 Pro-based agents edit only allowed parts of the file, Lean checks each attempt, and failed attempts feed the next try. In the full setup, rater agents compare partial proof sketches, assign Elo-style scores, and the next round starts from better failed sketches instead of an empty page.
If you work on theorem proving, formal methods, AI math, or high-assurance code, this is worth reading because it shows a concrete generate-check-retry loop with numbers. If you only need a chatbot that helps with everyday algebra, this is not your tool. You also should skip it if you need the AlphaProof Nexus search agent itself, because the repo releases proof outputs, not the full agent.
Yes, explore it as an experimental research artifact and a design pattern for verifier-backed AI loops. Do not treat it as production-ready theorem-proving infrastructure: the notes say there is no released agent code, no releases, 1 open issue, and the paper says most Erdős problems remain out of reach. The repo is still useful because you can verify the released Lean proofs with Lake.
Deep-dive insight, Easy and Pro modes, plus action playbooks — the full breakdown is one tap away.