Skip to content

tsgates/1proof

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

1proof Attempts

Three approaches to solving the 1st Proof benchmark (10 research-level math problems, see 1proof/First_Proof.tex).

Approaches

  • iter (1proof-iter/): Direct ask-and-revise loop with Claude Code. Each problem is solved in a single conversation — Claude drafts a proof, receives feedback, and iterates until a proofread LaTeX PDF is produced. All 10 problems produced complete writeups.

  • meta (1proof-meta/): Multi-agent debate framework. Multiple AI agents (researcher, debater, critic, verifier) discuss each problem through structured rounds, then collaboratively write and proofread the proof. Most runs were cut short by API rate limits; only Problem 10 completed the full pipeline (5 debate rounds + 5 proofreading rounds).

  • lean (1proof-lean/): Lean 4 formalization with Mathlib. Each problem is first researched, then a formal Lean proof is attempted alongside a LaTeX writeup. Most problems lack sufficient Mathlib coverage (representation theory, symplectic topology, etc.); only Problems 1 and 10 produced correct, compiling Lean proofs.

Results

# Topic iter meta lean
1 Phi-4-3 measure shift No (mutually singular) proof No (mutually singular) lean pdf
2 Rankin-Selberg Whittaker Yes (W exists) proof
3 ASEP Markov chain Yes (heat-bath chain) proof
4 Finite free Stam inequality Yes (inequality holds) proof
5 O-slice filtration Yes (characterization) proof
6 Epsilon-light subsets Yes (c exists) proof
7 Lattice 2-torsion Q-acyclic No (impossible) proof
8 Lagrangian smoothing Yes (admits smoothing) proof
9 Rank-1 tensor detector Yes (F exists, deg 3) proof
10 PCG for RKHS CP decomp Yes (PCG works) proof Yes (PCG works) proof Yes (PCG works) lean pdf

Proof Sketches (iter)

  1. No. The divergent renormalization mass delta-m causes the Hellinger affinity between the Phi-4-3 measure and its smooth shift to decay exponentially, establishing mutual singularity.
  2. Yes. A Whittaker function W with Kirillov support concentrated at a single torus point is constructed; the u_Q-twist collapses the Rankin-Selberg integral to a nonvanishing constant independent of s.
  3. Yes. A heat-bath Markov chain driven by adjacent transpositions with gap-dependent acceptance probabilities satisfies detailed balance against the interpolation ASEP stationary distribution at q=1.
  4. Yes. Concavity of 1/Phi_n along the semicircular heat flow is established via curvature analysis with doubly stochastic Jacobians, yielding the Stam inequality by variance decomposition.
  5. Yes. Isotropy separation sequences combined with induction on |H| and the maximal Mackey property characterize O-slice connectivity in terms of geometric fixed point connectivity.
  6. Yes. Edge classification by effective resistance, Foster's theorem, and random vertex sampling with spectral concentration bounds produce an epsilon-light subset of size at least cepsilon|V|.
  7. No. A torsion-free finite-index subgroup is a PD_d(Z)-group whose classifying space is contractible; Smith theory then forbids any nontrivial finite group from acting freely on it.
  8. Yes. Local vertex models decompose into transverse Lagrangian planes; mollification of piecewise-quadratic generating functions produces a Hamiltonian isotopy extending topologically to the polyhedral surface.
  9. Yes. F consists of 3x3 minors of flattening matrices. Rank-1 scaling preserves rank-2 flattenings via the bilinear structure; non-rank-1 scaling is detected by a Khatri-Rao cascade argument.
  10. Yes. The Kronecker-vec identity avoids O(N) cost; matvecs route through q observed entries via the selection matrix S. Preconditioner M = (Z^T Z) tensor K^2 + lambda(I_r tensor K) is inverted via simultaneous eigendecomposition in O(n^2 r) time.

Human Note

  • meta's solve.py doesn't run fully becuase of claude code's rate limits.
  • I don't understand any of the problem definition nor solutions. all are just gibberish to non exeprts like me.
  • a non domain-specific expert attempts on a random area.
  • don't use these proofs
  • #1stProof

About

non-expert's attempt on 1proof

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors