ABOUT ME

I am an incoming PhD student at UT Austin, working in the Trustworthy and Intelligent Systems Lab (Trishul) under Swarat Chaudhuri. I am interested in developing intelligent systems for mathematics and other scientific domains. Previously I have worked on number theory, quadratic forms, and single-cell computational genomics.

NEWS

PAPERS

Find more info on these papers as well as other publications here.

  1. On three-term linear relations of theta series of positive-definite binary quadratic forms, Rahul Saha, Jonathan Hanke
  2. LLMSTEP: LLM Proofstep Suggestions in Lean (NeurIPS 2023 Math AI ), Sean Welleck, Rahul Saha
  3. A New Approach towards Autoformalization, Nilay Patel, Rahul Saha, Jeffrey Flanigan
  4. Engineering mtDNA Deletions by Reconstructing End-Joining in Human Mitochondria, Yi Fu, Max Land, Ruobing Cui, Tamar Kavlashvili, Minsoo Kim, Toby Lieber, Keun Woo Ryu, Emily DeBitetto, Ignas Masilionis, Rahul Saha, Meril Takizawa, Daphne Baker, Marco Tigano, Ed Reznik, Roshan Sharma, Ronan Chaligne, Craig B Thompson, Dana Pe’er, Agnel Sfeir

RESEARCH INTERESTS

How did complex mathematical ability arise from millions of years of evolution? How are humans capable of discovering relativity, or proving the 290 theorem? From both cognitive and mathematical perspectives, this is a very compelling question to me. For my doctoral research, I want to study this problem from the angle of automatic theorem proving, discovery, and program synthesis.

  • Automatic Theorem Proving. How do we train agents to rigorously prove mathematical statements? Most popular methods currently involve guided tree search with signals from an LLM, but there are many unanswered questions on sample complexity, synthetic data generation, and scaling laws.
  • Automatic Discovery. How do humans come up with newer concepts in mathematics and science? This is a generative AI problem but the twist is that the space is highly discrete.
  • Program Synthesis. How can one build reliable, safe, and formally verifiable programs at scale? LLMs are expensive and not reliable when codebases scale up to tens of thousands of lines of code. Tackling this will require coming up with newer algorithms.

APPLICATIONS

I believe (and hope) that the fruits of my research will empower us to discover newer frontiers of science, empower mathematicians with the ability to ask and answer deeper questions, boost productivity of scientists, and help humanity towards solving some of the hardest problems.

PORTFOLIO

My projects are listed here. Some of these projects are no longer maintained, but I am happy to entertain any questions or comments.