This repository contains evaluation infrastructure for FormalBench including evaluation metrics and wrappers for calling LLMs. If you found this repository to be useful, please cite our research paper ...