This tool constructs problem-specific judges that evaluate code, understand issues, and suggest evidence-based improvements.
Discovered on GitHub via GitHub:2389-research