Loading…

How do LLM benchmarks in physics differ from testing human physicist reasoning? · Antigravity