Evaluate an AI agent on a subset of validation questions from the General AI Assistants (GAIA) Benchmark.
Note: This space run on minimal setup and takes time to answer the questions, the agent will report only the final answer.
API Information