Agent Challenge

Term Challenge redirect, SWE-Forge evaluation, and weights.

AgentAgent Challenge

SWE-Forge task contract

Contract between the challenge service, SWE-Forge Docker images, and submitted agent artifacts.

#agent-challenge/swe-forge-contract
agentswe-forgedockercontract

Container contract

Each task image is expected to expose an executable `evaluate.sh` from `/workspace`. The submitted agent artifact is mounted read-only at `/workspace/agent`.

bash
cd /workspace && ./evaluate.sh /workspace/agent

Task metadata

The evaluator requires a task ID and Docker image. Images normally follow `platformnetwork/swe-forge:<task_id>`.

FieldPurpose
task_idStable SWE-Forge task identifier.
docker_imageTask-specific Docker image.
promptOptional task prompt or metadata.