Evaluation Platform

Checking...
Registered Projects
--
Project Kinds
--
Domain Types
--
API Endpoints
9
Unified routing
Server Status
--
Registered Projects
Project Kind Domain Type Description Module Path Status
Loading projects...
Architecture

Project Hierarchy

Projects are organized by kind (non_ui, ui), then by domain type (react, common, cot), then by project name.

(repo root)/
  non_ui/                          -- backend / evaluation logic
    react/                         -- domain type: react
      shared/deterministic_checks/ -- shared checks for all react projects
      ms_evals/                    -- project: ms_evals
    common/                        -- domain type: common
      shared/deterministic_checks/ -- shared checks for all common projects
      hr_mvp/                      -- project: hr_mvp
    cot/                           -- domain type: cot (placeholder)
  ui/                              -- frontend / dashboards
    dashboard/                     -- unified dashboard
    ms_evals/                      -- ms_evals frontend assets
    hr_mvp/                        -- hr_mvp frontend assets

Unified API Routing

All requests are routed through a single server. Each endpoint accepts project_type and proj_name parameters to dynamically resolve the target project module at runtime.

POST /api/execute
{
  "project_type": "react",
  "proj_name": "ms_evals",
  "task_data": { ... }
}

Standard Response Format

All API responses follow a consistent envelope.

{
  "success": true,
  "data": { ... }
}

Deterministic Checks Override

Shared validation logic is defined per domain type at the shared level. Individual projects can override checks by placing a deterministic_checks.py in their project directory.

non_ui/{domain_type}/shared/deterministic_checks/  -- default
non_ui/{domain_type}/{project}/deterministic_checks.py  -- override
Response Envelope

Every endpoint returns this structure. Check success to determine outcome. Error details appear in data.error.

{ "success": boolean, "data": { ... } }
Endpoints
Loading project info...