Computer Use Test — Run Results

Copy this file to RESULTS-<SCOPE>-<RUNID>.md (e.g. RESULTS-full-20260601a.md or RESULTS-CU-01-20260601a.md) and fill it in. This is the artifact Ravi reviews after a run: it must make clear what was tested, what passed, what failed, and what is left to test by hand.

Run header

FieldValue
RUNID<RUNID>
Date / time (UTC)<YYYY-MM-DD HH:MM>
Environmenthttps://calendo.dev (production)
Agent / harness<which computer-use agent / model>
Scope<single suite ID OR "full graph (Wave 1 + Wave 2)">
Preconditions (§B checklist)☐ All passed ☐ Failures (list below)
Overall result<PASS / PASS-with-residue / FAIL / BLOCKED>

Precondition failures (if any): <none / list each failed item from 00-setup §B and what was skipped because of it>


Per-suite results

Status legend: PASS (all pass/fail criteria met) · PARTIAL (some items pass, some blocked/failed) · FAIL (a pass/fail criterion failed) · BLOCKED (precondition/session missing) · SKIPPED (not attempted this run) · N/R not run.

Fill L1 / L2 / L3 with ✅ / ❌ / — (n/a) / 🚫 (blocked). Link evidence to screenshot names.

IDTitlePriStatusL1L2L3EvidenceNotes
CU-01Core booking lifecycle (book → reschedule → cancel)P0N/R
CU-02Auth lifecycle (register, verify, login, reset, delete)P0N/R
CU-03Google Calendar (conflict, buffers, two-way)P0N/R
CU-05Event-type config → booking-page enforcementP0N/R
CU-06Availability engine (weekly, overrides, holidays, slot-debug)P0N/R
CU-07Host-side booking managementP1N/R
CU-08AI booking chatbot (public page)P1N/R
CU-09AI dashboard assistant (feature parity)P1N/R
CU-10Landing + marketing + static pages + mobileP1N/R
CU-11Public booking page UX (timezones, nav, QR, mobile)P1N/R
CU-04Microsoft / Outlook calendar integrationP2N/R
CU-12Routing forms (build → submit → route → analytics)P2N/R
CU-13Meeting polls (create → vote → tally → finalize)P2N/R
CU-14Team / org scheduling (roles, round-robin, collective)P2N/R
CU-15Contacts, analytics dashboard, CSV exportP2N/R
CU-16Settings & customization (branding, blocklist, BYOK, pixels)P2N/R
CU-17Slack notifications & outbound webhooksP2N/R
CU-18New-user onboarding wizard (4-step)P2N/R
CU-19Embeddable booking widget (inline/popup/badge)P3N/R
CU-20Email sequences, reminders, reconfirmation (time-gated)P3N/R
CU-22Chrome extension for Gmail (manual-led)P3N/R

Tally: PASS __ · PARTIAL __ · FAIL __ · BLOCKED __ · SKIPPED __ (of 21)


Per-suite detail

Duplicate this block for each suite attempted. Keep failures specific and reproducible.

CU-__ — <title>


Manual residue — REMAINING FOR RAVI TO TEST

Things the agent could not fully verify in-browser this run. (Pre-filled from the suites' manual-residue sections; the runner ticks/annotates what actually fell through.)

Out of scope / TBD this run

See ../COVERAGE.md for the full per-suite manual-residue and TBD ledger.


Cleanup confirmation

Cleanup notes: <anything left intentionally, or that needs manual/D1 cleanup>