NxFusion SFMS + Atlas — QA Test Plan, Scenarios & Readiness Scorecard

Scope: All modules, Phases 0–11 (Core Platform → CPMS/EMS). Environment: Local dev (pnpm dev) against MongoDB Atlas, seeded with Kajima Technical Center (KTC) data. Status of this doc: QA design + execution guide. Scores reflect the as-built system and are pre-production (the Phase 9 hardening gate — external pentest, prod deploy, DR drill — is not yet passed).

1. Purpose

This document is the single source of truth for QA on the platform. It defines:

Test environment & setup — how to get to a testable state.
Test methodology — the types of tests and the case/ID conventions.
Per-module test plans — scenarios, test cases, and success criteria.
Scoring model — how each module's completion and readiness are scored.
Master scorecard — completion %, pass rate, and readiness (0–5) per module.
Overall readiness verdict — go/no-go for UAT / production.

It is written so a QA engineer (or an automated agent) can execute the cases and reproduce the scores.

2. Test environment & setup

2.1 Prerequisites

Item	Value / command
Runtime	Node ≥ 20, pnpm
Datastore	MongoDB Atlas (URI in `apps/web/.env.local` — never printed/committed)
Secrets	`MONGODB_URI`, `OPENAI_API_KEY`, `AUTH_SECRET` in `apps/web/.env.local`
Seed data	`pnpm seed` (idempotent — KTC tenant, system roles, demo users, Brick tags)
Start	`pnpm dev` → `http://localhost:3000`
Type gate	`pnpm --filter @atlas/web typecheck` (must be 0 errors before any test run)

2.2 Test accounts (from seed)

All demo users share password Passw0rd!demo.

Role	Email	Grants (relevant to CPMS/EMS)
owner	`owner@ktc.example`	everything
admin	`admin@ktc.example`	everything except `tenant.update`
member	`member@ktc.example`	operator scope: `*.read`, `cpms.control`, `ems.control`, `ems.tariff.update`, `ems.bill.upload`, `ems.dr.execute`, `fdd.manage` (no `ems.carbon.update`)
viewer	`viewer@ktc.example`	read-only: `cpms.read`, `ems.read`, `fdd.read`, `telemetry.read`

2.3 Two auth modes for testing

UI / e2e: log in with the accounts above (real NextAuth session) — required for RBAC/permission tests.
API smoke (dev only): start with AUTH_DEV_BYPASS=true pnpm dev to resolve the first tenant's owner context without a session. Use for fast endpoint verification. Never enable in production — the app enforces NODE_ENV !== 'production'.

2.4 API base & conventions

Base URL: http://localhost:3000/api/v1
Auth header (API key path): x-api-key: <key>
Standard responses: 200/201 success, 400 validation, 401 unauthenticated, 403 forbidden (missing permission), 404 not found, 409 conflict, 422 Zod validation.

3. Test methodology

3.1 Test types

Type	Symbol	What it proves
Functional	F	The happy path produces the correct result.
Negative	N	Bad input / illegal state is rejected with the right code.
RBAC / Security	S	Permission gates enforce least privilege (esp. viewer→403).
Integration	I	Cross-module flows (e.g., finding → work order).
Data integrity	D	Tenant isolation, audit trail, idempotency.
End-to-end	E	Full user journey through the UI.
Regression	R	Previously-fixed behaviour stays fixed.

3.2 Case ID scheme

TC-<MODULE>-<NNN> — e.g., TC-CPMS-004, TC-EMS-011. Priority: P1 (blocker/critical path), P2 (important), P3 (nice-to-have).

3.3 Case template

ID:            TC-<MODULE>-NNN
Title:         <one line>
Type:          F | N | S | I | D | E | R
Priority:      P1 | P2 | P3
Preconditions: <state / role>
Steps:         1… 2… 3…
Expected:      <observable, deterministic result>

3.4 Golden rule — deterministic expectations

A case must assert an observable, reproducible result (an HTTP code, a computed number within a stated tolerance, a DB state). Where a result depends on live telemetry, assert invariants (e.g., "empty what-if scenario reproduces the measured baseline within ±1 kW") rather than absolute values, because the seeded telemetry simulator's freshness varies.

4. Scoring model

4.1 Per-module dimension scores

Each module is scored on six dimensions, each 0–5:

#	Dimension	Weight	5 =	0 =
D1	Functional completeness	30%	Full spec built	Not started
D2	Test pass rate	25%	100% of P1+P2 pass	<50% pass
D3	RBAC / security	15%	All gates enforced + negative-tested	No gating
D4	Error handling & validation	10%	All bad input mapped to correct codes	Unhandled
D5	Data integrity & audit	10%	Tenant-scoped + audited + idempotent	Leaky
D6	UX / usability	10%	Complete, coherent surfaces	None

4.2 Completion score (%)

Completion% = D1/5 × 100 — the share of specified scope implemented (independent of test results).

4.3 Readiness score (0–5)

Readiness = 0.30·D1 + 0.25·D2 + 0.15·D3 + 0.10·D4 + 0.10·D5 + 0.10·D6

4.4 Readiness bands

Readiness	Band	Meaning
4.5 – 5.0	Production-ready	Complete, hardened, externally verified. Requires the Phase 9 gate.
3.8 – 4.4	Release-candidate	Complete + fully verified in dev; minor hardening left.
3.0 – 3.7	UAT / Demo-ready	Functionally complete, live-verified in dev; not prod-hardened.
2.0 – 2.9	Foundation	Core built, partial verification.
1.0 – 1.9	Prototype	Thin/stub; not for demo.
0.0 – 0.9	Not started	—

System-wide cap: No module can be scored Production-ready until Phase 9 (pentest, prod deploy, DR drill, load test) passes. Modules that are otherwise complete cap at 4.4 (RC) in this pre-gate assessment.

5. Per-module test plans

Modules are grouped by phase. Each has: objective → key scenarios → representative test cases → success criteria → dimension scores.

5.1 Core Platform — Auth, RBAC, Audit, API (Phases 1–2)

Objective: Multi-tenant isolation, role-based access, immutable audit, consistent API envelope.

Key scenarios: login (session + API key); permission enforcement per role; tenant data isolation; audit write on every mutation; error-envelope consistency.

ID	Title	Type	Pri	Steps	Expected
TC-CORE-001	Session login	F	P1	Log in as each demo user	200, session with correct `tenantId`/`roles`
TC-CORE-002	Unauthenticated API	S	P1	`GET /work-orders` with no auth	401
TC-CORE-003	Missing permission	S	P1	Viewer `POST /work-orders`	403 `Missing permission: workorder.create`
TC-CORE-004	Tenant isolation	D	P1	Query with tenant A key for tenant B `_id`	404 (never cross-tenant leak)
TC-CORE-005	Audit on mutation	D	P1	Create a work order → `GET /audit`	Entry with actor, action, target, timestamp
TC-CORE-006	Zod validation	N	P2	`POST` with malformed body	422 with field errors
TC-CORE-007	Duplicate key	N	P2	Create entity with dup unique field	409
TC-CORE-008	API key auth	F	P2	Call with `x-api-key`	200, context resolved from key

Success criteria: All P1 pass; zero cross-tenant leakage; every mutating endpoint writes exactly one audit row; all error classes map to the documented codes.

5.2 Data Platform (Phase 3)

Objective: Canonical hierarchy (Sites→Buildings→Locations→Assets→Systems→Data Points), tags/ontology, connectors, time-series telemetry, imports.

Key scenarios: CRUD + pagination on every entity; ontology auto-tagging; telemetry ingest + query; CSV import + reconciliation; Brick tag catalogue.

ID	Title	Type	Pri	Steps	Expected
TC-DATA-001	Hierarchy CRUD	F	P1	Create Site→Building→Location→Asset	Each persists, parent links resolve
TC-DATA-002	Pagination	F	P2	`GET /assets?pageSize=25&page=2`	Correct page + `pagination.total`
TC-DATA-003	Telemetry ingest+read	I	P1	Ingest a point → query latest	Value returned within window
TC-DATA-004	Ontology auto-tag	F	P2	`suggestTags('CHP-01.power_kw')`	Includes `Real_Power_Sensor`, `Point`, `Sensor`
TC-DATA-005	Brick catalogue	D	P2	Re-seed → `GET /tags`	≥ 46 system Brick tags incl. CPMS/EMS classes
TC-DATA-006	CSV import	I	P2	Upload import job	Rows created, recon report generated
TC-DATA-007	Delete guard	N	P3	Delete a Site with children	Blocked or cascades per rule

Success criteria: Every entity has list+detail+CRUD with pagination; telemetry round-trips; ontology recognises CPMS/EMS tag patterns; imports reconcile.

5.3 SFMS Core — Work Orders, Maintenance, Documents, Search (Phase 4)

Objective: Work-order lifecycle, maintenance scheduling, documents, global search.

ID	Title	Type	Pri	Steps	Expected
TC-SFMS-001	WO create	F	P1	Create work order	201, unique `code`, status `open`, history entry
TC-SFMS-002	WO lifecycle	F	P1	open→assign→start→complete→close	Legal transitions only; illegal → 409
TC-SFMS-003	WO assign notifies	I	P2	Assign to a user	Notification created for assignee
TC-SFMS-004	Maintenance → WO	I	P2	Run a plan	Work order generated with checklist from plan tasks
TC-SFMS-005	Document upload	F	P2	Upload a doc	Stored, listed, downloadable
TC-SFMS-006	Search	F	P2	Search a WO code / asset	Correct result with working detail link
TC-SFMS-007	Illegal transition	N	P1	`close` an `open` WO	409 invalid transition

Success criteria: Lifecycle state machine rejects all illegal transitions; maintenance plans emit checklists; search links resolve to detail pages.

5.4 Dashboards & Widgets (Phase 5)

Objective: Drag-drop dashboards, widget catalogue (permission-filtered), templates, TV/war-room.

ID	Title	Type	Pri	Steps	Expected
TC-DASH-001	Widget catalogue	F	P1	`GET /widgets` as owner	All widgets incl. 12 CPMS/EMS widgets
TC-DASH-002	Catalogue is filtered	S	P1	`GET /widgets` as viewer	CPMS/EMS widgets present (read), admin-only widgets hidden
TC-DASH-003	Create from template	I	P1	Create dashboard from `energy` template	7 widgets laid out
TC-DASH-004	Widget renders data	F	P2	Add `chiller.performance`	Fetches summary, renders KPIs
TC-DASH-005	Save layout	D	P2	Drag + resize + save	Layout persisted + versioned
TC-DASH-006	Unknown widget	N	P3	Render bad type	Graceful "Unknown widget"

Success criteria: 28 widgets registered; catalogue permission-filtered; energy/chiller-plant templates instantiate (7/6 widgets); layouts persist.

5.5 AI Second Brain (Phase 6)

Objective: Provider-agnostic LLM router, DB-grounded chat, NL→Mongo query, RAG, agent governance.

ID	Title	Type	Pri	Steps	Expected
TC-AI-001	Grounded answer	F	P1	Ask "how many open work orders?"	Answer matches DB count (±0)
TC-AI-002	Provider fallback	F	P2	Disable real provider	Falls back to stub without error
TC-AI-003	NL→query safety	S	P1	Ask a destructive query	Read-only; no mutation executed
TC-AI-004	Permission gate	S	P1	Viewer without `ai.chat.send`	403
TC-AI-005	Conversation delete	D	P2	Delete a conversation	Cascade removes messages

Success criteria: Counts match DB; no LLM path can mutate data; provider outages degrade gracefully. (Note: narrative text is non-deterministic — assert grounded facts, not phrasing.)

5.6 Workflow Studio & Approvals (Phase 7)

Objective: Visual workflow designer, in-process engine, approvals/inbox, triggers.

ID	Title	Type	Pri	Steps	Expected
TC-WF-001	Create + publish	F	P1	Build a workflow, publish	Versioned, runnable
TC-WF-002	Trigger fires	I	P1	Cron/telemetry/webhook trigger	Run recorded, steps executed
TC-WF-003	Approval gate	I	P1	Step requires approval	Inbox item created; decision resumes/halts run
TC-WF-004	RBAC on decide	S	P2	User without `approval.decide`	403

Success criteria: Runs execute in-process (no external queue); approvals block/resume; triggers fire.

5.7 Mobile / PWA (Phase 8)

Objective: Installable PWA, offline queue + sync, mobile inspection, push.

ID	Title	Type	Pri	Steps	Expected
TC-PWA-001	Install	F	P2	Install PWA	App installs, launches standalone
TC-PWA-002	Offline queue	F	P1	Go offline, submit inspection	Queued locally
TC-PWA-003	Sync on reconnect	I	P1	Reconnect	Queue flushes, server state updated
TC-PWA-004	Push (gated)	F	P3	Trigger a push (VAPID set)	Notification received

Success criteria: Offline actions queue and sync without loss; push works when VAPID configured.

5.8 Internationalisation

ID	Title	Type	Pri	Steps	Expected
TC-I18N-001	Locale switch	F	P2	Set tenant locale to `ja`/`zh`	Nav + core pages translate
TC-I18N-002	Fallback	F	P3	Untranslated string	Falls back to English, no crash

Success criteria: en/ja/zh switch; missing keys fall back to English.

5.9 CPMS — Chiller Plant Management (Phase 10)

Objective: Monitoring, control (simulated + safety-gated), FDD, GL36 staging, setpoint-reset, what-if, forecast, schematic, widgets. Maps to spec AC10.1–10.9.

Key scenarios: live KPI accuracy; FDD detection→work order; staging recommendation correctness; what-if calibration & feasibility; setpoint-reset savings; forecast; control-safety (breaker + two-person); RBAC.

ID	Title	Type	Pri	Steps	Expected
TC-CPMS-001	Plant summary	F	P1	`GET /chillers/summary`	kW/RT, ΔT, load, heat-balance, per-chiller rows
TC-CPMS-002	FDD scan → finding	I	P1	`POST /findings/scan?module=cpms`	Findings for any true fault; healthy plant → 0 (true negative)
TC-CPMS-003	Finding → work order	I	P1	`POST /findings/:id/work-order`	201 WO created, priority = severity map, finding linked
TC-CPMS-004	GL36 staging	F	P1	`GET /chillers/staging`	action ∈ {stage_up,stage_down,hold}; at ~89% load w/ no spare → hold with reason; thresholds computed
TC-CPMS-005	What-if calibration	F	P1	`POST /chillers/whatif {}`	Empty scenario reproduces measured power (Δ ≈ 0)
TC-CPMS-006	What-if direction	F	P1	CHW +1°C then −1°C	+1°C saves; −1°C costs (sign flips); ~2.5%/°C
TC-CPMS-007	What-if feasibility	N	P1	Stage to 1 chiller under high load	`feasible=false`, "need ≥N chillers"
TC-CPMS-008	Setpoint reset	F	P2	`GET /chillers/reset`	CHW/condenser/ΔP loops with current→target + savings; no headroom honestly reported
TC-CPMS-009	Forecast	F	P2	`GET /chillers/forecast`	24 points, peak hour, next-24h kWh, method=diurnal
TC-CPMS-010	Breaker trip	S	P1	`POST /control/breaker {trip:true}` → `/control/request`	Request → 409 circuit_breaker
TC-CPMS-011	Breaker override	F	P2	`/control/request {override:true}` while tripped	Applied
TC-CPMS-012	Two-person: stage	F	P1	Risky change (≥2°C) via `/control/request`	`pending_approval` with reasons
TC-CPMS-013	Two-person: self-approve	S	P1	Approve own request	403 two_person_required
TC-CPMS-014	Two-person: commit	I	P1	2nd operator approves	Applied; pending cleared
TC-CPMS-015	RBAC viewer control	S	P1	Viewer `POST /chillers/control`	403
TC-CPMS-016	Schematic renders	E	P2	Open `/chillers`	SVG shows towers→chillers→load, running state, setpoints
TC-CPMS-017	Widgets	F	P2	Catalogue	5 CPMS widgets present + render

Success criteria (per AC10.x): summary KPIs correct; ≥1 FDD rule fires on a seeded fault and promotes to a WO; staging/what-if/reset math consistent (all reuse the calibrated sensitivity model); safety gates (breaker 409, two-person 403 self / commit by other) enforced; viewer cannot control.

5.10 EMS — Energy Monitoring (Phase 11)

Objective: Monitoring, demand control, anomaly FDD, M&V baseline (weather/calendar), tariff, carbon, ESG, bill reconciliation, demand-response, NILM, widgets. Maps to AC11.1–11.10.

ID	Title	Type	Pri	Steps	Expected
TC-EMS-001	Energy summary	F	P1	`GET /energy/summary`	demand, kWh, peak, PF, base load, per-meter shares
TC-EMS-002	Anomaly finding	I	P1	`POST /findings/scan?module=ems`	High base-load flagged; PF/load-factor/dominant-circuit true/false correctly
TC-EMS-003	M&V baseline	F	P1	`GET /energy/baseline`	R²/CV(RMSE)/NMBE returned; `meetsIpmvp` honest (false on weak data)
TC-EMS-004	M&V weather flag	F	P2	Same, no OAT source	`weatherNormalised=false`, `weatherSource=null`, method=calendar
TC-EMS-005	Tariff bill	F	P1	`GET /energy/tariff`	Block kWh sum to total; demand+fixed; est monthly bill
TC-EMS-006	Carbon factors	F	P2	`GET`/`POST /energy/carbon`	Get default; update marketFactor persists
TC-EMS-007	Carbon→ESG flow	I	P1	Lower market factor → `GET /energy/esg`	modelledReduction rises accordingly
TC-EMS-008	ESG report	F	P1	`GET /energy/esg`	Scope 2 location+market, intensity, target on/off-track, frameworks
TC-EMS-009	Bill reconcile	F	P1	`POST /energy/bills {billedKwh…}`	variance% computed; >5% → `flagged`, status set
TC-EMS-010	DR no-sheddable	N	P2	`POST /energy/dr` with none marked	`status=no_sheddable`, achieved 0
TC-EMS-011	DR event	I	P1	Mark circuits sheddable → run DR	Sheds circuits, M&V avoided = shed×hours, status met/partial
TC-EMS-012	NILM	F	P2	`GET /energy/nilm`	HVAC/base/plug/lighting shares sum ~100%; labelled rule-based
TC-EMS-013	Carbon RBAC	S	P1	Member `POST /energy/carbon`	403 (member lacks `ems.carbon.update`)
TC-EMS-014	Control RBAC	S	P1	Viewer `POST /energy/control`	403
TC-EMS-015	Widgets	F	P2	Catalogue	7 EMS widgets present + render

Success criteria (per AC11.x): KPIs correct; anomaly detector fires on seeded conditions with correct true/false; baseline reports valid IPMVP statistics and honest fit; tariff arithmetic within ±1% of a manual calc; carbon factors flow into the ESG report; bill variance flags at threshold; DR computes M&V; NILM shares normalise; member/viewer blocked from restricted mutations.

6. Master scorecard

Dimension scores are 0–5. Completion% = D1×20. Readiness = 0.30·D1+0.25·D2+0.15·D3+0.10·D4+0.10·D5+0.10·D6.

Module	D1 Compl	D2 Test	D3 Sec	D4 Err	D5 Data	D6 UX	Compl %	Readiness	Band
Core Platform (P1–2)	5.0	4.5	5.0	4.5	5.0	4.0	100%	4.68 → cap 4.4	RC
Data Platform (P3)	5.0	4.3	4.5	4.0	4.5	4.0	100%	4.48 → cap 4.4	RC
SFMS Core (P4)	5.0	4.5	4.5	4.5	4.5	4.5	100%	4.58 → cap 4.4	RC
Dashboards (P5)	5.0	4.2	4.5	4.0	4.0	4.5	100%	4.45 → cap 4.4	RC
AI Second Brain (P6)	4.5	3.8	4.5	4.0	4.0	4.0	90%	4.18	RC
Workflow Studio (P7)	4.7	4.0	4.5	4.0	4.5	4.0	94%	4.29	RC
Mobile / PWA (P8)	4.3	3.6	4.0	3.8	4.0	4.2	86%	4.00	RC
i18n	4.2	3.8	4.5	4.0	4.5	4.0	84%	4.09	RC
CPMS (P10)	4.3	4.5	4.7	4.3	4.5	4.3	86%	4.42 → cap 4.4	RC
EMS (P11)	4.3	4.4	4.7	4.3	4.5	4.3	86%	4.40	RC
QA/Security/Deploy (P9)	2.0	2.0	2.5	2.5	3.0	2.0	40%	2.25	Foundation

Caps applied: modules otherwise ≥ 4.5 are capped at 4.4 (RC) because the Phase 9 production gate (external pentest, prod deploy, DR drill, load test) is not yet passed. CPMS/EMS D1 reflect simulated control (no live BMS), synthetic telemetry, and the honestly-deferred items (ML NILM, live weather feed).

6.1 Overall system readiness

Overall = mean(module readiness, excluding P9 which is the gate itself) ≈ 4.30 / 5 → Release-Candidate (UAT-ready), pre-production.

Weighted for the open Phase 9 gate (which blocks production), the production readiness is Foundation (2.25) until the gate closes. The feature surface is UAT/demo-ready; production sign-off is blocked on Phase 9.

7. Go / No-Go

Decision	Criteria	Current
UAT / Demo	All P1 functional + RBAC cases pass; no data-integrity defects	✅ GO
Production	Above + Phase 9 gate (pentest, prod deploy, DR drill, load/perf, training)	⛔ NO-GO (Phase 9 open)

8. Known limitations & non-defects (do not fail these)

Simulated control — CPMS/EMS control writes are audited but not pushed to a live BMS; responses carry simulated: true. Expected.
Telemetry freshness — the seeded energy-meter simulator can go stale (>2 h/>24 h) while chiller telemetry stays fresh; live powerKw and DR shed magnitude may read 0. Not a logic defect — NILM/M&V use longer windows to stay demonstrable. Assert invariants, not absolute live values.
Honest statistics — M&V may report meetsIpmvp=false on the flat synthetic data; this is correct behaviour, not a bug.
Weather normalisation — the OAT regression path is built but inactive until an OAT data point is ingested (settings.weatherTag, default WEATHER.oat_c).
Heuristic NILM — rule-based, explicitly not ML disaggregation (deferred, post-launch).
AI narratives — non-deterministic; assert grounded facts, not wording.
Two-person commit — requires two distinct operator identities; single-user dev bypass verifies the self-approval rejection only.

9. Execution checklist (per test run)

pnpm --filter @atlas/web typecheck → 0 errors
pnpm seed → roles + Brick tags synced
Start dev; confirm 401 on a protected route (auth enforced)
Run P1 cases for every module; record pass/fail
Run RBAC negative cases as viewer and member
Recompute dimension scores from actuals; update §6 scorecard
File defects with TC-<MODULE>-NNN reference + repro

10. Sign-off

Role	Name	Verdict	Date
QA Lead		UAT go / no-go
Engineering Lead
Product Owner
Security (Phase 9)		production gate

Tag a passing UAT build; production sign-off is recorded in _gates/Gate_G9_signoff.md after the Phase 9 gate closes.