Platform capabilities

Everything in one product

Each capability maps directly to a paper — no marketing claim on this page is unsupported by published research.

Paper A · C

Defect-Driven Test Prioritisation

Run the production model gb-paper1-v4-fixed_age against any pull request. Top 10% of files capture 43.82% of defects.

Paper E

LLM-Powered Test Generation

Requirements in, executable tests out. Deterministic rule verifier in the loop; 96.3% first-pass verification.

Paper D

Self-Healing Automation

Tree-ensemble locator ranking recovers broken selectors at runtime with confidence-gated heal-vs-flag decisions.

Paper B

Cross-Repository Risk Profiles

Leave-one-repository-out validated transfer for teams whose codebases don't look like the benchmark set.

Paper C

CI/CD & Webhook Integration

GitHub Actions, advisory and gating rollouts, pre-merge risk advice, and a stable HTTP contract for IDE plugins and CLI tools.

Governance

AI Governance Layer

Payload controls, content redaction, author privacy hashing, prompt versioning, and verifiable audit trails on every call.

Paper F · Vision

Post-Execution Defect Attribution

Fuse failure signals with SHAP-explained priors to produce triage advice, not just ranked file lists.

Platform

Quality Analytics Dashboard

Real-time view of risk distribution, top-k coverage, and self-heal vs. flag rates across services.

How TestForge AI works

Four steps from sign-up to a pre-merge gate that catches defects before they ship.

  1. 01Connect your repository

    OAuth into GitHub; TestForge AI mirrors the commit history through the Repository Analytics Engine and produces a process-metric snapshot per file.

  2. 02Calibrate the risk model

    The cross-repository transfer pipeline from Paper B picks the best starting point; an in-team calibration pass uses Platt scaling to tune for your defect rate.

  3. 03Wire into CI/CD

    Install the GitHub Actions workflow. Start in advisory mode (PR comments only). Promote to gating mode when the team is comfortable.

  4. 04Generate & verify tests

    Drop a requirement into the RAITG service. Receive Playwright + BDD specs, the rule-verifier report, and a symbolic-mutation-indicator adequacy score.

What sets TestForge AI apart

Research-backed, not vibe-coded

Every claim on the product page maps to a published or in-flight paper with reproducible artefacts.

Calibration before accuracy

Production selection criteria privilege calibration, top-k stability, and threshold sensitivity over raw AUC.

Multi-language, multi-framework

Java, JavaScript, Python, and Go on the roadmap. Frameworks: Selenium, Playwright, Cypress, PyTest, JUnit.

Verified LLM generation

Deterministic rule verifier in every loop. The LLM never gets the last word.

Governance is architecture

Payload controls and prompt receipts are not opt-in; they're how the platform talks to models.

From research to production

The exact code that produced the paper's numbers is the code running in production.

Live in public beta

Ready to try TestForge AI?

Beta access is open. Bring a repository and a couple of requirements — you'll have a working risk profile and a generated test suite inside an hour.