TestForge AI — Public Beta | Vijay Prasad Javvadi

Platform capabilities

Everything in one product

Each capability maps directly to a paper — no marketing claim on this page is unsupported by the underlying research (papers under peer review; preprints where noted).

Paper A · C

Defect-Driven Test Prioritisation

Run the production model gb-paper1-v4-fixed_age against any pull request. Top 10% of files capture 43.82% of defects.

Paper E

LLM-Powered Test Generation

Requirements in, executable tests out. Deterministic rule verifier in the loop; 96.3% first-pass verification.

Paper D

Self-Healing Automation

Tree-ensemble locator ranking recovers broken selectors at runtime with confidence-gated heal-vs-flag decisions.

Paper B

Cross-Repository Risk Profiles

Leave-one-repository-out validated transfer for teams whose codebases don't look like the benchmark set.

Paper C

CI/CD & Webhook Integration

GitHub Actions, advisory and gating rollouts, pre-merge risk advice, and a stable HTTP contract for IDE plugins and CLI tools.

Governance

AI Governance Layer

Payload controls, content redaction, author privacy hashing, prompt versioning, and verifiable audit trails on every call.

Paper F

Post-Execution Defect Attribution

Fuse failure signals with repository priors to produce triage advice, not just ranked file lists. Under review at Springer EMSE.

Platform

Quality Analytics Dashboard

Real-time view of risk distribution, top-k coverage, and self-heal vs. flag rates across services.

How TestForge AI works

Four steps from sign-up to a pre-merge gate that catches defects before they ship.

01Connect your repository▶

OAuth into GitHub; TestForge AI mirrors the commit history through the Repository Analytics Engine and produces a process-metric snapshot per file.
02Calibrate the risk model▶

The cross-repository transfer pipeline from Paper B picks the best starting point; an in-team calibration pass uses Platt scaling to tune for your defect rate.
03Wire into CI/CD▶

Install the GitHub Actions workflow. Start in advisory mode (PR comments only). Promote to gating mode when the team is comfortable.
04Generate & verify tests▶

Drop a requirement into the RAITG service. Receive Playwright + BDD specs, the rule-verifier report, and a symbolic-mutation-indicator adequacy score.

What sets TestForge AI apart

Research-backed, not vibe-coded

Every claim on the product page maps to a paper under peer review (or a clearly-marked preprint) with reproducible artefacts.

Calibration before accuracy

Production selection criteria privilege calibration, top-k stability, and threshold sensitivity over raw AUC.

Multi-language, multi-framework

Java, JavaScript, Python, and Go on the roadmap. Frameworks: Selenium, Playwright, Cypress, PyTest, JUnit.

Verified LLM generation

Deterministic rule verifier in every loop. The LLM never gets the last word.

Governance is architecture

Payload controls and prompt receipts are not opt-in; they're how the platform talks to models.

From research to production

The exact code that produced the paper's numbers is the code running in production.

Sister tool — TestForge PaperQC. A pre-submission quality check for software-engineering manuscripts: venue fit, benchmarking against accepted peers, structure, honesty & integrity checks, and a submission-package review, with an acceptance-likelihood estimate and a prioritised action list before you submit. Free during beta. Read the overview or visit paperqc.testforge-ai.com.

Everything in one product

Defect-Driven Test Prioritisation

LLM-Powered Test Generation

Self-Healing Automation

Cross-Repository Risk Profiles

CI/CD & Webhook Integration

AI Governance Layer

Post-Execution Defect Attribution

Quality Analytics Dashboard

How TestForge AI works

What sets TestForge AI apart

Research-backed, not vibe-coded

Calibration before accuracy

Multi-language, multi-framework

Verified LLM generation

Governance is architecture

From research to production

Ready to try TestForge AI?