Paper A

Process-Metric-Based Defect Prediction at Scale

Six ML classifiers benchmarked head-to-head on 296,457 file instances across five mature OSS systems. Random Forest best at AUC 0.8998, F1 0.6355 binary, F1 0.7595 macro.

Paper B

Cross-Repository Transfer Learning

Leave-one-repository-out evaluation across five heterogeneous OSS systems. Cross AUC 0.867, cross F1 0.631; AUC–F1 asymmetry analysis and defect-rate mismatch as the primary degradation driver.

Paper C

Risk-Based CI/CD Test Prioritisation

Top 10% of files capture 43.82% of defects (4.37× risk-vs-random lift; 81.4% of the oracle ceiling). Production deployment as a FastAPI microservice triggered by GitHub Actions.

Paper D

Self-Healing Web Test Automation

Tree-ensemble locator ranking over eight DOM feature families. Mutation-based evaluation across 2,400 events and seven refactor classes; heuristic + ML hybrid with calibrated confidence gating.

Paper E

LLM-Based Test Case Generation

RAITG framework converting 312 natural-language requirements across three domains into executable tests. 94.1% requirement coverage, 96.3% first-pass verification, 68% effort reduction.

Paper F · Vision

Post-Execution Defect Attribution

Fusing dynamic test-failure signals (stack traces, suspect-set ranking) with SHAP-explained repository priors to produce practitioner-actionable triage, not just ranked file lists.

Cross-cutting

AI Governance for Quality Engineering

Unified governance rail across all services: payload controls, content redaction, author privacy hashing, prompt versioning, and verifiable audit trails for every model call.

Methodology

Reproducible Empirical Software Engineering

Every result on this site is reproducible from on-disk artefacts — trained models, CSVs, JSONs, and figures. Honest threats-to-validity disclosures including the resolved file_age_days sign bug.