Process-Metric-Based Defect Prediction at Scale
Six ML classifiers benchmarked head-to-head on 296,457 file instances across five mature OSS systems. Random Forest best at AUC 0.8998, F1 0.6355 binary, F1 0.7595 macro.
Cross-Repository Transfer Learning
Leave-one-repository-out evaluation across five heterogeneous OSS systems. Cross AUC 0.867, cross F1 0.631; AUC–F1 asymmetry analysis and defect-rate mismatch as the primary degradation driver.
Risk-Based CI/CD Test Prioritisation
Top 10% of files capture 43.82% of defects (4.37× risk-vs-random lift; 81.4% of the oracle ceiling). Production deployment as a FastAPI microservice triggered by GitHub Actions.
Self-Healing Web Test Automation
Tree-ensemble locator ranking over eight DOM feature families. Mutation-based evaluation across 2,400 events and seven refactor classes; heuristic + ML hybrid with calibrated confidence gating.
LLM-Based Test Case Generation
RAITG framework converting 312 natural-language requirements across three domains into executable tests. 94.1% requirement coverage, 96.3% first-pass verification, 68% effort reduction.
Post-Execution Defect Attribution
Fusing dynamic test-failure signals (stack traces, suspect-set ranking) with SHAP-explained repository priors to produce practitioner-actionable triage, not just ranked file lists.
AI Governance for Quality Engineering
Unified governance rail across all services: payload controls, content redaction, author privacy hashing, prompt versioning, and verifiable audit trails for every model call.
Reproducible Empirical Software Engineering
Every result on this site is reproducible from on-disk artefacts — trained models, CSVs, JSONs, and figures. Honest threats-to-validity disclosures including the resolved file_age_days sign bug.