Status: Completed Length: 9 pages Target venue: IEEE Software / IEEE Access / EMSE

Abstract. Uniform empirical comparison of six classifiers (LR, DT, RF, GB, XGB, MLP) on a 296,457-instance dataset assembled from five mature open-source systems (Elasticsearch, Spring Boot, Hadoop, Kafka, Express). Stratified 5-fold cross-validation with paired-t and Wilcoxon significance testing. Random Forest is best (AUC 0.8998, F1 binary 0.6355, F1 macro 0.7595); XGBoost is the closest competitor (AUC 0.8955).

Contributions. Multi-model Gini feature importance with explicit caveat on Strobl 2007 high-cardinality bias; documentation of the deployed model gb-paper1-v4-fixed_age; honest threats-to-validity disclosure including the file_age_days sign bug that contaminated 40% of rows in the prior revision and is now eliminated.

Status: Completed Length: 5 pages Target venue: MSR / IEEE Access / AST @ ICSE

Abstract. Systematic leave-one-repository-out validation reveals an AUC–F1 asymmetry under cross-repository conditions and identifies defect-rate mismatch as the primary qualitative predictor of transfer degradation. Cross-repository AUC ranges 0.817–0.913 with mean 0.867; cross F1 ranges 0.525–0.707 with mean 0.631. The cross-within F1 gap shrinks to 0.027 after the file_age_days correction.

What's explicitly out of scope. No 20-pair pairwise matrix, no MMD distance analysis, and no domain-adaptation methods (TCA / CORAL / DANN). These are signposted as future work rather than claimed.

Status: Manuscript ID: a37743ed-7589-48bf-8cf8-84bb599a9e2f ORCID: 0009-0004-1192-6906 Length: 6 pages Target venue: IEEE Software (industry) / QRS / ICST industry

Abstract. Top-k coverage analysis at k ∈ {10, 20, 30, 40, 50}% with risk-vs-random lift factors. Top 10% of files capture 43.82% of defects (4.37× lift, 81.4% of the oracle ceiling at the 18.61% base rate). Top 20% reaches 69.62%, top 30% reaches 83.64%. The production model gb-paper1-v4-fixed_age is integrated as a FastAPI microservice triggered by GitHub Actions with explicit advisory-vs-gating rollout criteria.

Implementation status. The system described in this paper is deployed in production as a FastAPI microservice integrated with GitHub Actions CI/CD pipelines.

Honest disclosure. No production telemetry is yet instrumented; the paper describes the design pattern and the offline coverage substrate. A telemetry-grounded follow-up is in progress.

Status: In progress Target venue: ICST / ASE industry / ISSTA tool

Abstract. A framework that automatically recovers broken Selenium locators at runtime using DOM similarity analysis and tree-ensemble ranking over eight feature families (id-stability, role/aria, structural neighbourhood, text proximity, depth, attribute Jaccard, sibling-index, and Xpath-edit distance). Mutation-based evaluation pipeline simulates realistic UI evolution: 2,400 mutation events spanning seven refactoring classes.

Hybrid heuristic + ML ranking produces a deterministic top-1 candidate plus calibrated confidence used to gate auto-heal vs. flag-for-review.

Status: Completed / production service Target venue: ASE / ICSE-NIER / IEEE Access

Abstract. The RAITG framework converts 312 requirements across three domains (banking, healthcare, e-commerce) into executable tests using an LLM plus deterministic rule-based verification. End-to-end results: 68% effort reduction (184h → 58.9h), requirement coverage 71.2% → 94.1%, and first-pass verification rate 96.3%. Symbolic mutation indicators are used as a domain-agnostic adequacy proxy.

Status: Vision paper Target venue: ICSE-NIER / FSE Ideas / ASE-NIER

Abstract. A practitioner-oriented vision for fusing dynamic test-failure signals (stack traces, failure clustering, suspect-set ranking) with SHAP-explained repository priors from the deployed defect-prediction model, producing actionable triage advice rather than ranked file lists. Outlines a four-stage attribution pipeline, validation strategy, and integration points with the existing FastAPI microservice.

Consolidation note. The previous Paper 1–Paper 8 listing has been replaced by this six-paper portfolio (May 2026). Paper 2 (dataset construction) and Paper 3 (classifier comparison) were folded into Paper A to avoid salami-slicing; Paper 4 (feature importance / SHAP) is also part of Paper A with SHAP claims removed and Gini-only analysis kept honest.