System Architecture | Vijay Prasad Javvadi

Layer-by-layer walkthrough

Each layer publishes a stable contract to the one above it; each box names the paper(s) where its design is justified.

1Data sources

Five upstream sources feed the platform: Git commit logs (history across Elasticsearch, Spring Boot, Hadoop, Kafka, Express), issue / defect history (keyword labelling on bug tags), live DOM snapshots from Selenium and Playwright sessions, natural-language requirements (312 across three domains), and test-run telemetry from real CI executions.

Sources are intentionally heterogeneous: the platform's job is to fuse them into a single risk model and a single triage signal.

2Feature extraction & dataset

The Repository Analytics Engine produces the seven process metrics powering Paper A: commit_count, unique_developers, code_churn, lines_added, lines_deleted, file_age_days, and commit_frequency. The May 2026 rebuild corrected the file_age_days sign bug that previously contaminated 40% of rows, lifting commit_frequency from a near-zero contributor to a meaningful secondary signal.

In parallel the DOM feature extractor (Paper D) emits eight feature families and the requirement pre-processor (Paper E) normalises requirements via the RAITG prompt schema and domain glossary.

3ML / LLM core

Four model families share a uniform training pipeline (stratified 5-fold CV, SMOTE oversampling, paired-t/Wilcoxon significance):

Defect-prediction models — six classifiers benchmarked head-to-head, Random Forest best at AUC 0.8998 (Paper A). Cross-repository transfer — leave-one-repository-out with cross AUC 0.867 and cross F1 0.631 (Paper B). Locator-ranking models — tree ensembles over 2,400 mutation events spanning seven refactor classes (Paper D). LLM + rule verifier — prompt-engineered generation with deterministic rule checks and symbolic mutation indicators (Paper E).

4Runtime services (FastAPI)

Five microservices expose the trained models behind versioned HTTP endpoints: risk-prediction API, test-prioritisation service (top-10% capture 43.82% of defects, 4.37× lift; Paper C), self-healing runtime, test-generation service (94.1% requirement coverage, 96.3% first-pass verification; Paper E), and the defect-attribution service (Paper F).

The production model in service is gb-paper1-v4-fixed_age — Gradient Boosting with SMOTE, trained on 386,076 instances post-correction, AUC 0.8917.

5CI/CD surface

The platform meets developers where they already work: a GitHub Actions integration with advisory and gating rollouts, a risk dashboard, a CLI / IDE plugin for pre-commit risk hints, and versioned reports and audit logs — including the immutable prompt receipts that the AI governance layer requires.

GAI Governance side rail

A cross-cutting concern, not a layer. Implements payload controls, content redaction, author-privacy hashing, prompt versioning, and audit trails. Every model call — whether to the defect classifier, the LLM, or the attribution service — passes through this rail and produces a verifiable receipt.

Why this shape? The architecture is deliberately ML-agnostic at the service boundary — the same FastAPI contract serves the current Gradient Boosting model and would serve a fine-tuned transformer if and when one outperforms it on calibration, not just on raw AUC.