Building scalable ML systems for high-volume, real-world decisioning problems
Ph.D. Computer Engineering · Cybersecurity & Distributed Systems · Large-Scale Data Processing
School Districts
Power BI Dashboards
End-to-end production systems — ML design, pipeline architecture, API deployment, and operational monitoring — all serving 30+ California school districts.
Expatiate Communications · Pasadena, CA
University of Delaware · Newark, DE
7+ years across research and industry positions
Three non-negotiable engineering constraints applied across all 32 production systems — not policies, but structural guarantees.
Every system isolates failures at the tenant boundary. One school or district failing never cascades to others — enforced at the goroutine or node level, not by try/catch wrapping.
All ETL pipelines use truncate-reload semantics — never append-only. Any pipeline is safe to rerun after partial failure without data corruption, duplication, or manual cleanup.
AI inference on student data runs locally via Ollama — a structural constraint, not a configuration option. Eliminates PII exposure risk regardless of downstream code changes or vendor policy.
School districts lacked predictive visibility into IEP compliance, academic progress, and operational risk — relying on slow, fragmented manual data aggregation across disparate assessment platforms.
Platform deployed across 30+ California school districts. LangGraph agentic automation achieved a 75–90% reduction in pipeline processing time. Replaced dozens of hours of weekly manual work — data collection, compliance tracking, report generation, and stakeholder communication — with single-command automated pipelines.
Transparent proxies silently intercept and modify web traffic without user awareness, but their true prevalence, behavior, and network-wide impact were poorly understood at scale.
Published at IEEE INFOCOM 2024 — one of the top-ranked venues in computer networking, revealing the significant hidden influence of transparent proxies on internet traffic integrity.
The open proxy landscape — used for anonymization, censorship circumvention, and malicious activity — had never been comprehensively characterized in terms of scale, geography, and behavior.
Published in Computer Networks (Elsevier), 2022 — delivering the first comprehensive analysis of the open proxy ecosystem and its security implications at internet scale.
Remote peering in BGP networks was known to distort anycast routing decisions, but the extent of this unintended impact on global traffic distribution — including for major cloud providers — had not been passively quantified.
Published in ACM SIGCOMM Computer Communication Review, 2019 — a flagship networking venue — establishing foundational methodology for passive anycast analysis used in subsequent internet measurement research.
As AI crawlers become ubiquitous, websites are moving beyond binary blocking (robots.txt) to a more sophisticated, unmeasured tactic: returning HTTP 200 responses to both humans and AI bots, but serving degraded, watermarked, or "poisoned" content specifically to crawlers like GPTBot.
Unlike prior work measuring blocking, this measures deception — filling a critical gap in understanding how the web's content landscape diverges between human and AI readers.
LLMs are widely used to generate Infrastructure-as-Code (Terraform, Kubernetes YAML, Nginx configs). If a model hallucinates a plausible but unregistered domain endpoint, an attacker could register that domain to intercept live API traffic or credentials from deployed systems.
Distinct from package hallucination studies — this targets DNS-level infrastructure interception, a critical supply chain risk not previously measured in the LLM security literature.
4 Active Certifications · Issued 2026 · Valid through 2028
Coursera · Issued Aug 2023
Extensive peer review contributions ensuring the integrity and quality of high-tier network science and security venues.