Bridging the gap between advanced predictive analytics, scalable ML architecture, and cross-functional business execution.
Drawing on over a decade of experience across banking, insurance, and tech manufacturing, I specialize in applying cross-industry predictive frameworks to engineer modern credit and growth strategies. With expertise in Python, Databricks, and SQL, I design end-to-end analytical pipelines. Beyond the code, I thrive as a strategic bridge—translating complex data models to secure executive buy-in, mentoring technical teams, and partnering directly with marketing and business units to execute highly targeted campaigns.
Case studies demonstrating my ability to navigate complex enterprise data systems, engineer predictive features, and collaborate across teams to optimize business outcomes.
The Challenge: A "spray and pray" digital marketing strategy led to high Customer Acquisition Costs (CAC) and low conversion rates across diverse demographic funnels.
The Solution: Built a data generation engine to simulate 100k+ realistic inbound leads. Designed a PySpark pipeline utilizing VectorAssemblers and StandardScaler (Z-Score Normalization) to feed into a K-Means Clustering algorithm.
The Impact: Achieved an elite Silhouette Score of 0.77, discovering three distinct "hidden personas." Enabled dynamic routing to specific business workflows, driving a simulated 3x lift in Click-Through Rates.
The Challenge: Reactive call-center retention efforts led to high policyholder churn rates and bloated "Cost-to-Serve" operational metrics for enterprise insurance clients.
The Solution: Architected a Databricks pipeline leveraging PySpark and XGBoost to predict customer churn probability based on historical interaction frequency, billing friction, and policy metadata.
The Impact: Enabled proactive intervention strategies by identifying at-risk accounts before they canceled. Translated operational efficiency into a measurable reduction in Cost-to-Serve and improved Customer Lifetime Value (LTV).
The Challenge: Traditional actuarial methods rely on broad averages, leading to over-reserving (trapped capital) or under-reserving (P&L shocks) for liability claims.
The Solution: Architected an end-to-end PySpark ML pipeline on Databricks. Engineered a Gradient Boosted Tree model that captures non-linear risk interactions, outperforming standard GLMs, and tracked the deployment via MLflow Unity Catalog.
The Impact: Successfully segmented risk, demonstrating via decile analysis that the model accurately isolates routine claims ($3k) from severe shock losses ($35k+), enabling precise capital allocation.