
Hi, welcome to my project portfolio!
My name is Wouter and I am a Machine Learning Engineer and LLM enthusiast. At work I build production ML and LLM systems for large-scale business problems. Outside of that, I pick up projects out of genuine curiosity or practical need. This site documents both, with a focus on what was actually hard to build and why the architecture ended up the way it did. Feel free to reach me at wouterwijffels@gmail.com or via LinkedIn.
Background
My educational background is in industrial engineering and logistics optimization, where I learned to think about systems and how to improve them mathematically. My working career started at the intersection of workflow automation, analytics, and data science. After a few years, my interest shifted toward machine learning and language modelling, which is where most of my focus sits today. Outside of work I go to open source events, industry conferences, and meetups in the fields I follow.
Projects
Seven years of projects, from first educational data science projects to current ML systems serving millions of customers. Dates mark when a project started; most are still running or have evolved since. Impact figures for production projects are measured at scale; for MVP and hobby projects, numbers are early estimates or personal assessments. Click any card for the full business context, solution design, and technical challenges.
Live in internal beta with CS Analytics and Customer Research
AI agent that answers analytical questions about customer service calls and survey data. Fast MCP server with per-topic tools, AWS Agent Core with memory for campaign and brand context. Aggregated-only outputs by design for GDPR compliance.
AWS Agent Core, Fast MCP, AWS Bedrock, Snowflake, MLflow, CloudWatch
Read more →
80% time reduction in content writing · 1 A/B test, still being evaluated
Snowflake Cortex reads marketing automation segments and generates a persona profile per group. AWS Bedrock then produces personalized campaign copy for each segment. All output is human-reviewed before go-live.
Snowflake Cortex, AWS Bedrock, AWS Lambda, Marketing Automation
Read more →
10% reduction in call handling time · 1.3M conversations/year
Bedrock extracts 5 summary texts and 10 classification features per conversation, with PII masked before any LLM call. Validated output auto-files the CRM wrap-up and feeds downstream analytics. Processing 1.3M conversations per year, cutting agent handling time by 10%.
AWS Bedrock, Lambda, Snowflake, MLflow, Python
Read more →
>20 models · >400 pipelines · 12 data scientists
Central MLflow tracking server on SageMaker with S3 artifact store, provisioned via Terraform. SageMaker Pipeline evaluation steps compare each run against a hard threshold and a benchmark run; Alertmanager deduplicates and throttles before publishing to SNS. Teams channels subscribe directly for production alerts.
Terraform, MLflow, S3, AMP, SNS, GitLab, SageMaker Pipelines
Read more →
80% time saving · testing with 1 buyer, not yet evaluated
A three-level hierarchical taxonomy clusters industrial PO line items by semantic similarity. Buyers search by part description or upload a supplier invoice for automatic line-by-line price comparison. Hybrid BM25 and dense vector search with cross-encoder reranking surfaces cluster benchmarks.
Sentence Transformers, BERTopic, OpenAI API, ChromaDB, Streamlit
Read more →
~35% uplift improvement · monthly in 5 countries
Historical A/B experiment data trains an uplift model that predicts incremental campaign effect per customer. Results feed a Tableau dashboard used by marketers to target the segments most likely to respond. Retention rates increased 35% since the system went live.
SageMaker Pipelines, Snowflake, Tableau
Read more →
Used to find my own apartment
Started as a HuggingFace agents course project, grew into a real tool I used to find my apartment. smolagents orchestrates scrapers across rental sites, checks for new listings every morning, and lets you query it over WhatsApp.
smolagents, Selenium, Python, WhatsApp API
Read more →
40x better than random selection · monthly in 5 countries
Single XGBoost churn model deployed across 5 countries via SageMaker and GitLab CI/CD, each with its own data warehouse and marketing system. Churn indication is 40x better than random and 8x better than the best benchmark. Established the ML platform integration pattern now used by 20+ models.
AWS SageMaker, GitLab CI/CD, Snowflake, XGBoost, MLflow
Read more →
NLP pipeline maps public commodity and gas price indices to the raw material components in supplier contracts. The lag and magnitude between index movements and supplier price requests are measured and surfaced. Used as a verification tool during active supplier negotiations.
Python, NLP, Snowflake, Public Gas & Commodities APIs
Read more →