Senior AI / ML Engineer with 12+ years of software and platform engineering experience, now fully focused on designing, building, and operating production AI/ML systems at scale. I bridge the gap between research-grade models and enterprise-grade products — from data pipelines through training and fine-tuning to reliable, observable inference serving.
My core expertise includes end-to-end ML platform design (feature stores, training orchestration, model registries, A/B serving), large language model integration and fine-tuning (LoRA/QLoRA, RLHF, DPO), retrieval-augmented generation (RAG) with advanced chunking, embedding and reranking strategies, and agentic AI system architecture with tool-use, planning, and memory patterns. I design AI systems that are safe, explainable, and compliant with EU AI Act, NIST AI RMF, and ISO/IEC 42001.
I architect scalable inference infrastructure on Kubernetes (GKE, EKS) with GPU scheduling, autoscaling, and cost-optimized serving using vLLM, TGI, and Triton. I have hands-on production experience with Vertex AI, Amazon SageMaker, Azure ML, and open-source stacks (MLflow, Kubeflow, Ray).
I can present complex AI system architecture clearly to executives, product teams, and regulators — with high-quality diagrams, risk assessments, and business-impact analysis. I lead architecture reviews, AI risk assessments, and implementation planning for responsible, scalable AI delivery.
Google Cloud (5x Professional Certified): Professional Cloud Architect, Professional Cloud Network Engineer, Professional Security Operations Engineer, Professional Cloud DevOps Engineer, Professional Data Engineer. Technical Expert: Build with Vertex, Intelligent Search, Customer Engagement Suite with Google AI.
As a published Technical Author, I actively share hands-on AI engineering insights on Medium and in professional publications, and have won multiple AI hackathons at Deutsche Telekom.
LLM Fine-Tuning (LoRA, QLoRA, RLHF, DPO)
Retrieval-Augmented Generation (RAG)
Agentic AI (Tool-Use, Planning, Memory)
Prompt Engineering & Evaluation (RAGAS, DeepEval)
Transformer Architecture & Attention Mechanisms
Vector Databases (Pinecone, Weaviate, pgvector, ChromaDB)
Embedding Models & Semantic Search
ML Platform Architecture (Feature Store, Model Registry, A/B Serving)
Training Orchestration (Kubeflow, Ray, Vertex AI Pipelines)
Inference Serving (vLLM, TGI, Triton, TorchServe)
GPU Infrastructure & Scheduling (NVIDIA A100/H100, MIG)
Model Optimization (Quantization, Distillation, Pruning)
Vertex AI (AutoML, Custom Training, Endpoints, Gemini)
Amazon SageMaker & Bedrock
Azure OpenAI Service & Azure ML
LangChain / LlamaIndex / CrewAI
Hugging Face Ecosystem (Transformers, PEFT, TRL, Datasets)
MLOps & Experiment Tracking (MLflow, W&B, Neptune)
Data Engineering for ML (Spark, BigQuery, Dataflow, dbt)
Computer Vision (YOLO, ViT, Detectron2)
NLP (NER, Sentiment, Summarization, Classification)
Responsible AI & AI Safety (Guardrails, Red-Teaming, Bias Audit)
AI Governance (EU AI Act, NIST AI RMF, ISO/IEC 42001)
AI Risk Management (ISO/IEC 23894)
Python (PyTorch, TensorFlow, JAX)
Go (Golang)
Kubernetes (GKE, EKS)
Docker
Terraform / Pulumi
CI/CD (GitLab CI, GitHub Actions, Cloud Build)
Argo CD / Argo Workflows (GitOps, ML Pipelines)
Observability (Prometheus, Grafana, OpenTelemetry)
API Design (REST, gRPC, GraphQL, OpenAPI)
Event-Driven Architecture (Kafka, Pub/Sub, NATS)
Security for AI Systems (Model Security, Data Poisoning Defense)
Architecture Diagramming & Technical Storytelling