Senior AI / ML Engineer profile
Professional Profile

Senior AI / ML Engineer

12+ years experience6 roles9 certifications

About Me

Senior AI / ML Engineer with 12+ years of software and platform engineering experience, now fully focused on designing, building, and operating production AI/ML systems at scale. I bridge the gap between research-grade models and enterprise-grade products — from data pipelines through training and fine-tuning to reliable, observable inference serving. My core expertise includes end-to-end ML platform design (feature stores, training orchestration, model registries, A/B serving), large language model integration and fine-tuning (LoRA/QLoRA, RLHF, DPO), retrieval-augmented generation (RAG) with advanced chunking, embedding and reranking strategies, and agentic AI system architecture with tool-use, planning, and memory patterns. I design AI systems that are safe, explainable, and compliant with EU AI Act, NIST AI RMF, and ISO/IEC 42001. I architect scalable inference infrastructure on Kubernetes (GKE, EKS) with GPU scheduling, autoscaling, and cost-optimized serving using vLLM, TGI, and Triton. I have hands-on production experience with Vertex AI, Amazon SageMaker, Azure ML, and open-source stacks (MLflow, Kubeflow, Ray). I can present complex AI system architecture clearly to executives, product teams, and regulators — with high-quality diagrams, risk assessments, and business-impact analysis. I lead architecture reviews, AI risk assessments, and implementation planning for responsible, scalable AI delivery. Google Cloud (5x Professional Certified): Professional Cloud Architect, Professional Cloud Network Engineer, Professional Security Operations Engineer, Professional Cloud DevOps Engineer, Professional Data Engineer. Technical Expert: Build with Vertex, Intelligent Search, Customer Engagement Suite with Google AI. As a published Technical Author, I actively share hands-on AI engineering insights on Medium and in professional publications, and have won multiple AI hackathons at Deutsche Telekom.

Contact

LinkedInGitHub

Skills

LLM Fine-Tuning (LoRA, QLoRA, RLHF, DPO)
Retrieval-Augmented Generation (RAG)
Agentic AI (Tool-Use, Planning, Memory)
Prompt Engineering & Evaluation (RAGAS, DeepEval)
Transformer Architecture & Attention Mechanisms
Vector Databases (Pinecone, Weaviate, pgvector, ChromaDB)
Embedding Models & Semantic Search
ML Platform Architecture (Feature Store, Model Registry, A/B Serving)
Training Orchestration (Kubeflow, Ray, Vertex AI Pipelines)
Inference Serving (vLLM, TGI, Triton, TorchServe)
GPU Infrastructure & Scheduling (NVIDIA A100/H100, MIG)
Model Optimization (Quantization, Distillation, Pruning)
Vertex AI (AutoML, Custom Training, Endpoints, Gemini)
Amazon SageMaker & Bedrock
Azure OpenAI Service & Azure ML
LangChain / LlamaIndex / CrewAI
Hugging Face Ecosystem (Transformers, PEFT, TRL, Datasets)
MLOps & Experiment Tracking (MLflow, W&B, Neptune)
Data Engineering for ML (Spark, BigQuery, Dataflow, dbt)
Computer Vision (YOLO, ViT, Detectron2)
NLP (NER, Sentiment, Summarization, Classification)
Responsible AI & AI Safety (Guardrails, Red-Teaming, Bias Audit)
AI Governance (EU AI Act, NIST AI RMF, ISO/IEC 42001)
AI Risk Management (ISO/IEC 23894)
Python (PyTorch, TensorFlow, JAX)
Go (Golang)
Kubernetes (GKE, EKS)
Docker
Terraform / Pulumi
CI/CD (GitLab CI, GitHub Actions, Cloud Build)
Argo CD / Argo Workflows (GitOps, ML Pipelines)
Observability (Prometheus, Grafana, OpenTelemetry)
API Design (REST, gRPC, GraphQL, OpenAPI)
Event-Driven Architecture (Kafka, Pub/Sub, NATS)
Security for AI Systems (Model Security, Data Poisoning Defense)
Architecture Diagramming & Technical Storytelling

Experience

Senior Google Cloud Platform Engineer (GKE / GCP Focus)

T-Digital by Deutsche Telekom · Full-time

Sep 2024 - Present · 1 yr 9 mos

Thessaloniki, Central Macedonia, Greece · Hybrid

- Owned and evolved a production multi-region, multi-cluster GKE platform serving multiple product teams: provisioned with Terraform (google/google-beta provider), managed via Argo CD GitOps across Standard and Autopilot cluster modes.
- Designed and operated GCP networking: Shared VPC topology, Private Service Connect, VPC peering, private GKE clusters with authorised control-plane access, Cloud DNS private/public zone management.
- Implemented HTTPS L7 ingress with Google-managed certificates, multi-backend routing via Cloud Load Balancing, and Cloud Armor WAF rules (rate limiting, geo-restriction, custom OWASP rule sets).
- Hardened GCP IAM: Workload Identity Federation for keyless CI/CD authentication, least-privilege service account design, Org Policy constraints, Binary Authorization policy enforcement.
- Operated GCP secrets and key management: Cloud KMS (CMEK), Secret Manager with rotation triggers, External Secrets Operator integration for seamless pod secret injection.
- Built full-stack observability on GCP: Cloud Monitoring dashboards and SLO-based alerting, Cloud Logging with log-based metrics and Log Analytics, Cloud Trace integrated via OpenTelemetry collector.
- Managed vulnerability pipeline: Artifact Registry container scanning, Trivy in CI, CVE triage process, Binary Authorization attestation gates to block unsigned images in production.
- Improved platform resilience: cluster etcd backup to Cloud Storage, PVC snapshots, cross-region failover validation, RPO/RTO documentation aligned with SLO targets.

Expert in Platform Engineering & Cloud Automation (GCP)

T-Digital by Deutsche Telekom · Full-time

Jun 2022 - Sep 2024 · 2 yrs 4 mos

Thessaloniki, Central Macedonia, Greece · Hybrid

- Migrated multi-workload platform from OpenStack/AWS to GCP: designed VPC topology (Shared VPC, subnets, firewall rules, Cloud NAT), provisioned GKE clusters with Terraform, and established GitOps delivery via Argo CD.
- Built and standardised CI/CD foundations on GCP (Cloud Build + GitLab CI) for 50+ microservices, enforcing immutable image builds via Artifact Registry and deployment promotion workflows (dev -> staging -> prod).
- Implemented Terraform IaC modules for GCP resources: GKE, VPC, Cloud Armor, KMS, Secret Manager, Cloud Monitoring alert policies — peer-reviewed, versioned, and reused across teams (~40% faster environment provisioning).
- Hardened GCP IAM: OS Login, service account impersonation limits, Org Policy deny-lists for public IPs and legacy APIs, automated IAM audit reporting.
- Deployed L4/L7 traffic management: internal/external TCP/UDP load balancers, HTTPS frontend with SSL policy enforcement, backend health-check tuning for GKE workloads.
- Established Cloud Monitoring alert policies tied to SLI/SLO targets, integrated with PagerDuty; built Cloud Logging dashboards and log sinks to BigQuery for compliance export and cost analysis.
- Automated GCP resource lifecycle workflows (VNF scaling, node pool autoscaling policies, Cloud Scheduler + Cloud Functions), achieving ~20% infra cost reduction.

Projects

Stock Predict Architecture

Implemented an ML platform on GCP for stock prediction and sentiment analysis. Data flows from Yahoo Finance and public APIs into CloudSQL (Postgres) as historical storage. Vertex AI handles sentiment analysis, while ARIMA models run in optimized GKE pods (8Gi RAM, 4 CPU). CI/CD pipelines with Cloud Build and Artifact Registry ensure fast delivery; GitHub sources are secured with signed images (Cosign), SBOMs (Syft) and secrets in Vault. Infrastructure is automated via Terraform with monitoring and logging integrated for observability.

GCPKubernetesCloudSQLVertex AIPythonCloud BuildArtifact RegistryCosignSyftVaultTerraformCloud Monitoring
Date: 2024

Multi-Regional GKE Cluster with GitOps

Multi-regional Kubernetes deployment across West 3 and West 4 regions with GitLab Config Sync and Google Fleet. Ensures high availability by spreading workloads (App A, App B, App C) across zones (a, b, c). Unified GitOps delivery pipelines, consistent security policies, and cross-cluster management with Fleet.

Google Kubernetes Engine (GKE)Google FleetGitLab CIConfig Sync (GitOps)Multi-Regional HAKubernetes
Date: 2025

Licenses & Certifications

Google logo

Google Cloud Certified Professional Cloud Architect

Issued by Google · Issued May 2025

Credential ID: 8a9ddfba001e4a55bf42667a6b62da9b

Skills: Cloud Solution Architecture, Security and Compliance, Cloud Networking, +6 more

Show credential
Google logo

Google Cloud Certified Professional Data Engineer

Issued by Google · Issued Jan 2024

Credential ID: 4f3e7970bbdb4a61805a8209cfe215ed

Skills: Data Engineering, Big Data & ML Pipelines, Data Governance & Security, +6 more

Show credential

Courses

British Standards Institution (BSI)

ISO/IEC 23894:2023 AI Risk Management Awareness

British Standards Institution (BSI)

2025

Awareness eLearning Certificate of Completion covering ISO/IEC 23894:2023 Guidance on AI Risk Management: principles, context, identification, assessment, and mitigation of risks in AI systems for trustworthy and responsible AI.

Certificate: ISO/IEC 23894:2023 AI Risk Management Awareness
AIQI / UKAS

ISO/IEC 42001:2023 Artificial Intelligence Management System Awareness

AIQI / UKAS

2025

Awareness eLearning Certificate of Completion covering ISO/IEC 42001:2023 Artificial Intelligence Management Systems: key concepts, structure, controls, governance, and jurisdiction-specific considerations.

Certificate: ISO/IEC 42001:2023 Artificial Intelligence Management System Awareness
The Linux Foundation

LFS120: Conversational AI: Ensuring Compliance and Mitigating Risks

The Linux Foundation

2025

Compliance- and risk-focused training for conversational AI. Covers the EU AI Act, NIST AI RMF, and ISO/IEC 42001:2023 requirements; methods to identify, analyze, and mitigate ethical, technical, and regulatory risks; and practices for trustworthy, responsible AI implementations.

Certificate: LFS120: Conversational AI: Ensuring Compliance and Mitigating Risks

Education

Humanitarian, Economic and Information Institute of Technology

2012 - 2016

Bachelor's Degree, Law

Data Privacy and SecurityCompliance and RegulationsCommunication and CollaborationTechnical Analysis

Military Academy of the Strategic Missile Forces Academy named after Peter the Great

2007 - 2012

Engineer's Degree, Automation Management Systems

Time ManagementSelf-disciplineCommunication and Collaboration

Military Academy of the Strategic Missile Forces Academy named after Peter the Great

2007 - 2012

Engineer's Degree, Translation

Time ManagementSelf-disciplineCommunication and Collaboration

Publications

Medium

Linux top explained from scratch — clear & practical (2025)

DataDrivenInvestor · Aug 18, 2025

Step-by-step guide to reading Linux top: load averages, memory usage, CPU breakdown, and process states. Includes real-world scenarios for diagnosing compute, I/O, and VM bottlenecks — with clear habits to turn raw numbers into actionable insights.

Show publication
Medium

Scalable Micro-Kernel with Go, 2025 Edition

Level Up Coding · Jul 28, 2025

Introduces a micro-kernel architecture in Go where the core handles only lifecycle, routing and synchronization, while all business logic runs as pluggable modules. Demonstrates hot-swappable plugins for metrics, caching, and email — enabling granular scaling, non-blocking pub/sub, and clean code evolution.

Show publication

Honors and Awards

Google Cloud Gen AI Technical Expert Badge Challenge - Early Adopter Edition iconGoogle Cloud Gen AI Technical Expert Badge Challenge - Early Adopter Edition

Issued by Google Cloud - Aug 2025

Certification / Award

Recognized as one of the first 1,100 professionals worldwide to complete the Gen AI Technical Expert Badge Challenge (Early Adopter Edition). This advanced challenge required earning multiple high-level Google Cloud Technical Expert credentials, including 'Build with Vertex', 'Intelligent Search', and 'Customer Engagement Suite with Google AI'. The achievement demonstrates proficiency in applying Generative AI for enterprise use cases, from building with Vertex AI, through creating intelligent retrieval and search solutions, to designing conversational AI for customer engagement.

Certificate for Google Cloud Gen AI Technical Expert Badge Challenge - Early Adopter Edition
Google Cloud Arcade - Champions Milestone iconGoogle Cloud Arcade - Champions Milestone

Issued by Google Cloud - Jun 2024

Award

Recognized as one of the select professionals worldwide to achieve the Champions Milestone in Google Cloud Arcade, earning a total of 78 points. This rare accomplishment reflects over six months of consistent learning and hands-on practice across diverse Google Cloud technologies.

Certificate for Google Cloud Arcade - Champions Milestone
3rd RIL AI Hackathon icon3rd RIL AI Hackathon

Issued by Research Innovation Lab - Jun 2024

Hackathon

Won 1st place at the 3rd RIL AI Hackathon, hosted by the Research Innovation Lab and associated with T-Digital (Deutsche Telekom). Our team built a production-ready data uploader tailored for RAG-based chatbots. It supports PDF document ingestion, offers multiple adaptive chunking strategies, works autonomously with uploaded corpora, and automatically selects the best strategy using LLM-driven evaluation. We also instrumented RAGAS for response-quality measurement.

Certificate for 3rd RIL AI Hackathon
2nd RIL AI Hackathon icon2nd RIL AI Hackathon

Issued by Research Innovation Lab - Dec 2023

Hackathon

Won 1st place at the 2nd RIL AI Hackathon, hosted by Research Innovation Lab and associated with T-Digital (Deutsche Telekom). Developed an AI solution that analyzes user stories and test cases, providing insights to improve quality and efficiency in software development.

Certificate for 2nd RIL AI Hackathon