Ioannis Koumarelas

Ioannis Koumarelas

PhD graduate in Data Cleaning

Hasso Plattner Institute

Mission statement

Great applications can change the world and having high-quality data is more important than we previously thought. I am passionate about understanding data and making them a powerful tool in our hands through sophisticated Machine Learning and Data Engineering solutions.

Interests
  • Data Matching
  • Machine & Deep Learning
  • Natural Language Processing
Education
  • PhD in Information Systems, 2020

    Hasso Plattner Institute

  • MSc in Information Systems, 2014

    Aristotle University of Thessaloniki

  • BSc in Information Systems, 2012

    Aristotle University of Thessaloniki

Skills

Entity Resolution
Data Cleaning
Machine Learning
Data Engineering
Research & Development
Data Science

Experience

 
 
 
 
 
Veeva Link
Data Scientist
Dec 2021 – Present Remote (Berlin, DE)

Responsibilities include:

  • Duplicate Detection
  • Clustering
  • Creation & Enrichment of medical profiles
 
 
 
 
 
HPI Schul-Cloud
Data Engineer / Machine Learning Engineer
HPI Schul-Cloud
Apr 2020 – Nov 2021 Potsdam, DE

Responsibilities include:

  • Analysis & Presentations
  • Research & Publications
  • Data Cleaning methods
 
 
 
 
 
SAP Concur
Research Consultant
SAP Concur
Nov 2015 – Oct 2018 Potsdam, DE

Responsibilities include:

  • Analysis & Presentations
  • Research & Publications
  • Data Cleaning methods

Accomplish­ments

Udemy
Backend and Frontend technologies
Coursera
Deep Learning Specialization (5 courses)
  1. Neural Networks and Deep Learning.
  2. Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization.
  3. Structuring Machine Learning Projects.
  4. Convolutional Neural Networks.
  5. Sequence Models.
See certificate
Udemy
Containers and orchestration

Recent Publications

Quickly discover relevant content by filtering publications.
(2021). Knowledge Transfer for Entity Resolution with Siamese Neural Networks. In ACM JDIQ 2021.

PDF Cite Repeatability

(2020). Data Preparation for Duplicate Detection. In ACM JDIQ 2020.

PDF Cite Code Repeatability Supplementary code utilities

(2020). Efficient Discovery of Matching Dependencies. In ACM TODS 2020.

PDF Cite

(2018). Experience: Enhancing Address Matching with Geocoding and Similarity Measure Selection. In ACM JDIQ 2018.

PDF Cite

(2018). Flexible Partitioning for Selective Binary Theta-Joins in a Massively Parallel Setting. In Distrib. Parallel Databases 2018.

PDF Cite Code

Contact