Article-Journal

Knowledge Transfer for Entity Resolution with Siamese Neural Networks featured image

Knowledge Transfer for Entity Resolution with Siamese Neural Networks

We propose a deep Siamese neural network that learns a similarity measure tailored to a dataset, eliminating manual feature engineering. We also show that knowledge transfer …

Michael Loster
Data Preparation for Duplicate Detection featured image

Data Preparation for Duplicate Detection

We propose the first workflow that systematically integrates data preparation operations before duplicate detection, improving AUC-PR by up to 19%.

avatar
Ioannis Koumarelas
Experience: Enhancing Address Matching with Geocoding and Similarity Measure Selection featured image

Experience: Enhancing Address Matching with Geocoding and Similarity Measure Selection

In this paper, we study the problem of matching records that contain address information, including attributes such as Street-address and City. To facilitate this matching process …

avatar
Ioannis Koumarelas
Flexible Partitioning for Selective Binary Theta-Joins in a Massively Parallel Setting featured image

Flexible Partitioning for Selective Binary Theta-Joins in a Massively Parallel Setting

We propose an ensemble-based partitioning method to improve theta-join execution in massively parallel systems such as MapReduce and Spark.

avatar
Ioannis Koumarelas
Integrating similarity and dissimilarity notions in recommenders featured image

Integrating similarity and dissimilarity notions in recommenders

Collaborative recommenders rely on the assumption that similar users may exhibit similar tastes while content-based ones favour items that found to be similar with the items a user …

Christos Zigkolis