Ioannis Koumarelas, PhD
Ioannis Koumarelas, PhD
Home
Skills
Experience
Accomplishments
Publications
Contact
Light
Dark
Automatic
Matching Dependencies
MDedup: Duplicate Detection with Matching Dependencies
Our system uses automatically discovered MDs, various dataset features, and known gold standards to train a model that selects MDs as duplicate detection rules. Once trained, the model can select useful MDs for duplicate detection on any new dataset.
Ioannis Koumarelas
,
Thorsten Papenbrock
,
Felix Naumann
PDF
Cite
Code
Video
Supplementary code utilities
Repeatability
Efficient Discovery of Matching Dependencies
We focus on the efficient discovery of all interesting MDs in real-world datasets. For this purpose, we propose HyMD, a novel MD discovery algorithm that finds all minimal, non-trivial MDs within given similarity boundaries.
Philipp Schirmer
,
Thorsten Papenbrock
,
Ioannis Koumarelas
,
Felix Naumann
PDF
Cite
Cite
×