Skip to content

Instantly share code, notes, and snippets.

View hecvd's full-sized avatar

Hector Villasante hecvd

  • GetYourGuide
  • Berlin, Germany
View GitHub Profile
@hecvd
hecvd / reusable-data-model.md
Last active April 18, 2024 09:26
Building a reusable Data Model for Approximate String Matching with Spark

Building a reusable Data Model for Approximate String Matching with Spark

Context

At the Paid Search team in GetYourGuide, we manage a collection of over 30 million active digital ads to be displayed in every mayor ads provider in each of our supported 19 languages to announce thousands of activities across the globe. We take our motto "Find the best things to do wherever you’re going" very seriously. Whenever someone inputs a query in their search engine of preference, we make a huge effort to accomplish 2 goals: