Entity embeddings are used to map categorial variables into eucledian space. Works particularly well with features with high cardinality. i.e the categorial variable would be having a lot of discrete values.
INPUT- TABULAR DATA -> ||| ALGORITHM ||| -> OUTPUT - ENTITY EMBEDDINGS
Algorithm -
- convert categorical variables into contiguous integers or one-hot encodings, normalizing continuous features to standard normal (after your feature engineering procedure) etc.
- Some categorical variables have a lot more levels than others. So we use the cardinality of the variable to decide the size of the embedding. Something like
sizes = [(min(max_emb_size, (c+1)//2)) for c in sizes]
Think of it like each user in the IMDB database to be represented by a vector, not one-hot but an entity vector. If you are trying to map users to movies then here each weight vector is a list of latent concepts, which give a certain weight to our decision on how to go about to rate a movie. Just like the stuff we do in SVD. It is across these latent concepts that we visualise out entity embeddings.
- Then follow the standard training procedure for training a feed forward neural network.(i.e uniform intialize you embeddings with the above embedding size and train them on your data)
1. Collaborative filtering[1]
Aim - Predict what a user would rate a movie even thought he has not watched it.
You will get a person represented by a vector like -
Person_i = [scifi-0.1, romantic-0.3, horror-0.3] //only the numbers, text is for illustration
2. Taxi cab[2]
Predict the destination of taxi trips based on initial partial trajectories
3. Rossman Drug Sales[3] Forecast sales using store, promotion, and competitor data
References -
[1]https://medium.com/@apiltamang/learning-entity-embeddings-in-one-breath-b35da807b596
[2]https://www.kaggle.com/c/pkdd-15-predict-taxi-service-trajectory-i