Skip to content

Instantly share code, notes, and snippets.

@idshklein
Created March 18, 2024 18:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save idshklein/697c2b615f9b30ff98dcdc5aeb843a05 to your computer and use it in GitHub Desktop.
Save idshklein/697c2b615f9b30ff98dcdc5aeb843a05 to your computer and use it in GitHub Desktop.
calculation of geographical distances vs jako-winkler distance
pacman::p_load(tidyverse,jsonlite,stringdist)
res <- read_json("https://data.gov.il/api/3/action/datastore_search?resource_id=70ba1705-3b25-416f-939c-985999f87f35&limit=10000")
df <- res$result$records %>% map_df(~.x) %>%
filter(`סוג ישוב`!="מוקד תעסוקה") %>%
mutate(cntr = row_number())
dist_geo <- df %>%
select(`אורדינטה מזרחית`,`אורדינטה צפונית`) %>%
dist() %>% as.matrix()
res1 <- map(df$`שם ישוב`,~stringdist(.x,df$`שם ישוב`, method='jw'))
matrix_from_list <- do.call(cbind, res1)
df_geo_dist <- dist_geo %>%
as.data.frame() %>%
rownames_to_column("from") %>%
gather(to,geo_dist,-from)
df_word_dist <- matrix_from_list %>%
as.data.frame() %>%
setNames(1:nrow(df)) %>%
rownames_to_column("from") %>%
gather(to,word_dist,-from)
df_geo_dist %>%
left_join(df_word_dist,join_by(from,to)) %>%
mutate(word_dist_x = (word_dist- min(word_dist))/(max(word_dist) - min(word_dist)),
geo_dist_x = (geo_dist- min(geo_dist))/(max(geo_dist) - min(geo_dist)),
dist = word_dist_x^2 + (geo_dist_x)^2) %>%
mutate_at(vars(from,to), as.integer) %>%
filter(from != to,geo_dist < 10000) %>%
left_join(df %>% select(cntr,`שם ישוב`),join_by(from == cntr)) %>%
left_join(df %>% select(cntr,`שם ישוב`),join_by(to == cntr)) %>%
select(-from,-to,-word_dist_x,-geo_dist_x) %>%
arrange(dist) %>% View()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment