Skip to content

Instantly share code, notes, and snippets.

@kez
Forked from ianks/slugify.sql
Created May 13, 2019 14:50
Show Gist options
  • Star 32 You must be signed in to star a gist
  • Fork 8 You must be signed in to fork a gist
  • Save kez/17638bade0382f820280dafa46277435 to your computer and use it in GitHub Desktop.
Save kez/17638bade0382f820280dafa46277435 to your computer and use it in GitHub Desktop.
Generating Slugs in Postgres
CREATE EXTENSION IF NOT EXISTS "unaccent"
CREATE OR REPLACE FUNCTION slugify("value" TEXT)
RETURNS TEXT AS $$
-- removes accents (diacritic signs) from a given string --
WITH "unaccented" AS (
SELECT unaccent("value") AS "value"
),
-- lowercases the string
"lowercase" AS (
SELECT lower("value") AS "value"
FROM "unaccented"
),
-- remove single and double quotes
"removed_quotes" AS (
SELECT regexp_replace("value", '[''"]+', '', 'gi') AS "value"
FROM "lowercase"
),
-- replaces anything that's not a letter, number, hyphen('-'), or underscore('_') with a hyphen('-')
"hyphenated" AS (
SELECT regexp_replace("value", '[^a-z0-9\\-_]+', '-', 'gi') AS "value"
FROM "removed_quotes"
),
-- trims hyphens('-') if they exist on the head or tail of the string
"trimmed" AS (
SELECT regexp_replace(regexp_replace("value", '\-+$', ''), '^\-', '') AS "value"
FROM "hyphenated"
)
SELECT "value" FROM "trimmed";
$$ LANGUAGE SQL STRICT IMMUTABLE;
CREATE FUNCTION public.set_slug_from_name() RETURNS trigger
LANGUAGE plpgsql
AS $$
BEGIN
NEW.slug := slugify(NEW.name);
RETURN NEW;
END
$$;
CREATE TRIGGER "trg_slug_insert"
BEFORE INSERT ON "my_table_name"
FOR EACH ROW
WHEN (NEW.name IS NOT NULL AND NEW.slug IS NULL)
EXECUTE PROCEDURE set_slug_from_name();
@kez
Copy link
Author

kez commented May 13, 2019

Forked to remove single and double quotes so they don't get switched to hyphens. Also updated the trailing/leading hyphen regex as the \\- was not working for me ('\- does).

@nickreese
Copy link

How would you rework this to make sure the slug is unique and if not add a value at the end?

@kez
Copy link
Author

kez commented Feb 14, 2020

At the moment the trigger and function is not aware of what table it's running against, so there's no immediately obvious way to check the uniqueness. You could use TG_RELID and pass through the table name to loop and check, but it's maybe not the best place for this logic.

A quick win would be to enforce uniqueness on the name column. Alternatively create the slug with the name and the primary key column:

NEW.slug := slugify(concat(NEW.name, '-', (NEW.id)::varchar);

Assuming you had a name of "Foo bar" and an ID of "1" this would give you "foo-bar-1" and the use of the primary key should always keep it unique, but you might not want to leak your primary key info like this.

@nickreese
Copy link

This is great. Thank you. I was looking to have another function that checked the table and looped through it to see if the value was unique but this is a good fit for now. You rock.

@augnustin
Copy link

Man, this is incredibly useful!! Thanks a lot, been searching for this for quite a while before I found your post: https://www.kdobson.net/2019/ultimate-postgresql-slug-function/

@kez
Copy link
Author

kez commented Apr 29, 2020

Man, this is incredibly useful!! Thanks a lot, been searching for this for quite a while before I found your post: https://www.kdobson.net/2019/ultimate-postgresql-slug-function/

Thanks augnustin. There's a function below which checks for duplicates then adds an incrementing number until it's unique. Not that well tested and probably not best practice:

`
CREATE or replace FUNCTION public.set_slug_from_name() RETURNS trigger
LANGUAGE plpgsql
AS $function$
DECLARE
sql_string varchar;
tmp_slug varchar;
increment integer;
tmp_row record;
tmp_row_count integer;

BEGIN

tmp_row_count = 1;
increment = 0;
while tmp_row_count > 0 LOOP
  
  if increment > 0 then
    tmp_slug = slugify(NEW.name || ' ' || increment::varchar);
  ELSE
    tmp_slug = slugify(NEW.name);
  end if;

  sql_string = format(' select count(1) cnt from ' || TG_TABLE_NAME || ' where slug = ''' || tmp_slug || '''; ');
  for tmp_row in  execute(sql_string)
  loop
    raise notice '%', tmp_row;  
    tmp_row_count = tmp_row.cnt;  
  end loop;

increment = increment + 1;
END LOOP;

 NEW.slug := tmp_slug;
RETURN NEW;

END
$function$;
`

@augnustin
Copy link

Cheers @kez

I don't need that though, because in my case, I simply need to write things like select * from users u where slufigy(users.city) = 'paris';

@dantio
Copy link

dantio commented Dec 11, 2020

What do you think about this oneliner:
SELECT TRIM(BOTH '-_' FROM regexp_replace(unaccent(lower('VALUE')), '[^a-z0-9\\-_]+', '-', 'gi'));

  1. I think "remove single and double quotes" is not required, since the next regex will remove it anyway.
  2. the trims hyphens in your example does not remove underscore and TRIM is build in function in postgres.

@nickkang1
Copy link

If you don't care that the slug changes when the name changes, you can also make it a generated column since slugify(value text) is immutable.

CREATE TABLE person (
  name  text not null,
  slug  text generated always as (slugify(name)) stored
);

@fr-pagesurfer
Copy link

I've rewritten this in PL/PGSQL, making it more readable to me. Also added some german specific umlaut replacement rules, and declared it as PARALLEL SAFE which allows for a substantially neater and faster query plan in my case.

CREATE EXTENSION IF NOT EXISTS "unaccent";
CREATE OR REPLACE FUNCTION slugify(t text) RETURNS text
AS $$
BEGIN
	t := regexp_replace(t, '[Ää]', 'ae', 'g');
	t := regexp_replace(t, '[Öö]', 'oe', 'g');
	t := regexp_replace(t, '[Üü]', 'ue', 'g');
	t := unaccent(t);
	t := lower(t);
	t := regexp_replace(t, '[''"]+', '', 'gi');
	t := regexp_replace(t, '[^a-z0-9\\-_]+', '-', 'gi');
	t := regexp_replace(t, '\-+$', '');
	t := regexp_replace(t, '^\-', '');
	RETURN t;
END;
$$ LANGUAGE plpgsql STRICT IMMUTABLE PARALLEL SAFE;

@daniel-finnan
Copy link

Thanks for sharing this modified version of @ianks function, the removal of the leading and trailing - didn't work for me. So I replaced it with the trim string function.

  "trimmed" AS (
    SELECT trim(both '-' from "value") AS "value"
    FROM "hyphenated"
  )

@NielsRenard
Copy link

the following chars are all not being filtered out:

\ backslash
] right square bracket
^ caret
_ underscore

The issue seems to be a slash too many, \\-_ will match characters from range "\" to "_" (char code 92 to 95).

Removing one of the slashes made it work as intended, i.e. :

  -- replaces anything that's not a letter, number, hyphen('-'), or underscore('_') with a hyphen('-')
  "hyphenated" AS (
    SELECT regexp_replace("value", '[^a-z0-9\-_]+', '-', 'gi') AS "value"
    FROM "removed_quotes"
  ),

@paulovieira
Copy link

the following chars are all not being filtered out:

\ backslash ] right square bracket ^ caret _ underscore [...]

Thank you @NielsRenard! With your fix the slugify function is now more robust for corner cases like "the title [something]".

@kez: can you please update the code in the gist please? This is a useful utility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment