Skip to content

Instantly share code, notes, and snippets.

@jdcrensh
Last active January 28, 2024 12:25
Show Gist options
  • Star 23 You must be signed in to star a gist
  • Fork 8 You must be signed in to fork a gist
  • Save jdcrensh/4670128 to your computer and use it in GitHub Desktop.
Save jdcrensh/4670128 to your computer and use it in GitHub Desktop.
Class to encode a string into base 62 (character set [a-zA-Z0-9]) with Java. A common use case is URL shortening.
public class Base62 {
public static final String ALPHABET = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
public static final int BASE = ALPHABET.length();
private Base62() {}
public static String fromBase10(int i) {
StringBuilder sb = new StringBuilder("");
if (i == 0) {
return "a";
}
while (i > 0) {
i = fromBase10(i, sb);
}
return sb.reverse().toString();
}
private static int fromBase10(int i, final StringBuilder sb) {
int rem = i % BASE;
sb.append(ALPHABET.charAt(rem));
return i / BASE;
}
public static int toBase10(String str) {
return toBase10(new StringBuilder(str).reverse().toString().toCharArray());
}
private static int toBase10(char[] chars) {
int n = 0;
for (int i = chars.length - 1; i >= 0; i--) {
n += toBase10(ALPHABET.indexOf(chars[i]), i);
}
return n;
}
private static int toBase10(int n, int pow) {
return n * (int) Math.pow(BASE, pow);
}
}
import org.junit.Test;
import static junit.framework.Assert.assertEquals;
public class Base62Test {
@Test
public void testCharList() throws Exception {
StringBuilder sb = new StringBuilder();
for (char c = 'a'; c <= 'z'; c++) {
sb.append(c);
}
for (char c = 'A'; c <= 'Z'; c++) {
sb.append(c);
}
for (int i = 0; i <= 9; i++) {
sb.append(i);
}
assertEquals(sb.toString(), Base62.ALPHABET);
}
@Test
public void testStringFromInt() throws Exception {
int n = 0;
String str = "6JaY2";
char[] chars = str.toCharArray();
n += Base62.ALPHABET.indexOf(chars[0]) * (int) Math.pow(62, 4);
n += Base62.ALPHABET.indexOf(chars[1]) * (int) Math.pow(62, 3);
n += Base62.ALPHABET.indexOf(chars[2]) * (int) Math.pow(62, 2);
n += Base62.ALPHABET.indexOf(chars[3]) * (int) Math.pow(62, 1);
n += Base62.ALPHABET.indexOf(chars[4]) * (int) Math.pow(62, 0);
assertEquals(str, Base62.fromBase10(n));
}
@Test
public void testIntegerFromString() throws Exception {
assertEquals(125, Base62.toBase10("cb"));
}
@Test
public void testFromZero() {
assertEquals("a", Base62.fromBase10(0));
assertEquals("b", Base62.fromBase10(1));
}
}
@pajuey
Copy link

pajuey commented Jan 30, 2013

neat O_o

@gerrytan
Copy link

Awesome, thank you. Also using long instead of int might help the code to support larger number

@eugenedw
Copy link

great work - thanks!

@jayceekay
Copy link

jayceekay commented Apr 26, 2016

Base62.fromBase10(0) returns "" but shouldn't it return "a"?
Base62.fromBase10(1) returns "b", which i'd expect.

i think the issue is the loop on https://gist.github.com/jdcrensh/4670128#file-base62-java-L13 ?

@guidomedina
Copy link

guidomedina commented Apr 30, 2016

You could do it like this to benefit to be able to compensate, it'll make it extremely fast,
also, calculating the next power is expensive so why not cache the previous one?

  /**
   * Base62 characters table sorted to quickly calculate decimal equivalency by compensating.
   */
  static final char[] BASE62 = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz".toCharArray();

  /**
   * Returns the base 62 string of an integer.
   *
   * @return the base 62 string of an integer
   */
  public static String base62(int value) {
    final StringBuilder sb = new StringBuilder(1);
    do {
      sb.append(BASE62[value % 62]);
      value /= 62;
    } while (value > 0);
    return sb.reverse().toString();
  }

  /**
   * Returns the base 62 value of a string.
   *
   * @return the base 62 value of a string.
   */
  public static int base62(String value) {
    int result = 0;
    int power = 1;
    for (int i = value.length() - 1; i >= 0; i--) {
      int digit = value.charAt(i) - 48;
      if (digit > 42) {
        digit -= 13;
      } else if (digit > 9) {
        digit -= 7;
      }
      result += digit * power;
      power *= 62;
    }
    return result;
  }

@jdcrensh
Copy link
Author

jdcrensh commented May 5, 2016

jayceekay - you're right, that was definitely an oversight. Added a check for when zero (since zero can't be passed to fromBase10(i, sb)).

@jdcrensh
Copy link
Author

jdcrensh commented May 5, 2016

guidomedina - nice solution!

@nirupam89
Copy link

any string that starts with 'a' is not working

@npetzall
Copy link

@guidomedina I would like to use your code in SchemaSpy, would it be ok? You'll retain copyright and it will be distributed under the terms of LGPL v3 or later.

@cicidi
Copy link

cicidi commented Jan 17, 2021

nice practice I think that makes more sense start with 0, instead of 'a'

@D00mch
Copy link

D00mch commented Oct 5, 2022

about @guidomedina example above. Be careful as his base62(int) method works with O(n^2) asymptotics, because sb.insert(0, ..) does system.arraycopy under the hood. It's better to do sb.append(...) and reverse the result at the end. We would have O(2n) which is O(n) asymptotically.

* assuming math operations are performed in constant time.

@guidomedina
Copy link

guidomedina commented Oct 5, 2022

Edited my initial code sample to append and reverse ;-)

@npetzall I'm so sorry I missed your message, feel free, this is just a simple method, feel free to do with it whatever you want/need

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment