Skip to content

Instantly share code, notes, and snippets.

@paranoiq
Last active January 3, 2024 14:30
Show Gist options
  • Star 27 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save paranoiq/1932126 to your computer and use it in GitHub Desktop.
Save paranoiq/1932126 to your computer and use it in GitHub Desktop.
public RSA key validation regexp
#ssh-rsa AAAA[0-9A-Za-z+/]+[=]{0,3} ([^@]+@[^@]+)#
// this is the most simple case. see more complete regexps in coments below
// http://generator.my-addr.com/generate_ssh_public_rsa_key-private_rsa_key-ssh_pair_online_tool.php
// https://help.ubuntu.com/community/SSH/OpenSSH/Keys
// http://www.ietf.org/rfc/rfc4716.txt
@MaPePeR
Copy link

MaPePeR commented Jun 19, 2022

@nemchik I left the last character out intentionally, because not all bits in that base64 character are part of the "known header data". There is additionally some padding in there, that determines the character:

\0      \0      \0      \0x7    s       s       h       -       r       s       a
|------||------||------||------||------||------||------||------||------||------||------|PADDING with 0
0000000000000000000000000000011101110011011100110110100000101101011100100111001101100001??
|----||----||----||----||----||----||----||----||----||----||----||----||----||----||----|
A     A     A     A     B     3     N     z     a     C     1     y     c     2     E?

Forcing the last character to be an E also enforces, that the data that follows starts with 00. I didn't know if that was always the case, so I didn't want to add that constraint to the regex. From all I know, these chars could be valid:

Last Header sextet Base64 encoding
000100 E
000101 F
000110 G
000111 H

I didn't see the point going through all of them to determine these extra valid chars, when the regex can only give you a rough guideline anyway. But maybe there is also a constraint, that I don't know of, that forces these bits to always be 0? Keeping the E in the regex would be fine, then.

>>> import base64
>>> base64.b64encode("\0\0\0\x07ssh-rsa\x00")
'AAAAB3NzaC1yc2EA'
>>> base64.b64encode("\0\0\0\x07ssh-rsa\xFF")
'AAAAB3NzaC1yc2H/'
#              ^- last character isn't an E anymore.

(I only picked rsa as the example here, because it was the shortest. But the same applies to all of them, of course)

@nemchik
Copy link

nemchik commented Jun 19, 2022

@MaPePeR thanks for the follow up! That's super helpful! This comment thread is getting a bit large, so I'm considering making a repository where this could be kept up with and issues and pull requests could be opened to improve the information.

I'd want to include information such as removing characters from the end for the reasons you mentioned.

@nemchik
Copy link

nemchik commented Jun 20, 2022

I've put together a repository here https://github.com/nemchik/ssh-key-regex
I've cited credit to @paranoiq for this gist and @MaPePeR for all of the amazing information provided here in the comments.

If I am made aware of new supported key types I will be happy to update the information or accept pull requests to improve.

A potential future goal will be to use GitHub pages to present the information as a webpage with nicer formatting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment