Skip to content

Instantly share code, notes, and snippets.

@DavidBuchanan314
Last active June 26, 2024 12:05
Show Gist options
  • Save DavidBuchanan314/aafce6ba7fc49b19206bd2ad357e47fa to your computer and use it in GitHub Desktop.
Save DavidBuchanan314/aafce6ba7fc49b19206bd2ad357e47fa to your computer and use it in GitHub Desktop.
Rabbit R1 Unofficial API Docs

The Rabbit R1 uses a few custom APIs to talk to The Cloud™. Almost nothing happens on-device, and all the AI magic happens on servers.

Consequently, you don't really need the physical device.

TLS Client Fingerprinting

In lieu of an authentication scheme, Rabbit's servers attempt to verify device authenticity by checking the TLS client's JA3 fingerprint, presumably enforced by AWS WAF.

If your TLS client doesn't match an expected fingerprint, you'll get HTTP 403 errors. This fingerprint works:

771,4865-4866-4867-49195-49196-52393-49199-49200-52392-49171-49172-156-157-47-53,0-23-65281-10-11-35-16-5-13-51-45-43-21,29-23-24,0

This is a common fingerprint for Android devices and is not exclusive to the R1.

I use utls to replicate this fingerprint like so (drop this into ja3proxy)

Account Activation

Visit https://rabbit.tech/activate in a web browser and set up an account. Follow the registration process, and you should end up with a QR code.

Decode the QR. I like to use zxing. You should get a URL that looks like:

https://hole.rabbit.tech/apis/linkDevice?userId=auth0%7Crandomhex&linkingPasscode=randomhex

The URL must first be modified to append a deviceId parameter, set to a 15 digit decimal number (it's supposed to be an IMEI, but you can use any value) (alternatively, generate a more realistic value with this)

Make an HTTP GET request to the modified URL, and on success you'll get a JSON response that looks like this:

{"actualUserId":"auth0|randomhex","userId":"randomhex","accountKey":"randomhex","userName":"blah"}

Keep these values safe, particularly accountKey, you'll need them for later.

The account is now activated, and your browser session should have access to the "rabbithole", where you can let them skim your creds for 3rd party service integrations over VNC, and other such features.

The API

The main API is a JSON-based RPC-like mechanism running over a websocket, at wss://r1-api.rabbit.tech/session

The API is clearly based on the GAMA NPC "Quantum Engine AI" integration thing, which you can find partial docs for here (paste it into https://studio.asyncapi.com/), but this is more of a curiosity than useful documentation.

You'll need to set a couple of HTTP headers before it'll work, App-Version and OS-Version. Valid values for these fields change in each update, so I won't list them here, but maybe someone will be nice and leave currently-working values in the comments. (it sounds like OS-Version is the more important of the two, App-Version maybe doesn't matter)

Device-Health (UNTESTED)

In newer updates (v0.8.99+) a timestamp string in the format rabbit_OS_v0.8.99_20240606175556,YYYYMMDDHHMMSSmmm,xx (where mmm is milliseconds, xx is a random 2-digit even integer) is encrypted with the following RSA-3072 public key:

-----BEGIN PUBLIC KEY-----
MIIBojANBgkqhkiG9w0BAQEFAAOCAY8AMIIBigKCAYEAqLNRPcujKw1elkNJc+10
o37YVbb7OjYa4Cv2pG2BzfSV3Ev7LMvaA2w0PAy25DhQU2NI7RU2a51OvTz0DsXM
69oakuN0oSrKa9Eit2GPnX89H702MXGXiRDZWEufAx67AaxK9d80Bajh2Abn06Bw
az9Z4D8vMxUOGsYkVKMW0LrmnW4984XIUqT3+lOiEijBamodU/mORTeuxc5cdan0
0fq8qTOYuGFuKlPJSI3EExFHP3ONHD6z44+PxXmhw532uAiNnT74yKXBoVYU19b8
AAWLiSKyjf1eeus7dTobPKcpMemlJgxHtVHtaSgnUugQ0a3XvmTVQpSeytPw8bL+
/3c5KXfjGxPchoEZi7d71wv/AufDiSXrgaew1KfJZBsr8Somr03b8xsHRJruPT61
iPceh9bTWscwnK3WmDpAxnjdPQiflt/mKkPEETtKGx0X5kUImHnr1jhUdYKmEOHf
wkXBKVc66hpn85WGJ7MPVyixIOpzScAYnKjVsP4ma6iFAgMBAAE=
-----END PUBLIC KEY-----

in RSA_PKCS1_OAEP_PADDING mode (MGF1, SHA1). The resulting value is base64 encoded and stored in the Device-Health header. It's unclear how this measures the health of a device, but it's a feature nonetheless.

Example code

I haven't thoroughly tested this yet. At present, the API doesn't seem to mind whether Device-Health is correct, or specified at all.

Authentication

To authenticate, send a JSON blob that looks like this:

{
	"global": {
		"initialize": {
			"deviceId": IMEI,
			"evaluate": false,
			"greet": true,
			"language": "en",
			"listening": true,
			"location": {
				"latitude": 0.0,
				"longitude": 0.0
			},
			"mimeType": "wav",
			"timeZone": "GMT",
			"token": "rabbit-account-key+" + ACCOUNT_KEY,
		}
	}
}

deviceId is whatever you used during activation, and ACCOUNT_KEY is the value of accountKey from the activation response message. Use your imagination for the other fields (I haven't figured out precisely what "listening", "greet" or "evaluate" do yet).

This should be the first thing you send after initiating the websocket connection.

Send "Terminal" Text Input

Send a JSON message that looks like this

{
  "kernel": {
    "userText": {
      "text": INPUT
    }
  }
}

Receive "Terminal" Text Output

Text-based responses look like this:

{"kernel": {"assistantResponse": OUTPUT}}

Receive Voice Output

Example output:

{
  "kernel": {
    "assistantResponseDevice": {
      "text": {
        "language":"en",
        "chars":[" ","H","e","l","l","o",","," ","h","o","w"," ","c","a","n"," ","I"," ","a","s","s","i","s","t"," ","y","o","u"," ","t","o","d","a","y","?"],
        "char_start_times_ms":[0,...],
        "char_durations_ms":[0,...]
      },
      "audio": BASE64_WAV,
      "canned": false,
    }
  }
}

NOTE: The text field is actually a stringified JSON object, I'm showing it as plain JSON above for clarity.

I wonder what the canned field indicates?

Set Push-to-Talk State

Send a JSON message like this:

{
  "kernel": {
    "voiceActivity": {
      "imageBase64": "",
      "state": STATE
    }
  }
}

Where STATE is one of: inactive, pttButtonPressed, pttButtonReleased.

Streaming Voice Input

Set PTT state to pressed, then send 0.1 second chunks of uncompressed WAV as bytes directly down the websocket, then set PTT state to released. It looks like it uses 16kHz stereo, 16-bit samples.

Image Input (UNTESTED)

Send a base64-data-uri-encoded JPEG file, nominally 1080x720px at 100% quality (although other resolutions/qualities/formats presumably work too?) in a pttButtonReleased PTT message's imageBase64 field. (sent along with a voice input as described above)

{
  "kernel": {
    "voiceActivity": {
      "imageBase64": "",
      "state": "pttButtonReleased"
    }
  }
}
@Pinball3D
Copy link

Holy f*** this is amazing

@Pinball3D
Copy link

Pinball3D commented May 28, 2024

Hey, if you see this, how do i setup the proxy? Im on macos. I replaced the proxy.go file with the one you provided and ran make, then ./ja3proxy, and it says this: "2024/05/27 19:55:15 copy client to dest error: read tcp 192.168.1.52:52301->18.160.41.19:443: use of closed network connection"

EDIT: Got it working, authenticated my "r1" now just need to get the wss working

@Proton0
Copy link

Proton0 commented Jun 1, 2024

@Pinball3D how did you manage to fix it? I am on MacOS and same issue

@Pinball3D
Copy link

@Pinball3D how did you manage to fix it? I am on MacOS and same issue

I’m not entirely sure how I fixed it to be honest. I think I might have rebuilt the executable or something. I will look when I am on my computer later.

@DarthChief394
Copy link

Does anyone managed to authenticate? I can't figure out how to send the JSON blob

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment