Skip to content

Instantly share code, notes, and snippets.

@danielrosehill
Last active November 25, 2025 15:23
Show Gist options
  • Select an option

  • Save danielrosehill/d3913d4c8cc69acaf3ee7772771c2f1d to your computer and use it in GitHub Desktop.

Select an option

Save danielrosehill/d3913d4c8cc69acaf3ee7772771c2f1d to your computer and use it in GitHub Desktop.
Wayland Virtual Keyboard Solutions on Ubuntu 25.10 KDE Plasma - What Works

Wayland Virtual Keyboard Solutions on Ubuntu 25.10 KDE Plasma

System Configuration

  • OS: Ubuntu 25.10
  • Desktop Environment: KDE Plasma
  • Display Server: Wayland
  • Date Tested: November 2025

The Problem

Getting virtual keyboard input (typing text programmatically) to work on Wayland is notoriously difficult. Many tools that worked on X11 don't work on Wayland due to security restrictions.

What DIDN'T Work

wtype

  • Package: wtype (available in Ubuntu repos)
  • Error: Compositor does not support the virtual keyboard protocol
  • Reason: KDE Plasma's Wayland compositor doesn't implement the zwp_virtual_keyboard_v1 protocol that wtype requires
  • Verdict: ❌ Does not work on KDE Plasma Wayland

xdotool

  • Package: xdotool
  • Reason: X11 only - doesn't work on Wayland at all
  • Verdict: ❌ Does not work on Wayland

What WORKED

Two solutions have been validated on this system. Both use kernel-level uinput to bypass Wayland's security restrictions.

Solution 1: waystt + ydotool ✅

Components:

  • waystt: Wayland speech-to-text tool that outputs transcribed text to stdout
  • ydotool: Virtual input tool using uinput
    • Package: ydotool + ydotoold (Ubuntu repos)

How it works:

  1. waystt records audio and sends to OpenAI Whisper API (or local Whisper)
  2. Transcribed text is output to stdout
  3. stdout is piped to ydotool which types it at cursor location
  4. ydotool uses Linux uinput subsystem to create virtual keyboard at kernel level

Installation:

# Install ydotool
sudo apt install ydotool ydotoold

# Clone and build waystt
git clone https://github.com/artur-shaik/waystt
cd waystt
cargo build --release

# Configure waystt
mkdir -p ~/.config/waystt
cp .env.example ~/.config/waystt/.env
# Edit .env with your OpenAI API key

Setup Requirements for ydotool:

# User must be in input group
sudo usermod -aG input $USER
# Log out and back in

# Verify uinput permissions
ls -la /dev/uinput
# Should show: crw-rw----+ 1 root input ...

# Check ydotoold socket exists
ls -la /tmp/.ydotool_socket

Usage:

# Start waystt with output piped to ydotool
nohup sh -c './waystt 2>/tmp/waystt.log | while IFS= read -r line; do ydotool type -- "$line"; done' &

# Click in a text field, speak, then trigger transcription
pkill --signal SIGUSR1 waystt

Pros:

  • Uses OpenAI Whisper API (high accuracy)
  • Can also use local Whisper models
  • Flexible - stdout can be piped to other tools

Cons:

  • Requires manual signal to trigger transcription
  • Not real-time (batch transcription)

Solution 2: Deepgram Voice Keyboard (voice-keyboard-linux) ✅

Repo: https://github.com/deepgram-devs/voice-keyboard-linux

How it works:

  1. Creates a virtual keyboard device directly via /dev/uinput at kernel level
  2. Uses Deepgram's Flux API for real-time streaming STT
  3. Types incrementally as you speak (real-time corrections)

Architecture:

  • Starts with root privileges to create virtual keyboard via uinput
  • Drops privileges to access user's PipeWire/PulseAudio audio session
  • Kernel-level virtual input device works with ALL Linux applications

Installation:

git clone https://github.com/deepgram-devs/voice-keyboard-linux
cd voice-keyboard-linux
cargo build --release

# Run (requires sudo for uinput access)
DEEPGRAM_API_KEY=your-key-here sudo -E ./target/release/voice-keyboard

Pros:

  • Real-time streaming transcription
  • Incremental typing with smart corrections
  • No manual triggering needed

Cons:

  • Requires Deepgram API key (paid service)
  • Needs sudo to run (for uinput device creation)

Comparison Table

Feature waystt + ydotool Deepgram Voice Keyboard
Input Method Kernel uinput (via ydotool) Kernel uinput (direct)
STT Provider OpenAI Whisper / Local Deepgram Flux
Real-time No (batch) Yes (streaming)
Trigger Manual (SIGUSR1) Automatic
Sudo Required No Yes
Works on KDE Plasma Wayland ✅ Yes ✅ Yes

Why Kernel-Level uinput Works

Both solutions bypass Wayland's security model by creating virtual input devices at the Linux kernel level using /dev/uinput. This is the same mechanism used by physical keyboards, so:

  • Works with ANY Wayland compositor (KDE, GNOME, Sway, etc.)
  • Works with ALL applications
  • No compositor protocol support needed

Summary

Tool Works on KDE Plasma Wayland Method
wtype ❌ No Wayland protocol (unsupported)
xdotool ❌ No X11 only
ydotool ✅ Yes Kernel uinput
voice-keyboard-linux ✅ Yes Kernel uinput

This gist was generated by Claude Code. Please validate this information against your specific system configuration, as results may vary with different Ubuntu versions, desktop environments, or Wayland compositors.

Comments are disabled for this gist.