OpenAI's whisper was most likely trained on subtitled videos by German public-service television broadcaster ZDF.
Whisper "is a general-purpose speech recognition model […] trained on a large dataset of diverse audio", as it's written in the project's README.
A clear indication is the (so to say) transcription of silent audio to text saying "Untertitel im Auftrag des ZDF, 2017".
This sentence may be seen in videos of ZDF's youth program Funk. One example may be a 2017 video of Funk's format musstewissen Mathe at the end of the video.
Thus, Whisper translates silence into copyright notices.
# generate 23 seconds of silence
$ ffmpeg -f lavfi -i anullsrc=r=11025:cl=mono -t 23 -acodec aac silence.m4a
# transcribe it to text using Whisper
$ whisper silence.m4a --language German
[00:00.000 --> 00:02.000] Untertitel im Auftrag des ZDF, 2017
I just noticed this as well, pretty annoying. User doublex created a dictionary of all known artefacts of this kind, sorted by language, to ease automatic removal.