Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save mark-wiemer/9c3f8b6720a6f1258d2c582768b631ea to your computer and use it in GitHub Desktop.
Save mark-wiemer/9c3f8b6720a6f1258d2c582768b631ea to your computer and use it in GitHub Desktop.

HuggingGPT paper, figure 4, part 1

From HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace, transcribed by Mark Wiemer. All efforts were made to keep text as written. Emojis have been used to replace icons in the response.


Query: generate a video with the title "An astronaut is walking in space" and dub it.

Response: Sure. I have generated a video with the title 'An astronaut is walking in space' and dubbed it. The workflow I used is as follows: First, I used the text-to-video model ' 🤗 damo-vilab/text-to-video-ms-1.7b' to generate a video from the text. The inference result is the generated video ' 📽️ /videos/5696.mp4'. Second, I used the text-to-speech model ' 🤗 facebook/tts_transformer-es-css10' to generate an audio from the text. The inference result is the generated audio ' 🎵 /audios/3f9e.flac '. Finally, I combined the video and audio to generate the dubbed video. I hope this answers your question. Is there anything else I can help you with?

[Series of images of an astronaut walking on a gray background with an audio indicator beneath]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment