Last active
March 27, 2023 00:32
-
-
Save matthewjberger/63ee4f66d10e33168d8a6deeaa6e1caa to your computer and use it in GitHub Desktop.
A british voice assistant natural speech frontend to GPT-3
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
GPT-3 British Voice Assistant | |
This project is a simple voice assistant that takes your natural speech as input | |
and returns a response generated by the GPT-3 AI model. The response is then | |
read out loud with a British accent. | |
Installation: | |
1. Create a virtual environment and activate it: | |
python -m venv venv | |
source venv/bin/activate # Linux/Mac | |
venv\Scripts\activate # Windows | |
2. Install the required packages: | |
pip install -r requirements.txt | |
3. Set your OpenAI API key as an environment variable: | |
export OPENAI_API_KEY="your_api_key_here" # Linux/Mac | |
set OPENAI_API_KEY="your_api_key_here" # Windows | |
Finding the Audio Device Index: | |
Before running the script, you need to find the correct audio device index | |
for your microphone. Run the following Python script: | |
import speech_recognition as sr | |
for index, name in enumerate(sr.Microphone.list_microphone_names()): | |
print(f"Microphone with name \"{name}\" found for `Microphone(device_index={index})`") | |
This will display a list of available microphones and their corresponding | |
device indices. Note the device index for the microphone you want to use. | |
Running the Voice Assistant: | |
1. Open the script and replace the `device_index` variable in the `speech_to_text()` | |
function with your desired device index: | |
device_index = YOUR_DESIRED_DEVICE_INDEX # Replace with the desired device index | |
2. Run the script: | |
python voice_assistant.py | |
The voice assistant will now listen for your questions and provide responses | |
generated by the GPT-4 AI model. | |
To exit the script, press `Ctrl + C` in the terminal. | |
""" | |
import openai | |
import os | |
from gtts import gTTS | |
from pydub import AudioSegment | |
from pydub.playback import play | |
import speech_recognition as sr | |
# Load the OpenAI API key | |
openai.api_key = os.getenv("OPENAI_API_KEY") | |
def text_to_speech_british(text, speed_multiplier=1.5): | |
tts = gTTS(text=text, lang='en', tld='co.uk', slow=False) | |
tts.save('output.mp3') | |
audio = AudioSegment.from_mp3('output.mp3') | |
faster_audio = audio.speedup(playback_speed=speed_multiplier) | |
play(faster_audio) | |
def ask_gpt4(question): | |
response = openai.Completion.create( | |
engine="text-davinci-003", | |
prompt=question, | |
max_tokens=100, | |
n=1, | |
stop=None, | |
temperature=0.8, | |
) | |
return response.choices[0].text.strip() | |
def speech_to_text(): | |
recognizer = sr.Recognizer() | |
device_index = 7 | |
with sr.Microphone(device_index=device_index) as source: | |
print("Please ask your question:") | |
audio = recognizer.listen(source) | |
try: | |
text = recognizer.recognize_google(audio) | |
print(f"Recognized question: {text}") | |
return text | |
except sr.UnknownValueError: | |
print("Could not understand your speech.") | |
return None | |
except sr.RequestError as e: | |
print(f"Could not request results from Google Speech Recognition service; {e}") | |
return None | |
if __name__ == '__main__': | |
while True: | |
question = speech_to_text() | |
if question: | |
response = ask_gpt4(question) | |
print(response) | |
text_to_speech_british(response, speed_multiplier=1.5) | |
else: | |
print("Please try again.") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
openai | |
gtts | |
pydub | |
SpeechRecognition |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment