Skip to main content

How to Build an Advanced AI Voice Assistant in Python: Step-by-Step Guide

 Creating an Advanced AI Voice Assistant in Python

AI voice assistants like Siri, Alexa, and Google Assistant have become indispensable in our daily lives. Creating your own voice assistant in Python is a fantastic way to learn about speech recognition, text-to-speech conversion, and natural language processing (NLP). In this comprehensive guide, we’ll develop an advanced AI voice assistant capable of handling tasks, responding to queries, and even engaging in natural conversations using API.


Prerequisites

Before we dive into the code, ensure you have the following:

  • Python 3.6 or higher.
  • A good microphone for input.
  • Internet connection (for and other APIs).
  • Required libraries:
    bash
    pip install SpeechRecognition pyttsx3 pyaudio openai


Step 1: Setting Up Text-to-Speech (TTS)

We'll use the pyttsx3 library to give our assistant a voice. This library supports multiple speech engines and allows us to control the speech rate, volume, and voice.

python
import pyttsx3 
# Initialize TTS engine 
engine = pyttsx3.init()
 engine.setProperty('rate', 150
# Adjust the speaking speed 
engine.setProperty('volume', 0.9)
# Set volume (0.0 to 1.0) 
def speak(text): """Convert text to speech.""" 
 engine.say(text) 
 engine.runAndWait()

Step 2: Implementing Speech Recognition

To capture user commands, we use the speech_recognition library. This module converts spoken words into text.

python
import speech_recognition 
as sr def listen(): """Capture voice input and convert it to text.""" 
 recognizer = sr.Recognizer()
with sr.Microphone() 
as source: print("Listening..."
 recognizer.adjust_for_ambient_noise(source, duration=1
 audio = recognizer.listen(source) try:
 command = recognizer.recognize_google(audio) 
print(f"You said: {command}"
return command.lower() 
except sr.UnknownValueError: 
 speak("Sorry, I didn't catch that. Could you repeat?"
return None except sr.RequestError:
 speak("Sorry, I am unable to access the service."
return None

Step 3: Integrating for Natural Language Understanding

python
import openai 
 openai.api_key = "your_openai_api_key" 
# Replace with your OpenAI API key 
def get_gpt_response(prompt):
"""Get a response from .""" 
try: response = openai.Completion.create( engine="text-davinci-003",
# model 
 prompt=prompt, max_tokens=150, temperature=0.7 )
return response.choices[0].text.strip() 
except Exception
as e: return "Sorry, I couldn't process that request."

Step 4: Handling Commands

We define a function to interpret and act on the user’s commands. The assistant can perform tasks like opening websites, checking the time, and more.

python
import webbrowser 
import os from datetime 
import datetime
def process_command(command): """Interpret and execute user commands.""" 
if "open website" in command: speak("Which website should I open?")
 website = listen() 
if website: webbrowser.open(f"https://{website}.com"
 speak(f"Opening {website}."
elif "time" in command: now = datetime.now().strftime("%H:%M"
 speak(f"The current time is {now}.")
elif "shutdown" in command: speak("Shutting down the system."
 os.system("shutdown /s /t 1"
elif "your name" in command: speak("I am your AI assistant. How can I help?") else: # Delegate to for conversational responses
response = get_gpt_response(command) speak(response)

Step 5: Putting It All Together

Finally, we tie all the pieces together in a main loop that continuously listens for commands and processes them.

python
if __name__ == "__main__": speak("Hello! I am your AI assistant. How can I assist you today?")
while True: command = listen() 
if command: if "exit"
in command or "stop" in command: speak("Goodbye! Have a great day!"
break process_command(command)

Additional Features for Enhancement

To make the assistant even more advanced, consider adding the following features:

  1. Weather Updates: Integrate APIs like OpenWeatherMap to fetch and read out weather information.

    python
    import requests def get_weather(city): api_key = "your_weather_api_key" url = f"http://api.openweathermap.org/data/2.5/weather?q={city}&appid={api_key}&units=metric" response = requests.get(url) if response.status_code == 200: data = response.json() weather = data['weather'][0]['description'] temp = data['main']['temp'] speak(f"The weather in {city} is {weather} with a temperature of {temp}°C.") else: speak("Sorry, I couldn't fetch the weather information.")
  2. Smart Home Integration: Use IoT platforms like MQTT or APIs from smart home devices to control lights, fans, or other appliances.

  3. Task Management: Integrate with Google Calendar or Todoist for scheduling and reminders.

  4. Dynamic Responses: Use NLP libraries like spaCy or NLTK for deeper understanding and context-aware replies.


Sample Conversation

Assistant: Hello! I am your AI assistant. How can I assist you today?
You: What time is it?
Assistant: The current time is 14:45.
You: Open website Google.
Assistant: Opening Google.
You: What's the weather in Bhopal?
Assistant: The weather in Bhopal is clear skies with a temperature of 28°C.
You: Stop.
Assistant: Goodbye! Have a great day!


Challenges and Solutions

  1. Speech Recognition Errors:

    • Issue: Background noise may interfere.
    • Solution: Use recognizer.adjust_for_ambient_noise() and a good-quality microphone.
  2. Text-to-Speech Quality:

    • Issue: Robotic voice quality.
    • Solution: Experiment with different TTS engines or consider using cloud-based TTS services like Google Text-to-Speech.
  3. API Rate Limits:

    • Issue: Frequent use of AI may hit rate limits.
    • Solution: Optimize token usage or implement local NLP for simpler tasks.

Conclusion

Congratulations! You've built an advanced AI voice assistant in Python. This project is not only a practical application of programming skills but also a stepping stone to explore domains like AI, machine learning, and IoT. With further enhancements, your assistant can become a versatile personal companion.