theoopsguy/GSoC-2022-Final-Submission.md

## GSoC-2022-Final-Submission.md

      
    Raw
  

              GSoC-2022-Final-Submission.md
            
          
    GSoC 2022 Final Submission - Jitsi Music Bot

Organization:

Jitsi
Project Name:

Music bot for Jitsi video conference
Mentors:

@saghul
@tmoldovan8x8
Introduction


This summer (2022), I was working with Jitsi to develop a music bot for Jitsi meetings as an open-source contributor under Google Summer of Code. This is my final submission and contains description of the work I did in the past 3 months.

Project Overview


The aim of the project was to develop a chat-based music bot for Jitsi meetings that allows users to enjoy music in a meeting by typing simple English commands in the chat box. The bot joins the meeting as a participant and follows the commands. For instance, if the bot is asked to play a song, it will go to a music streaming app, say YouTube Music, search for that song, start playing it and share the audio in the meeting.

Demo


Technical Architecture


GUI:


tkinter - Python interface to Tcl/Tk, is used to create a simple Graphical User Interface where the user enters the link to the meeting which the bot is supposed to join, and the path to location where the chromedriver is stored on their system. Once the ‘Start Bot’ button is clicked, Selenium WebDriver is started in a thread.


Automation:


Selenium is the core of the bot, it has been used to automate the browser. The bot starts an automated browser window on the user's system, which is driven by Selenium WebDriver. The script currently uses ChromeDriver – WebDriver for Chrome as the native WebDriver. The automated browser is navigated to the meeting link entered previously. The driver turns off the camera and changes the mic input device to a particular device, which will be discussed in the later part. 


The operations on the webpage are carried out by following the standard Selenium approach of first finding the desired web element using a suitable built-in locator strategy. The next step is to take action on the element, for instance a click.


The driver, then joins the meeting as a participant and sends a message in the meeting chat box introducing itself. Next it runs an infinite loop through which it reads the messages in the chat box.  


Commands:


To play a song when /play command is used, the driver switches to a new tab, navigates to YouTube Music, searches for the song and starts playing the first search result. Then it switches back to the meeting tab to read next commands.


/pause and /resume command, make the driver switch to the music tab and toggle the play-pause button.


When /help command is used, the driver sends messages in the chat box informing the user about all the supported commands using the send keys element interaction.


/exit command makes the driver click on the hangup button, to leave the meeting and then quits the driver.


Audio:


The bot requires a virtual audio driver like VB-CABLE or BlackHole that allow applications to pass audio to other applications with zero additional latency. These drivers add virtual audio input and output devices to the system. Audio routed to the input end of the virtual device can be heard at the output end of the virtual device.


The output end of the virtual device had been selected as the mic input device previously while joining the meeting. The next step is to route the audio from the browser tab playing the music to the input end of the virtual audio device. To achieve this, a piece of JavaScript code is executed on music streaming webpage by the driver, which uses the MediaDevices method enumerateDevices to list all the audio devices. The script then inspects the devices list to get the device ID of the input end of the virtual audio device. This device ID is then set as the sink ID for all the audio elements on the webpage.

Links to contribution


Repository Link
Pull Request

Future Work

There is always scope for improvement in any project. These are the issues/features on which work would be done after GSoC:


Add support for more music streaming services, for instance, Spotify.


Ability to sign in on the music streaming services, this way user would be able to enjoy the perks of premium subscriptions they may already have, and won’t have to view ads.


To make it more convenient for the user, in the GUI, the input field for the path to chromedriver can have a default value which would be the default download location for the Operating System. If the user decides to move the chromedriver to some other location, they can replace it with the new path.


Add support for different browsers other than Chrome.