laic/install-htk-ubuntu-linux.md

## install-htk-ubuntu-linux.md

      
    Raw
  

              install-htk-ubuntu-linux.md
            
          
    Installing HTK on ubuntu Linux (Nov 2021)

Please note I've not been able to test this on a clean install since my linux computer is old and already has an HTK installed on it!
These instructions are for installing HTK on your own computer.  It's fine just to use the remote desktop though! If you use the remote desktop you don't have to install anything.
Installing on Ubuntu WSL 2 (18/11/2021)

I got HTK to work Ubuntu WSL 2 by installing a bunch of packages.  To be honest, I'm not sure if I needed all of them, but I'll list them all here anyway.
First you need to make sure you are running WSL version 2. You can find out which version you are using by opening Powershell in Windows and typing
wsl -l -v

If you're using version 1, this page has some instructions for how to update to version 2: https://codefellows.github.io/setup-guide/windows/ (though I haven't had the chance to test).
If you are using version 2, you can proceed in the Ubuntu terminal:
sudo dpkg --add-architecture i386
sudo apt update
sudo apt install linux-libc-dev-i386-cross
sudo apt install libc6-dev-i386

I couldn't get X11 to work, but HTK works ok without it (see the configure command below).  Now you can skip to the HTK Download bit.
Install X11 and old gcc compatibility libraries ("normal" Ubuntu)

If you did the bit above for Ubuntu WSL you can skip this bit, and go to "Download the HTK source files"
Update the list of potential packages (good to do before you install new things with apt)
sudo apt get update

Enter your computer password (or your ubuntu password if you're using a virtualmachine or WSL) when prompted.
Now install the libraries:
sudo apt get install libc6-dev-i386
sudo apt get install libx11-dev:i386 libx11-dev

If you see some complaint about "get" not being an apt command, try the commands without the get:
sudo apt update
sudo apt install libc6-dev-i386
sudo apt install libx11-dev:i386 libx11-dev

Download the HTK source files

Go to the HTK website:
https://htk.eng.cam.ac.uk
To download the source code and read the documentation (the HTK book), you'll need to register for an account:
https://htk.eng.cam.ac.uk/register.shtml
Once you're account is setup, go to the Download page:
https://htk.eng.cam.ac.uk/download.shtml
Download the "stable" version of HTK 3.4.1:
https://htk.eng.cam.ac.uk/ftp/software/HTK-3.4.1.tar.gz
Make a note of where you downloaded it to! I downloaded it to /home/clai/speech_processing/tools
Install HTK

Open your terminal and go to the directory that you downloaded the HTK-3.4.1.tar.gz to.  The terminal usually starts off in your home directory (for me /Users/clai), but I'll just use the full path here to demonstrate
cd /home/clai/speech_processing/tools

Use the ls command to check that the zip file is in the directory
ls

Unzip and unpack the source archive:
tar xvzf HTK-3.4.1.tar.gz

If you type ls in the terminal now, you should see a directory called htk.  Go into that directory:
cd htk

Configure the installation parameters

Now configure things before compiling all the code. I'm disabling X and hslab to avoid the X11 errors (you don't need it to run the scripts).
./configure --without-x --disable-hslab

Compile and install the code using make

Now type in the following command, which will start compiling the source code into the actual application we want to use
make all

If you get some error about a makefile have spaces instead of a tab. e.g.:
make[1]: Entering directory '/home/clai/speech_processing/tools/htk/HLMTools'
Makefile:77: *** missing separator (did you mean TAB instead of 8 spaces?). Stop.
make[1]: Leaving directory '/home/clai/speech_processing/tools/htk/HLMTools'
make: *** [Makefile:111: hlmtools] Error 1

If you so, open the file HLMTools/Makefile, e.g.
nano HLMTools/Makefile

Or using whatever text editor you like.  Lines 76-77 look like this:
mkinstalldir:
        if [ ! -d $(bindir) -a X_ = X_yes ] ; then mkdir -p $(bindir) ; fi

You'll need to change the 8 spaces on the start of line 77 to a tab.  If you're in nano press control-x to save and exit.

The run make all again from the htk directory.
If that all goes well, you use the following command to install the binaries into your path (e.g. /usr/local/bin):
sudo make install

As an initial test of whether it's working, try:
HVite

You should see a bunch of help information printed out.
Getting the scripts and data from the PPLS AT lab servers

It's good to keep a separation between tools and the actual data you use and generate for the assignment.
So, now we will go to the directory you actually want to do the assignment work in. I'll just try to mirror the directory structure given in the assignment instructions.
Let's get the data:
Make sure you have the University VPN on if you're not on campus.
In the following, set YOUR_UUN to your actual UUN rather than mine
YOUR_UUN=clai

make the directory ~/Documents/sp if you didn't already do that for assignment 1:
mkdir -p  ~/Documents/sp/

Get the scripts:
rsync -avz $YOUR_UUN@scp1.ppls.ed.ac.uk:/Volumes/Network/courses/sp/digit_recogniser ~/Documents/sp

Get the data, excluding the previous wav files for now (features have already been extracted):
rsync --exclude 'wav' -avz $YOUR_UUN@scp1.ppls.ed.ac.uk:/Volumes/Network/courses/sp/data  ~/Documents/sp

Now go to the directory you just downloaded the scripts, i.e. digit_recognizer:
cd ~/Documents/sp/digit_recognizer

Things to note about running the scripts:


You'll need to run scripts from the digit_recognizer directory, not the scripts directory below it. e.g.:

./scripts/initialise_models

Remember not to skip the ./ at the beginning of that line. It tells the computer to run the command using the file scripts/initialise_models relative to you current directory.  Otherwise, the computer it will look for the command in all the directories listed in your PATH environment variable (and probably give you an error).


The scripts assume you are using the bash shell, but the path to bash at the top of each the scripts is not quite right for a Mac.  For example, the first line in scripts/initialise_models is #!/usr/bin/bash, but you will need to change it to #!/bin/bash instead.


If you are running zsh  as your default shell (maybe on newer MacBooks), you might need to change it to bash using the following command:


chsh -s /bin/bash


Since we're not collecting and labelling data as the first step, you'll need to change the scripts to build a speaker dependent model from an existing speaker.  So, each of the script change the beginning from this:

#!/usr/bin/bash

# to use your own data, this automatically sets USER to be your username
USER=${USER:-`whoami`}
# and this is the path to where your data was placed by the make_mfccs script
DATA=${DATA:-/Volumes/Network/courses/sp/data_upload}

# later, to use another user's data, for example "simonk"
# USER=simonk
# and to access all data from all years, use this path
# DATA=${DATA:-/Volumes/Network/courses/sp/data}

to this:
#!/bin/bash

## to use your own data, this automatically sets USER to be your username
#USER=${USER:-`whoami`}
## and this is the path to where your data was placed by the make_mfccs script
##DATA=${DATA:-~/Documents/sp/data}

# Use another user's data, for example "simonk"
USER=simonk
# and to access all data from all years, use this path
DATA=~/Documents/sp/data}


Assuming you downloaded the data as per the instructions above.