Skip to content

Instantly share code, notes, and snippets.

@mikesmales
Created February 20, 2019 20:26
Show Gist options
  • Save mikesmales/aafd09846c3c21f0997af57154b8ba8c to your computer and use it in GitHub Desktop.
Save mikesmales/aafd09846c3c21f0997af57154b8ba8c to your computer and use it in GitHub Desktop.
Urban_sounds_feature_extraction
def extract_features(file_name):
try:
audio, sample_rate = librosa.load(file_name, res_type='kaiser_fast')
mfccs = librosa.feature.mfcc(y=audio, sr=sample_rate, n_mfcc=40)
mfccsscaled = np.mean(mfccs.T,axis=0)
except Exception as e:
print("Error encountered while parsing file: ", file)
return None
return mfccsscaled
# Load various imports
import pandas as pd
import os
import librosa
# Set the path to the full UrbanSound dataset
fulldatasetpath = '/Urban Sound/UrbanSound8K/audio/'
metadata = pd.read_csv(fulldatasetpath + '../metadata/UrbanSound8K.csv')
features = []
# Iterate through each sound file and extract the features
for index, row in metadata.iterrows():
file_name = os.path.join(os.path.abspath(fulldatasetpath),'fold'+str(row["fold"])+'/',str(row["slice_file_name"]))
class_label = row["class_name"]
data = extract_features(file_name)
features.append([data, class_label])
# Convert into a Panda dataframe
featuresdf = pd.DataFrame(features, columns=['feature','class_label'])
print('Finished feature extraction from ', len(featuresdf), ' files')
@DeeBul
Copy link

DeeBul commented Dec 2, 2019

KeyError: 'class_name'

TypeError: 'str' object cannot be interpreted as an integer

for this line class_label = row["class_name"]

any suggessions

Getting the same error. I believe class_name has to be defined as something.

@DeeBul
Copy link

DeeBul commented Dec 2, 2019

After I imported numpy I believe running this:

`def extract_features(file_name):

try:
    audio, sample_rate = librosa.load(file_name, res_type='kaiser_fast') 
    mfccs = librosa.feature.mfcc(y=audio, sr=sample_rate, n_mfcc=40)
    mfccsscaled = np.mean(mfccs.T,axis=0)
    
except Exception as e:
    print("Error encountered while parsing file: ", file)
    return None 
 
return mfccsscaled`

takes forever. Is it normal for this part to take forever or is just I did something wrong.

@AndresRicoM
Copy link

KeyError: 'class_name'

TypeError: 'str' object cannot be interpreted as an integer

for this line class_label = row["class_name"]

any suggessions

I was getting the same error. I changed "class_name" to "classID" and it seems to be working fine. It has to do with the labels used by the csv metadata file. classID is the column that has the integer values for each sound class (0-9) (According to the UrbanSound8K website.

@AndresRicoM
Copy link

After I imported numpy I believe running this:

`def extract_features(file_name):

try:
    audio, sample_rate = librosa.load(file_name, res_type='kaiser_fast') 
    mfccs = librosa.feature.mfcc(y=audio, sr=sample_rate, n_mfcc=40)
    mfccsscaled = np.mean(mfccs.T,axis=0)
    
except Exception as e:
    print("Error encountered while parsing file: ", file)
    return None 
 
return mfccsscaled`

takes forever. Is it normal for this part to take forever or is just I did something wrong.

It is taking me several minutes to run as well. What should be taking time is the for loop that accesses each file, not the extract_features() function.

It takes a long time because the loop makes it extract features from more than 8,000 .wav files.

@HabibRekik93
Copy link

name error : name 'file' is not defined
any suggestion ?

@mohsin-noor
Copy link

name error : name 'file' is not defined any suggestion ?

I am getting this same error, i tried to fix it multiple times but still no result and it shows error, what to do now

@mstakale
Copy link

Please make sure you are entering correct file path and it is in right format

@mohsin-noor
Copy link

Please make sure you are entering correct file path and it is in right format

Thanks for your response.!
I tried to bypass this error and working now, but not fixed 100%.

Actually, I am not working on Urban sound classification. I am working on binary classification of audio dataset of covid-19 and I am facing a problem in reshaping of array.

ValueError: cannot reshape array of size 36760 into shape (919,1149,2,1)

I got stucked here and trying to fix it, wondering for solution to reshape array. I have total 1149 rows and 2 classes(positive, negative).
can you please help me to identify what needs to be fixed to reshape arrays?

@mstakale
Copy link

Can you please share your dataset with me if it is open source? I will look at it. Also you have to reshape input according to your data. I guess in your case (1149,2,1)

@mohsin-noor
Copy link

Hello there,
Thanks for your response and sorry for delay from my end.!
yes, the thing you suggest, i have tried it but it could not work because my train_data on X has value 919 which is creating a disturbance.!
but I am glad to have your response.!
I can share the dataset link: https://github.com/virufy/virufy-cdf-coughvid/tree/main/virufy-cdf-coughvid
BUT I HAVE CUSTOMIZED THE DATASET ACCORDING TO MY CODE
you can email me at mohsinnoor53@gmail.com
I would share the customized and optimized data there
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment