NP-chaonay/MusicalCompositionFeaturesClassification-Part1

## MusicalCompositionFeaturesClassification-Part1
############################################################################
#          Muscial Composition Features Classification - Part 1            #
############################################################################

###### General Notice ######
+ Author : Nuttapong Punpipat (NP-chaonay)
+ Updated On : 2020_05_16 16:03 (UTC)
+ General Updating Frequency : Weekly
+ Code Snippet Completeness : about 30%
+ Stability : <Beta
----------
+ Texts under section with "[CODE]", are designed to be use directly in Python interactive console, that able to be executed by using copy-paste method.

###### Brief Table of Content ######
+ General Notice (Line 5)
+ Brief Table of Content (Line 14)
+ Abstract (Line 32)
+ Requirement (Line 34)
+ Overall Architecture (Line 52)
+ Experiment method (Line 62)
+ Glossory (Line 84)
----------
+ Initialization (Line 99)
+ Data Decompositions (Line 534)
+ Simple Testing (Line 544)
+ Testing multiple models (Line 589)
+ Analysing audio wave (Line 685)
----------
+ Code Snippet (Line 717)
!!! Require re-check the table !!!

###### Abstract ######

###### Requirement ######
- Recommend to execute on regular Linux desktop distribution (For more information, see sub-section "OS Compatibilitiy" below)
- Other than Linux, recommend to use Mac/Google Colab instead. (See Windows-issues on sub-section "OS Compatibilitiy" below)
- FFmpeg (Checking by using os.system) (Google Colab has this built-in)
	- To enable pitch changing, ensure that FFmpeg is built with librubberband. (Most FFmpeg builds should have it.)
- /tmp for keeping temporary files
- Latest version of youtube-dl (Optional when using youtube related function)
- Python modules (See on section Initialization > Import modules)
- Manually configuration (See on Initialization > Define variables)
### OS Compatibilitiy
- This script has been tested and used in Ubuntu 19.10  with Python 3.5.x
- Since Windows doesn't use root file system like Linux/Mac, this could break compatibilities on most of features.
- These features are known not to be working on another system other than Linux :
	- save_recording() and related functions involve system recording : Lack of arecord binary. In another system, change arecord and its original parameters to another replacement instead.
- These features are known not to be working on non-UNIX OS :
	- Any features using "/tmp" directory : "/tmp" doesn't exist in Windows. However you may do a word-substitute "/tmp" with another temporary directory path instead.

###### Overall Architecture ######
< Starting Input : FFmpeg-compatible audio with start,end parameter for cropping >
< Process #1 : Convert to 2ch 44100Hz pcm_s16le WAVE audio with only cropped region >
< Process #2 : Convert to time-frequency array >
< Process #3 : Final prediction ML model >
------------
<START>
< Process #1 on Starting Input as Output #1 (2ch 44100Hz pcm_s16le WAVE audio with only cropped region) >
< Process #2 on Output #1 as Output #2 (Time-Frequency array) >
< Process #3 on Output #2 as Final Output (Highest confidence class prediction) >
<END>
###### Experiment method ######
+ Phase 1 : Prepare the song data for testing, by gathering from these following training sets :
	- Set 1 : Randomly gathering (Attempt to choose for most varience.)
	- Set 2 : (May recommended) Find song that has major-minor pair
		- Method A: Each pair has same root key.
		- Method B: Not do as the Method A
+ Phase 2 : Convert any audio data to 16bit 44100Hz FLAC (For storage, optional)
+ Phase 3 : Get the data to train ML algorithms
	+ Method A:1 : Convert "Phase 2/Any" audio data to 1-ch WAVE format with (5 regions cropping)
	+ Method A:2 : (Unrecommended as it isn't fast in both training and testing., Even this using multiprocessing.) --> So I do this just for case study) Same as Method 1 but convert it to frequency-over-time array using my implemented function, then save the calulated data using pickle (When using 5 regions cropping preprocessed song, takes about 454 seconds per song, 50 songs take about 6 hrs 30 mins; Due to some limitations.)
	+ Method B:1 : With Dimension reduction
	+ Method B:2 : Without Dimension reduction
+ Phase 4 : Testing trained multiple ML algorithms using any of following testing sets :
	* Note 1 : Recommend to use the entire releases from one of selected artists/organizations/group of genres.
	* Note 2 : The list of sets is defined by me. You can use any song you would desire.
	+ Set A : 14 institutional songs from "Bodindecha (Sing Singhaseni) 2 high school"
	+ Set B : >=20 The Beatles song
	+ Set ... : So on...
	+ Set 0 : Any of training set not used in training
+ Phase 5 : Gathering more data + Do a data analysing
+ Phase 6 : Apply this experiment to another experiment or real-life situations

###### Glossory ######
+ 5 regions cropping
	5 seconds cropping that consists of :
	- First part
	- About 25%
	- At the center
	- About 75%
	- End part
+ Pre-processed audio data
	Audio data that generated by the function ``save_recording()``,``save_local()``,``save_youtube()``, or other function that has similar behaviors; able to use for training directly.
	Technically, these're specially-cropped (Total sample of 44100 x 5 secs x 5 different audio portions, take a look on ``process_wave_data()`` how they're cropped) 44100Hz mono WAVE format encoded in signed 16bit little endian.


###### Initialization ######
### [CODE] Import modules
# Built-Ins and Pre-Installed
import os
from time import sleep
from threading import Thread as Thread
from codecs import escape_decode
import glob
import pickle
import subprocess
import multiprocessing
from math import floor
from functools import reduce
# Downloadable modules
import scipy.io.wavfile as scp_io_wav
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.svm import LinearSVC
from sklearn.metrics import classification_report,confusion_matrix
from sklearn.preprocessing import LabelEncoder
from sklearn.decomposition import PCA # For using Data Decomposition

### [CODE] Define functions
def array1d_append(array,value):
	shape=array.shape
	if len(shape)!=1: raise np.AxisError('Inputted array should be 1 dimension.')
	array=np.resize(array,(shape[0]+1,))
	array[-1]=value
	return array

def array_replace(array,orig,new):
	shape=array.shape
	array=list(np.nditer(array))
	for i in range(len(array)):
		if array[i]==orig: array[i]=np.array(new)
	return np.array(array).reshape(shape)

def print_err(err,action='doing a operation',type_='Warning'):
	print('['+type_+'] Exception occurred when '+action+'.')
	print('  '+type(err).__name__+': '+str(err))

def process_wave_data(path):
	audio=scp_io_wav.read(path)
	rate=audio[0]
	audio=scp_io_wav.read(path)
	rate=audio[0]
	ch1=audio[1][:,0]
	ch2=audio[1][:,1]
	del audio
	duration=rate*5
	side=round( (len(ch1)-duration)/2 )
	side_1=round( (len(ch1)-duration)*0.25 )
	side_2=round( (len(ch1)-duration)*0.75 )
	ch1_f=np.array(ch1[:duration],'float')
	ch1_m=np.array(ch1[side:side+duration],'float')
	ch1_l=np.array(ch1[-duration:],'float')
	ch1_1=np.array(ch1[side_1:side_1+duration],'float')
	ch1_2=np.array(ch1[side_2:side_2+duration],'float')
	ch2_f=np.array(ch2[:duration],'float')
	ch2_m=np.array(ch2[side:side+duration],'float')
	ch2_l=np.array(ch2[-duration:],'float')
	ch2_1=np.array(ch2[side_1:side_1+duration],'float')
	ch2_2=np.array(ch2[side_2:side_2+duration],'float')
	del ch1,ch2
	snd_f=(ch1_f+ch2_f)/2
	snd_m=(ch1_m+ch2_m)/2
	snd_l=(ch1_l+ch2_l)/2
	snd_1=(ch1_1+ch2_1)/2
	snd_2=(ch1_2+ch2_2)/2
	del ch1_f,ch1_m,ch1_l,ch1_1,ch1_2,ch2_f,ch2_m,ch2_l,ch2_1,ch2_2
	snd=np.append(snd_f,[snd_1,snd_m,snd_2,snd_l])
	return np.array(snd,'int16')

def save_youtube(_url,metadata,start=None,end=None,pitch=None):
	for file in glob.glob(temp_path+'*'):
		try: os.remove(file)
		except Exception as err: print_err(err,'deleting unused files')
	print('Downloading using youtube-dl...')
	process=subprocess.run(executable='youtube-dl',args=['youtube-dl','-o',temp_path+'youtube_dl-downloaded.%(ext)s','--exec','mv {} '+temp_path+'youtube_dl-downloaded',_url],stdout=subprocess.PIPE,stderr=subprocess.STDOUT,stdin=subprocess.DEVNULL)
	if process.returncode!=0:
		print('[Error] youtube-dl process doesn\'t return to 0.')
		print('<< Start of Process console output>>')
		print(process.stdout.decode())
		print('<< End of Process console output>>')
		raise ResourceWarning('Error occurred when getting the audio data from Youtube.')
	print('1st Normalizing/Converting using ffmpeg-normalize...')
	process=subprocess.run(executable='ffmpeg-normalize',args=['ffmpeg-normalize','-o',temp_path+'converted_phase1','-f','-vn','-ofmt','wav','-nt','peak','-t','0',temp_path+'youtube_dl-downloaded'],stdout=subprocess.DEVNULL,stderr=subprocess.PIPE,stdin=subprocess.DEVNULL)
	if process.returncode!=0:
		print('[Error] ffmpeg-normalize process doesn\'t return to 0.')
		print('<< Start of Process stderr console output>>')
		print(process.stderr.decode())
		print('<< End of Process stderr console output>>')
		raise ResourceWarning('Error occurred when getting the audio data from Youtube.')
	print('Converting using ffmpeg...')
	args=['ffmpeg','-i',temp_path+'converted_phase1','-f','wav','-acodec','pcm_s16le','-ar','44100','-ac','2','-y',temp_path+'converted_phase2']
	if pitch!=None: args.insert(3,'-af'); args.insert(4,'rubberband=pitch='+str(pitch)+':pitchq=quality:channels=together')
	else: pass
	if start!=None and end!=None: args.insert(1,'-ss'); args.insert(2,str(start)); args.insert(3,'-to'); args.insert(4,str(end))
	elif start==None and end!=None: args.insert(1,'-to'); args.insert(2,str(end))
	elif end==None and start!=None: args.insert(1,'-ss'); args.insert(2,str(start))
	else: pass
	process=subprocess.run(executable='ffmpeg',args=args,stdout=subprocess.PIPE,stderr=subprocess.STDOUT,stdin=subprocess.DEVNULL)
	if process.returncode!=0:
		print('[Error] ffmpeg process doesn\'t return to 0.')
		print('<< Start of Process console output>>')
		print(process.stdout.decode())
		print('<< End of Process console output>>')
		raise ResourceWarning('Error occurred when getting the audio data from Youtube.')
	print('2nd Normalizing using ffmpeg-normalize...')
	process=subprocess.run(executable='ffmpeg-normalize',args=['ffmpeg-normalize','-o',temp_path+'ready_audio','-f','-c:a','pcm_s16le','-ar','44100','-ofmt','wav','-nt','peak','-t','0',temp_path+'converted_phase2'],stdout=subprocess.DEVNULL,stderr=subprocess.PIPE,stdin=subprocess.DEVNULL)
	if process.returncode!=0:
		print('[Error] ffmpeg-normalize process doesn\'t return to 0.')
		print('<< Start of Process stderr console output>>')
		print(process.stderr.decode())
		print('<< End of Process stderr console output>>')
		raise ResourceWarning('Error occurred when getting the audio data from Youtube.')
	print('Insert audio data to program memory...')
	insert(process_wave_data(temp_path+'ready_audio'))
	print('Saving to storage...')
	write(len(x),len(x))
	print('Adding/Changing song metadata to database...')
	change_db(*metadata)
	global y
	y=le.fit_transform(db.Key)
	print('Updating database in storage...')
	update_to_ldb()

def change_db(*args):
	global db
	if len(args)==len(db.columns)+1:
		db.loc[args[0]]=pd.Series(args[1:],index=db.columns)
	else: raise ValueError('Amount of data inputted isn\'t matched with current database.')
	update_to_ldb()

def update_from_ldb(NoBackup=False):
	global db,db_,x,x_,y,y_,le,le_
	if 'x' in globals() and not NoBackup: x_=x
	if 'db' in globals() and not NoBackup: db_=db
	if 'y' in globals() and not NoBackup: y_=y
	if 'le' in globals() and not NoBackup: le_=le
	length=len(os.listdir(path))
	x=np.empty((length,1102500),'int16')
	for i in range(length):
		x[i]=scp_io_wav.read(root+str(i+1))[1]
	db=pd.read_csv(database,index_col='Name')
	y=le.fit_transform(db.Key)

def get_first_zero_index_pos(wave,start_index):
	try:
		first_value=wave[start_index]
		if first_value>0:
			for i  in range(start_index+1,len(wave)):
				if wave[i]<0:
					d=abs(wave[i]-wave[i-1])
					first_zero_index_pos=i-1+(abs(wave[i-1])/d)
					break
				elif wave[i]==0:
					first_zero_index_pos=i
					break
		elif first_value<0:
			for i  in range(start_index+1,len(wave)):
				if wave[i]>0:
					d=abs(wave[i]-wave[i-1])
					first_zero_index_pos=i-1+(abs(wave[i-1])/d)
					break
				elif wave[i]==0:
					first_zero_index_pos=i
					break
		else:
			first_zero_index_pos=start_index
		return first_zero_index_pos
	except IndexError: raise ResourceWarning('Given start index isn\'t found.')
	except UnboundLocalError: raise ResourceWarning('No next zero-interception from the given start index.')

def get_wavelength(wave,start_index):
	begin=get_first_zero_index_pos(wave,start_index)
	end=get_first_zero_index_pos(wave,floor(get_first_zero_index_pos(wave,floor(begin)+1))+1)
	return abs(end-begin)

### [CODE] Define functions (Wait to clean them up)
def convert_local(path,to_path=None):
	os.system('rm /tmp/ML_test.wav 1>/dev/null 2>/dev/null')
	os.system('ffmpeg -i "'+path+'" -f wav -acodec pcm_s16le -ar 44100 -ac 2 -y /tmp/ML_test.wav'+quiet_str)
	if to_path!=None: scp_io_wav.write(to_path,44100,process_wave_data('/tmp/ML_test.wav'))
	else: scp_io_wav.write(path,44100,process_wave_data('/tmp/ML_test.wav'))

def testML(path):
	return clf.predict([process_wave_data(path)])

def write_thread(t,start=1,end=None):
	cnt1=0
	while True:
		no=t+8*cnt1+(start-1)
		if end!=None and no>end: return
		i=no-1
		try: x[i]
		except IndexError: return
		scp_io_wav.write(root+str(no),44100,x[i])
		cnt1+=1

def read_thread(t,start=1,end=None):
	global x_
	if end==None: end=len(db)
	for no in range(t+(start-1),end+1,8):
		i=no-1
		path=root+str(no)
		x_[i]=scp_io_wav.read(root+str(no))[1]

# Can add/remove amount of threads manually
def write(start=1,end=None):
	from threading import Thread as Thread
	wt1=Thread(name='WritingThread1',target=write_thread,args=(1,start,end))
	wt2=Thread(name='WritingThread2',target=write_thread,args=(2,start,end))
	wt3=Thread(name='WritingThread3',target=write_thread,args=(3,start,end))
	wt4=Thread(name='WritingThread4',target=write_thread,args=(4,start,end))
	wt5=Thread(name='WritingThread5',target=write_thread,args=(5,start,end))
	wt6=Thread(name='WritingThread6',target=write_thread,args=(6,start,end))
	wt7=Thread(name='WritingThread7',target=write_thread,args=(7,start,end))
	wt8=Thread(name='WritingThread8',target=write_thread,args=(8,start,end))
	wt1.start()
	wt2.start()
	wt3.start()
	wt4.start()
	wt5.start()
	wt6.start()
	wt7.start()
	wt8.start()
	wt1.join(); wt2.join(); wt3.join(); wt4.join(); wt5.join(); wt6.join(); wt7.join(); wt8.join()
# Can add/remove amount of threads manually

def read(start=1,end=None):
	global x_
	if end==None: x_=np.empty((len(os.listdir(path))-(start-1),1102500),'int16')
	else: x_=np.empty((end-start+1,1102500),'int16')
	rt1=Thread(name='ReadingThread1',target=read_thread,args=(1,start,end))
	rt2=Thread(name='ReadingThread2',target=read_thread,args=(2,start,end))
	rt3=Thread(name='ReadingThread3',target=read_thread,args=(3,start,end))
	rt4=Thread(name='ReadingThread4',target=read_thread,args=(4,start,end))
	rt5=Thread(name='ReadingThread5',target=read_thread,args=(5,start,end))
	rt6=Thread(name='ReadingThread6',target=read_thread,args=(6,start,end))
	rt7=Thread(name='ReadingThread7',target=read_thread,args=(7,start,end))
	rt8=Thread(name='ReadingThread8',target=read_thread,args=(8,start,end))
	rt1.start()
	rt2.start()
	rt3.start()
	rt4.start()
	rt5.start()
	rt6.start()
	rt7.start()
	rt8.start()
	rt1.join(); rt2.join(); rt3.join(); rt4.join(); rt5.join(); rt6.join(); rt7.join(); rt8.join()

def insert(snd):
	x.resize((x.shape[0]+1,x.shape[1]),refcheck=False)
	pos=len(x)-1
	x[pos]=snd

def save_local(path):
	_null=os.system('rm /tmp/ML_test.wav'+quiet_str)
	_null=os.system('ffmpeg -i "'+path+'" -f wav -acodec pcm_s16le -ar 44100 -ac 2 -y /tmp/ML_test.wav'+quiet_str)
	insert(process_wave_data('/tmp/ML_test.wav'))
	write(len(x),len(x))

def test_local(path):
	_null=os.system('rm /tmp/ML_test.wav 1>/dev/null 2>/dev/null')
	_null=os.system('ffmpeg -i "'+path+'" -f wav -acodec pcm_s16le -ar 44100 -ac 2 -y /tmp/ML_test.wav'+quiet_str)
	return clf.predict([process_wave_data('/tmp/ML_test.wav')])[0]

def youtube_test(_url):
	_null=os.system('rm /tmp/ML_test.wav'+quiet_str)
	_null=os.system("(rm -f /tmp/ML_tmp.*; youtube-dl -o '/tmp/ML_tmp.%(ext)s' '"+_url+"')"+quiet_str)
	_null=os.system('ffmpeg -i /tmp/ML_tmp.* -f wav -acodec pcm_s16le -ar 44100 -ac 2 -y /tmp/ML_test.wav'+quiet_str)
	return clf.predict([process_wave_data('/tmp/ML_test.wav')])

def record_test():
	_null=os.system('rm /tmp/ML_test.wav'+quiet_str)
	print('REC...')
	_null=os.system('arecord -t wav -f S16_LE -r 44100 -c 2 /tmp/ML_test.wav'+quiet_str)
	return clf.predict([process_wave_data('/tmp/ML_test.wav')])

def save_recording():
	_null=os.system('rm /tmp/ML_test.wav'+quiet_str)
	print('REC...')
	_null=os.system('arecord -t wav -f S16_LE -r 44100 -c 2 /tmp/ML_test.wav'+quiet_str)
	insert(process_wave_data('/tmp/ML_test.wav'))
	write(len(x),len(x))
	ask_y()

def youtube_askname(_url,name):
	global db,y
	_null=os.system('rm /tmp/ML_test.wav'+quiet_str)
	_null=os.system("(rm -f /tmp/ML_tmp.*; youtube-dl -o '/tmp/ML_tmp.%(ext)s' '"+_url+"')"+quiet_str)
	_null=os.system('ffmpeg -i /tmp/ML_tmp.* -f wav -acodec pcm_s16le -ar 44100 -ac 2 -y /tmp/ML_test.wav'+quiet_str)
	insert(process_wave_data('/tmp/ML_test.wav'))
	write(len(x),len(x))
	global db
	datas=[]
	datas+=[name]
	for cnt in range(len(db.columns)):
		datas+=['']
	with open(database,'r+') as _tmp:
		_tmp.seek(_tmp.seek(0,2)-1)
		if _tmp.read()!='\n': _tmp.write('\n'+','.join(datas))
		else: _tmp.write(','.join(datas))
	update_from_ldb()

def update_to_ldb():
	open(database+'.bak','w').write(open(database).read())
	db.to_csv(database)

def ask_y():
	global db
	datas=[]
	datas+=[input('Name : ')]
	for column in db.columns:
		datas+=[input('Data for ('+str(column)+') : ')]
	with open(database,'r+') as _tmp:
		_tmp.seek(_tmp.seek(0,2)-1)
		if _tmp.read()!='\n': _tmp.write('\n'+','.join(datas))
		else: _tmp.write(','.join(datas))
	update_from_ldb()

def record_test_multiple():
	global test
	test=np.array([],'int8')
	while True:
		test=np.append(test,record_test())
		input('Continue??')

### [CODE] Define variables (Wait to clean them up)
# Notes : Directory path should be append with '/'
	### Configuration zone
# Selecting the Main path
main_path='/home/np-chaonay/Misc/MusicDataset-2/'
# Songs metadata file location
database=main_path+'database.csv'
# Training data location
path=main_path+'Songs/'
# Temporary uploading file location
uploading_path=main_path+'Uploaded/'
# Uploaded song location
uploaded_songs=uploading_path+'Songs/'
# Uploaded test audio location
uploaded_testing=uploading_path+'Testing/'
# Prefixing filename of training audio data (Including folder path)
root=path+'dsb-'
# Temporary directory
temp_path='/tmp/np-chaonay.ML/'
	### [Configuration zone END]
# Label encoding class
le=LabelEncoder()
# Quiet string for ignore console I/O of os.system()
quiet_str=' </dev/null 1>/dev/null 2>/dev/null'

### [CODE] Defining personal modules/functions/variables
# Recording_1
def record_1():
	global test
	test=np.array([],'int8')
	while True:
		print('(Odd)')
		test=np.append(test,record_test())
		input('Continue?? (Even)')
		print('(Even)')
		test=np.append(test,record_test())
		input('Continue?? (Odd)')

# Youtube_1
# Note: Check current csv file header and this function before use.
def YT():
	_url=input('URL:')
	name=input('NAME:')
	emotion=input('EMOTION:')
	key=input('KEY:')
	global db,y,clf
	_null=os.system('rm /tmp/ML_test.wav'+quiet_str)
	_null=os.system("(rm -f /tmp/ML_tmp.*; youtube-dl -o '/tmp/ML_tmp.%(ext)s' '"+_url+"')"+quiet_str)
	_null=os.system('ffmpeg -i /tmp/ML_tmp.* -f wav -acodec pcm_s16le -ar 44100 -ac 2 -y /tmp/ML_test.wav'+quiet_str)
	snd=process_wave_data('/tmp/ML_test.wav')
	print('RESULT:',clf.predict([snd]))
	insert(snd)
	write(len(x),len(x))
	datas=[]
	datas+=[name,emotion,key]
	for cnt in range(len(db.columns)-2):
		datas+=['']
	with open(database,'r+') as _tmp:
		_tmp.seek(_tmp.seek(0,2)-1)
		if _tmp.read()!='\n': _tmp.write('\n'+','.join(datas))
		else: _tmp.write(','.join(datas))
	update_from_ldb()

def readURL(url):
	url=list(url)
	mark=-1
	while True:
		for i,char in enumerate(url):
			if i<=mark: continue
			if char=='%':
				mark=i
				url[i]=escape_decode('\\x'+''.join(url[i+1:i+3]))[0]
				del url[i+2]; del url[i+1]
				break
		else: break
	for i,char in enumerate(url):
		if type(char) is str: continue
		c=1
		while True:
			bytes_sum=bytes()
			for byte in url[i:i+c]:
				bytes_sum+=byte
			try: url[i]=bytes_sum.decode()
			except UnicodeDecodeError: c+=1
			else: del url[i+1:i+c]; break
	return ''.join(url)[7:]

### [CODE] Getting things ready for use
# Getting song metadata from .csv file
update_from_ldb()
# Loading training audio data
read()
# Rename variable
x_raw=x_
del x_
# Set x_raw as default x
x=x_raw
# Check if there're something wrong
if x.shape[0]==y.shape[0]==len(os.listdir(path)): pass
else : raise ResourceWarning('Length of database entries and amount of the pre-processed audio data aren\'t matched.')
# [CODE END]


###### Data Decompositions ######
### [CODE] Dimensions Reduction
dr=PCA()
dr.fit(x[:50])
x_dr=dr.transform(x)
### [CODE] Set x_dr as default x
x=x_dr


###### Simple Testing ######
### [CODE] Define and train the model
clf=LinearSVC(max_iter=10**4*5)
clf.fit(x,y)

### [CODE] Alarm when training ends
print('\a'); sleep(0.75); print('\a'); sleep(0.75); print('\a');

### [CODE] Testing
# Define the true classification
y_t=np.array([ 0, 1, 1,  0, 1,  0,  0,  0,  0,  0,  0,  0, 1, 1])
# Create predicted list
y_p=[]
# For loop for testing
for no in range(1,15):
	y_p+=[clf.predict([process_wave_data('/home/np-chaonay/Music/Imported/โรงเรียนบดินทรเดชา (สิงห์ สิงหเสนี) ๒/Track '+str(no)+'.wav')])[0]]

# Convert to Numpy for easily comparing
y_p=np.array(y_p)

# Report the score
while True:
	print('# Array comparison')
	print('Predicted:',y_p)
	print('     True:',y_t)
	print('\n# Simple statistics comparison')
	_n=len(y_t[y_t==y_p])
	_d=len(y_t)
	_p=_n/_d
	print('Total Score:',_n,'of',_d,str(round(100*_p,1))+'%')
	_n=len(y_t[np.logical_and(y_t==0,y_p==0)])
	_d=len(y_t[y_t==0])
	_p=_n/_d
	print('Major Score:',_n,'of',_d,str(round(100*_p,1))+'%')
	_n=len(y_t[np.logical_and(y_t==1,y_p==1)])
	_d=len(y_t[y_t==1])
	_p=_n/_d
	print('Minor Score:',_n,'of',_d,str(round(100*_p,1))+'%')
	print('\n# Confusion matrix comparison')
	confusion_matrix(y_t,y_p)
	break


###### Testing multiple models ######

# Models to train : LinearSVC,SVC,SGDClassifier,DecisionTreeClassifier,KNeighborsClassifier,GaussianNB,MultinomialNB,ComplementNB,BernoulliNB,CategoricalNB,MLPClassifier,BernoulliRBM

### [CODE] Define the true classification
y_t=np.array([ 0, 1, 1,  0, 1,  0,  0,  0,  0,  0,  0,  0, 1, 1])

### [CODE] Import necessary modules
import sklearn.svm
import sklearn.linear_model
import sklearn.tree
import sklearn.neighbors
import sklearn.naive_bayes
import sklearn.neural_network

### [CODE] Define multiple models
Types=[
sklearn.neural_network.MLPClassifier(hidden_layer_sizes=(10,10,10,10,10,10,10,10,10,10),activation='relu'),
sklearn.neural_network.MLPClassifier(hidden_layer_sizes=(10,10,10,10,10,10,10,10,10,10),activation='tanh'),
sklearn.neural_network.MLPClassifier(hidden_layer_sizes=(10,10,10,10,10,10,10,10,10,10),activation='logistic'),
sklearn.neural_network.MLPClassifier(hidden_layer_sizes=(10,10,10,10,10,10,10,10,10,10),activation='identity'),
]

### [CODE] Define lists of models prediction, not-worked models + Define var for identity string of created models and retrive the identity string to that list
model_p=[]
Iden=[]
Error_Models=[]
Error_Models_Indexes=[]
for Type in Types:
	Iden+=[repr(Type)]

### [CODE] Training
for i,Type in enumerate(Types):
	# Training
	clf=Type
	# Catch exception for anything that would go wrong
	try: clf.fit(x[:50],y[:50])
	except Exception as err:
		print_err(err,'training the model.')
		# Annouce the not-worked model
		Error_Models+=[Iden[i]]
		Error_Models_Indexes+=[i]
		# Delete unused tested model
		Types[i]=None
	else:
		# Create predicted list
		y_p=[]
		# For loop for testing
		for no in range(1,15):
			# IN CASE of not using dimension reduction
			#y_p+=[clf.predict([process_wave_data('/home/np-chaonay/Music/Imported/โรงเรียนบดินทรเดชา (สิงห์ สิงหเสนี) ๒/Track '+str(no)+'.wav')])[0]]
			# IN CASE of using dimension reduction
			y_p+=[clf.predict(dr.transform([process_wave_data('/home/np-chaonay/Music/Imported/โรงเรียนบดินทรเดชา (สิงห์ สิงหเสนี) ๒/Track '+str(no)+'.wav')]))[0]]
		# Convert to Numpy for easily comparing
		y_p=np.array(y_p)
		# Add to models prediction list
		model_p+=[y_p]

### [CODE] Console alarm when training ends
print('\a'); sleep(0.75); print('\a'); sleep(0.75); print('\a');

### [CODE] If there're any models has experienced the error, report and remove it from Iden
if Error_Models:
	print('[WARNING] Some of defined models has experienced the error.')
	while Error_Models_Indexes:
		del Iden[Error_Models_Indexes[-1]]
		del Error_Models_Indexes[-1]

### [CODE] Report the results
for i in range(len(Iden)):
	print('\n'+'#'*80+'\n')
	print('Model #'+str(i+1)+': '+Iden[i]+'\n')
	print('# Array comparison')
	print('Predicted:',model_p[i])
	print('     True:',y_t)
	print('\n# Simple statistics comparison')
	_n=len(y_t[y_t==model_p[i]])
	_d=len(y_t)
	_p=_n/_d
	print('Total Score:',_n,'of',_d,str(round(100*_p,1))+'%')
	_n=len(y_t[np.logical_and(y_t==0,model_p[i]==0)])
	_d=len(y_t[y_t==0])
	_p=_n/_d
	print('Major Score:',_n,'of',_d,str(round(100*_p,1))+'%')
	_n=len(y_t[np.logical_and(y_t==1,model_p[i]==1)])
	_d=len(y_t[y_t==1])
	_p=_n/_d
	print('Minor Score:',_n,'of',_d,str(round(100*_p,1))+'%')
	print('\n# Confusion matrix comparison')
	print(confusion_matrix(y_t,model_p[i]))

# [CODE END]


###### Analyzing audio wave ######
### [CODE] Define threading function and variables
def thread(no,Queue):
	range_=len(wave)//4
	for i in range(range_):
		try: freq_list[i]=44100/get_wavelength(wave,i+(no-1)*range_)
		except ResourceWarning as err:
			if no!=4: print('[Warning] Couldn\'t find wavelength even in the early threads. (Wave may too short) [At Thread:',no,']')
			freq_list[i]=freq_list[i-1]
	Queue.put(freq_list)

songs=[path+'dsb-'+str(i+1) for i in range(50)]
freqarrays=[main_path+'FreqArrays/dsb-'+str(i+1) for i in range(50)]
### [CODE] Do a multi-threading audio wave analyzing and save as pickled file
for i in range(50):
	wave=scp_io_wav.read(songs[i])[1]
	freq_list=np.zeros((len(wave)//4),'float32')
	Queue_Process1 = multiprocessing.Queue()
	Queue_Process2 = multiprocessing.Queue()
	Queue_Process3 = multiprocessing.Queue()
	Queue_Process4 = multiprocessing.Queue()
	Process1 = multiprocessing.Process(target=thread, args=(1,Queue_Process1))
	Process2 = multiprocessing.Process(target=thread, args=(2,Queue_Process2))
	Process3 = multiprocessing.Process(target=thread, args=(3,Queue_Process3))
	Process4 = multiprocessing.Process(target=thread, args=(4,Queue_Process4))
	Process1.start(); Process2.start(); Process3.start(); Process4.start()
	Obj_1=Queue_Process1.get(); Obj_2=Queue_Process2.get(); Obj_3=Queue_Process3.get(); Obj_4=Queue_Process4.get()
	pickle.dump(reduce(np.append,(Obj_1,Obj_2,Obj_3,Obj_4)),open(freqarrays[i],'wb'))

### [CODE] Load pickled files
freqarray=np.column_stack([pickle.load(open(file,'rb')) for file in freqarrays])

###### Code Snippet ######
# save_youtube Ex.1
Name=''; Rate=None; Genre=None; EmotionType=None; Key=''; KeyDegree=None; BPM=None; BeatsPerMeasure=None; IsCompound=None; save_youtube('',(Name,Rate,Genre,EmotionType,Key,KeyDegree,BPM,BeatsPerMeasure,IsCompound),start=None,end=None,pitch=None)
# save_youtube Ex.2 (Init_Before_Use,For_Execute)
1)
Rate=None; Genre=None; EmotionType=None; KeyDegree=None; BPM=None; BeatsPerMeasure=None; IsCompound=None;
2)
print('[INFO] Starting saving data of the song.')
while True:
	try: Data=0; Name=''+' ('+['Maj','Min'][Data]+')'; Key=['Maj','Min'][Data]; save_youtube('',(Name,Rate,Genre,EmotionType,Key,KeyDegree,BPM,BeatsPerMeasure,IsCompound),start=None,end=None,pitch=None)
	except Exception as err: print_err(err,action='saving data, retrying',type_='Error'); continue
	else: break
# SVC with different C constant
sklearn.svm.SVC(C=0.0),
sklearn.svm.SVC(C=0.5),
sklearn.svm.SVC(C=1.0),
sklearn.svm.SVC(C=10.0),
sklearn.svm.SVC(C=100.0),
# LinearSVC with different C constant
sklearn.svm.LinearSVC(C=0.5),
sklearn.svm.LinearSVC(C=1.0),
sklearn.svm.LinearSVC(C=10.0),
sklearn.svm.LinearSVC(C=100.0),
# MLPClassifier with different activation functions
sklearn.neural_network.MLPClassifier(hidden_layer_sizes=(10,10,10,10,10,10,10,10,10,10),activation='relu'),
sklearn.neural_network.MLPClassifier(hidden_layer_sizes=(10,10,10,10,10,10,10,10,10,10),activation='tanh'),
sklearn.neural_network.MLPClassifier(hidden_layer_sizes=(10,10,10,10,10,10,10,10,10,10),activation='logistic'),
sklearn.neural_network.MLPClassifier(hidden_layer_sizes=(10,10,10,10,10,10,10,10,10,10),activation='identity'),