Darshan M.S. MSDarshan91

## stopwords-kn.txt
ಮತ್ತು
ಈ
ಒಂದು
ರಲ್ಲಿ
ಹಾಗೂ
ಎಂದು
ಅಥವಾ
ಇದು
ರ
ಅವರು

## blog.html
<!DOCTYPE html>
<html>
<body>

<h1>Extracting Skills from Personal Communication Data using StackExchange Dataset</h1>

<p>In this blog, we will see how to make use of the stack exchange publicly available dump to extract skills from the communication data.
First, download the entire stack exchange dataset.
The entire stackexchange dataset can be downloaded <a href=" https://archive.org/details/stackexchange">here</a>. There are many stackexchange websites like stackoverflow,cs, datascience, physics, history and so on. One can download the necessary compressed files or one can download the entire dump using torrents. Since, we were using linux on openstack framework, we had to download the torrent files from the terminal and more information about downloading the torrent files from command line is <a href="https://www.learn2crack.com/2013/10/download-torrent-using-terminal.html">here</a>. After downloading the files extract the 7z files (Can be done in one script). Each 7z file corresponds to a stackexchange

## KNN.py
import csv
import random
import math
import operator

def loadDataset(filename, split, trainingSet=[] , testSet=[]):
    with open(filename, 'rb') as csvfile:
        lines = csv.reader(csvfile)
        dataset = list(lines)
        for x in range(len(dataset)-1):
	ಮತ್ತು
	ಈ
	ಒಂದು
	ರಲ್ಲಿ
	ಹಾಗೂ
	ಎಂದು
	ಅಥವಾ
	ಇದು
	ರ
	ಅವರು
	<!DOCTYPE html>
	<html>
	<body>

	<h1>Extracting Skills from Personal Communication Data using StackExchange Dataset</h1>

	<p>In this blog, we will see how to make use of the stack exchange publicly available dump to extract skills from the communication data.
	First, download the entire stack exchange dataset.
	The entire stackexchange dataset can be downloaded <a href=" https://archive.org/details/stackexchange">here</a>. There are many stackexchange websites like stackoverflow,cs, datascience, physics, history and so on. One can download the necessary compressed files or one can download the entire dump using torrents. Since, we were using linux on openstack framework, we had to download the torrent files from the terminal and more information about downloading the torrent files from command line is <a href="https://www.learn2crack.com/2013/10/download-torrent-using-terminal.html">here</a>. After downloading the files extract the 7z files (Can be done in one script). Each 7z file corresponds to a stackexchange
	import csv
	import random
	import math
	import operator

	def loadDataset(filename, split, trainingSet=[] , testSet=[]):
	with open(filename, 'rb') as csvfile:
	lines = csv.reader(csvfile)
	dataset = list(lines)
	for x in range(len(dataset)-1):