For details http://www.cc.iitd.ernet.in/ and http://mirror.iitd.ernet.in
Thanks Vibhav Sinha for helping in compiling this
For details http://www.cc.iitd.ernet.in/ and http://mirror.iitd.ernet.in
Thanks Vibhav Sinha for helping in compiling this
The paper can be found here.
This paper explores an alternative paradigm for machine learning that more closely models the diversity, competence and cumulative nature of human learning (called never-ending learning). It compares, the current day machine learning systems which have a very narrow scope and learn only a single function from very specific and limited training examples in a particular format, to the broad learning that humans undergo. It also presents a case study of NELL(never-ending language learner) from CMU and discusses the acheivements and laggings in the system. The paper has little formalism and nearly no mathematical concreteness but explores a powerful machine learning paradigm, backed up with intuitive reasoning, that may see some light in the future. For some concreteness we start by defining any general purpose agent (supervised learning) in machine learning c
The paper can be found here.
The paper is written in a very elegant and easy to understand way is and divided in 6 parts explaining topics from implementation to the usage at google. It exposes the reader to the immense power in functional languages and is explains how the programming model of map reduce is inspired from it.
Maintaining mutual consistency across different sites, on updates, insertion and deletions, when a database is replicated is non-trivial and a significant problem. Though, it sounds reasonable to maintain a list of all replication servers and send direct updates to all when an update occurs at a site, it can cause large network load on the link of the node that has the initial update. Also, in case of constantly adding and leaving nodes, maintaining a consistent list of a million or a few hundered thousand nodes at every site consistently itself is difficult. In the face of the above mentioned problems, the algorithms described in the paper can come in handy.
The described algorithms have been used in the clearinghouse servers of the Xerox Corporate Internet and have proven to be very useful.
var wikipedia = require("wikipedia-js"); // using wikipedia-js library for search | |
var options = {query: query, format: "html", summaryOnly: false, lang: "en"}; | |
wikipedia.searchArticle(options, function(err, htmlWikiText){ // searching wikipedia with options | |
if(err){ | |
console.log("An error occurred[query=%s, error=%s]", query, err); | |
return; | |
} | |
callback('<div style="text-align: justify !important;">'+htmlWikiText+'</div>'); // Pretty print the wiki-html-text | |
}); |
var watson = require('watson-developer-cloud'); | |
var NaturalLanguageUnderstandingV1 = require('watson-developer-cloud/natural-language-understanding/v1.js'); | |
var nlu = new NaturalLanguageUnderstandingV1(<API SECRET>); | |
function getConcepts(text,callback){ | |
nlu.analyze({ | |
'html': text, // Search concepts related to the text in the buffer/String | |
'features': { // Search concepts related to these keywords/concepts. | |
'concepts': {}, 'keywords': {},} |
The paper can be found here.
GFS is a scalable distributed FS for large distributed data intensive applications. Capabilities of fault tolerance and high streaming performance are inbuil while running on commodity grade hardware.
The paper can be found here
Put in simple words: The paper presents a way on how you can classify text without any annotated data (i.e. unsupervised) and some minimal domain knowledge. The paper uses the domain of reviews, where the domain knowledge is knowing excellent
is positive
while poor
is negative
sentiment.
The paper can be found here
Automatic question generation for sentexces from passages in reading comprehension
The paper can be found here
Put in simple words: The paper presents a method on how you can train a model when you have only a small amount of (labelled) data in the domain you are working on, but have access to loads of (labelled) data from some other domain. The paper has been named so, because the author suggests that it can be frustrating when you figure out that simple methods like those illustrated can be such difficult benchmarks to beat and perform reasonably well.