Ömer Kırnap kirnap

## dataset.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                kirnap
                / dataset.md
            
            
              Last active
              May 4, 2023 17:33
            
          
    Training data

Sentence-transformers
use the concatenation from multiple datasets to fine-tune our model. The total number of sentence pairs is above 1 billion sentences.
We sampled each dataset given a weighted probability which configuration is detailed in the data_config.json file.


Dataset
Paper
Number of training tuples


Reddit comments (2015-2018)
paper
726,484,430


## acl17.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                kirnap
                / acl17.md
            
            
              Last active
              July 31, 2017 03:26
            
              
                A user manual and summary in ACL2017 conference 
              
          
    Day 1


There were some tutorials I took two of them:
- NLP for precision medicine

They basically apply the machine learning algorithms along with NLP techniques to cancer cure detection task. They borrow sequence tagging, dependency parsing, word embedding and apply to tackle their current research. They talked usual about graph LSTMs and biLSTMs. (The presenter)
- Deep Learning for Dialogue systems

They briefly introduce end to end conversation based personal assitant like Apple's siri, Amazon's alexa etc. In other words, this tutorial was a detailed overview on how to apply dialogue systems on top of deep learning. The emphasize was on RL and different structured LSTM like architectures. Their slides (at least take a look at references in btw slides 100 and 120) are quite explanatory, and the cited papers try to catch up with the current research on that area