zhangk1551

## word2vec_and_glove.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                zhangk1551
                / word2vec_and_glove.md
            
            
              Created
              June 21, 2021 21:52
            
              
                The note for Word2vec and GloVe
              
          
    Word2vec vs GloVe

Paper

Word2vec

Efficient Estimation of Word Representations in Vector Space (2013 Sept) link
Distributed Representations of Words and Phrases and their Compositionality (2013 Oct) link
GloVe

GloVe: Global Vectors for Word Representation (2014 Jan) link
Approach


## joint_embedding_exploration.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                zhangk1551
                / joint_embedding_exploration.md
            
            
              Last active
              June 18, 2021 14:46
            
              
                Exploration for Joint embeddings
              
          
    Exploration for Joint embeddings

The Text2Shape task:

3D related.
Sentence descriptions instead of word terms.
The cosine similarity between the text and shape embeddings need to be meaningful.
Don't need to handle general sentences for all the possible 3D scene. Just focus on a small subset of the sentences including furniture description.

A note for sentence/doc embeddings: Summary for sentence/doc embeddings
Word & object shape

DeViSE(2D)


## Statistical_learning_in_NLP.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                zhangk1551
                / Statistical_learning_in_NLP.md
            
            
              Last active
              June 17, 2021 19:59
            
              
                Statistical learning in NLP
              
          
    Statistical learning in NLP

This note briefly introduced three statistical learning methods in NLP applications: the latent dirichlet allocation for topic modeling, the conditional random fields for named entity recognition and matrix factorization for latent semantic analysis.
Latent Dirichlet Allocation

Dirichlet distribution

Dirichlet distribution is a multivariate generalization of the beta distribution.


Approach


## sentence_and_doc_embeddings.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                zhangk1551
                / sentence_and_doc_embeddings.md
            
            
              Last active
              June 1, 2021 22:31
            
              
                Summary for sentence/doc embeddings
              
          
    Summary for sentence/doc embeddings

Representation

A general definition of representation: transform the origin data(word/sentence/doc, or even images) to vectors that contain information and useful as input for models.

Similar word/sentence/doc tend to be close to each other measured by cosine distance. (map the origin data to a meaningful vector space) It's nice-to-have but not a must. This feature is useful for tasks like semantic similarity comparison, clustering, and information retrieval via semantic search.
Examples for representations with/without the cos similarity feature: Word2vec provides the cos similarity feature for word embeddings, and furthermore it preserve linear regularities among words (vector(”King”) - vector(”Man”) + vector(”Woman”)). But average/last embeddings from BERT don't guaranteed the cos similarity feature, and thus have a bad performance when you apply cosine distance directly to measure sentence/doc similarity (even worse than sentence embeddin

  
## transformer_and_bert_introduction.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                zhangk1551
                / transformer_and_bert_introduction.md
            
            
              Last active
              May 24, 2021 19:40
            
              
                Introduction of Transformer and BERT
              
          
    Transformer

As the word embeddings, for sentences and articles, there are sequence auto-encoder models, which turn the text into a vector representation, and sequence auto-decoder models, which unfolded a vector representation and returned something meaningful like text, tags, or labels.

In the famous paper "Attention Is All You Need" published in 2017, the researchers in Google proposed Transformer, a encode-decode model only with attension mechanism. Before this paper, there were already many former works about neural network encoder and decoder. However, unlike the Transformer that based solely on attention mechanisms, most of the former encoders/decoders relied on recurrent or convolutional structure. Compared with 1-dimension CNN that can only focus on fixed-length parts of the sentence sequences due to the limitation of convolution kernels, attentionism as the weighted average can handle the whole sentence s

  
## grounding_paper_reading.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                zhangk1551
                / grounding_paper_reading.md
            
            
              Last active
              March 31, 2021 09:15
            
              
                Grounding Paper Reading
              
          
    Paper Critiques

Multimodal embeddings


DeViSE  NIPS 2013
Deep Multimodal Embedding: Manipulating Novel Objects with Point-clouds, Language and Trajectories ICRA 2017

Attention


Show, Attend and Tell: Neural Image Caption	ICML 2015
MAttNet: Modular Attention Network for Referring Expression Comprehension	CVPR 2018

Pre-training with transformers


Vilbert	NeurIPS 2019
CLIP CVPR 2021


## vim8_plugins_recommendation.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                zhangk1551
                / vim8_plugins_recommendation.md
            
            
              Last active
              September 19, 2020 04:57
            
              
                vim8 Plugins Recommendation
              
          
    Vim8 Plugins Recommendation

Preface

Before starting the discussion, go through the following three questions:
With vim,

How do you search a string / file in a project? Could you quickly switch between the search results?
How do you compile a project? Could you do it with one shortcut and jump to the exact position where error occurs?
How do you rename a variable? What if there is another variable with the same name in the project?