airMeng/Add model support.md

## Add model support.md

      
    Raw
  

              Add model support.md
            
          
     1. Model weight conversion 

  1.1 Pytorch weight parsing

  1.2 tokenizer

2. Model enablements

  2.1 model loading 

  model class 

  model struct (ffn, attn, norm tensors), tensor name mapping 

  model_context 

  model_load_internal (explaination of variables) 

  2.2 inference process

      model_eval_internal (explaination of variables)

  2.3 application

      CMakeLists.txt

      Python bindings and scripts

3. Performance optimization

  3.0 introduction to JBLAS, isa 

  https://github.com/intel-sandbox/jblas/blob/main/README.md
  3.1 Int4
      quantize (sym, bits, compute type, fallback)
  
3.2 MHA fusion
      
basic MHA (check jblas_mha_xxx_support)
      
GQA
      
link-to-MHA-doc
  
3.3 FFN fusion
      
basic FFN
      
link-to-FFN-doc
  
3.4 Tensor Parallel
     
link to TP doc