Neel-Shah-29/Gsoc2023.md Secret

## Gsoc2023.md

      
    Raw
  

              Gsoc2023.md
            
          
    Final Evaluation Report for GSOC 2023


Details


Name
Neel Shah


Organisation
CERN HSF (Root Project)


Mentor
Dr. Lorenzo Moneta , Sanjiban Sengupta


Project
ROOT - TMVA SOFIE Developments - Inference Code Generation for Deep Learning models


Proposal Link :- Inference Code Generation for Deep Learning models

Description of Project


This project will focus on development of some missing deep learning operations which will allow to build more complex networks within TMVA for parsing the Transformer based models and Graph Net Models in SOFIE.


The expected result is a working implementation of modular operators classes that implement the operators as defined by the ONNX standards in the code generation format. The project requires also to write the corresponding unit tests need to validate the written code.


Pre-GSOC Period

Since i did the GSOC for the second time in the same organisation the process was similar to 2022 and i was aware about it.
I have prepared a detailed blog on how did i got into GSOC 2022 at CERN HSF in detail here
Flowchart to Generate the Code in SOFIE


Motivation behind using SOFIE


ML ecosystem mostly focuses on model training.
Machine Learning Inference & deployment is often neglected
Inference in Tensorflow & PyTorch

only their own model
usage of C++ environment is difficult
heavy dependency


Types of Parsers


How does ONNX (Open Neural Network Exchange) work?


About SOFIE

SOFIE(System for Optimized Fast Inference code Emit) is a deep learning inference engine that


Takes ONNX files as input


Produces a C++ script as output


TMVA SOFIE (“System for Optimized Fast Inference code Emit”) generates C++ functions easily invokable for the fast inference of trained neural network models. It takes ONNX model files as inputs and produces C++ header files that can be included and utilized in a “plug-and-go” style. This is a new development in TMVA and is currently in early experimental stage.


Coding Period

1) Fix the implementation of MatMul Operator:  Added support for standalone MatMul operator to be accepted by Gemm Operator.


Defination :- Mat-Mul ONNX Documentation


Issue:
Our earlier implementation uses the fused output of Matmul followed by Add Operator in the Gemm Operator and gave a runtime error when standalone matmul operator was used with 0 bais matrix.


Solution:
Now we can have zero bias matrix and pass the output of only the Matmul Operator to Gemm as well.


PR Status:-


Pull Request
PR Number
Status


Support for Mat-Mul ONNX Operator
#12894


2) Swish Activation Function


swish(x) = x * sigmoid(x)

a = tf.constant([-20, -1.0, 0.0, 1.0, 20], dtype = tf.float32)
b = tf.keras.activations.swish(a)
b.numpy()

Output:


PR Status:-


Pull Request
PR Number
Status


Swish Activation Function
#12198


3) Range ONNX Operator


Defination :- Range Operator ONNX Documentation


Generate a tensor containing a sequence of numbers that begin at start and extends by increments of delta up to limit (exclusive).


The number of elements in the output of range is computed as below:


number_of_elements = max( ceil( (limit - start) / delta ) , 0 )


PR Status:-


Pull Request
PR Number
Status


Range ONNX Operator
#13457


4) TopK ONNX Operator


Defination :- TopK Operator ONNX Documentation

Retrieve the top-K largest or smallest elements along a specified axis. Given an input tensor of shape [a_1, a_2, ..., a_n, r] and integer argument k, return two outputs:
Value tensor of shape [a_1, a_2, ..., a_{axis-1}, k, a_{axis+1}, ... a_n] which contains the values of the top k elements along the specified axis


Index tensor of shape [a_1, a_2, ..., a_{axis-1}, k, a_{axis+1}, ... a_n] which contains the indices of the top k elements (original indices from the input tensor).


If "largest" is 1 (the default value) then the k largest elements are returned.


If "sorted" is 1 (the default value) then the resulting k elements will be sorted.


If "sorted" is 0, order of returned 'Values' and 'Indices' are undefined.


PR Status:-


Pull Request
PR Number
Status


TopK ONNX Operator
#12942


5) Log ONNX Operator


Defination :- Log Operator ONNX Documentation


Calculates the natural log of the given input tensor, element-wise.


PR Status:-


Pull Request
PR Number
Status


Log ONNX Operator
#12945


6) Erf Operator


Defination :- Erf Operator ONNX Documentation


Computes the error function of the given input tensor element-wise.


PR Status:-


Pull Request
PR Number
Status


Erf ONNX Operator
#13104


7) Where Operator


Defination :- Where Operator ONNX Documentation


Return elements, either from X or Y, depending on condition. Where behaves like numpy.where with three parameters.


PR Status:-


Pull Request
PR Number
Status


Where ONNX Operator
#13093


8) Feature: Add an option of saving both .dat and .root files


A weight file isn't created by default when generating the code
Options to save the weights in a text file or in a ROOT binary
Earlier we were only having the reading and writing in a .dat (data) files but now we can even have the option to store the .root files and we can access the read and write facilities in root files as well.


PR Status:-


Pull Request
PR Number
Status


Save Weights in Root Binary files
#13423


9) Comparision Operators (Less, Greater, LessOrEqual, GreaterOrEqual, Equal)


Defination :- Equal Operator ONNX Documentation


Returns the tensor resulted from performing the equal logical operation elementwise on the input tensors A and B (with Numpy-style broadcasting support).


Similarity other comparision operators are also implemented. Documentation of those operators as mentioned below.


Defination :- Less Operator ONNX Documentation


Defination :- LessOrEqual Operator ONNX Documentation


Defination :- Greater operator ONNX Documentation


Defination :- GreaterOrEqual operator ONNX Documentation


PR Status:-


Pull Request
PR Number
Status


Comparision ONNX Operators (Less, Greater, LessOrEqual, GreaterOrEqual, Equal)
#13171


10) ConstantOfShape ONNX Operator


Defination :- ConstantOfShape Operator ONNX Documentation


Generate a tensor with given value and shape.


Elu Operator


Defination :- Elu Operator ONNX Documentation


Elu takes one input data (Tensor) and produces one output data (Tensor) where the function f(x) = alpha * (exp(x) - 1.) for x < 0, f(x) = x for x >= 0, is applied to the tensor elementwise.


PR Status:-


Pull Request
PR Number
Status


Elu ONNX Operator
#13544


Tile Operator


Defination :-  Tile Operator ONNX Documentation


Constructs a tensor by tiling a given tensor. This is the same as function tile in Numpy, but no broadcast. For example A = [[1, 2], [3, 4]], B = [1, 2], tile(A, B) = [[1, 2, 1, 2], [3, 4, 3, 4]]


Branch Link:- https://github.com/Neel-Shah-29/root-1/tree/Tile


Some Useful Blogs written by me.


Python Tutorials for various C files of Tutorials/TMVA
Documentation on RModelParser_ONNX.cxx
Getting into GSOC 2022
All about Community Bonding Period
Implementing the Operators in Sofie
Final Project Presentation GSOC 2022
Mid-Term Presentation Gsoc 2023
Lightening Talk Presentation Gsoc 2023
Proposal for GSOC 2023

Conclusion

I enjoyed a lot working with these project! I would like to really thank my mentors Lorenzo Moneta, Sitong An, Omar, Ahmat Hamdan and Sanjiban Sengupta for always being a great support for me. Whenever i wanted any help or guidance, they were always with me! I am very proud to be associated with so many bright minds surrounded by me, and every single day i learn something new from them. At the end i am able to achieve all of my success because of the best wishes of my parents, seniors and friends so a big thanks to them as well.
Hope you all enjoyed reading my blog and learnt a lot.
Thanks and Regards,
Neel Shah

Name	Neel Shah
Organisation	CERN HSF (Root Project)
Mentor	Dr. Lorenzo Moneta , Sanjiban Sengupta
Project	ROOT - TMVA SOFIE Developments - Inference Code Generation for Deep Learning models