Skip to content

Instantly share code, notes, and snippets.

@ritesh99rakesh
Last active August 27, 2019 10:39
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ritesh99rakesh/3bcf890df9eecc5d626531a243b245f2 to your computer and use it in GitHub Desktop.
Save ritesh99rakesh/3bcf890df9eecc5d626531a243b245f2 to your computer and use it in GitHub Desktop.
GSoC 2019 Report Ritesh : Statistics

Google Summer of Code 2019


Project: Enhancement of Statistics Module

Organization: SymPy

This report summarizes the work done in my GSoC 2019 project, Enhancement of Statistics Module wth SymPy. My mentors were, Francesco Bonazzi and Sidhant Nagpal. Blog post with step by step development of the project is avaiable at ritesh99rakesh.github.com.

About Me

I am Ritesh Kumar, 4th year Computer Science and Engineering Student from Indian Institute of Technology Kanpur.

Project Outline

The project plan was focused on the following areas of statistics that were required to be added to sympy.stats.

  1. Community Bonding - I was supposed to add Inverse Gaussian, Levy, Non-central Beta, Beta-Binomial, Poisson binomial, Wishart, Inverse Wishart, Dirichlet, Inverted Dirichlet, Multivariate Pareto, Normal Inverse Gamma, Normal Inverse Wishart adn Normal Wishart distributions.
  2. Phase 1 - I was supposed to work on exporting expressions of RVs to external libraries like NumPy, SciPy and PyMC3, including it's API design and implementation.
  3. Phase 2 - I was expected to work on compound distributions including it's API design and implementation.
  4. Phase 3 - I planned to work on writing a wrapper for exporting RV to external libraries and Random Walk implementation.

Pull Requests

This section describes the actual work done during the coding period in terms of PRs.

  1. Community Bonding
  • #16814 : This PR implemented Inverse Normal distribution in sympy/stats/crv_types.py.

  • #16815 : This PR added Beta-Binomial distribution under sympy/stats/frv_types.py.

  • #16827 : This PR implemented Non-central Beta distribution under sympy/stats/crv_types.py.

  • #16843 : This PR enhances the function modules by implementing Multivariate Gamma function. Multivariate gamma is used in Wishart, Inverse Wishart distributions and Matrix variate beta distribution. The code is under sympy/functions/special/gamma_functions.py.

  • #16848 : This PR added Log-logistic distribution under sympy/stats/crv_types.py.

  1. Phase 1
  • #16858 : This PR added test for moment generating function and missing checks in crv_types.py.

  • #16935 : This PR enchanced the RV function interface by implementing Kurtosis function under sympy/stats/rv_interface.py.

  • #16942 : This issue discusses the API design and implementation details for exporting expression of RVs to external libraries like NumPy, SciPy and PyMC3.

  • #17068 : This PR added Log Normal distributionto sympy/stats/crv_types.py.

  • #17070 : This PR added Exponential Power distribution under sympy/stats/crv_types.py.

  • #17076 : This PR enchanced continuous distribution by adding Exponentially modified Gaussian distribution under sympy/stats/crv_types.py.

  • #17077 : This PR added Factorial Moment function to sympy/stats/rv_interface.py.

  • #17099 : This PR implemented Skellam distribution under sympy/stats/drv_types.py.

  1. Phase 2
  • #16970 : This PR added PERT distribution in sympy/stats/crv_types.py.

  • #17004 : This PR enhances the function module by implementing the Incomplete Beta function under sympy/functions/special/beta_functions.py.

  • #17033 : This PR is to finalize the prototype of exporting RV to external libraries.

  • #17036 : This is one of the major PR for the project. In this PR, I conceptualized compound distributions by implementing CompoundPSpace and CompoundDistribution. The code files include sympy/stats/compound_rv.py.

  • #17057 : This is one of the most difficult PR since it involved designing the API for sampling and intraction with external libraries. Refer to the comment. Implemented sample methods for continuous RV under sympy/stats/crv_types.py and added tests.

  1. Phase 3
  • #17197 : This issue discusses the API design and implementation details of Random Walk.

  • #17199 : This PR is related to refactoring sympy/stats/joint_rv_types.py.

  • #17204 : This PR added Wishart distribution under sympy/stats/joint_rv_types.py

  • #17210 : This PR addedMultivariateNormal and MultivariateLaplace functions in sympy/stats/joint_rv_types.py.

  • #17257 : This PR added sample methods using libraries like NumPy, SciPy and PyMC3 for Discrete RVs in sympy/stats/drv_types.py

  • #17268 : This was enhancement related PR to allow symbolic parameters for Joint RVs in sympy/stats/joint_rv_types.py.

  • #17445 : This PR added sample methods using libraries like NumPy, SciPy and PyMC3 for Finite RV in sympy/stats/frv_types.py.

Miscellaneous Work

This section contains some of my PRs related to miscellanous issues like, workflow improvement, etc.

  • #16820 : This was bug fix related PR to solve the issue with Range function in sympy/sets/fancysets.py.

  • #16886 : This was bug fix related PR to correct morse code for 1 and add tests for morse code in sympy/crypto/crypto.py.

  • #16888 : This PR added powsimp in _combine_inverse under sympy/core/mul.py and added corresponding tests.

  • #17037 : This PR was related to change in isqrt function in sympy/core/power.py.

  • #17239 : This was enchancement related PR to add relational operator printer for various languages in sympy/printing.

Future Work

Some more work has to be done to complete the sampling from external libraries. Last part of Compound distribution is left which will be completed very soon. For long term future work, more compound distributions could be added by following the present implementation pattern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment