Skip to content

Instantly share code, notes, and snippets.

@sak-codes
Created August 21, 2023 15:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save sak-codes/2c2d948f8ec285360247db6606cb4ad3 to your computer and use it in GitHub Desktop.
Save sak-codes/2c2d948f8ec285360247db6606cb4ad3 to your computer and use it in GitHub Desktop.

Extending the data structures and algorithms along with providing C++ backend

Project Mentors: Gagandeep Singh, Ivan Ogasawara, Smit Lunagariya, Ever Vino, Alexandre de Siqueira, Agustina Pesce, Saransh Chopra

This is the final report of the entire GSoC 2023 for the project titled as Extending the data structures and algorithms along with providing C++ backend. The complete description of the project implementation and discussions of the entire program is available at my personal blog site.

About Me

I am Sakshi Oza, final year Masters student at Indian Institute of Technology, Gandhinagar.

Project Overview: Phase-wise short description

I divided the whole project into three main phases to work on the specific topics that were supposed to be implemented during the GSoC.

  1. Community Bonding & Pre-GSOC: During this phase, I mainly focused on the easier topics that help me to get started with the official coding timeline. I dedicated my efforts to gaining a thorough understanding of the codebase.

  2. Phase 1: This phase consists of the official coding before prior to the mid evaluation. This phase was mainly focused on extending the existing scope of data structures and algorithms as well as fixing the existing issues.

  3. Phase 2: This phase consists of the official coding after the the mid evaluation. This phase included working on some new data structures along with adding the CPP backend to the current algorithms under the linear_data_structures submodule.

Pull Requests

This section describes the actual work done during the coding period in terms of merged pull requests.

Community Bonding & Pre-GSOC

  • #516 - This PR implements a method in DSU to find the size of the group of the given key.

  • #517 - This PR adds a method in the class Trie to check if the string as inserted before.

  • #521 - This PR implements a method to print all the members of the same group in DSU.

Phase 1

  • #530 - This PR closes the issue-478 by implementing the network flow algorithms: Edmond Karp.

  • #534 - This PR was built on the top of the above PR to add the Dinic algorithm for network flow.

  • #535 - This PR closes the issue-436 by implementing a test function that helps in testing the library installation as well as supports submodule specific tests.

  • #537 - This PR implements lower bound and upper bound in the BST data structure similar to CPP's STL.

Phase 2

  • #538 - This PR closes an issue-390 on adding a Multiset data structure similar to CPP's STL.

  • #539 - This PR adds Lazy Segment Tree implementation.

  • #540 - This PR focuses on the CPP backend of the project targeting the sorting algorithm by starting with bubble_sort.

  • #542 - This PR completed CPP backend of selection_sort.

  • #543 - This PR completed CPP backend of insertion_sort and is_ordered.

  • #544 - This PR extends CPP backend for search algorithms: linear_search, binary_search, and jump_search.

Examples

  1. Using the test function
>>> from pydatastructs import test
>>> test(["graphs"])
============================= test session starts ==============================
platform darwin -- Python 3.8.16, pytest-7.3.1, pluggy-1.0.0
rootdir: /Users/thebigbool/repos/pydatastructs
plugins: cov-4.1.0, xdist-3.3.1, anyio-3.7.0
collected 11 items

pydatastructs/graphs/tests/test_adjacency_list.py .                      [  9%]
pydatastructs/graphs/tests/test_adjacency_matrix.py .                    [ 18%]
pydatastructs/graphs/tests/test_algorithms.py .........                  [100%]

============================== 11 passed in 0.05s ==============================

We can also test an imported submodule in the similar way:

>>> from pydatastructs import graphs
>>> test([graphs])
============================= test session starts ==============================
platform darwin -- Python 3.8.16, pytest-7.3.1, pluggy-1.0.0
rootdir: /Users/thebigbool/repos/pydatastructs
plugins: cov-4.1.0, xdist-3.3.1, anyio-3.7.0
collected 11 items

pydatastructs/graphs/tests/test_adjacency_list.py .                      [  9%]
pydatastructs/graphs/tests/test_adjacency_matrix.py .                    [ 18%]
pydatastructs/graphs/tests/test_algorithms.py .........                  [100%]

============================== 11 passed in 0.02s ==============================
  1. Using a Multiset
>>> from pydatastructs.miscellaneous_data_structures import Multiset
>>> ms = Multiset()
>>> ms.add(5)
>>> ms.add(5)
>>> ms.add(3)
>>> ms.add(7)
>>> len(ms)
4
>>> 5 in ms
True
>>> 2 in ms
False
  1. Using a CPP backend
>>> from pydatastructs import OneDimensionalArray, Backend, linear_search
>>> array = OneDimensionalArray(int, [1, 2, 5, 7, 10, 29, 40])
>>> linear_search(array, 5, backend=Backend.CPP)
2
>>> linear_search(array, -5, backend=Backend.CPP) # not found
>>>

Conclusion

This summer has been a great learning experience. I am grateful to my mentors, Gagandeep Singh, and Smit Lunagariya for always helping me with the new concepts, reviewing my PRs and providing quick responses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment