Skip to content

Instantly share code, notes, and snippets.

@naviji
Last active October 31, 2021 16:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save naviji/6e4b276eac4aa4ceedfb0d33df5c6a16 to your computer and use it in GitHub Desktop.
Save naviji/6e4b276eac4aa4ceedfb0d33df5c6a16 to your computer and use it in GitHub Desktop.
GSoC 2020 Work Product Submission for Joplin

Naveen M V - GSoC 2020

What got done

The project consisted of three parts:

  1. Make search better by introducing additional search filters. (e.g., tags, notebook, type)
  2. Make the ranking of search results better by implementing the Okapi BM25 relevance function.
  3. Make fuzzy search possible.

Of these, the first two have been merged. The third is nearing completion.

The documentation for the new search filters can be found here.

Discussions and weekly project reports can be found here.

What's left to do

Deploying the fuzzy search on all supported platforms is the main problem to be solved.

Because we are using the Spellfix extension, which needs to be compiled from source, there are some difficulties in packaging and deploying the feature.

Code contributions

  1. All: Add search filters
    Joplin's search was previously using the FTS offered by SQLite almost directly, which lacks flexibility. For example, we can't restrict the search scope to a particular notebook or search based on tags.

    The current search implementation fixes most of these problems by providing a better abstraction over FTS, supporting many new filters while remaining flexible enough to add new ones in the future easily.

    Known limitations:

    1. We can't exclude a notebook from the search directly. E.g., -notebook: archive doesn't work.
    2. The search bar is not interactive. Providing suggestions or listing the recent searches could be useful.
  2. All: Weigh notes using Okapi BM25 score
    Okapi BM25 is a popular ranking function used by search engines to estimate a document's relevance to a given search query. It ranks a set of documents based on the query terms appearing in each document, regardless of their proximity.

  3. Desktop: Fuzzy search
    Fuzzy search gives a better user experience.

    Known limitations:

    1. We need to build the spellfix extension from source.
    2. No support for mobile.

Challenges and learnings

  1. Managing complexity and creating abstractions was the main challenge. Mentor feedback was helpful here.

  2. Another is the use of test-driven development. It saved me hours of frustration when refactoring the code while implementing the search filters. (It was also quite satisfying seeing all those tests pass.)

  3. I'm also going to spend more time studying software architecture and design patterns. These become impossible to ignore as the code base grows larger. I have a lot to learn here.

@Anubhav-developr
Copy link

great sir your project is interesting ...

@kirtanlab
Copy link

what do you guide a beginner to start contributing to it?

@naviji
Copy link
Author

naviji commented Oct 31, 2021

Look at issues tagged as good for beginners and try solving them. Make a PR with a fix and get feedback on how to make it better.

Finally, if everything goes well, the maintainers will merge your PR to the code base. That's a good feeling.

@kirtanlab
Copy link

okk, it will be definitely thrilling experience for me @naviji

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment