Skip to content

Instantly share code, notes, and snippets.

@kanha-gupta
Last active August 29, 2023 07:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kanha-gupta/c62ef45c0e427a682c1e3c2801dd211a to your computer and use it in GitHub Desktop.
Save kanha-gupta/c62ef45c0e427a682c1e3c2801dd211a to your computer and use it in GitHub Desktop.
Final work report - Google Summer of Code 2023 - Apache Software Foundation

GSoC 2023 - Final Project Report

gsoc logo

The following report summarizes the work done by me during Google Summer of Code 2023 with the Apache ShardingSphere

Contributor details

  • Name: Kanha Gupta

  • Github: kanha-gupta

  • Organization: Apache ShardingSphere (part of Apache Software Foundation)

  • Project Title: Enhance SQLNodeConverterEngine to support more MySQL SQL statements

  • Project listing on GSoC website

Tech Stack:

  • Advanced Java, Unit testing
  • Maven
  • Antlr4
  • MySQL

About the project

logo_v3 (1)

Apache ShardingSphere acts as a database proxy and provides support for various databases such as Mysql, postgreSQL, open gauss, sql server etc. It parses sql engine before sending the sql queries to databases for further processing. It contains sql parser engine to parse sql given by user & converts it into an AST node through the lexer & parser process & then into a sql statement to extract features which is core input of kernel processor Kerner processor converts AST node to SQL node by using the Federation engine present in its architecture. As federation engine have the responsibility of converting all types of ASTNode to SQLNodes, it needs optimization to successfully provide support for different sql queries for different types of databases to keep the conversion successful so that calcite can be used to implement sql optimization and finally send it to the database for processing. Right now Many of the sql queries don't have support added yet & therefore if the user tries to use that sql, it throws a OptimizationSQLNodeConvertException.

Therefore This project proposes to add support of 36 SELECT MySQL queries and also optimize grammar wherever needed.

Goals:

Support for 36 SQL queries had to be implemented in the federation engine. These queries included Special function queries, Operator queries, Expression queries, Groupby and Orderby queries, Table queries, and Some queries that are not natively supported by Apache Calcite in its SQL parser such as MatchAgainst function, Collate Expression queries etc.

Approach and workflow:

shardingsphere workflow

Pull request and issues:

Coding Period

PRs and Issues related to GSoC project

  • Project listing on Github
  • Project listing on Jira
  • #24888: support for SELECT special functions
  • #25189: Trim functions support
  • #25228: Support for Extract, BitExpr Mod
  • #25268: Support for substring, dual, spatial function
  • #25318: Support for bitExpr with plus & minus interval
  • #25513: support for predicate with IN subquery
  • #25564: support for boolean primary SQL
  • #25650: support for EXPR with Is & Is not
  • #25594: support for NOT BETWEEN, NOT IN, COMPARISON SUBQUERY
  • #25684: support for NotSegment & NotConverter
  • #25773: support for OR sign
  • #25865: support for MOD, Vertical Bar

My Honourable Mentor Mr.Zhengqiang Duan refactored SQLNodeConverterEngineIT file for sql which calcite parser do not support #25882

  • #25921: support for TABLE prefix operator
  • #25953: support for Operators which calcite dont support
  • #26038: support for WindowSegment
  • #26246: NotConverter enhancements
  • #26256: support for MySQL operators
  • #26330: support for Match expression
  • #27361: support for CollateExpression

Result:

At the end of my term, all the proposed work were successfully completed. SQL parsing for SELECT queries were achieved.

Future Scope:

ShardingSphere supports query parsing for MySQL, Oracle, OpenGauss, postGreSQL, SQL92 and SQLServer. We can now work on adding support for more unsupported queries of the following Databases, optimizing parsing logic, and improving support of other queries such as UPDATE, DELETE, INSERT, EXPLAIN

Conclusion :

I attribute my successful completion of this work to the invaluable guidance and unwavering support I received from my exceptional mentors: Mr. Zhengqiang Duan, Ms. Trista Pan, and Mr. Chuxin Chen. Their mentorship was instrumental in helping me navigate through every aspect of this project. I am profoundly grateful to them and to the ShardingSphere community for providing me this incredible opportunity. I extend my heartfelt thanks to everyone involved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment