/GSoC2024-UnifiedIDL.md Secret
Last active
December 23, 2024 10:23
Revisions
-
SHA-4096 revised this gist
Aug 25, 2024 . 1 changed file with 379 additions and 21 deletions.There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,40 +1,398 @@ Project: Unified IDL for Apache Dubbo-Go # Project Description This project aims to provide apache dubbo with a Protobuf IDL translation tool to unify the interface definition in both IDL and Non-IDL mode. The toolset would be able to generate code that invokes the Server and Client APIs of Dubbo regardless of the protocol mode (IDL or Non-IDL). What's more, a toolset for translating java interface definitions to Protobuf IDL would also be developed to simplify the coordination of microservices development in different programming languages. The project has potential to grow into an even more unified microservice developing workflow by empowering users to generate code directly from an existing interface definition of various programming languages. # Project Progress Overview Works Done: - Unified IDL Extension files for hessian type mapping and transport protocol specification - Enable developers to define Hessian2-serialized service in Protobuf IDL - Support the translation from Java interfaces to Protobuf service definition - Fix bugs in interoperation between Dubbo-Java and Dubbo-Go Works Yet to be Done - Support more complex Java types in Java2Proto plugin - Support direct interface migration from Dubbo-Java to Dubbo-Go - Enable dubbo-cli to manage protoc plugins and code generation option # Project Usage & Code Insight: The Unified-IDL project provides developers with the following features: 1. Generate invocation functions using Hessian2 serialization from user-defined Protobuf IDL files 2. Generate Protobuf IDL files from Java interface definition 3. Specify the transport protocol (e.g. DUBBO/TRIPLE) in Protobuf IDL files ## Unified IDL for Hessian2 ### Usage The following two link are examples for the hessian2-serializatized services development using Protobuf IDL for interface definition. [dubbo-go-samples/java_interop/non-protobuf-dubbo](https://github.com/apache/dubbo-go-samples/tree/main/java_interop/non-protobuf-dubbo) [dubbo-go-samples/java_interop/non-protobuf-triple](https://github.com/apache/dubbo-go-samples/tree/main/java_interop/non-protobuf-triple) In these samples, invocation files are already generated in the proto folder. In non-protobuf-dubbo this generation is made by running the following command: ``` protoc -I ./ \ --go-hessian2_out=./ --go-hessian2_opt=paths=source_relative \ --go-dubbo_out=./ --go-dubbo_opt=paths=source_relative \ ./greet.proto ``` Basically, this command uses the (previously installed) protoc plugins [protoc-gen-go-dubbo](https://github.com/dubbogo/protoc-gen-go-dubbo)and [protoc-gen-go-hessian2](https://github.com/dubbogo/protoc-gen-go-hessian2)for code generation. The former generates the invocation part, and the latter handles the type registration and hessian2 serialization. ### Code Insight How does the code work? In protoc-gen-go-hessian2, the .proto file is passed on to the `ProcessProtoFile` function, then we extract contents in the file variable into hessian2Go struct ``` func ProcessProtoFile(g *protogen.GeneratedFile, file *protogen.File) (*Hessian2Go, error) { hessian2Go := &Hessian2Go{ File: file, Source: file.Proto.GetName(), ProtoPackage: file.Proto.GetPackage(), } for _, enum := range file.Enums { hessian2Go.Enums = append(hessian2Go.Enums, processProtoEnum(g, enum)) } for _, message := range file.Messages { hessian2Go.Messages = append(hessian2Go.Messages, processProtoMessage(g, file, message)) } return hessian2Go, nil } ``` The hessian2Go struct is then passed on to generator.GenHessian2 function for code generation. ``` func GenHessian2(g *protogen.GeneratedFile, hessian2Go *Hessian2Go) { genPreamble(g, hessian2Go) genPackage(g, hessian2Go) g.QualifiedGoIdent(protogen.GoIdent{ GoName: "dubbo-go-hessian2", GoImportPath: "github.com/apache/dubbo-go-hessian2", }) for _, enum := range hessian2Go.Enums { genEnum(g, enum) } for _, message := range hessian2Go.Messages { genMessage(g, message) } genRegisterInitFunc(g, hessian2Go) } ``` Functions like genXxx are functions that use g.P to send generated lines to system's stdout stream and passed on to the protoc binary. For example, genPreamble uses g.P to generate preambles of the code ``` func genPreamble(g *protogen.GeneratedFile, hessian2Go *Hessian2Go) { g.P("// Code generated by protoc-gen-go-dubbo. DO NOT EDIT.") g.P() g.P("// Source: ", hessian2Go.Source) g.P("// Package: ", strings.ReplaceAll(hessian2Go.ProtoPackage, ".", "_")) g.P() } ``` In order to use Hessian2 serialization, we utilized the [dubbo-go-hessian2](https://github.com/apache/dubbo-go-hessian2)repository in the generated code. ``` func genRegisterInitFunc(g *protogen.GeneratedFile, hessian2Go *Hessian2Go) { g.P("func init() {") for _, message := range hessian2Go.Messages { if message.Desc.IsMapEntry() || message.ExtendArgs { continue } g.P("dubbo_go_hessian2.RegisterPOJO(new(", message.GoIdent.GoName, "))") for _, inner := range message.Messages { if inner.Desc.IsMapEntry() { continue } g.P("dubbo_go_hessian2.RegisterPOJO(new(", inner.GoIdent.GoName, "))") } } for _, e := range hessian2Go.Enums { g.P() if e.JavaClassName != "" { g.P("for v := range ", e.GoIdent.GoName, "_name {") for range e.Values { g.P("dubbo_go_hessian2.RegisterJavaEnum(", e.GoIdent.GoName, "(v))") } g.P("}") } } g.P("}") g.P() } ``` When it comes to protoc-gen-go-dubbo, extra option for transport protocol specification is introduced. Take service level transport protocol specification as an example: ``` service GreetingsService { option (unified_idl_extend.service_protocol) = { protocol_name: "DUBBO"; }; ... } ``` This option is used in the ProcessProtoFile method in protoc-gen-go-dubbo: ``` for _, service := range file.Services { serviceProtocolOpt, ok := proto.GetExtension(service.Desc.Options(), unified_idl_extend.E_ServiceProtocol).(*unified_idl_extend.ServiceProtocolTypeOption) if serviceProtocolSpecFlag && ok { if serviceProtocolOpt.GetProtocolName() != unified_idl_extend.ProtocolType_DUBBO.String() || serviceProtocolOpt == nil { // skip the service which is not dubbo protocol or does not have a service option continue } } ... } ``` serviceProtocolOpt is a cli argument set by user, whose default value is false. When set to true, the generator would check whether a method has the option unified_idl_extend.service_protocol. If the option does not exist or does not equal to "DUBBO", the related invoke function would not be generated. ## Java2Proto [Java2Proto](https://github.com/dubbogo/dubbo-java2proto)is a tool to generate Protobuf IDL files from java interface definition. The project is still at its early stage, but can already generate methods with simple java types as arguments ### Usage Suppose we have a set of java files to define an interface ``` //GreetingsService.java package org.apache.dubbo.tri.hessian2.api; public interface GreetingsService { GreetResponse greet(GreetRequest req); } //GreetRequest.java package org.apache.dubbo.tri.hessian2.api; public class GreetRequest implements java.io.Serializable { private String name; public String getName() { return name; } public void setName(String name) { this.name = name; } } //GreetResponse.java package org.apache.dubbo.tri.hessian2.api; public class GreetResponse implements java.io.Serializable { private String greeting; public String getGreeting() { return greeting; } public void setGreeting(String greeting) { this.greeting = greeting; } } ``` We can run the following command in terminal ``` go run github.com/dubbogo/dubbo-java2proto --file ./GreetingsService.java go run github.com/dubbogo/dubbo-java2proto --file ./GreetRequest.java go run github.com/dubbogo/dubbo-java2proto --file ./GreetResponse.java ``` This will generate the following files: ``` //GreetingsService.proto // Code generated by dubbo-java2proto. DO NOT EDIT. syntax = "proto3"; package api; option go_package = "org/apache/dubbo/tri/hessian2/api;api"; import "unified_idl_extend/unified_idl_extend.proto"; service GreetingsService { option (unified_idl_extend.service_extend) = { interface_name: "org.apache.dubbo.tri.hessian2.api.GreetingsService"; }; rpc greet(GreetRequest) returns (GreetResponse) { option (unified_idl_extend.method_extend) = { method_name: "greet"; }; } } //GreetRequest.proto // Code generated by dubbo-java2proto. DO NOT EDIT. syntax = "proto3"; package api; option go_package = "org/apache/dubbo/tri/hessian2/api;api"; import "unified_idl_extend/unified_idl_extend.proto"; message GreetRequest { string name = 1; option (unified_idl_extend.message_extend) = { java_class_name: "org.apache.dubbo.tri.hessian2.api.GreetRequest"; }; } //GreetResponse.proto // Code generated by dubbo-java2proto. DO NOT EDIT. syntax = "proto3"; package api; option go_package = "org/apache/dubbo/tri/hessian2/api;api"; import "unified_idl_extend/unified_idl_extend.proto"; message GreetResponse { string greeting = 1; option (unified_idl_extend.message_extend) = { java_class_name: "org.apache.dubbo.tri.hessian2.api.GreetResponse"; }; } ``` Then these .proto files can be used by a variety of protoc plugins to generate code in other languages. ## Code Insight The most important problem to solve is how to get the syntax tree in java code. We chose [tree-sitter](https://tree-sitter.github.io/tree-sitter/)to do the job. With [go-tree-sitter](http://github.com/smacker/go-tree-sitter)we were able to parse java source code into a syntax tree. In the parser package, we defined a JavaParser struct and parse the interface definition(and related class definition) bit by bit. ``` type JavaParser struct { content []byte ast *sitter.Node PackageName string Interfaces []*JavaInterface Classes []*JavaClass } ``` The NewParser method reads the source code file and gets the syntax tree with tree-sitter, then it wraps these information into the javaParser ``` func NewParser(filepath string) (*JavaParser, error) { bytes, err := os.ReadFile(filepath) if err != nil { return nil, err } parser := sitter.NewParser() parser.SetLanguage(java.GetLanguage()) tree, err := parser.ParseCtx(context.Background(), nil, bytes) if err != nil { return nil, err } jp := &JavaParser{ content: bytes, ast: tree.RootNode(), } return jp, nil } ``` Then ParseFile is called to get all child nodes of the AST node and parse them according to their types. Then these parseXxx functions are called recurrsively, and eventually save data into the properties of the javaParser ``` func (jp *JavaParser) ParseFile() { for i := 0; i < int(jp.ast.ChildCount()); i++ { child := jp.ast.Child(i) typ := child.Type() switch typ { case PackageDecl: jp.parsePackage(child) case InterfaceDecl: jp.parseInterface(child) case ClassDecl: jp.parseClass(child) } } } func (jp *JavaParser) parseInterface(node *sitter.Node) { ji := new(JavaInterface) for i := 0; i < int(node.ChildCount()); i++ { child := node.Child(i) typ := child.Type() // check if the interface is public // don't generate code for non-public interface switch typ { case Modifiers: if child.Content(jp.content) != "public" { break } case Identifier: identifier := child.Content(jp.content) ji.Name = identifier case InterfaceBody: jp.parseInterfaceBody(child, ji) jp.Interfaces = append(jp.Interfaces, ji) } } } ``` # Related Repositories [dubbo-go](https://github.com/apache/dubbo-go) [dubbo-go-samples](https://github.com/apache/dubbo-go-samples) [protoc-gen-go-dubbo](https://github.com/dubbogo/protoc-gen-go-dubbo) [protoc-gen-go-hessian2](https://github.com/dubbogo/protoc-gen-go-hessian2) [Java2Proto](https://github.com/dubbogo/dubbo-java2proto) -
SHA-4096 created this gist
Aug 22, 2024 .There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,40 @@ GSoC 2024 Report Project: Unified IDL for Apache Dubbo-Go Project Description This project aims to provide apache dubbo with a Protobuf IDL translation tool to unify the interface definition in both IDL and Non-IDL mode. The toolset would be able to generate code that invokes the Server and Client APIs of Dubbo regardless of the protocol mode (IDL or Non-IDL). What's more, a toolset for translating java interface definitions to Protobuf IDL would also be developed to simplify the coordination of microservices development in different programming languages. The project has potential to grow into an even more unified microservice developing workflow by empowering users to generate code directly from an existing interface definition of various programming languages. Project Progress Overview Works Done: Unified IDL Extension files for hessian type mapping and transport protocol specification Enable developers to define Hessian2-serialized service in Protobuf IDL Support the translation from Java interfaces to Protobuf service definition Fix bugs in interoperation between Dubbo-Java and Dubbo-Go Works Yet to be Done Support more complex Java types in Java2Proto plugin Support direct interface migration from Dubbo-Java to Dubbo-Go Enable dubbo-cli to manage protoc plugins and code generation option Project Usage: The Unified-IDL project provides developers with the following features: Generate invocation functions using Hessian2 serialization from user-defined Protobuf IDL files Generate Protobuf IDL files from Java interface definition Specify the transport protocol (e.g. DUBBO/TRIPLE) in Protobuf IDL files Hessian2 IDL usage The following two link are examples for the hessian2-serializatized services development using Protobuf IDL for interface definition. dubbo-go-samples/java_interop/non-protobuf-dubbo at main · apache/dubbo-go-samples (github.com) dubbo-go-samples/java_interop/non-protobuf-triple at main · apache/dubbo-go-samples (github.com) In these samples, invocation files are already generated in the proto folder.