Project: Unified IDL for Apache Dubbo-Go
This project aims to provide apache dubbo with a Protobuf IDL translation tool to unify the interface definition in both IDL and Non-IDL mode. The toolset would be able to generate code that invokes the Server and Client APIs of Dubbo regardless of the protocol mode (IDL or Non-IDL). What's more, a toolset for translating java interface definitions to Protobuf IDL would also be developed to simplify the coordination of microservices development in different programming languages. The project has potential to grow into an even more unified microservice developing workflow by empowering users to generate code directly from an existing interface definition of various programming languages.
Works Done:
-
Unified IDL Extension files for hessian type mapping and transport protocol specification
-
Enable developers to define Hessian2-serialized service in Protobuf IDL
-
Support the translation from Java interfaces to Protobuf service definition
-
Fix bugs in interoperation between Dubbo-Java and Dubbo-Go
Works Yet to be Done
-
Support more complex Java types in Java2Proto plugin
-
Support direct interface migration from Dubbo-Java to Dubbo-Go
-
Enable dubbo-cli to manage protoc plugins and code generation option
The Unified-IDL project provides developers with the following features:
-
Generate invocation functions using Hessian2 serialization from user-defined Protobuf IDL files
-
Generate Protobuf IDL files from Java interface definition
-
Specify the transport protocol (e.g. DUBBO/TRIPLE) in Protobuf IDL files
The following two link are examples for the hessian2-serializatized services development using Protobuf IDL for interface definition.
dubbo-go-samples/java_interop/non-protobuf-dubbo
dubbo-go-samples/java_interop/non-protobuf-triple
In these samples, invocation files are already generated in the proto folder. In non-protobuf-dubbo this generation is made by running the following command:
protoc -I ./ \
--go-hessian2_out=./ --go-hessian2_opt=paths=source_relative \
--go-dubbo_out=./ --go-dubbo_opt=paths=source_relative \
./greet.proto
Basically, this command uses the (previously installed) protoc plugins protoc-gen-go-dubboand protoc-gen-go-hessian2for code generation. The former generates the invocation part, and the latter handles the type registration and hessian2 serialization.
How does the code work?
In protoc-gen-go-hessian2, the .proto file is passed on to the ProcessProtoFile
function, then we extract contents in the file variable into hessian2Go struct
func ProcessProtoFile(g *protogen.GeneratedFile, file *protogen.File) (*Hessian2Go, error) {
hessian2Go := &Hessian2Go{
File: file,
Source: file.Proto.GetName(),
ProtoPackage: file.Proto.GetPackage(),
}
for _, enum := range file.Enums {
hessian2Go.Enums = append(hessian2Go.Enums, processProtoEnum(g, enum))
}
for _, message := range file.Messages {
hessian2Go.Messages = append(hessian2Go.Messages, processProtoMessage(g, file, message))
}
return hessian2Go, nil
}
The hessian2Go struct is then passed on to generator.GenHessian2 function for code generation.
func GenHessian2(g *protogen.GeneratedFile, hessian2Go *Hessian2Go) {
genPreamble(g, hessian2Go)
genPackage(g, hessian2Go)
g.QualifiedGoIdent(protogen.GoIdent{
GoName: "dubbo-go-hessian2",
GoImportPath: "github.com/apache/dubbo-go-hessian2",
})
for _, enum := range hessian2Go.Enums {
genEnum(g, enum)
}
for _, message := range hessian2Go.Messages {
genMessage(g, message)
}
genRegisterInitFunc(g, hessian2Go)
}
Functions like genXxx are functions that use g.P to send generated lines to system's stdout stream and passed on to the protoc binary. For example, genPreamble uses g.P to generate preambles of the code
func genPreamble(g *protogen.GeneratedFile, hessian2Go *Hessian2Go) {
g.P("// Code generated by protoc-gen-go-dubbo. DO NOT EDIT.")
g.P()
g.P("// Source: ", hessian2Go.Source)
g.P("// Package: ", strings.ReplaceAll(hessian2Go.ProtoPackage, ".", "_"))
g.P()
}
In order to use Hessian2 serialization, we utilized the dubbo-go-hessian2repository in the generated code.
func genRegisterInitFunc(g *protogen.GeneratedFile, hessian2Go *Hessian2Go) {
g.P("func init() {")
for _, message := range hessian2Go.Messages {
if message.Desc.IsMapEntry() || message.ExtendArgs {
continue
}
g.P("dubbo_go_hessian2.RegisterPOJO(new(", message.GoIdent.GoName, "))")
for _, inner := range message.Messages {
if inner.Desc.IsMapEntry() {
continue
}
g.P("dubbo_go_hessian2.RegisterPOJO(new(", inner.GoIdent.GoName, "))")
}
}
for _, e := range hessian2Go.Enums {
g.P()
if e.JavaClassName != "" {
g.P("for v := range ", e.GoIdent.GoName, "_name {")
for range e.Values {
g.P("dubbo_go_hessian2.RegisterJavaEnum(", e.GoIdent.GoName, "(v))")
}
g.P("}")
}
}
g.P("}")
g.P()
}
When it comes to protoc-gen-go-dubbo, extra option for transport protocol specification is introduced. Take service level transport protocol specification as an example:
service GreetingsService {
option (unified_idl_extend.service_protocol) = {
protocol_name: "DUBBO";
};
...
}
This option is used in the ProcessProtoFile method in protoc-gen-go-dubbo:
for _, service := range file.Services {
serviceProtocolOpt, ok := proto.GetExtension(service.Desc.Options(), unified_idl_extend.E_ServiceProtocol).(*unified_idl_extend.ServiceProtocolTypeOption)
if serviceProtocolSpecFlag && ok {
if serviceProtocolOpt.GetProtocolName() != unified_idl_extend.ProtocolType_DUBBO.String() || serviceProtocolOpt == nil {
// skip the service which is not dubbo protocol or does not have a service option
continue
}
}
...
}
serviceProtocolOpt is a cli argument set by user, whose default value is false. When set to true, the generator would check whether a method has the option unified_idl_extend.service_protocol. If the option does not exist or does not equal to "DUBBO", the related invoke function would not be generated.
Java2Protois a tool to generate Protobuf IDL files from java interface definition. The project is still at its early stage, but can already generate methods with simple java types as arguments
Suppose we have a set of java files to define an interface
//GreetingsService.java
package org.apache.dubbo.tri.hessian2.api;
public interface GreetingsService {
GreetResponse greet(GreetRequest req);
}
//GreetRequest.java
package org.apache.dubbo.tri.hessian2.api;
public class GreetRequest implements java.io.Serializable {
private String name;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
}
//GreetResponse.java
package org.apache.dubbo.tri.hessian2.api;
public class GreetResponse implements java.io.Serializable {
private String greeting;
public String getGreeting() {
return greeting;
}
public void setGreeting(String greeting) {
this.greeting = greeting;
}
}
We can run the following command in terminal
go run github.com/dubbogo/dubbo-java2proto --file ./GreetingsService.java
go run github.com/dubbogo/dubbo-java2proto --file ./GreetRequest.java
go run github.com/dubbogo/dubbo-java2proto --file ./GreetResponse.java
This will generate the following files:
//GreetingsService.proto
// Code generated by dubbo-java2proto. DO NOT EDIT.
syntax = "proto3";
package api;
option go_package = "org/apache/dubbo/tri/hessian2/api;api";
import "unified_idl_extend/unified_idl_extend.proto";
service GreetingsService {
option (unified_idl_extend.service_extend) = {
interface_name: "org.apache.dubbo.tri.hessian2.api.GreetingsService";
};
rpc greet(GreetRequest) returns (GreetResponse) {
option (unified_idl_extend.method_extend) = {
method_name: "greet";
};
}
}
//GreetRequest.proto
// Code generated by dubbo-java2proto. DO NOT EDIT.
syntax = "proto3";
package api;
option go_package = "org/apache/dubbo/tri/hessian2/api;api";
import "unified_idl_extend/unified_idl_extend.proto";
message GreetRequest {
string name = 1;
option (unified_idl_extend.message_extend) = {
java_class_name: "org.apache.dubbo.tri.hessian2.api.GreetRequest";
};
}
//GreetResponse.proto
// Code generated by dubbo-java2proto. DO NOT EDIT.
syntax = "proto3";
package api;
option go_package = "org/apache/dubbo/tri/hessian2/api;api";
import "unified_idl_extend/unified_idl_extend.proto";
message GreetResponse {
string greeting = 1;
option (unified_idl_extend.message_extend) = {
java_class_name: "org.apache.dubbo.tri.hessian2.api.GreetResponse";
};
}
Then these .proto files can be used by a variety of protoc plugins to generate code in other languages.
The most important problem to solve is how to get the syntax tree in java code. We chose tree-sitterto do the job. With go-tree-sitterwe were able to parse java source code into a syntax tree. In the parser package, we defined a JavaParser struct and parse the interface definition(and related class definition) bit by bit.
type JavaParser struct {
content []byte
ast *sitter.Node
PackageName string
Interfaces []*JavaInterface
Classes []*JavaClass
}
The NewParser method reads the source code file and gets the syntax tree with tree-sitter, then it wraps these information into the javaParser
func NewParser(filepath string) (*JavaParser, error) {
bytes, err := os.ReadFile(filepath)
if err != nil {
return nil, err
}
parser := sitter.NewParser()
parser.SetLanguage(java.GetLanguage())
tree, err := parser.ParseCtx(context.Background(), nil, bytes)
if err != nil {
return nil, err
}
jp := &JavaParser{
content: bytes,
ast: tree.RootNode(),
}
return jp, nil
}
Then ParseFile is called to get all child nodes of the AST node and parse them according to their types. Then these parseXxx functions are called recurrsively, and eventually save data into the properties of the javaParser
func (jp *JavaParser) ParseFile() {
for i := 0; i < int(jp.ast.ChildCount()); i++ {
child := jp.ast.Child(i)
typ := child.Type()
switch typ {
case PackageDecl:
jp.parsePackage(child)
case InterfaceDecl:
jp.parseInterface(child)
case ClassDecl:
jp.parseClass(child)
}
}
}
func (jp *JavaParser) parseInterface(node *sitter.Node) {
ji := new(JavaInterface)
for i := 0; i < int(node.ChildCount()); i++ {
child := node.Child(i)
typ := child.Type()
// check if the interface is public
// don't generate code for non-public interface
switch typ {
case Modifiers:
if child.Content(jp.content) != "public" {
break
}
case Identifier:
identifier := child.Content(jp.content)
ji.Name = identifier
case InterfaceBody:
jp.parseInterfaceBody(child, ji)
jp.Interfaces = append(jp.Interfaces, ji)
}
}
}
How long this project ? it's a 3 month project or 6 month ?