RBS is an easy language to generate code for because of its simple syntax and lack of dependencies between files. In addition, there are currently a large number of required type definitions, so code generation for RBS is considered to be highly important.
Various attempts have been made to generate RBS code, including generation from JSON files and analysis of static and dynamic Ruby code. However, each method has its advantages and disadvantages, and not all problems have been solved.
- Large number of undefined gems
- Determination of generics
- Dynamic method generation by eval-type methods
- Dynamic module include/extend/prepend using Module.new or included
- Various extension requests
I propose a method of combining small classes like a plugin mechanism for Rack middleware. We refer to this mechanism as RBSG (RBS Generator).
Like a Rack application, you write a small amount of code and run this.
loader = -> () {
require 'foo'
class Bar
include Foo
end
}
RBSG::Builder.new do
use RBSG::Logger
use RBSG::CreateFileByName,
base_dir: "sig/out",
header: "# !!! GENERATED CODE !!!"
use RBSG::ObjectSpaceDiff
use RBSG::IncludeExtendPrepend
use RBSG::Result
run loader
end
sig/out/foo.rbs
# !!! GENERATED CODE !!!
module Foo
end
sig/out/bar.rbs
# !!! GENERATED CODE !!!
class Bar
include Foo
end
module RBSG
class ObjectSpaceDiff
def initialize(loader, if: nil)
@loader = loader
@if = binding.local_variable_get(:if)
end
def call(env)
modules_before = ObjectSpace.each_object(Module).to_a
result = @loader.call(env)
modules_after = ObjectSpace.each_object(Module).to_a
(modules_after - modules_before).each do |mod|
next unless @if.nil? || @if.call(mod)
result[mod.to_s] # set by default value
end
result
end
end
end
- Middleware can be stacked by function, and middleware can be easily added or removed.
- Middleware is in simple classes, and because they are simple, they can be used in a variety of situations.
- The independent loading phase of the code allows both the pre- and post-loading code to be written. It is also easy to enclose the code in blocks.
- The Rack architecture is widely accepted by rubyists and the acquisition cost can be estimated at a low level.
- Unnecessary output can be controlled by middleware.
- File output can also be written as middleware, so any output format can be supported.
- The developer must write the loader and middleware stack like an application.
- Middleware using TracePoint will not work as intended if the load timing is off.
- High scalability and flexibility have a trade-off that also increases the cost of understanding.
sequenceDiagram
Middleware1 ->> Middleware2: .call
Middleware2 ->> Result: .call
Result ->>+ loader: .call
loader ->>- Result: no result
Result ->> Middleware2: result
Middleware2 ->> Middleware1: result
- Class/module definition using ObjectSpace
- Static and dynamic addition of include/extend/prepend modules
- Output data filtering
- Debugging display of output
- File output per class/module
- Constant definition and type guessing
- Logger configuration
- Rails extensions to support class_attribute and mattr_acessor
- Automatic support for method delegation
The proposed method is expected to be applied to a variety of use cases because of its simple and powerful mechanism.
When generating definitions for ActiveRecord
loader = -> (_env) {
# code loading
require 'active_record'
ActiveRecord.eager_load!
}
RBSG::Builder.new do
use RBSG::Logger
use RBSG::CreateFileByName, # output
base_dir: "sig/out",
header: "# !!! GENERATED CODE !!!"
use RBSG::Clean, if: -> (name, bodies) {
if RBSG.rbs_defined?(name, library: "stdlib")
bodies.empty? # skip empty definition
else
!(name.start_with?("ActiveRecord")) # skip out of scope
end
}
use RBSG::ObjectSpaceDiff # class definition
use RBSG::IncludeExtendPrepend # imported modules
use RBSG::Rails::ClassAttribute # extention for rails
use RBSG::Result
run loader
end
Also, methods that are extended in Rails can be developed by writing extensions prepared for them and adding functionality. Furthermore, by switching the branch of the code to be read from, it is possible to easily output the code for each version.
env = {}
loader = -> (_env) {
Rails.application.eager_load!
}
RBSG::Builder.new do
use RBSG::Logger
use RBSG::CreateFileByName,
base_dir: Rails.root.join("sig/out"),
header: "# !!! GENERATED CODE !!!"
use RBSG::Clean, if: -> (name, _bodies) {
RBSG.rbs_defined?(name, collection: true) # skip exist definition
}
use RBSG::ObjectSpaceDiff
use RBSG::IncludeExtendPrepend
use RBSG::Rails::ClassAttribute
use CustomGenerator::Rolify # user customized
use RBSG::Result
run loader
end.call(env)
Similar to the gem_rbs_collection example, the same middleware can be used in the application code by changing the loader portion. In addition, users can add their own extensions and try them out, and it is easy to convert them to gems after they are used.
The loader only needs to load the code and does not need to worry about the return value.
Create it with a class that has a #call
method, like Rack middleware.
Example of simple middleware
class SampleMiddlewear
def initialize(loader)
@loader = loader
end
def call(env)
@loader.call(env)
end
end
The interface is limited, but the content is not. Generate code, filter output, change output format, debug display, read documentation, configure Logger, etc.
The return value is basically a Hash object with the class/module name as key and the content of each class/module as body. RBS is constructed by adding output codes to this result. It can output multiple classes/modules, so it can be used in libraries that extend core classes, such as active_support, and of course in Rails applications.
The output is done using result. output can also be middleware, so it can handle a variety of output requests. For example, "write to a file for each class/module name", "write everything to standard output", etc.
@pocke
Thank you for your great great reviewing.
I often refer to your code for implementation.
The phase of writing RBSs
There is a problem with the order.
In Rack, puma and other servers are responsible for output, so they were not helpful for this problem.
Currently RBSG does not provide any special support for output in order to keep the architecture simple.
I am also thinking that the following Recommended configuration may be able to handle some of this problem.
ja
順番の問題はありますね。
Rackではpuma等のserverが出力を担当しているのでこの問題の参考にはなりませんでした。
RBSGの現状ではアーキテクチャーをシンプルに保つため出力のための特別な対応はしていません。
また、次のRecommended configurationでいくらか対応できるのではないかと考えています。
Recommended configuration
I feel it is a very good idea and necessary.
RBSG needs to be customized for each situation as a highly flexible trade-off.
I initially thought I could provide an
example
directory and copy/paste it, but as you say, it looks like a bit more support.ja
非常に良いアイデアで必要だと感じています。
RBSGは自由度が高いトレードオフとして、状況毎にカスタマイズする必要があります。
私は当初
example
ディレクトリを用意してコピー&ペーストで対応できるかと考えていましたが、おっしゃる通り、もう少しサポートそうです。The loader is enough?
You are right.
RBSG is mainly focused on functionality and will not be good at handling files.
The interface I'm thinking of to deal with this is this.
ja
貴方は正しいです。
RBSGは機能を中心にしているため、ファイルの扱いは得意ではないでしょう。
対策として考えているインターフェースはこうです。
Loading existing RBSs
I too felt the need for this functionality and was able to implement it after much effort.
After parsing it as RBS and assigning it to a temporary RBS::Environment, it is again converted to a string with RBS::Writer and added to the existing result.
Duplicate elimination is done just before output.
The stored string is parsed as RBS, called
#uniq!
with the class and name of member declarations, and stringified again with RBS::Writer for output.These will be quick to see the implementation.
ja
私もこの機能の必要性を感じていて、苦労の末実装できました。
一度RBSとしてparseし、仮のRBS::Environmentに代入した後、もう一度RBS::Writerで文字列化して既存のresultに追加します。
重複の排除は出力の直前に行います。
貯められた文字列をRBSとしてパースし、member declarationsのclassとnameで
#uniq!
を呼び出し、再度RBS::Writerで文字列化して出力します。これらは実装を見るのが早いでしょう。
How to filter output classes
I am impressed that you are aware of that problem.
I was initially thinking of a filter around
Object.const_source_location
, but it turns out to be a very complex need.Module.new
etc. such as `User::AttributeMethods::GeneratedAttributeMethodsUser::GeneratedAssociateMethods
, etc.User::DeleteJob
.app/models
, but eliminateapp/jobs
.It looks like I only need to look at the
User
path to deal with 1., but that doesn't eliminate the definition of 3.If I try to deal with 3. as well, I will end up eliminating 1. and 2. as well.
This issue is unresolved, but it seemed to me that if we had middleware that could Filter either way, we could then provide users with an example implementation.
ja
その問題に気がつくとはさすがです。
私は当初
Object.const_source_location
を中心としたフィルターを考えていましたが、非常に複雑なニーズがあることが分かっています。User::AttributeMethods::GeneratedAttributeMethods
のようなModule.new
等で追加される名前のないmoduleUser::GeneratedAssociateMethods
等User::DeleteJob
のような場合。app/models
は残したいが、app/jobs
は排除したい1.を対応するためには
User
のpathだけを見ればいいように見えますが、それだと3.の定義を排除できません。3.も対応しようとすると、今度は1.や2.も排除してしまいます。
この問題は未解決ですが、どちらにせよFilterできるMiddlewareがあれば、あとはユーザーに実装例を示すことで対応できるように思いました。
Questions
I finally removed the
env
argument from the spec.At first I was quoting from the Rack spec.
I thought it could be used if there were common configuration items between middleware, but never had the opportunity to do so.
I read the Rack specification and interpreted
env
as handling HTTP requests.RBSG does not have an input concept like HTTP request.
Thus I decided to remove it.
ja
私は最終的に
env
引数を仕様から削除しました。はじめはRackの仕様から引用していました。
ミドルウェア間で共通する設定項目などがあれば利用できるかと考えていましたが、その機会はありませんでした。
私はRackの仕様を読み、
env
はHTTP requestを扱うものだと解釈しました。RBSGにHTTP requestのような入力の概念はありません。
よって私は削除することに決めました。
I decided to ignore the return value of loader.
The loader would be written by the user, and I figured the experience would be better if the user did not have to know the return value specification.
ja
私はloaderの返り値は無視することに決めました。
loaderはユーザーが書くことになりますが、返り値の仕様を知らなくてもよい方が体験は良いだろうと私は考えました。
The implementation of
use
is just as followsja
use
の実装はちょうど以下のようになります。