Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
I'm not very familiar with LSP/LSIF so far, but gave a quick read and here's a summary of LSP/LSIF vs Kythe:
- Documentation: LSP/LSIF protocol seems well documented. Kythe schema is a bit more dense, protocol needs digging around in .proto files (which are OK though).
- Generally, Kythe pipeline needs more implicit knowledge to use - some online posts might address these though.
- Windows: Kythe serving tools run on Linux, though some Docker magic might be available.
- In Kythe, the storage format and the serving protocol are more separated, while LSIF tries to maintain serialized LSP responses.
- In fact, Kythe has no standard storage format (the reference implementation uses some columnar protobufs AFAIK)
- Kythe uses protobufs for storage/protocol, LSP some jsony/idl-y thing (though Kythe has JSON-y API exposed too).
- By and large, not much practical difference here.
- Minor nit: protobufs eventually sneak into Kythe's JSON-only API too, as the structured type hovers are serialized protobuf blobs.
- This implies that to serve Kythe an intermediate transformation server is suggested, as Browser client code usually shouldn't deal with protobufs.
- LSIF operates more on the text editor level (code spans), Kythe more on an AST-like / semantic level.
- The Kythe schema at https://kythe.io/docs/schema/ seems quite unpenetrable, but giving it a long hard stare starts to make sense
- It's main additions are
- Nitty details that can cover C++ (template specializations, macro expands). Might be handy for C++.
- Encoding type-level information (parametric abstraction, etc).
- There are some deficiencies here, the schema strives to be general, obviously Haskell's type system would be too complex for it.
- One could force a dumbed-down type into it
- Question is, what would be the benefit? For code navigation the type level is rarely used (ok, sometimes).
- Monomorphic typed navigation could come ready (= tell me other functions that have this monomorphic type)
- I'm not so sure about polymorphic, as last time I pondered, abstractions captured the specific location of type vars in source
- can tell more if interested, but TLDR is for efficient/useful (ala hoogle) type searches likely an external tool is a better place.
- Extra features: code outline
- LSP seems can provide Code Outline using documentSymbols call.
- Kythe might be able using experimental explore service - didn't use this yet.
- Cross-project references:
- LSIF hints projects could create dumps that are easy to pull in. Quote:
The generated moniker [ a handle in string format ] must be position independent and stable so that it can be used to identify the symbol in other projects or documents. It should be sufficiently unique so as to avoid matching other monikers in other projects unless they actually refer to the same symbol. Its content is up to the programming language and is not part of the LSIF specification.
- At Google, such approach ran into problems due to codebase size. Even non-Google, I could imagine a codebase having two versions of lib X checked out (different tools depending on different - bad, but exists). You get ambiguity instantly.
- Kythe thus uses build-specific ids for xrefing to avoid ambiguity. This usually also implies that you are expected to index all your deps using the same env.
- Though some id (Kythe calls them vname) hackery might be available if really needed.
- Cross-language references:
- Likely suffering from the same problem as mentioned in cross-project.
- Kythe's solution AFAIR was to specially prepare glue code generators (protobuf language binding, FFI) to output extra info when ran in Kythe-mode, that can be used to correlate the foreign use.
- This is again more complication, but again in sake of reducing ambiguity.
- See consideration on how to add proto-lens - protobuf xref support to haskell-indexer in https://github.com/google/haskell-indexer/issues/15.
- Generally, Kythe team was very attentive on production and usage at scale. Not saying MS guys not, just to potentially justify some of the complexities present.
TLDR for me LSP/LSIF seems quick to get some results with, but Kythe seems more prepared for ambiguity issues arising on the scale (note disclaimer at top that I'm biased as well).
Refs:
- Kythe protos: https://github.com/kythe/kythe/tree/master/kythe/proto (see xref, explore, graph for example).
- Kythe schema: https://kythe.io/docs/schema/
- LSIF issue/proposal: https://github.com/Microsoft/language-server-protocol/issues/623, https://github.com/Microsoft/language-server-protocol/blob/master/indexFormat/specification.md
@robinp
Copy link
Author

robinp commented Jan 22, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment