Skip to content

Instantly share code, notes, and snippets.

@jonathanrobie
Last active December 9, 2016 17:46
Show Gist options
  • Save jonathanrobie/dd5ac17f5e8b6b8f0e79b378b2c83754 to your computer and use it in GitHub Desktop.
Save jonathanrobie/dd5ac17f5e8b6b8f0e79b378b2c83754 to your computer and use it in GitHub Desktop.
Let's make a list of structures that we want to represent, and compare various approaches to:
- Create them
- View them with various tools
- Query a treebank to find or use them
Use headers to create separate categories.
# Determiners
- Articular infinitives
- Articular participles
# Participles
- Supplementary
- Circumstantial
-- Connected
-- Genitive Absolute
- Periphrastic
- Attributive
-- Substantive
-- Adjectival
# Speaking and writing
- Identifying what was said for a verb of communication
# Word Order (VO, OV, VSO, SVO, etc.)
- see http://www.hf.uio.no/ifikk/english/research/projects/proiel/Activities/proiel/publications/haug-nijmegen.pdf
@rkjtan
Copy link

rkjtan commented Dec 8, 2016

GBI's approach has been to use minimal terminology. At the phrase level, when an np & an adjp combine to form a larger np, we are saying that the np is doing one of the things np's do (usually name an entity that can then be related to the verb as an argument or adjunct [directly or specified by a preposition]) & the adjp is doing one of the things adjp's do (usually adjectivally attributing some quality to an np) & the result is a larger np. The primary attributes of the larger np belong to the head, which happens to be the head of the smaller np--that is why the resultant combination is not a larger adjp. When an np & another np that match in case, & usually gender & number too, combine to form a larger np, the second np further defines the first np, but the head of the larger np is the head of the first np. In our Rule attribute, we do indicate that this is known as apposition by using the Rule name Np-Appos (basically further defining/identifying the head np--just a subtype of naming an entity in my book). When two np's combine & the second np is in the genitive case, the larger np has the head of the first np as head. In our Rule attribute, we indicate that this is genitive modification by using the Rule name NPofNP (usually simply relating one np to the head np, restricting its scope).
From one perspective, in effect the labels np & adjp can function as both a class & a role. Likewise, the genitive case can indicate both form & function.
So, when it comes to the various types of participles:
A participle is always classed as a verb
Because it is a verb, it always does what verbs do (e.g., predicate an activity or state, & can have arguments & adjuncts) as head of its own clause.
Participial clauses can also do what np's, adjp's, & advp's do as an embedded clause functioning as an np, adjp or advp.
So, circumstantial participles have Role = ADV.
Substantival participles are first promoted to an np node (to make clear they are substantival) & then can do what np's do (S, O, apposition or genitive modification to another np to form a larger np, etc.).
Adjectival participles are first promoted to an adjp node (to make clear they are adjectival) & then they can do what adjp's do (adjectivally modify an np).
Supplementary participles are treated like arguments of the finite verb & so as clauses, they have Role = O (not turned into an np first to distinguish them from substantival participles that function as O).
Genitive absolute participles are just treated as separate clauses from the main clause--the participle would be genitive case, be a verb, then head of its own clause--it is then connected to the main clause with a larger clause node (basically forming part of a clause complex), but there is no Role, indicating its "absolute" nature in relation to the main clause.
Periphrastics are just treated as two verbs (one of them typically a "to be" verb) combining to form one verb, with the participle as head & the Role = V for the combined node.

In light of the previous discussion, my preference, to avoid having a special attribute for substantival or adjectival participles, is to preserve the np or adjp node that is above the nodes that form the participial clause. The class would be np for substantival participial clauses & adjp for adjectival participial clauses. Adjectival participial clauses would need no Role since adjectival basically is sufficient to describe both class & role (different users can always add their preferred Role label automatically for adjp, if desired, but I'm just saying that it is actually sometimes redundant because the name for a class sometimes also describes function fairly well). Substantival participial clauses would have Role = S, O, IO, as appropriate by moving the S, O, or IO node info into the Role attribute & removing that extra node layer currently above it in the GBI trees. If the substantival participial clause is functioning at a phrase level--e..g, it is the object of a preposition or it is an np in apposition to another np or a genitive np modifying another np, then we either need no Role or we can automatically translate Rule = Np-Appos & Rule = NPofNP on the overall node into something like Role = Definer & Role = Qualifier for the 2nd of the two nodes formed by the Rule = NP-Appos & Rule = NPofNP.

P.S. I definitely agree with Jonathan's comment below. My comment above serves to clarify some aspects of the GBI trees that may not be transparent to all & also to indicate my initial preferences (which may change as we examine various representations).

@jonathanrobie
Copy link
Author

jonathanrobie commented Dec 8, 2016

It's not hard to transform GBI into various formats and try them out. I think we should take a little time to do that.

The goal of use cases is to give us a way to identify what we mean by "good" and measure how good various representations are. I'd like to avoid coming to conclusions at this point, I'd rather start by:

  • Filling out the list of use cases until we think we have a representative list of things people might want to do
  • Showing how candidate representations would be presented and how the underlying structure would be explained to students
  • Writing queries against candidate representations
  • Discussing what it would take to create a candidate structure with a treebank editor for a new text like the LXX

I think we should do that for each representation that at least one of us thinks is promising and compare the results. We are going to be building on this format, so taking the time to get it right is important. Of course, we can have one "working format" that we use in the meantime, I would probably update the Lowfat with that format. In the long run, I hope the format we agree on will replace Lowfat, there is no reason to maintain them both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment