-
-
Save kirlat/5c36baaf26e3ea399bfe36d0a354c7b1 to your computer and use it in GitHub Desktop.
# Custom scalars and enums | |
scalar Date | |
scalar Language | |
scalar URI | |
enum MIMEType { | |
TEXT_PLAIN | |
TEXT_HTML | |
} | |
enum FeatureType { | |
WORD | |
FULL_FORM | |
HDWD | |
PART | |
NUMBER | |
CASE | |
GRMCASE | |
DECLENSION | |
GENDER | |
TYPE | |
CLASS | |
GRMCLASS | |
CONJUGATION | |
COMPARISON | |
TENSE | |
VOICE | |
MOOD | |
PERSON | |
FREQUENCY | |
MEANING | |
SOURCE | |
FOOTNOTE | |
DIALECT | |
NOTE | |
PRONUNCIATION | |
AGE | |
AREA | |
GEO | |
KIND | |
DERIVTYPE | |
STEMTYPE | |
MORPH | |
VAR | |
RADICAL | |
KAYLO | |
STATE | |
} | |
# Output types | |
type FeatureValue { | |
value: String! | |
} | |
type Feature { | |
id: ID! | |
schema: String! | |
type: FeatureType! | |
value: [FeatureValue!]! | |
} | |
type User { | |
id: ID! | |
nickname: String | |
} | |
type ResourceProvider { | |
uri: URI!, | |
description: String!, | |
right: String! | |
} | |
type Comment { | |
id: ID! | |
text: String! | |
language: Language! | |
dateTime: Date! | |
author: User! | |
replies: [Comment!] | |
} | |
type Assertion { | |
id: ID! | |
confidence: Int! | |
dateTime: Date! | |
author: User! | |
} | |
type Negation { | |
id: ID! | |
confidence: Int! | |
dateTime: Date! | |
author: User! | |
} | |
type Definition { | |
id: ID! | |
text: String! | |
language: Language! | |
format: MIMEType! | |
lemma: Lemma! # The lemma this is defined by the definition text | |
comments: [Comment!] | |
} | |
type DefinitionConnection { | |
id: ID! | |
definitions: [Definition!] | |
assertions: [Assertion!] | |
negations: [Negation!] | |
comments: [Comment!] | |
} | |
type DefinitionSet { | |
id: ID! | |
lemmaWord: String! | |
language: Language! | |
shortDefs: [DefinitionConnection!] | |
fullDefs: [DefinitionConnection!] | |
} | |
type Inflection { | |
id: ID! | |
language: Language! | |
stem: String | |
prefix: String | |
suffix: String | |
features: [String!] | |
example: String | |
comments: [Comment!] | |
} | |
type Lemma { | |
id: ID! | |
word: String! | |
language: Language! | |
prinicipalParts: [String!] | |
variants: [Lemma!] # Alternative versions of the lemma | |
partOfSpeech: Feature! | |
features: [Feature!] # Part of speech will not be included into the features list | |
comments: [Comment!] | |
} | |
type Lexeme { | |
id: ID! | |
lemma: Lemma! | |
inflections: [Inflection!] | |
meaning: DefinitionSet! # If there are no definitiions, the DefinitionSet will be empty | |
comments: [Comment!] | |
} | |
type Word { | |
id: ID! | |
targetWord: String! | |
lexemes: [Lexeme!] | |
} | |
# Input types | |
input AssertionInput { | |
confidence: Int! | |
dateTime: Date! | |
authorID: ID! | |
} | |
input NegationInput { | |
confidence: Int! | |
dateTime: Date! | |
authorID: ID! | |
} | |
input CommentInput { | |
text: String! | |
language: Language! | |
dateTime: Date! | |
authorID: ID! | |
} | |
input CommentReplyInput { | |
commentID: ID! | |
comment: CommentInput | |
} | |
input LexemeCommentInput { | |
lexemeID: ID! | |
comment: CommentInput | |
} | |
input DefinitionCommentInput { | |
definitionID: ID! | |
comment: CommentInput | |
} | |
input DefinitionConnectionCommentInput { | |
definitionConnectionID: ID! | |
comment: CommentInput | |
} | |
input DefinitionAssertionInput { | |
definitionConnectionID: ID! | |
assertion: AssertionInput | |
} | |
input DefinitionNegationInput { | |
definitionConnectionID: ID! | |
negation: NegationInput | |
} | |
# Mutations | |
type Mutation { | |
# Creates a new comment and attaches it to the specified lexeme | |
commentOnLexeme(input: LexemeCommentInput) : Comment | |
# Creates a new assertion and attaches it to the specified definitionConnection | |
assertDefinitionConnection(input: DefinitionAssertionInput) : Assertion | |
# Creates a new negation and attaches it to the specified definitionConnection | |
negateDefinitionConnection(input: DefinitionNegationInput) : Negation | |
# Creates a new comment and attaches it to the specified definitionConnection | |
commentOnDefinitionConnection(input: DefinitionConnectionCommentInput) : Comment | |
# Creates a new comment and attaches it to the specified definition | |
commentOnDefinition(input: DefinitionCommentInput) : Comment | |
# Creates a new reply to the existing comment | |
replyToComment(input: CommentReplyInput) : Comment | |
} |
for ResourceProvider, I'm not sure there is any difference between ID and URI. The URI is the unique identifier of the provider. In addition, I think the ResourceProvider needs to be part of the graph of the objects it provides. E.g. so a Definition would have a ResourceProvider, as would Inflection, etc. etc.
Let's leave the uri
field as an ID (we can rename it to id
if that's what DB/GraphQL implementation would require). Maybe we should also add an optional rights
field (not sure about the rights translations and where would they come for)?
for Feature, I'm not sure about SortOrder -- I don't think it should be required. I think I would also like to have a property on Feature which we can use to specify the ontology or schema the feature belongs to. The Feature names/values that we use now adhere to the AlpheiosLexicon schema, but we are going to need to be able to map to and/or support other ontologies (such as the Universal Dependencies tagset) so we should be explicit about that.
Agree that sort order might be not needed. We can apply it later (maybe even dynamically) by matching the feature name with the sort order.
What do you think might be a good name for the field specifying the schema? Would we adhere to any standards here, as schema.org
?
For Word, we need Language, and also the ability to optionally include context (prefix, suffix, source).
Agree, will add those fields.
For Lexeme, I dont know if the altLemmas belongs in the Lexeme graph. We need to be able to be explicit about the isLemmaVariant relationship between lemmas. So maybe this actually belongs in the Lemma Graph? Also, I think the meaning (DefinitionSet) needs to be optional on a Lexeme because we might not have it in all cases (right now we often have that scenario, when we can't find the definition for a lemma)
I'm for moving the altLemmas
to the Lemma
object, it would make more sense, on my opinion. Also, maybe we can rename it to something like variants
, not including the word lemma
, because it will be clear enough that this pertains to lemma?
Regarding DefinitionSet
currently we, if there are no definitions available, attach an empty DefinitionSet
to the meaning
field of the Lexeme
. As a result, the Lexeme
will always have a DefinitionsSet
object in the meaning field, even when this object is empty. Should we keep it the same in GraphQL?
For Lemma, I am wondering if we need to pull part of speech out of features. a Lemma needs at a minimum the Part of Speech feature and may have other optional features.
I think it would be good to separate a part of speech out of the features list. Would the name partOfSpeech
for the field be appropriate? I'm afraid pos
might be too ambiguous.
I'm not sure about including the InflectionConstraints .. those are mostly used for matching for purposes of the InflectionTables and I'm not sure they belong here.
That's a good point, I will remove it.
For Definition, we probably need both lemmaLanguage and definitionLanguage
If we need both lemma word and the language, maybe it's better to include a simple Lemma
object with obligatory fields only (ID
, word
, and language
)? I think the structure of the Definition
would be simpler this way. What do you think?
For Inflection, we should have a required Form: String property. Up until now we have been overloading the stem property, using it to hold the form when we can't identify the stem. Sometimes the stem and the form are one and the same when there isn't a suffix or prefix but sometimes that's not what is meant.
Will add that.
For the Annotation, I'm not sure what we would have in the text field for the AnnotationTypes identified here so far. If it's an assertion of the validity of a Definition as belonging to a Lexeme it's just that. And vice-versa for the negation. The Comment is where someone would supply commentary text.
If we'll use Comment
for the commentary text then we don't need the text
field within an Annotation
. I will remove it.
In addition for Annotation we're going to need a property for Confidence. I think it can be a Number.
Will add that.
Also, I know we are focused on the Definition use case at the moment, but just a note that probably we are going to need a similar structure for Inflections (so that we can have annotations which assert or negate the relationship between a Lexeme and an Inflection, etc.)
I can add related fields into the schema.
I'm also working on adding some mutations for the Definition
use cases.
for ResourceProvider, I'm not sure there is any difference between ID and URI. The URI is the unique identifier of the provider. In addition, I think the ResourceProvider needs to be part of the graph of the objects it provides. E.g. so a Definition would have a ResourceProvider, as would Inflection, etc. etc.
Let's leave the
uri
field as an ID (we can rename it toid
if that's what DB/GraphQL implementation would require). Maybe we should also add an optionalrights
field (not sure about the rights translations and where would they come for)?
Yes, we probably need description
and rights
fields. Currently these are defined in the adapter config, but we will need to be able to fully define a resource provider in the data store.
for Feature, I'm not sure about SortOrder -- I don't think it should be required. I think I would also like to have a property on Feature which we can use to specify the ontology or schema the feature belongs to. The Feature names/values that we use now adhere to the AlpheiosLexicon schema, but we are going to need to be able to map to and/or support other ontologies (such as the Universal Dependencies tagset) so we should be explicit about that.
Agree that sort order might be not needed. We can apply it later (maybe even dynamically) by matching the feature name with the sort order.
What do you think might be a good name for the field specifying the schema? Would we adhere to any standards here, asschema.org
?
Probably not schema.org, but yes it's possible at some point we would switch to Universal Dependencies (https://universaldependencies.org/format.html#morphological-annotation) or other standard such as Lexinfo (lexinfo.net)
For field name, I think schema
works as well as anything.
For Lexeme, I dont know if the altLemmas belongs in the Lexeme graph. We need to be able to be explicit about the isLemmaVariant relationship between lemmas. So maybe this actually belongs in the Lemma Graph? Also, I think the meaning (DefinitionSet) needs to be optional on a Lexeme because we might not have it in all cases (right now we often have that scenario, when we can't find the definition for a lemma)
I'm for moving the
altLemmas
to theLemma
object, it would make more sense, on my opinion. Also, maybe we can rename it to something likevariants
, not including the wordlemma
, because it will be clear enough that this pertains to lemma?
Yes agree.
Regarding
DefinitionSet
currently we, if there are no definitions available, attach an emptyDefinitionSet
to themeaning
field of theLexeme
. As a result, theLexeme
will always have aDefinitionsSet
object in the meaning field, even when this object is empty. Should we keep it the same in GraphQL?
I guess that's fine. We should probably have a standard approach to this across the board (i.e whether to use nullable or empty lists)
For Lemma, I am wondering if we need to pull part of speech out of features. a Lemma needs at a minimum the Part of Speech feature and may have other optional features.
I think it would be good to separate a part of speech out of the features list. Would the name
partOfSpeech
for the field be appropriate? I'm afraidpos
might be too ambiguous.
partOfSpeech
is fine.
For Definition, we probably need both lemmaLanguage and definitionLanguage
If we need both lemma word and the language, maybe it's better to include a simple
Lemma
object with obligatory fields only (ID
,word
, andlanguage
)? I think the structure of theDefinition
would be simpler this way. What do you think?
Yes.
Thanks.
I think we still should make sortOder
on Feature
optional.
Regarding the mutations, I understand the recommendation from the cited article (https://www.apollographql.com/blog/designing-graphql-mutations-e09de826ed97/) (and others I've read) to suggest that mutations should have single, nested inputs and outputs, as in
commentOnLexeme(input:
{
lexemeID: ID!
comment: CommentInput!
}) {
comment: Comment!
}
I think we still should make
sortOder
onFeature
optional.
I removed it from the Feature
, but left within the FeatureValue
. Should I remove it from there too? Would we not need it there tool? What do you think?
Regarding the mutations, I understand the recommendation from the cited article (https://www.apollographql.com/blog/designing-graphql-mutations-e09de826ed97/) (and others I've read) to suggest that mutations should have single, nested inputs and outputs
That's correct, but I was not sure whether to follow it or not. It seemed a little too radical to me. It felt subversive to the language ideas behind the SDL where it allows multiple input parameters. I agree that it's good to keep the number of parameters at the minimum, but I'm not sure if we always should use the only one: creating a wrapper around several variables in order to present it as a single argument seemed like a way to create an extra verbosity. I also do not see any benefits for the versioning, because for that we can create a new mutation. Some other guides I've seen are using several mutation parameters: https://www.apollographql.com/docs/apollo-server/schema/schema/#designing-mutations. GitLab GraphQL API style guide also uses multiple arguments: https://docs.gitlab.com/ee/development/api_graphql_styleguide.html#arguments. So it seems not a clear-cut solution. I don't have a strong opinion on this, neither I have sufficient experience with GraphQL, so I try to keep my mind open. What do you think? Would it be better for us to always use a single input variable?
I removed it from the Feature, but left within the FeatureValue. Should I remove it from there too? Would we not need it there tool? What do you think?
the thing about sortOrder is that it doesn't make sense in the context of a single feature. It belongs to the display domain rather than the data. It's presence in the morphology service output is really a legacy thing.
Regarding the mutations, I understand the recommendation from the cited article (https://www.apollographql.com/blog/designing-graphql-mutations-e09de826ed97/) (and others I've read) to suggest that mutations should have single, nested inputs and outputs
That's correct, but I was not sure whether to follow it or not. It seemed a little too radical to me. It felt subversive to the language ideas behind the SDL where it allows multiple input parameters. I agree that it's good to keep the number of parameters at the minimum, but I'm not sure if we always should use the only one: creating a wrapper around several variables in order to present it as a single argument seemed like a way to create an extra verbosity. I also do not see any benefits for the versioning, because for that we can create a new mutation. Some other guides I've seen are using several mutation parameters: https://www.apollographql.com/docs/apollo-server/schema/schema/#designing-mutations. GitLab GraphQL API style guide also uses multiple arguments: https://docs.gitlab.com/ee/development/api_graphql_styleguide.html#arguments. So it seems not a clear-cut solution. I don't have a strong opinion on this, neither I have sufficient experience with GraphQL, so I try to keep my mind open. What do you think? Would it be better for us to always use a single input variable?
I have read a number of things which support the nesting concept. But as with everything, there are always multiple perspectives. I think we'll see what works best for us as we go.
I have read a number of things which support the nesting concept. But as with everything, there are always multiple perspectives. I think we'll see what works best for us as we go.
I think then we'd better to use the nested input. If it won't work for us, we can switch to using multiple variables.
I've also removed the sortOrder
field.
some initial thoughts:
for ResourceProvider, I'm not sure there is any difference between ID and URI. The URI is the unique identifier of the provider. In addition, I think the ResourceProvider needs to be part of the graph of the objects it provides. E.g. so a Definition would have a ResourceProvider, as would Inflection, etc. etc.
for Feature, I'm not sure about SortOrder -- I don't think it should be required. I think I would also like to have a property on Feature which we can use to specify the ontology or schema the feature belongs to. The Feature names/values that we use now adhere to the AlpheiosLexicon schema, but we are going to need to be able to map to and/or support other ontologies (such as the Universal Dependencies tagset) so we should be explicit about that.
For Word, we need Language, and also the ability to optionally include context (prefix, suffix, source).
For Lexeme, I dont know if the altLemmas belongs in the Lexeme graph. We need to be able to be explicit about the isLemmaVariant relationship between lemmas. So maybe this actually belongs in the Lemma Graph? Also, I think the meaning (DefinitionSet) needs to be optional on a Lexeme because we might not have it in all cases (right now we often have that scenario, when we can't find the definition for a lemma)
For Lemma, I am wondering if we need to pull part of speech out of features. a Lemma needs at a minimum the Part of Speech feature and may have other optional features.
I'm not sure about including the InflectionConstraints .. those are mostly used for matching for purposes of the InflectionTables and I'm not sure they belong here.
For Definition, we probably need both lemmaLanguage and definitionLanguage
For Inflection, we should have a required Form: String property. Up until now we have been overloading the stem property, using it to hold the form when we can't identify the stem. Sometimes the stem and the form are one and the same when there isn't a suffix or prefix but sometimes that's not what is meant.
For the Annotation, I'm not sure what we would have in the text field for the AnnotationTypes identified here so far. If it's an assertion of the validity of a Definition as belonging to a Lexeme it's just that. And vice-versa for the negation. The Comment is where someone would supply commentary text.
In addition for Annotation we're going to need a property for Confidence. I think it can be a Number.
Also, I know we are focused on the Definition use case at the moment, but just a note that probably we are going to need a similar structure for Inflections (so that we can have annotations which assert or negate the relationship between a Lexeme and an Inflection, etc.)