Upon researching GraphQL, the question immediately came to my mind:
Why do we need another query language when we already have SQL?
A second question then came to mind:
Can we use SQL for API endpoints?
After asking this question, it now seems odd that REST style APIs would even be considered a good choice.
Some of the obvious limitations of the current SQL syntax are:
- flat table style responses
- unable to return nested data structures and related records without multiplying the records through a join
- verbosity of joining to other tables
- the order of the commands is unintuitive. Eg selecting the shape of the data returned before we know from where the data will come from
- We can not query for data to be returned from an insert query like we can in GraphQL.
So what would SQL need to look like to make it useful for use with API endpoints?
Queries should be restructured so that the query for the data come first, and the structuring of the response comes last:
select people p
where p.id = 1
return p.id, p.name
This query selects any people with the id of 1
. The data returned is an array of objects containing the id
and name
properties.
The terms table
and record
I feel has is too tightly coupled to the persistence of data in a database table.
So lets introduce some slight variations to our terminology.
First, a table
generally represents a collection of something.
The something may have a realworld object it is describing, or be an abstract concept such as a relationship form one thing to another thing.
As such we should use the term collection
when referring to a collection of things.
Some ideas for terminology for what collections contain:
record
comes from traditional databasesentity
ormodel
comes from ORMs, but brings to mind a physical object. We also want to store abstract concepts.element
comes from arrays. Usually thought of linear and unconnectednode
comes from graph theory. Nodes have connections to other nodes. The name is somewhat unfamiliar to the general populationitem
is relatively generic and also simple to understandobject
comes from OOP to represent an instanciated instance of some type. In this case the type would be the typw of the collection
We often refer to the singular data contained within a database record as a column
.
Some ideas for replacement terminoligy includes:
property
attribute
collections of objects with attributes an attribute can be a connection from one entity to another
friend is a connection. connection/join tables may also have attributes on them we want to access friend.name - accessed the name property of the connected entity friend~id - accesses the id property of the connection entity
In relational databases, tables can join with other tables. A key consiting of one or more columns is used to join the records of one table to the records in another. Additionally, keys can be made as constraints on tables and be used for indexing. Further when querying for data, the joins in the query are always explicit. They must be defined in every query.
This seems odd to me. We should be able to define a connection
between items in one collection
to another.
For example, lets say we have a collection of people. Its safe to assume that people will have friendships with one another. We should be able to define explicit connections from one collection to another. There are various types of connections: one-one, one-many, many-many, one-at least one. We should be able define an explicit name for these connections so that we can use them in our queries.
Eg
add connection friends on person
select friendships f
join people p1 on f.person1 = p1
join people p2 on f.person2 = p2
return
ALTER COLLECTION people (
id int REQUIRED PRIMARY KEY,
name string REQUIRED,
has many friendships select friendships f where f.person1 = id
has many friends select this.friendships f return f.person2
);
ALTER COLLECTION friendships (
person1 REFERENCES people REQUIRED # implicitly uses primary key as the the connection... no need to specify the attribute
person2 REFERENCES people REQUIRED
)
When we query data, we generally want to query for data that has structure. We often want to see how data from one collection connects to data in another collection. We can query collections as well as the connections they define.
select people p
where p.id = 1
return
p.id,
p.name,
friends:
select p.friends f return # explicit select that returns specific attributes of each friend
f.id
f.name
Or,
select people p
where p.id = 1
return
p.*, # returns all attributes of p
friends: select p.friends f return f.* # explicit select that returns all attributes of each friend
Or,
select people p
where p.id = 1
return
p.*, # returns all attributes of p
p.friends # implicitly return all attributes of each friend
Response:
- id: 1
name: Shuan
friends:
- id: 2
name: Brent
update people p
where p.id = 7
set
p.name = "blah"
update p.friends f
The join column values are automatically set in the new inserted records where the select returns multiple entities, entities will be inserted for each
select people p
where p.id = 7
insert p.friends (person2)
from
- 8
- 9
- 10
Because person2
references people
, and the primary key of people is id of type int, the person with matching id is set to person2
.
Note that from
is an optional keyword. What matters is that either a single item or a collection is provided
select people p where p.id = 7
insert p.friends (person2) f
from select people where id in [8, 9, 10]
Because person2
references people
, and our from select
query returns records from the people
collection, we do not need to specify a return attribute to assign to person2
.
We should be able to return query data from our update queries.
select people p where p.id = 7
insert p.friends (person _id) f
from select people where id in [8, 9, 10]
return
f.name
QraphQL does not allow custom querying. The querying API is predefined. The client has no mechanism to request additional filters on the data.
- custom methods with defined parameters