br3nt/query_language.md

## query_language.md

      
    Raw
  

              query_language.md
            
          
    Upon researching GraphQL, the question immediately came to my mind:

Why do we need another query language when we already have SQL?

A second question then came to mind:

Can we use SQL for API endpoints?

After asking this question, it now seems odd that REST style APIs would even be considered a good choice.
Some of the obvious limitations of the current SQL syntax are:

flat table style responses
unable to return nested data structures and related records without multiplying the records through a join
verbosity of joining to other tables
the order of the commands is unintuitive.  Eg selecting the shape of the data returned before we know from where the data will come from
We can not query for data to be returned from an insert query like we can in GraphQL.

So what would SQL need to look like to make it useful for use with API endpoints?
Simple querying

Queries should be restructured so that the query for the data come first, and the structuring of the response comes last:
select people p
where p.id = 1
return p.id, p.name

This query selects any people with the id of 1.  The data returned is an array of objects containing the id and name properties.
Terminology

The terms table and record I feel has is too tightly coupled to the persistence of data in a database table.
So lets introduce some slight variations to our terminology.
First, a table generally represents a collection of something.
The something may have a realworld object it is describing, or be an abstract concept such as a relationship form one thing to another thing.
As such we should use the term collection when referring to a collection of things.
Some ideas for terminology for what collections contain:

record comes from traditional databases
entity or model comes from ORMs, but brings to mind a physical object.  We also want to store abstract concepts.
element comes from arrays.  Usually thought of linear and unconnected
node comes from graph theory.  Nodes have connections to other nodes. The name is somewhat unfamiliar to the general population
item is relatively generic and also simple to understand
object comes from OOP to represent an instanciated instance of some type.  In this case the type would be the typw of the collection

We often refer to the singular data contained within a database record as a column.
Some ideas for replacement terminoligy includes:

property
attribute

collections of objects with attributes
an attribute can be a connection from one entity to another
friend is a connection.
connection/join tables may also have attributes on them we want to access
friend.name - accessed the name property of the connected entity
friend~id - accesses the id property of the connection entity
Connections and relationships

In relational databases, tables can join with other tables.
A key consiting of one or more columns is used to join the records of one table to the records in another.
Additionally, keys can be made as constraints on tables and be used for indexing.
Further when querying for data, the joins in the query are always explicit.  They must be defined in every query.
This seems odd to me.  We should be able to define a connection between items in one collection to another.
For example, lets say we have a collection of people.  Its safe to assume that people will have friendships with one another.
We should be able to define explicit connections from one collection to another.
There are various types of connections: one-one, one-many, many-many, one-at least one.
We should be able define an explicit name for these connections so that we can use them in our queries.
Eg
add connection friends on person
select friendships f
join people p1 on f.person1 = p1
join people p2 on f.person2 = p2
return 

ALTER COLLECTION people (
  id int REQUIRED PRIMARY KEY,
  name string REQUIRED,
  
  has many friendships select friendships f where f.person1 = id
  has many friends select this.friendships f return f.person2
);

ALTER COLLECTION friendships (
  person1 REFERENCES people REQUIRED # implicitly uses primary key as the the connection... no need to specify the attribute
  person2 REFERENCES people REQUIRED
)

Querying related entities and returning structured data

When we query data, we generally want to query for data that has structure.
We often want to see how data from one collection connects to data in another collection.
We can query collections as well as the connections they define.
select people p
where p.id = 1
return
  p.id,
  p.name,
  friends:
    select p.friends f return # explicit select that returns specific attributes of each friend
      f.id
      f.name

Or,
select people p
where p.id = 1
return
  p.*, # returns all attributes of p
  friends: select p.friends f return f.* # explicit select that returns all attributes of each friend

Or,
select people p
where p.id = 1
return
  p.*, # returns all attributes of p
  p.friends # implicitly return all attributes of each friend

Response:
- id: 1
  name: Shuan
  friends:
    - id: 2
      name: Brent

Updating collections

update people p
where p.id = 7

set
  p.name = "blah"
  update p.friends f

The join column values are automatically set in the new inserted records
where the select returns multiple entities, entities will be inserted for each
select people p
where p.id = 7
insert p.friends (person2)
from
  - 8
  - 9
  - 10

Because person2 references people, and the primary key of people is id of type int, the person with matching id is set to person2.
Note that from is an optional keyword.  What matters is that either a single item or a collection is provided
select people p where p.id = 7
insert p.friends (person2) f
from select people where id in [8, 9, 10]

Because person2 references people, and our from select query returns records from the people collection, we do not need to specify a return attribute to assign to person2.
Returning structured data from insert queries

We should be able to return query data from our update queries.
select people p where p.id = 7
insert p.friends (person _id) f
from select people where id in [8, 9, 10]
return 
  f.name

Comparisons to GraphQL

QraphQL does not allow custom querying.  The querying API is predefined.  The client has no mechanism to request additional filters on the data.
Whats missing?


custom methods with defined parameters