This article covers an overview of the second and third months of the coding period that I experienced with my project, Journal Policy Tracker Backend under the Open Bioinformatics Foundation organization during the Google Summer of Code 2022.
Our team had meetings and discussions regularly. I and my immediate mentor Pritish have meetings every other week where I explain to him the work I am doing and inform him if I am facing any problem. We also had a few combined team meetings where all the mentors and mentees in our team catch up and talk about the progress that we’ve made and the direction where the development should move. There are numerous things that we talk about and decide upon in these meetings that give us the direction where we need to move forward in our next step.
One of the key things that we decided in these meetings was that all users will have a dashboard where they can see their details and all the journals that they have submitted and also they can go on other people’s profiles and see the journals that those people have submitted. We also decided that we won’t require anyone to make an account to just look up a journal policy by its ISSN. This will make it more convenient for everyone.
After I was done with implementing sessions and handling CORS, I moved towards integrating the User and the Journal entities in our database.
I made significant changes in the user and journal resolvers to accomplish this. Because now we have the sessions implemented and working on our backend, we have access to the userId
of the user that is currently logged in. Here, userId
is the string representation of the objectId
which is a unique identifier of a document in a collection that is auto-generated in MongoDB.
The next step was to restrict the access of Journal CRUD API to only the people who are logged in. I accomplished this with the help of a package called GraphQL Shield. I have explained this process in more detail down below in the article.
From here, only the accounts that are registered in our app can Create, Read, Update and Delete journals from our database. This is a good start for integrating the User and Journal entities.
I integrated the User and Journal like the following:
- Every journal now has a field called
createdBy
which is populated by theuserId
of the user who created that journal. ThatuserId
is fetched from the session that gets created when a user logs in or when a new user registers. - Every user has a
journals
field that contains an array as the value. That array contains theobjectId
of the journals that were created by that particular user.
Now that our user and journal entities are well integrated, a problem that was occurring was that deleting a journal did not remove the journal’s id from its creator’s journals
array. This problem would not occur in a relational database because one table would be directly linked to another with a one-to-one relationship.
But in a no-SQL database like MongoDB, there are no in-built ways to solve this problem. The simplest solution to this problem was to add a middleware that will run whenever we call the deleteJournal
function.
This middleware function will run before our deleteJournal
resolver function and solve the inconsistency that we were facing.
The middleware function looks like the following:
export const journalMiddleware = {
Mutation: {
deleteJournal: async (resolve, parent, args, context, info) => {
try {
const journalToDelete = await Journal.findOne({
issn: args.issnToDelete,
});
await User.findByIdAndUpdate(
journalToDelete.createdBy,
{
$pull: {
journals: journalToDelete._id,
},
},
{ safe: true }
);
} catch (error) {
console.log(error);
}
return resolve(parent, args, context, info);
},
},
};
The updated schema for the journal entity looks like the following:
Journal: {
id: ID
title: String
url: String
issn: String
domainName: String
policies: {
title: String
firstYear: Int
lastYear: Int
policyType: enum: {
NUMBER_ONE: "Number One"
NUMBER_TWO: "Number Two"
NUMBER_THREE: "Number Three"
NUMBER_FOUR: "Number Four"
}
isDataAvailabilityStatementPublished: Boolean
isDataShared: Boolean
isDataPeerReviewed: Boolean
enforced: enum: {
YES: "Yes - Before Publication"
SOMETIMES: "Sometimes - Post-Publication Audit"
NO: "No - Not Enforced"
}
enforcedEvidence: String
}
createdAt: String
updatedAt: String
createdBy: ID
}
In a couple of fields of our updated journal schema, we wanted to have a selected few options with which the data fields can be populated.
Although MongoDB does not enforce any schema out of the box, we are using mongoose to communicate with our MongoDB database which provides us with the option to implement enums
in our schema.
Pagination is the process of splitting the contents of a website into discrete pages. This feature will let the users browse through all the journals with a lot of ease.
The type of pagination implemented in our project is commonly known as skip-limit
pagination. The skip-limit
approach is very common and straightforward to implement.
In this approach, to implement pagination we require two values. These are:
- Skip → It refers to the number of items we want to skip from the response array to get to the desired items of a particular page.
- Limit → It refers to the number of items we want to fetch after the skipped items.
To calculate the skip value, we need the current page number that the user is on and pass it in this formula:
Skip Value = (Current Page Number - 1) × Limit Value
I decided to implement the pagination feature in all of the queries that respond with an array of objects. Those response arrays are either user arrays or journal arrays. Therefore now whenever we query a list of users or journals, it is nicely divided into smaller pages.
In order to better manage and maintain our app, we needed to have a hierarchy for all users where some of the user accounts will have elevated privileges. This is because we cannot let everyone with an account be able to execute all the functions that are available on our server.
To solve this problem, we put all the users’ accounts in a hierarchy system where their account’s designated role
will determine the functions they will be able to execute.
To implement this, firstly I added the role
field to the user schema which gave me the ability to assign a level to every user. Each user’s role
field can have one of the three values which are USER
, MODERATOR
, and ADMIN
where ADMIN
has the access to all the functions in our server. Every user account has a role of USER
by default.
This feature was implemented with the help of the GraphQL Shield package. I also got a lot of help understanding its implementation from this video by Ben Awad.
Fetching the role
of a user is done with rule
function that come with GraphQL Shield and looks like the following:
const isAuthenticated = rule()((_, __, { req }) => {
return !!req.session.userId;
});
const isAdmin = rule()(async (_, __, { req }) => {
const user = await User.findById(req.session.userId);
return user && user.role === "ADMIN";
});
const isModerator = rule()(async (_, __, { req }) => {
const user = await User.findById(req.session.userId);
return user && user.role === "MODERATOR";
});
Currently, the role-to-function distribution looks like this:
export const authMiddleware = shield({
Query: {
// user queries
getCurrentUser: isAuthenticated,
getAllUsers: and(isAuthenticated, or(isAdmin, isModerator)),
// journal queries
getAllJournalsByUserId: isAuthenticated,
getAllJournalsByCurrentUser: isAuthenticated,
},
Mutation: {
// user mutations
addMockUserData: and(isAuthenticated, isAdmin),
logout: isAuthenticated,
// journal mutations
addMockJournalData: and(isAuthenticated, isAdmin),
createJournal: isAuthenticated,
updateJournal: and(isAuthenticated, or(isAdmin, isModerator)),
deleteJournal: and(isAuthenticated, or(isAdmin, isModerator)),
},
});
Currently it looks very simple but its complexity will probably increase over time as more and more functions get added into our server.
In our meetings, we discussed about error handling and decided that the only errors that are going to be handled on the backend are the ones that need to fetch data from the database to verify something. The rest can be handled on the front-end with the help of a validation library.
Error handling was added to the createJournal
and udpateJournal
mutations. In createJournal
mutation, if we create a new journal with an ISSN that is already present in the database then it will respond with an error saying “A journal with the same ISSN already exists”. The same is true with updateJournal
mutation. We will get the exact same error if we update an existing journal with an ISSN that is already present in the database. But if we try to update a journal that is not present in the database then it’ll respond with an error saying “ISSN not available”.
While working on the project, I was having some problems understanding the difference between _id
and id
. I was trying to find out what is the actual difference and what are the use cases for each. I did a good amount of research on this topic. This stack overflow answer gives a little bit of clarity explaining that _id
will return a result of type ObjectId
and id
will return the string representation of that ObjectId
.
_id
will return a value likeObjectId("5349b4ddd2781d08c09890f3")
id
will return a value like5349b4ddd2781d08c09890f3
- If we want to call a function like
Model.findById()
then we can directly pass in theid
that is the string representation. - If we are looking something up by its
id
then we should simply use.findById()
method and not.find()
. id
will not work with.find()
like.find({ id: 5349b4ddd2781d08c09890f3})
_id
will work with.find()
like.find({ _id: 5349b4ddd2781d08c09890f3})
When I was working on the user authentication system, I needed a front-end that I can connect my backend to and test all the functions. So I made a very simple front-end using Next.js and MUI and added the register
and login
pages to it. After the front-end was ready, I connected it to the the journal policy backend. I integrated the GraphQL resolvers with the front-end pages.
I not only wanted the login
and register
functions to work but also wanted to implement the field error handling. This will allow us to display the error message under that particular field where the wrong data was entered. This is a very common practice nowadays. I also wanted to make this mock front-end because it will make things easier for us when we integrate our backend with the real front-end as it will be easier to replicate this thing later.
Having a convenient way of adding mock data into the database is very crucial for testing purposes of our GraphQL resolvers.
While I was implementing pagination, I realized that I had no convenient way to populate the database with mock data to test the pagination feature.
Therefore I implemented two mutation functions to add mock data to our database. These functions are called addMockUserData
and addMockJournalData
. To implement these mutations, I used faker-js.
Faker is a popular library that generates fake (but reasonable) data that can be used for things such as:
- Unit Testing
- Performance Testing
- Building Demos
- Working without a completed backend
Config for generating mock user data:
const generateMockUser = () => {
return {
fullName: faker.fake("{{name.firstName}} {{name.lastName}}"),
username: faker.internet.userName(),
email: faker.internet.email(),
role: faker.helpers.arrayElement(["USER", "MODERATOR"]),
password: faker.internet.password(),
createdAt: faker.date.past(),
};
};
Config for generating mock journal data:
const generateMockJournals = (userId) => {
return {
title: faker.animal.cat(),
url: faker.internet.url(),
issn: faker.datatype.number({ min: 10000000, max: 99999999 }),
domainName: faker.internet.domainName(),
policies: {
title: faker.animal.cat(),
firstYear: faker.datatype.number({ min: 2000, max: 2010 }),
lastYear: faker.datatype.number({ min: 2011, max: 2022 }),
policyType: faker.fake("Number One"),
isDataAvailabilityStatementPublished: faker.datatype.boolean(),
isDataShared: faker.datatype.boolean(),
isDataPeerReviewed: faker.datatype.boolean(),
enforced: faker.fake("No - Not Enforced"),
enforcedEvidence: faker.lorem.sentence(10),
},
createdAt: faker.date.past(),
updatedAt: faker.date.past(),
createdBy: faker.fake(userId),
};
};
In the generateMockJournals
function, we require a userId
to generate all the mock functions. As I explained above in the Integration of User and the Journal section, every journal has a createdBy
field that needs to be populated by the userId
of the user who created it. This required userId
in this mock function is passed down to the createdBy
field of all of the mock journals that are generated.
At this point, the project is almost finished. I have completed the work that I mentioned in my GSoC proposal and finished writing the documentation. The only thing left now is to write tests for the project. In the forthcoming week, I will be writing a few tests and making sure that everything works well. At the end I will write a Final Report that will summarize and showcase all the work that I carried out during GSoC this year. I'm excited to get it up and running on a server and see it work. I want to thank my mentors Pritish and Yo for giving me this kind of leeway to work on this project. It was a lot of fun and I learned a lot during this time.