d-e-v-esh/GSoC 2022 Blog - 2.md

## GSoC 2022 Blog - 2.md

      
    Raw
  

              GSoC 2022 Blog - 2.md
            
          
    GSoC 2022 Blog - 2

This article covers an overview of the second and third months of the coding period that I experienced with my project, Journal Policy Tracker Backend under the Open Bioinformatics Foundation organization during the Google Summer of Code 2022.
Meetings and Discussions

Our team had meetings and discussions regularly. I and my immediate mentor Pritish have meetings every other week where I explain to him the work I am doing and inform him if I am facing any problem. We also had a few combined team meetings where all the mentors and mentees in our team catch up and talk about the progress that we’ve made and the direction where the development should move. There are numerous things that we talk about and decide upon in these meetings that give us the direction where we need to move forward in our next step.
One of the key things that we decided in these meetings was that all users will have a dashboard where they can see their details and all the journals that they have submitted and also they can go on other people’s profiles and see the journals that those people have submitted. We also decided that we won’t require anyone to make an account to just look up a journal policy by its ISSN. This will make it more convenient for everyone.
Integration of User and the Journal

After I was done with implementing sessions and handling CORS, I moved towards integrating the User and the Journal entities in our database.
I made significant changes in the user and journal resolvers to accomplish this. Because now we have the sessions implemented and working on our backend, we have access to the userId of the user that is currently logged in. Here, userId is the string representation of the objectId which is a unique identifier of a document in a collection that is auto-generated in MongoDB.
The next step was to restrict the access of Journal CRUD API to only the people who are logged in. I accomplished this with the help of a package called GraphQL Shield. I have explained this process in more detail down below in the article.
From here, only the accounts that are registered in our app can Create, Read, Update and Delete journals from our database. This is a good start for integrating the User and Journal entities.
I integrated the User and Journal like the following:

Every journal now has a field called createdBy which is populated by the userId of the user who created that journal. That userId is fetched from the session that gets created when a user logs in or when a new user registers.
Every user has a journals field that contains an array as the value. That array contains the objectId of the journals that were created by that particular user.

Adding journalMiddleware

Now that our user and journal entities are well integrated, a problem that was occurring was that deleting a journal did not remove the journal’s id from its creator’s journals array. This problem would not occur in a relational database because one table would be directly linked to another with a one-to-one relationship.
But in a no-SQL database like MongoDB, there are no in-built ways to solve this problem. The simplest solution to this problem was to add a middleware that will run whenever we call the deleteJournal function.
This middleware function will run before our deleteJournal resolver function and solve the inconsistency that we were facing.
The middleware function looks like the following:
export const journalMiddleware = {
  Mutation: {
    deleteJournal: async (resolve, parent, args, context, info) => {
      try {
        const journalToDelete = await Journal.findOne({
          issn: args.issnToDelete,
        });

        await User.findByIdAndUpdate(
          journalToDelete.createdBy,
          {
            $pull: {
              journals: journalToDelete._id,
            },
          },
          { safe: true }
        );
      } catch (error) {
        console.log(error);
      }

      return resolve(parent, args, context, info);
    },
  },
};
Implementation of the Updated Journal Schema

The updated schema for the journal entity looks like the following:
Journal: {
  id: ID
  title: String
  url: String
  issn: String
  domainName: String
  policies: {
    title: String
    firstYear: Int
    lastYear: Int
    policyType: enum: {
      NUMBER_ONE: "Number One"
      NUMBER_TWO: "Number Two"
      NUMBER_THREE: "Number Three"
      NUMBER_FOUR: "Number Four"
    }
    isDataAvailabilityStatementPublished: Boolean
    isDataShared: Boolean
    isDataPeerReviewed: Boolean
    enforced: enum: {
      YES: "Yes - Before Publication"
      SOMETIMES: "Sometimes - Post-Publication Audit"
      NO: "No - Not Enforced"
    }
    enforcedEvidence: String
  }
  createdAt: String
  updatedAt: String
  createdBy: ID
}
In a couple of fields of our updated journal schema, we wanted to have a selected few options with which the data fields can be populated.
Although MongoDB does not enforce any schema out of the box, we are using mongoose to communicate with our MongoDB database which provides us with the option to implement enums in our schema.
Pagination

Pagination is the process of splitting the contents of a website into discrete pages. This feature will let the users browse through all the journals with a lot of ease.
The type of pagination implemented in our project is commonly known as skip-limit pagination. The skip-limit approach is very common and straightforward to implement.
In this approach, to implement pagination we require two values. These are:

Skip → It refers to the number of items we want to skip from the response array to get to the desired items of a particular page.
Limit → It refers to the number of items we want to fetch after the skipped items.

To calculate the skip value, we need the current page number that the user is on and pass it in this formula:
Skip Value = (Current Page Number - 1) × Limit Value
I decided to implement the pagination feature in all of the queries that respond with an array of objects. Those response arrays are either user arrays or journal arrays. Therefore now whenever we query a list of users or journals, it is nicely divided into smaller pages.
Role Based Authorization → Authorization Middleware

In order to better manage and maintain our app, we needed to have a hierarchy for all users where some of the user accounts will have elevated privileges. This is because we cannot let everyone with an account be able to execute all the functions that are available on our server.
To solve this problem, we put all the users’ accounts in a hierarchy system where their account’s designated role will determine the functions they will be able to execute.
To implement this, firstly I added the role field to the user schema which gave me the ability to assign a level to every user. Each user’s role field can have one of the three values which are USER, MODERATOR, and ADMIN where ADMIN has the access to all the functions in our server. Every user account has a role of USER by default.
This feature was implemented with the help of the GraphQL Shield package. I also got a lot of help understanding its implementation from this video by Ben Awad.
Fetching the role of a user is done with rule function that come with GraphQL Shield and looks like the following:
const isAuthenticated = rule()((_, __, { req }) => {
  return !!req.session.userId;
});

const isAdmin = rule()(async (_, __, { req }) => {
  const user = await User.findById(req.session.userId);
  return user && user.role === "ADMIN";
});

const isModerator = rule()(async (_, __, { req }) => {
  const user = await User.findById(req.session.userId);
  return user && user.role === "MODERATOR";
});
Currently, the role-to-function distribution looks like this:
export const authMiddleware = shield({
  Query: {
    // user queries
    getCurrentUser: isAuthenticated,
    getAllUsers: and(isAuthenticated, or(isAdmin, isModerator)),

    // journal queries
    getAllJournalsByUserId: isAuthenticated,
    getAllJournalsByCurrentUser: isAuthenticated,
  },

  Mutation: {
    // user mutations
    addMockUserData: and(isAuthenticated, isAdmin),
    logout: isAuthenticated,

    // journal mutations
    addMockJournalData: and(isAuthenticated, isAdmin),
    createJournal: isAuthenticated,
    updateJournal: and(isAuthenticated, or(isAdmin, isModerator)),
    deleteJournal: and(isAuthenticated, or(isAdmin, isModerator)),
  },
});
Currently it looks very simple but its complexity will probably increase over time as more and more functions get added into our server.
Error Handling in Journal CRUD API

In our meetings, we discussed about error handling and decided that the only errors that are going to be handled on the backend are the ones that need to fetch data from the database to verify something. The rest can be handled on the front-end with the help of a validation library.
Error handling was added to the createJournal and udpateJournal mutations. In createJournal mutation, if we create a new journal with an ISSN that is already present in the database then it will respond with an error saying “A journal with the same ISSN already exists”. The same is true with updateJournal mutation. We will get the exact same error if we update an existing journal with an ISSN that is already present in the database. But if we try to update a journal that is not present in the database then it’ll respond with an error saying “ISSN not available”.
Difference between _id and id

While working on the project, I was having some problems understanding the difference between _id and id. I was trying to find out what is the actual difference and what are the use cases for each. I did a good amount of research on this topic. This stack overflow answer gives a little bit of clarity explaining that _id will return a result of type ObjectId and id will return the string representation of that ObjectId.

_id will return a value like ObjectId("5349b4ddd2781d08c09890f3")
id will return a value like 5349b4ddd2781d08c09890f3
If we want to call a function like Model.findById() then we can directly pass in the id that is the string representation.
If we are looking something up by its id then we should simply use .findById() method and not .find().
id will not work with .find() like .find({ id: 5349b4ddd2781d08c09890f3})
_id will work with .find() like .find({ _id: 5349b4ddd2781d08c09890f3})

Making a Local Front-End for Testing Purposes

When I was working on the user authentication system, I needed a front-end that I can connect my backend to and test all the functions. So I made a very simple front-end using Next.js and MUI and added the register and login pages to it. After the front-end was ready, I connected it to the the journal policy backend. I integrated the GraphQL resolvers with the front-end pages.
I not only wanted the login and register functions to work but also wanted to implement the field error handling. This will allow us to display the error message under that particular field where the wrong data was entered. This is a very common practice nowadays. I also wanted to make this mock front-end because it will make things easier for us when we integrate our backend with the real front-end as it will be easier to replicate this thing later.
link to front-end repo
Mutations for adding Mock Data

Having a convenient way of adding mock data into the database is very crucial for testing purposes of our GraphQL resolvers.
While I was implementing pagination, I realized that I had no convenient way to populate the database with mock data to test the pagination feature.
Therefore I implemented two mutation functions to add mock data to our database. These functions are called addMockUserData and addMockJournalData. To implement these mutations, I used faker-js.
Faker is a popular library that generates fake (but reasonable) data that can be used for things such as:

Unit Testing
Performance Testing
Building Demos
Working without a completed backend

Config for generating mock user data:
const generateMockUser = () => {
  return {
    fullName: faker.fake("{{name.firstName}} {{name.lastName}}"),
    username: faker.internet.userName(),
    email: faker.internet.email(),
    role: faker.helpers.arrayElement(["USER", "MODERATOR"]),
    password: faker.internet.password(),
    createdAt: faker.date.past(),
  };
};
Config for generating mock journal data:
const generateMockJournals = (userId) => {
  return {
    title: faker.animal.cat(),
    url: faker.internet.url(),
    issn: faker.datatype.number({ min: 10000000, max: 99999999 }),
    domainName: faker.internet.domainName(),
    policies: {
      title: faker.animal.cat(),
      firstYear: faker.datatype.number({ min: 2000, max: 2010 }),
      lastYear: faker.datatype.number({ min: 2011, max: 2022 }),
      policyType: faker.fake("Number One"),
      isDataAvailabilityStatementPublished: faker.datatype.boolean(),
      isDataShared: faker.datatype.boolean(),
      isDataPeerReviewed: faker.datatype.boolean(),
      enforced: faker.fake("No - Not Enforced"),
      enforcedEvidence: faker.lorem.sentence(10),
    },
    createdAt: faker.date.past(),
    updatedAt: faker.date.past(),
    createdBy: faker.fake(userId),
  };
};
In the generateMockJournals function, we require a userId to generate all the mock functions. As I explained above in the Integration of User and the Journal section, every journal has a createdBy field that needs to be populated by the userId of the user who created it. This required userId in this mock function is passed down to the createdBy field of all of the mock journals that are generated.
FIN

At this point, the project is almost finished. I have completed the work that I mentioned in my GSoC proposal and finished writing the documentation. The only thing left now is to write tests for the project. In the forthcoming week, I will be writing a few tests and making sure that everything works well. At the end I will write a Final Report that will summarize and showcase all the work that I carried out during GSoC this year. I'm excited to get it up and running on a server and see it work. I want to thank my mentors Pritish and Yo for giving me this kind of leeway to work on this project. It was a lot of fun and I learned a lot during this time.