Skip to content

Instantly share code, notes, and snippets.

@melvin15may
Last active August 28, 2017 00:39
Show Gist options
  • Save melvin15may/2ae50b6904844a8fa390030321394a3f to your computer and use it in GitHub Desktop.
Save melvin15may/2ae50b6904844a8fa390030321394a3f to your computer and use it in GitHub Desktop.
GSoC 17 report

GraphSpace Notifications

Project Overview

Notifications are integral to all web applications as they are the means to tracking all important information. It is most optimal when a notification system is decoupled from the main application flow, because we can have a clear distinction between the utilities of the two applications. In this project, there are three primary types of notifications that need to be implemented: Owner, Group and Watching. An owner notification is defined by one where users automatically receive a notification when they create a resource (graph, layout or group) on GraphSpace. Group notifications, on the other hand, will automatically be received by users when they join a group or when members of group that they are part of shared resources. Lastly, watching notifications will be automatically received by users when they are actively interacting with a graph or layout, but aren't necessarily the owner of it; users will receive this type of notification only if they have manually signed up for the Watching notification.

So, to implement this notification system, I will be using the asynchronous messaging service, Apache Kafka. While I considered RabbitMQ as an alternative service, its license Mozilla Public License (MPL) is incompatible with GraphSpace’s General Public License (GPL) v3. As this system should be decoupled from the main application flow, the main application will act like the producer, creating desired notifications and adding those to the messaging queue (see Figure 1). The producer will send different types of notification messages under different topics, and the consumer will add these notifications along with the status and type (from the topic of the message) to the same PostgreSQL database used by GraphSpace. Another component of the project will be implementing sockets in Django. This will be done using the django-socketio package. A socket is one endpoint of a two-way communication link between two programs running on the network. It allows real time, bidirectional communication between the web client and server. This allows for logged in users to see notifications in real time without refreshing the page. The notification will be pop down from the top right corner of the webpage the user is currently in.

A scheduled task, implemented using django-crontab will email all unread notifications which have not been emailed before, to all users. Django-crontab will allow us to write cron logic in Python to schedule tasks. The tasks will be scheduled to run whenever the GraphSpace server has less load. I will decide the criteria after discussion with my mentors and the GraphSpace team. The emails will be sent to only those users who have opted in for this service. The notification that have been emailed will be marked so in the database, so that they are not sent again. In addition to adding a notification button or icon to the user’s dashboard displaying the number of unread notifications, a webpage to show all notifications will also be created. A user can then mark the notification as “read” or browse all notifications based on their read state, content, and type; this information can be accessed by clicking on the notification icon. The notification will be populated on this page by an API call to the notification application with request attributes. These request attributes will act like filters to get notification of a specific state, content and type.

Summing up the requirements:

  • Using Apache Kafka as asynchronous messaging service implement notification system.

  • The notification system should serve three types of notification:

    • Owner
    • Group
    • Watching
  • Group large number of similar notifications to provide a more concise view to user

  • Allow logged in user to see notifications in real-time

  • Schedule tasks to send notification emails to user who opt in for this service

  • Create a page to display notifications and for the user to mark notifications as read

Work Completed/Results

100% of the requirements were satisfied, with the caveat that the team decided against pursuing some of them while we were designing the system. The requirements that have been satisfied are:

  • Using Apache Kafka as asynchronous messaging service implement notification system.
  • The notification system should serve three types of notification:
    • Owner
    • Group
  • Group large number of similar notifications to provide a more concise view to user
  • Allow logged in user to see notifications in real-time
  • Schedule tasks to send notification emails to user who opt in for this service
  • Create a page to display notifications and for the user to mark notifications as read

Including screenshots:

  • Upload graph owner notification

owner_notification

  • Click on owner notification

notification_click

  • Mark as read

mark_read

  • Click on a grouped/clustered notification

bulk_notification_click

  • Real-time owner notification

real-time-notification-owner

  • Real-time group notification

real-time-group-notification

  • Share graph in a group

share-graph-group

  • Group member accessing shared graph

group-member-graph-access

  • Mark all read

mark-all-read

  • Share layout notification

share-layout-notification

  • Click shared layout group notification

shared-layout-group

  • Click delete shared layout notification

delete-shared-layout

Table 1: Trigger and type of notifications

Action Notification Type of notification
User uploads a graph New graph uploaded. Owner
User creates a new layout New layout created. Owner
User creates a new group New group created. Owner
User deletes a graph Graph deleted. Owner
User deletes a group Group deleted. Owner
User deletes a layout Layout deleted. Owner
User updates a graph Graph updated. Owner
User updates a group Group updated. Owner
User shares graph in a group Graph shared. Group
User shares layout in a group Layout shared. Group
User un-shares graph from a group Graph removed. Group
User un-shares layout from a group Layout removed. Group
User adds new group member New group member added. Group
User removes group member Group member removed. Group

Another requirement was later added to the project. This was creating a Docker image for GraphSpace. This meant creating a DockerFile to allow users to build Docker image in their system and run GraphSpace in 1 command; instead of going through the process of installing each and every service.

Watching notification was not implemented as the team learned that most of the tasks related to this type were already being notified through either Owner or Group notifications. This type would also have complicated the database schema and GraphSpace architecture design.

All the changes mentioned above have been submitted to the GraphSpace repository through a pull request #315. (My fork of the GraphSpace repo: melvin15may/GraphSpace) Other resources that have been added are as follows:

Dataflow

Diagram 1: Architecture diagram

dataflow

Four steps in the notification process are:

  • Notification creation
  • Notification consumption
  • Real-time notification delivery
  • Daily notification email

Notification creation

As stated in the requirements, there are 2 types of notifications: Owner and Group. Both of these notifications can be created using 1 function call: send_message(). This function is defined in the graphspace/producer.py file. Example usage is as follows:

import graphspace.producer as producer 
producer.send_message(topic='owner',  
message={ 
  'owner_email': 'abc@example.com', 
  'message': 'New graph XYZ uploaded.', 
  'resource': 'graph', 
  'resource_id': 1, 
  'type': 'upload' 
}) 

In the above function call, we are creating an owner notification for graph XYZ upload by user abc@example.com. We are sending a message to Kafka queue through kafka-python producer.
We are using Apache Kafka as asynchronous messaging queue service. While we considered RabbitMQ as an alternative service, its license Mozilla Public License (MPL) is incompatible with GraphSpace’s General Public License (GPL) v3. Kafka-python is the client used to send and receive messages from Django. Initially, confluent-kafka-python was the client of choice because of it's speed in delivering messages but after facing issues with reliability we changed to much more stable client.

Notification consumption

To consume messages from the Kafka queue, we need a consumer which should be running in parallel to GraphSpace project. We create a consumer class inherited from python Thread class. This have been defined in applications/notifications/consumer.py file. We will be using 2 consumer class objects, one for consuming owner notifications and another for group notifications. These threads and objects are created and initiated in the graphspace/asgi.py file. In short, they are initiated when Daphne server is started. Whenever the consumer gets a message from Kafka queue, it calls the function add_owner_notification() or add_group_notification(). These functions add/create a record in the owner_notification or group_notification table. If the consumer fails (this is a silent fail; won't effect the other applications), then currently there is no provision which restarts it automatically. This prevents bugs from going undetected.

Realtime notification delivery

This means a user can see latest notification without refreshing the page. There are 2 indicators for the user depending on which page he/she is on:

  • If the user is on the notification page, he/she will see the latest notification in the "My Resource" table or one of the group tables based on the type of notification.
  • If the user is on any other page, he/she will see an indicator show up on the bell icon (similar to Github notifications)

This has been achieved using channels. Initially, as stated in the proposal we planned on using django-socketio but decided against it because it does not work on WSGI server and lacks support for socket.io 0.8. Moreover, channels is an official Django project. Channels also does not work in WSGI server; hence we are using Daphne. To learn more about how this have been implemented read this post from my blog. To summarize the dataflow, the client establishes a websocket connection through Daphne. Apache server acts like a reverse proxy redirecting all the websocket connection to Daphne interface server. This interface server allocates the task (sending/receiving message, etc.) to the workers.

Daily notification email

This is an optional feature that a user can activate by using the checkbox on the notification page. If activated, the user will get daily email (currently time set to 00:00) containing new notifications that user hasn't seen or read. This is achieved using django-crontab and mail feature of django. Django-crontab is used to set tasks to run at specific time. Initially we had decided on using django-cron but decided against it due to a lot of extra commands that need to be executed (e.g. registering a crontab job to execute command python manage.py runcrons to run python cron jobs).

Database schema

owner_notification table

Column name Data type Description
message String Message of the notification. E.g. Graph XYZ updated.
type Enum (create,upload,update) Type of notification
resource Enum (graph,group,layout) Type of notification's resource
resource_id Integer ID of the notification's resource
is_read Boolean Notification read status
owner_email String Foreign key to User table
created_at Datetime Date-time when notification was created
updated_at Datetime Date-time when notification was updated
is_email_sent Boolean Notification email status
emailed_at Datetime Date-time when notification was emailed

group_notification table

Column name Data type Description
message String Message of the notification. E.g. Graph XYZ shared.
type Enum (share,unshare,remove,add) Type of notification
resource Enum (graph,group_member,layout) Type of notification's resource
resource_id Integer ID of the notification's resource
group_id Integer ID of the notification's group
is_read Boolean Notification read status
owner_email String Email of creator of the notification
member_email String Foreign key to User table
created_at Datetime Date-time when notification was created
updated_at Datetime Date-time when notification was updated
is_email_sent Boolean Notification email status
emailed_at Datetime Date-time when notification was emailed

Future Work

One aspect that can be worked on is the UI of the notification page. It is rudimentary and lacks any level of personalization. To improve the UI, we will have to start from scratch; redesign the entire page. A good example of personalization would be to change the notification messages from say "Graph XYZ shared" to "Graph XYZ shared by you". This can be done by modifying the bootstrap-table formatter to detect if the notification was triggered by the logged in user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment