gdchamal/gist:0fcfa10a38b3b5eb2b31

## gistfile1.md

      
    Raw
  

              gistfile1.md
            
          
    Caliopen Architecture Guide
Context

Caliopen in this document refer to the technical messaging infrastructure
developed to achieve Caliopen project aims.
Caliopen is designed as a scalable online platform for management of messages
in many protocols. Mail and related protocols are the first implementation.
Security is a main concern, particulary personal data security principles.
Design

Caliopen is a basically a 3 layers architecture:


Storage with a data and an index engines used. Currently Cassandra
and Elasticsearch are supported


Core layer. It's where all logic is build. This layer must be used
by all clients of Caliopen.


Protocol layers. Protocol specific logic is build in these layers
and use core layer to manage data. REST Api, LMTP MDA, SMTP are such
layers.


Code repositories reflect this architecture. It's a protocol component
oriented architecture, not a microservice one.
All different protocols layer must used state of the art security
mechanisms. For example HTTPS cyphers when Caliopen do not operate
with a third party HTTP(S) service must be enforced to a really high
security level (A+ note on Ssl labs site).
Layers

Storage layer

All data are stored in a cassandra cluster, when it's possible using
user main encryption key. Not crypted data are stored in an index engine
to permit fast retrival of information per user.
Models are related to only one user in most cases, and each user have its
own index with models that can be indexed. User index must be able to be
rebuild at any time, using cassandra data. So many updatables data have
to be updated in both cassandra and elasticsearch (tags is a good example).
Core layer

This layer and only this one must be used to read and write from storage.
All models have their equivalent in this layer and must be used to manage
related data.
This layer must not use specific storage objects methods, this logic belong
stricly to storage layer (NB: it's not the current status in code).
All inputs must be validated and cleaned before. All methods in this layer
must be strictly declarative and not support anonymous arguments, specially
when requesting indexed data.
Protocols layer

This layer is many packages, one per protocol.
Supported protocols are at this time:

HTTPS REST Api
LMTP Mail Delivery Agent
SMTP for mail sending
Vcard import, export for address book management.

And more to come (x;pp, twitter, facebook, linkedin, ....)
Models

Storage layer define many models. Most are related to only one user, but
some are shared by the whole platform.
User management and configuration

User

All users are stored using this model. This model is used for user authentication
and lookup.

Counter

This model store all counters related to one user.
Tag

This model store all tags defined by an user

FilterRule

This model store filtering rules defined by one user
Current design is not correct as we store directly executable python code,
needed that for proof of concept, need a better solution
Message management

RawMessage

Store in raw format any message that can be related to many users
No modification on received or sent message must be done before
storing in this model

Message
~~~~~~~

Message processed for one user. Only interesting data for display
are stored in this model.

Thread
~~~~~~~

Threads related to one user. Any user message belong to a new or
an existing thread.

MessageLookup

Lookup table (index) to retrieve a user message by it's external id.
Contact management

XXX