You're the first data engineer and find your self with the following scenario:
Your company has three user-facing clients: Web, iOS, and Android. Your data science team is interested in analyzing the following data:
- Support messages
- Client interactions (clicks, touches, how they move through the app, etc)
The data scientists need to be able to join these two data streams together on a common user_id
to perform their analysis. Currently the support messages are going to a service owned by the backend team; they go through standard HTTP endpoints and are getting written to PostgreSQL. You're going to be responsible for the service receiving the client interactions.
Q1: Knowing that you're going to be in charge of getting this to some sort of data store downstream, what would your schemas look like? The only hard requirement is that support messages must have the message body, and client interactions have to have event
and target
fields to represent actions like click
on login button
and t