Skip to content

Instantly share code, notes, and snippets.

@sijie
Created August 23, 2017 21:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sijie/60324cc892643961b923593a597109ab to your computer and use it in GitHub Desktop.
Save sijie/60324cc892643961b923593a597109ab to your computer and use it in GitHub Desktop.
PIP-5: Add EventTime to Pulsar Message
  • Status: Initial Draft
  • Discussion Thread: N/A
  • Issue: #xxx

Motivation

In use cases such as streaming processing, they need a timestamp in messages to process. This timestamp is called event time, which is different from publish time - the timestamp that this even occurs. The event time is typically provided and set by the applications.

To solve these use cases, we propose to add a event time for pulsar messages.

Changes

Public Interfaces

  • add a method #setEventTime(long timestamp) in MessageBuilder.java
/**
 * Set the event time for a given message.
 * <p> Applications can retrieve the event time by calling `Message#getEventTime()`.
 * This field is useful for stream processing.
 * <p> Note: currently pulsar doesn't support event-time based index. so the subscribers can't
 * seek the messages by event time.
 */
MessageBuilder setEventTime(long timestamp);
  • add a method #getEventTime() in Message.java
/**
 * Get the event time associated with this event. It is typically set by the applications.
 * <p>If there isn't any event time associated with this event, it will return `-1`.
 */
long getEventTime();

Wire Protocol

we propose to introduce an optional field called event_time in MessageMetadata.

message MessageMetadata {

    ...

    // the timestamp that this event occurs. it is typically set by applications.
    // if this field is omitted, `publish_time` can be used for the purpose of `event_time`.
    optional int32 event_time = 12 [default = -1];

}

Compatibility, Deprecation, and Migration Plan

This change is backward compatible. There is no special migration plan required.

Non Covered

This proposal doesn't cover following areas:

  • we don't provide any event time index. that means we can't rewind based on event time.
@joefk
Copy link

joefk commented Aug 23, 2017

I have conflicting views on this one. On the one hand, it is valuable. So we should have a property like this. On the other hand, event generation happens completely outside Pulsar. At most, we can consider events that occur within the client library as falling within Pulsar. So I think this should go as a generic metadata, driven by getKey() and setKey()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment