History
- NATS is a high-performance messaging system created by Derek Collison in
2010 written in Ruby.
- It was originally built to serve as the message bus for
Cloud Foundry
- Handling internal communication among components of the system
It Addresses
- Service Discovery
- Low latency communication
- Load balancing
- Notification and events handling
Overview
- Client publishes message on
foo
subject - Only clients subscribed to
foo
receive the message
How Simple!
INFO {"server_id":"NCNS32OEKOMMHTBVUYC7QWASFAP3BPEKGZG3QDIYRCBUL23WE3OVVBBO","server_name":"NCNS32OEKOMMHTBVUYC7QWASFAP3BPEKGZG3QDIYRCBUL23WE3OVVBBO","version":"2.1.7","proto":1,"git_commit":"bf0930e","go":"go1.13.10","host":"0.0.0.0","port":4222,"max_payload":1048576,"client_id":1,"client_ip":"172.17.0.1"}
SUB greetings 1
+OK
PUB greetings 12
Hello World!
+OK
MSG greetings 1 12
Hello World!
Request/Response
- Client connected to NATS publishes a request on a
help
subject - Worker client subscribed to
help
, processes requests, then sends
response - Audit client is subscribed to all subjects via a wildcard, so also
receives message but does not reply
- Publisher receieves the response from the worker client
Request/Response (cont'd)
- A publisher sends a message on a particular subject by attaching a reply
subject name along with it.
- Then, the subscribers who receive this message send the response to the
reply subject that was specified in the request message.
- Reply subjects have a unique name called
inbox
and the NATS server will send the message comming into these subjects to
the relevant publisher.
Queue
- Publisher sends multiple help requests on a
help
subject - Multiple worker clients subscribed to the
help
subject from a
distributed queue - Each published request is randomly balanced to only one of the worker
clients in the distributed queue
NATS As an Always Available Dial Tone
- The main design constraints that define the style of the NATS project are:
- Simplicity
- Performance
- Reliability
- It does not offer any Persistence or
Buffering
- It is true Fire and
Forget
Availability
- NATS will try to protect itself at all costs to be available for all
users
- NATS client libraries, internally try to have an always established
connection to one of the available NATS servers
- In case a server fails, NATS will reconnect to another available server
in the pool
- NATS supports high-availablity via a clustering mode that is set up as a
full-mesh of the servers
Slow Consumer
- By default, if a client fails to drain the pending data that the server is
holding for the client for over two seconds, the server will disconnect
the client
- Read more about it
here
Ping/Pong
- There is a PING/PONG interval happing that the client has to follow
otherwise the server will reset the connection in case there are many PONG
replies missing.
- Server sends PING messages to client in each interval and client has to
respond
- Client also can send PING messages to the server
Subject Names
- The dot character (.) has special meaning as part of the subject.
- When using it as part of the subject, we can create namespaces
(hierarchy) that can be later matched via wildcards.
Subject | Matches |
---|
foo.*.bar | foo.hello.bar ✅, foo.hi.bar ✅ |
foo.* | foo.hello ✅, foo.hi.bar ❌ |
foo.> | foo.hello.bar ✅, foo.hi.bar ✅, foo.hello ✅ |
*
: matches a single token>
: matches one or more tokens in the subject, and it can
only appear at the end
Queue Subscribtions: Hands-on
Worker 1 👷
INFO {"server_id":"NARXYLU3XEZSFYCFRV5WJ63GQS5WDV3WC4WAKCYDYY2IBGIETWJADZVS","server_name":"NARXYLU3XEZSFYCFRV5WJ63GQS5WDV3WC4WAKCYDYY2IBGIETWJADZVS","version":"2.1.8","proto":1,"git_commit":"c0b574f","go":"go1.14.8","host":"0.0.0.0","port":4222,"max_payload":1048576,"client_id":4,"client_ip":"172.22.0.1"}
SUB requests workers 5
+OK
MSG requests 5 6
second
Worker 2 👷
INFO {"server_id":"NARXYLU3XEZSFYCFRV5WJ63GQS5WDV3WC4WAKCYDYY2IBGIETWJADZVS","server_name":"NARXYLU3XEZSFYCFRV5WJ63GQS5WDV3WC4WAKCYDYY2IBGIETWJADZVS","version":"2.1.8","proto":1,"git_commit":"c0b574f","go":"go1.14.8","host":"0.0.0.0","port":4222,"max_payload":1048576,"client_id":5,"client_ip":"172.22.0.1"}
SUB requests workers 55
+OK
MSG requests 55 5
first
Publisher ✍️
INFO {"server_id":"NARXYLU3XEZSFYCFRV5WJ63GQS5WDV3WC4WAKCYDYY2IBGIETWJADZVS","server_name":"NARXYLU3XEZSFYCFRV5WJ63GQS5WDV3WC4WAKCYDYY2IBGIETWJADZVS","version":"2.1.8","proto":1,"git_commit":"c0b574f","go":"go1.14.8","host":"0.0.0.0","port":4222,"max_payload":1048576,"client_id":6,"client_ip":"172.22.0.1"}
PUB requests 5
first
+OK
PUB requests 6
second
+OK
Even More Features!
- Jetstream 🚀
- Clustering ✨
- Super-Cluster with Gateways 🗺️
Jetstream
- JetStream is a NATS built-in distributed persistence
system.
- JetStream is built-in to nats-server and you only need 1 (or 3 or 5 if you
want fault-tolerance against 1 or 2 simultaneous NATS server failures) of
your NATS server(s) to be JetStream enabled for it to be available to all
the client applications.
Functionalities enabled by JetStream
- Streaming: temporal decoupling between the publishers and subscribers
- Replay policies
- Retention policies and limits
- Persistent distributed storage
- Stream replication factor
- Mirroring between streams
- De-coupled flow control
- Exactly once semantics
Streaming
temporal decoupling between the publishers and subscribers
Subscribers only receive the messages that are published when they are
actively connected to the messaging system
(i.e. they do not receive messages that are published while they are not
subscribing or not running or disconnected).
However, nowadays a new way to provide this temporal de-coupling has been
devised and has become 'mainstream':
streaming.
Streams capture and store messages published on
one (or more) subject and allow client applications to create
subscribers (i.e. JetStream consumers) at any
time to replay (or consume) all or some of the
messages stored in the stream
Replay policies
JetStream consumers support multiple replay
policies.
- all of the messages currently stored in the
stream, meaning a complete replay and you can select
the replay policy(i.e. the speed of the replay) to be
either:
- instant
- the messages are delivered to the consumer as fast as it can take
them.
- original
- the messages are delivered to the consumer at the rate they were
published into the stream, which can be very useful for example for
staging production traffic.
- the last message stored in the stream, or the
last message for each subject (as streams can
capture more than on subject).
- starting from a specific sequence number.
- starting from a specific start time.
Retention policies and limits
Practically speaking, streams cannot always just keep growing forever and
therefore JetStream support multiple retention policies as well as the
ability to impose size limits on streams.limit
You can impose the following limits on a stream
- Maximum message age
- Maximum total stream size (in bytes)
- Maximum number of messages in the stream
- Maximum individual message size
- You can also set limits on the number of consumers that can be defined
for the stream at any given point in time.
You must also select a discard policy which specifies
what should happen once the stream has reached one of its limits and a new
message is published
- discard old means that the stream will
automatically delete the oldest message in the stream to make room for
the new messages.
- discard new means that new message is
discarded (and the JetStream publish call retuns an error indicating
that a limit was reached)
Retention policy
You can choose what kind of retention you want for each stream
- limits (the default)
- interest (messages are kept in the stream for
as long as there are consumers that haven't delivered the message yet).
- work queue (the stream is used as a shared
queue and messages are removed from it as they consumed)
Regardless of the retention policy selected, the limits (and the discard
policy) always apply.
Persistent distributed storage
You can choose the durability as well as the resilience of the message
store according to your needs.
- Memory storage
- File storage
- Replication (1 (none), 2, 3) between nats servers for Fault Tolerance
JetStream uses a NATS optimized
RAFT distributed quorom algorithm to distribute
the persistence service between nats servers in a cluster while
maintaining immediate consistency even in the face of Byzantine failures.
In JetStream the configuration for
storing messages is defined separately from how
they are consumed.
Storage is defined in a Stream and consuming messages is defined by
multiple Consumers.
Stream replication factor
A stream's replication factor (R, often referred to as the number
Replicas) determines how many places it is stored allowing you to tune to
balance risk with resource usage and performance.
A stream that is easily rebuilt or
temporary might be
memory based with a
R=1 and a stream that can
tolerate some downtime might be
file based with R=1.
Replicas | Description |
---|
1 | Cannot operate during an outage of the server servicing the stream.
Highly performant. |
2 | No significant benefit at this time. We recommend using Replicas=3
instead. |
3 | Can tolerate loss of one server servicing the stream. An ideal
balance between risk and performance. |
4 | No significant benefit over Replicas=3 except marginally in a 5 node
cluster. |
5 | Can tolerate simultaneous loss of two servers servicing the stream.
Mitigates risk at the expense of performance. |
Mirroring between streams
JetStream also allows server administrators to easily
mirror streams, for example between
different JetStream domains in order to offer disaster recovery.
You can also define a stream as one of the
sources for another stream.
De-coupled flow control
JetStream provides de-coupled flow control over streams, the flow control is
not 'end to end' where
the publisher(s) are limited to publish no faster than the slowest of all
the consumers (i.e. the lowest common denominator)
can receive, but is instead
happening individually between each client application (publishers or
consumers) and the nats server.
Exactly once semantics
Because publications to streams using the JetStream publish calls are
acknowledged by the server, the base quality of service offered by streams
is at least once.
Meaning that while
reliable and normally duplicate free
there are some specific failure scenarios:
- That could result in a
publishing application believing (wrongly) that
a message was not published successfully and therefore publishing it
again.
- That could result in a client application's consumption
acknowledgement getting lost and therefore the
message being re-sent to the consumer by the server.
Therefore, JetStream also offers an
exactly once.
- For the publishing side it relies on the publishing application
attaching a unique message or publication id in a message header and on
the server keeping track of those ids for a configurable rolling period
of time in order to detect the publisher publishing the smae message
twice.
- For the subscribers a
double acknowledgement mechanisim is used to
avoid a message being erroneously re-sent to a subscriber by the server
after some kinds of failures
Consumers
JetStream consumers are views on a stream, they are
subscribed to (or pulled) by client applications to receive copies of (or to
consume if the stream is set as a working queue) message stored in the strem.Fast push consumers
Client applications can choose to use fast un-acknowledged
push (ordered) consumers to receive messages as
fast as possible (for the selected replay policy) on a specific delivery
subject or to an inbox.
Those consumers are meant to be used to
replay rather than
consume
the messages in a stream.
Horizentally scalable pull consumers with batching
Client applications can also use and share
pull consumers that are demand-driven, support
batching and must explicitly acknowledge message reception and processing
which means that they can be used to consume (ie.e use the stream as a
distributed queue) as well as process the messages in a stream.
Pull consumers can and are meant to be shared between applications (just
like queue groups) in order to provide easy and transparent horizental
scalability of the processing or consumption of messages in a stream without
having (for example) to worry about having to define partitions or worry
about fault-tolerance.