Uncontrolled data flowing through an application can blow the runtime to bits. When one of the components/service in a system is struggling to keep up with load it can fail catastrophically because of build up of memory. Since the component/service cant keep up and cant fail it should communicate the fact that it is under stress to upstream components so that the upstream service slows down. The backpressure may cascade all the way to the user . (The component which is receiving commands from user can only slow down the user by returning some error like "system busy" ). At this point the user experience for some users will suffer but this will ensure that on whole system is more reselient and responsive. In other words backpressure allows a component / service to apply breaks on incoming data by controlling the producer. Backpressure is a key concept in reactive programming.
When the incoming task rate from producer is higher than the consumer’s ability to process then following strategies to handle the over produced items.
- Autoscaling can be used for horizontally scaling the consumer.
- Note that while statless application servers can be horizontally scalable , they may be interacting with services/ stateful databases which have connection / processing capacity/rate limits. Read more here. https://cloud.google.com/blog/products/serverless/6-strategies-for-scaling-your-serverless-applications . In a nutshell not all things scale the same way.
- Buffer over produced tasks (accumulate incoming task spikes temporarily)
- Note that unbounded buffers where tasks are stored in memory are a common source of memory crashes in server. Buffering can only help in dealing with temporary spikes.
- Drop
- Control the producer . Different strategies which can be used to control the producer are
- Pull - With pull based streams the consumers control the producers. The consumer /subscriber decides when and how many tasks it is able and willing to receive. Note that in the reactive terminology this is called cold observable. Cold observables are are ideal for reactive pull model.
- The pulling consumer can control if the producer should buffer or drop the over produced items eg In RxJava onBackpressureBuffer() function is used for buffering. onBackpressureLastest() can be used for dropping. In RxJava if the producer drops because of overproduction it can send exception signal after sending all tasks. Hece the consumer knows that items/tasks were dropped (and scale out).
- consumer wants buffering on backpressure
- consumer wants dropping on backpressure.
Attribution : Diagrams for RxJava's wiki - Push - With push-based streams, the producer is in control and pushes data to the consumer when it’s available. Such a producer is called a hot observable in reactive terminology. The producer/observable will emit at its own pace and consumers/observers must keep up.
- Limited Push - The publisher only can send a maximum amount of items to the client at once.
- Since the Observable/producer pushes data and not pulled by consumer hence following flow control strategies can be used.
- Buffer - periodically gather items emitted by an Observable into bundles and emit these bundles rather than emitting the items one at a time ie The Buffer operator transforms an Observable that emits items into an Observable that emits buffered collections of those items.
- Attribution : http://reactivex.io/documentation/operators/buffer.html
- Sample: Emits the most recent items emitted by an Observable within periodic time intervals
- Debounce : Only emit an item from an Observable if a particular timespan has passed without it emitting another item
- Window : Periodically subdivide items from an Observable into Observable windows and emit these windows rather than emitting the items one at a timeRead more here http://reactivex.io/documentation/operators/window.html
- Canceling the data streaming when the consumer cannot process more events.
- In this case, the receiver can abort the transmission at any given time and subscribe to the stream later again
- Pull - With pull based streams the consumers control the producers. The consumer /subscriber decides when and how many tasks it is able and willing to receive. Note that in the reactive terminology this is called cold observable. Cold observables are are ideal for reactive pull model.