Resulting context

The primary benefit of this solution is that it provides proper bulkheads between components to make downstream components resilient to failures in upstream components. If an upstream becomes unavailable then it will have zero impact on downstream components. Queries will continue to return results from the materialized views. If an upstream component deploys a bug that produces invalid events, then the Stream Circuit Breaker pattern will handle these faults and the materialized views will continue to return the last known values. The data will eventually become consistent once the issue is resolved.

Materialized views are highly responsive. They are tuned to exactly the needs of the component and deployed using the most appropriate polyglot persistence. The performance of queries is not impacted by the calculations for joining data from multiple events because the calculations are only performed when the data changes and the result is stored in the view. A materialized view acts as a local cache that is always hot and always up to date with the last known values. It is still advised to set an appropriate cache-control header on the response of every query. The materialized view is a local resource that is owned by the component and thus no unnecessary layers must be traversed to retrieve the data. Retries due to capacity throttling are minimized since there is no inter-component competition for these resources, as they are not shared with other components.

Self-sufficient, full-stack teams are not reliant on other teams to define capacity requirements. A team is in full control of allocating the capacity for each and every one of its materialized views. Furthermore, as each materialized view is an independently provisioned cloud-native database, each view is tuned independently. A team must monitor and tune its capacity, but it is at liberty to do so.

Materialized views are the primary facilitator of global scalability. No matter how well we improve our ability to scale the other layers of a system, retrieving data from a database always has the final word on performance and scalability. For the longest time, we have only treated the symptoms by compensating at the higher layers with multiple layers of caching and through coarse-grained, database-wide replication. With cloud-native databases and materialized views, we are treating the root cause by turning the database inside out and ultimately turning the cloud into the database.

We are in essence scaling by replicating data at the fine-grained level of individual query definitions. This is actually pretty amazing when you stop and think about it, especially considering it is also cost effective. Each query definition leverages the most appropriate database type and has its own dedicated capacity that is not shared across components. It is updated with changes in near real time and its joins are pre-calculated, and it is further replicated across regions for optimized latency and regional failover.

A common complaint about some cloud-native databases and the use of materialized views is that the expressiveness of the queries is too limited and that managing the materialized views is too cumbersome. It is true that generic, monolithic databases, by design, make development easier. We can focus on getting the data into the database and then lazily create queries as we need them. However, this is what gets us into trouble. Queries have performance implications and each additional query begins to add up. Of course, there is a need to support ad hoc queries, but this should be the responsibility of specific analytics-oriented components. Individual components, however, have very specific query requirements. We, in essence, want to apply the single responsibility principle to each query definition. From the consumer's standpoint, a component-specific query should be responsive and provide simple filters. All the expressiveness of joins and groups should be pre-calculated. Our event streaming architecture provides the foundation for performing these calculations and creating materialized views. Next, we will cover some examples of techniques for joining and grouping events and handling events out of order.