Context, problem, and forces

You are building a reactive, cloud-native system that is composed of bounded isolated components. Each component leverages value-added cloud services to implement one or more cloud-native databases that are exclusively owned by each component. This isolation empowers your self-sufficient, full-stack team to rapidly and continuously deliver innovation with confidence.

This isolation also makes it challenging to work with data across components. First, making synchronous requests between components to retrieve data is problematic. Second, making multiple synchronous requests to retrieve and join data from multiple components is even more problematic. Inter-component synchronous requests increase latency and reduce responsiveness because of the additional overhead required to traverse the layers of a component to retrieve its data. Latency and responsiveness are further impacted when we have to join data across multiple components.

Leveraging asynchronous, non-blocking I/O to make these requests in parallel minimizes the impact, but it is still necessary to process the responses and stitch them together to achieve the desired join. This response processing logic is repeated over and over again regardless of whether or not the response has changed. The typical solution is to add a caching layer to each consuming component, but this approach has significant drawbacks. We have the added headache of maintaining the additional caching infrastructure, such as Redis or Memcached, plus the added logic to check the cache for hits and update the cache on misses. We need to avoid stale cache as the data does change and incur added latency for cache misses and cold starts. All in all, caching adds a lot of complexity to an already complex situation.

Synchronous inter-component communication also increases coupling and reduces resilience. At its worst, synchronous inter-component communication is coupled to the availability of the component being called. A component may be highly available most of the time, but when it experiences an outage, then the requesting components will need to handle this condition appropriately. The problem is further complicated when we are aggregating or joining requests across multiple components. We must account for the failure of any single request and include compensation logic for each and every request. This logic can easily become a maintenance issue. Service Discovery and Circuit Breakers are common tools for handling some of these issues with synchronous communication. However, much like caching, these tools add their own complexity and their own issues.

At its best, synchronous inter-component communication is subject to competition for what amounts to a shared resource. This is the same problem we have at the database level, where we want to eliminate shared database clusters because they do not provide proper bulkheads. Here we have just moved the problem up to the component level, but the end result is similar. At the component level, this will manifest itself as throttling errors. The shared component will ultimately protect itself by implementing throttling that is consistent with its available capacity. Certainly, the capacity of a shared component can be increased and auto-scaled, but the allocation of capacity and throttling is not within the direct control of the consuming teams.

The scalability of synchronous inter-component communication also comes into question. With multiple components competing for the resources of a shared component, that shared component becomes a bottleneck that can impact all the dependent components. It most certainly is possible to sufficiently scale shared components, but at what cost and what complexity? The synchronous solution is certainly inefficient because we are repeating the same requests over and over again. We also need to increase team collaboration to communicate capacity requirements. Collaboration is generally not a bad thing, but we want to empower self-sufficient, full-stack teams, not impede them. Ultimately, bounded isolated components that communicate synchronously are in fact not actually isolated. The bottom line is that synchronous inter-component communication is best avoided.