Tag Archives: architecture

Decentralization: Be Unstoppable and Ungovernable

The trucker’s freedom convoy in Canada has revealed how individuals are vulnerable to tyrannical (rights violating) actions by governments and corporations cooperating with authoritarian diktats across jurisdictional boundaries. Maajid Nawaz warns of totalitarian power over the populace using a social credit system imposed via central bank digital currency (CBDC) regimes being developed to eliminate cash. “Programmable” tokens will give the state power to control who may participate in financial transactions, with whom, when, for what, and how much. Such a regime would enable government tyranny to reign supreme over everyone and across everything within its reach.

Centralized dictatorial power is countered by decentralization, especially that which is designed into technology to become unchangeable by humans after the technology proliferates. The design principle is known as Code is law. The Proof of Work (PoW) consensus algorithm in Bitcoin is one such technology. CBDC is an attempt to prevent Bitcoin from becoming dominant. Criticism of PoW using too much electricity is another tactic. National and supranational powers (above nation states) are working against decentralization in order to preserve their dominance. The World Economic Forum (WEF) is installing its people into national legislatures and administrations to enact policies similar to those of the Chinese Communist Party (CCP) to concentrate globalized power for greater centralization of control.

We look toward Web3 and beyond to enable decentralization of digital services. As we explore decentralized applications, we must consider the intent behind distributed architectures for decentralization. What do we want from Web3?

Unstoppable Availability

Traditionally, we think about availability with regard to failure modes in the infrastructure, platform services, and application components. Ordinarily, we do not design for resiliency to the total loss of infrastructure and platform, because we don’t consider our suppliers to be potentially hostile actors. However, global integration and the unholy nexus of multinational corporations with foreign governments to impose extrajudicial punishments on individuals, who reside outside the jurisdiction of governments hostile to their cause, put them within the reach of unjust laws and tyrannical diktats of authoritarians. It is clear now that this is one of the greatest threats that must be mitigated.

Web3 technologies, such as blockchain, grew out of recognition that fiat is the enemy of the people, and we must decentralize by becoming trustless and disintermediated. Eliminate single points of failure everywhere. Run portably on compute, storage, and networking that are distributed across competitive providers in adversarial jurisdictions across the globe without cooperation. When totalitarianism comes, Bitcoin is the countermove. Decouple from centralized financial systems, including central banking and fiat currencies. Become unstoppable and ungovernable, resistant to totalitarianism.

To become unstoppable, users need to gain immunity from de-platforming and supply chain disruption through decentralization. Users need to be able to keep custody of their own data. Users need to self-host the application logic that operates on their data. Users need to compose other users’ data for collaboration without going through intermediaries (service providers who can block or limit access). Then, to achieve resiliency, users need to be able to migrate their software components to alternative infrastructure and platform providers, while maintaining custody of their data across providers. At a minimum, this migration must be doable by performing a human procedure with some acceptable interruption of service. Ideally, the possible deployment topologies would have been pre-configured to fail-over or switch-over automatically as needed with minimal service disruption. Orchestrating the name resolution, deployment, and configuration of services across multiple heterogeneous (competitive) clouds is a key ingredient.

Custody of data means that the owner must maintain administrative control over its storage and access, as well as having the option of keeping a copy of it on physical hardware that the owner controls. Self-hosting means that the owner must maintain administrative control over the resources and access for serving the application functions to its owner, and for that hosting to be unencumbered and technically practical to migrate to alternative resources (computing, financial, and human).

If Venezuela can be blocked from “some Ethereum services”, that is a huge red flag. Service providers should be free to block undesirable users. But if the protocol and platform enables authorities to block users from hosting and accessing their own services, then the technology is worthless for decentralization. Decentralization must enable users to take their business elsewhere.

Ungovernable Privacy

Privacy is a conundrum. Users need a way to identify themselves and authenticate themselves to exert ownership over their data and resources. Simultaneously, a user may have good reason to keep their identity hidden, presenting only a pseudonym or remaining cloaked in anonymity in public, where appropriate. Meanwhile, governments are becoming increasingly overbearing in their imposition of “Know Your Customer” (KYC) regulations on businesses ostensibly to combat money laundering. This is at odds with the people’s right to privacy and being free from unreasonable searches and surveillance. Moreover, recruiting private citizens to spy on and enforce policy over others is commandeering, which is also problematic.

State actors have opposed strong encryption, and they have sought to undermine cryptography by demanding government access to backdoors. Such misguided, technologically ignorant, and morally bankrupt motivations disqualify them from being taken seriously, when it comes to designing our future platforms and the policies that should be applied. Rights are natural (a.k.a. “God-given” or inalienable), and therefore they (including privacy) are not subject to anyone’s opinion regardless of their authority or stature. Cryptographic technology should disregard any influence such authorities want to exert, and design for maximum protection of confidentiality, integrity, and availability. Do not comply. Become ungovernable.

Composability

While the capabilities and qualities of the platform are important, we should also reconsider the paradigm for how we interact with applications. Web2 brought us social applications for human networking (messaging, connecting), media (news, video, music, podcasts), and knowledge (wikis). With anything social, group dynamics invariably also expose us to disharmony. Web2 concentrated power into a few Big Tech platforms; the acronym FAANG was coined to represent Facebook (now Meta), Amazon, Apple, Netflix, and Google (now Alphabet). With centralized control comes disagreement over how such power should be wielded as well as corruption and abuse of power. It also creates a system that is vulnerable to indirect aggression, where state actors can interfere or collude with private actors to side-step Constitutional protections that prohibit governments from certain behaviors.

David Sacks speaks with Bari Weiss about Big Tech’s assault on free speech and the hazard of financial technologies being used to deny service to individuals, as was done to the political opponents of Justin Trudeau in Canada in response to the freedom convoy protests.

Our lesson, after enduring years of rising tension in the social arena and culminating in outright tyranny, is that centralized control must disappear. Social interactions and all forms of transactions must be disintermediated (Big Tech must be removed as the middlemen). The article Mozilla unveils vision for web evolution shows Mozilla’s commitment to an improved experience from a browser perspective. However, we also need a broader vision from an application (hosted services) perspective.

The intent behind my thoughts on Future Distributed Applications and Browser based capabilities is composability. The article Ceramic’s Web3 Composability Resurrects Web 2.0 Mashups talks about how Web2 composability of components enabled mashups, and it talks about Web3 enabling composability of data. The focus is shifting from the ease of developing applications from reusable components to satisfying the growing needs of end users. Composability is how users with custody of their own data can collaborate among each other in a peer-to-peer manner to become social, replacing centralized services with disintermediated transactions among self-hosted services. The next best alternative to self-hosting is enabling users to choose between an unlimited supply of community-led hosted services that can be shared by like-minded mutually supportive users. The key is to disintermediate users from controlling entities run by people who hate them.

State of Technology

The article My First Web3 Webpage is a good introduction to Web3 technologies. This example illustrates some very basic elements, including name resolution, content storage and distribution, and the use of cryptocurrency to pay for resources. It is also revealing of how rudimentary this stuff is relative to the maturity of today’s Web apps. Web3 and distributed apps (dApps) are extremely green. Here is a more complicated example. Everyone is struggling to understand what Web3 is. Even search is something that needs to be rethought.

The article Why decentralization isn’t the ultimate goal of Web3 should give us pause. Moxie Marlinespike, Jack Dorsey, Mark Andreeson, and other industry veterans are warning us about the current crop of Web3 technologies being fraudulent and conflicted. Vitalik Buterin’s own views confess that the technology may not be going in the right direction. Ethereum’s deficiencies are becoming evident. This demands great caution and high suspicion.

Here is a great analysis of the critiques against today’s Web3 technologies. It is very clarifying. One important point is the ‘mountain man fantasy’ of self-hosting; no one wants to run their own servers. The cost and burden of hosting and operating services today is certainly prohibitive.

Even if the mountain man fantasy is an unrealistic expectation for the vast majority, so long as the threat of deplatforming and unpersoning is real, people will have a critical need for options to be available. When Big Tech censors and bans, when the mob mobilizes to ruin businesses and careers, when tyrannical governments freeze bank accounts and confiscate funds, it is essential for those targeted to have a safe haven that is unassailable. Someone living in the comfort of normal life doesn’t need a cabin in the woods, off-grid power, and a buried arsenal. But when you need to do it, living as a mountain man won’t be fantastic. Prepping for that fall back is what decentralization makes possible.

In the long term, self-hosting should be as easy, effortless, and affordable as installing desktop apps and mobile apps. We definitely need to innovate to make running our apps as cloud services cheap, one-click, and autonomous, before decentralization with self-hosting can become ubiquitous. Until then, our short-term goal should be to at least make decentralization practical, even if it is only accessible initially to highly motivated, technologically savvy early adopters. We need pioneers to blaze the trail in any new endeavor.

As I dive deeper into Web3, it is becoming clear the technology choices lean toward Ethereum blockchain to the exclusion of all else. Is Ethereum really the best blockchain to form a DAO? In Ethereum, writing application logic is expected to be smart contracts. Look at the programming languages available for smart contracts. Even without examining any of these languages, my immediate reaction is revulsion. Who would want to abandon popular general purpose programming languages and their enormous ecosystems? GTFO.

We need a general purpose Web architecture for dApps that are not confined to a niche. I imagine container images served by IPFS as a registry, and having a next-gen Kubernetes-like platform to orchestrate container execution across multicloud infrastructures and consuming other decentralized platform services (storage, load balancing, access control, auto-scaling, etc.). If the technology doesn’t provide a natural evolution for existing applications and libraries of software capabilities, there isn’t a path for broad adoption.

We are early in the start of a new journey in redesigning the Web. There is so much more to understand and invent, before we have something usable for developing real-world distributed apps on a decentralized platform. The technology may not exist yet to do so, despite the many claims to the contrary. This will certainly be more of a marathon, rather than a sprint.

Browser based capabilities

One approach to better empowering users and upstart services to avoid Big Tech censorship, suppression, and control is to build capabilities into the browser for mashing up and mixing in complementary services. This would provide a client-side (browser based) approach for third party complementary services to extend incumbent services without needing the incumbent’s authorization or cooperation. This would be one element of building Future Distributed Applications.

Using this approach, social media sites (Facebook, Instagram, Twitter, YouTube, Reddit, etc.) that enforce authoritarian content moderation policies can be complemented by alternative services, where prohibited users and comments can be linked. Users could see the conversation with content merged from every desired source beyond Big Tech control. This approach for distributing comments that form a single conversation would be applicable to many services.

  • Comment on content where a user’s comments would be suppressed.
  • Annotate or review an article where commenting is not enabled. Allow an annotation to link precisely to a specific range of text, so it can be presented inline.
  • Add links to relevant content not referenced by the original.

This paradigm would enable end users to control how content is consumed, so that Web sites cannot censor or bias what information is presented about controversial topics.

Applying browser add-ons that mix-in complementary services would also enable end users to take information and process it in personalized ways, such as for fact-checking, reputation, rating, gaining insights through analytics, and discovering related (or contrarian) information. Complementary content could be presented by injecting HTML, or by rendering additional layers, frames, tabs, or windows, as appropriate.

Browser add-ons are only supported on the desktop, not mobile devices. Mobile devices would need to be supported for this paradigm to become broadly useful.

Future Distributed Applications

Big Tech censorship and cancel culture are becoming intolerable. Politicization of business is destroying the fabric of society. Corporate oligarchs are implementing partisan agendas to shape public discourse by applying so-called “community standards” for social media content moderation. They de-platform personalities who express opinions that run counter to approved narratives. They silence dissent. Free speech and freedom of association are under threat, as private companies are coerced by state regulatory action, looming threats of state intervention, and mob rule through heckler’s veto, bullying, harassment, doxxing, and cancel culture. Concentration of power and control in a few dominant platforms, such as Google, YouTube, Facebook, Twitter, Wikipedia, and their peers has harmed consumer choice. Anti-competitive behavior, such as collusion among platform and infrastructure services to deny service to competitive upstarts and undesirable non-conformists, has suppressed alternatives like Parler, Gab, and BitChute.

The current generation of dominant platforms does not allow editorial control to be retained by content creators. The platform is viewed as the ultimate authority, and users are limited in their ability to assert control to form self-moderated communities and to set their own community standards. Control is asserted by the central platform authorities.

Control needs to be decoupled from centralized platform authorities and put back in the hands of content creators (authors, podcasters, video makers) and end users (content consumers and social participants). Editorial control over legal content does not belong with Big Tech. What constitutes legal content is dependent on the user’s jurisdiction, not Big Tech’s harmonization of globalist attitudes. To Americans, hate speech is protected speech, and it needs to be freely expressible. Similarly, users in other jurisdictions should be governed according to their own standards.

We need to develop apps with peer-to-peer protocols and end-to-end encryption to cut out the middlemen which will exterminate today’s generation of social media companies. Better yet, application logic itself should be deployable on user-controlled compute with user-controlled encrypted storage on any choice of infrastructure providers (providing a real impetus for the adoption of Edge Computing), so that centralized technology monopolies cannot dominate as they do today. This approach needs to be applied to decentralize all apps, including video, audio podcasts, music, messaging, news, and other content distribution.

I believe the next frontier for the Internet will be the development of a generalized approach on top of HTTP or as an adjunct to HTTP (like bittorrent) to enable distributed apps that put app logic and data storage at end-points controlled by users. This would eliminate control by middlemen over what content can be created and shared.

Applications must be distributed in a topology where a node is dedicated to each user, so that the user maintains control over the processing and data storage associated with their own content. Applications must be portable across cloud infrastructures available from multiple providers. A user should be able to deploy an application node on any choice of infrastructure provider. This would enable users to be immune from being de-platformed.

With an application whose logic and data are distributed in topology and administrative control, the content should be digitally signed so that it can be authenticated (verified to be produced by the user who owns it). This is necessary, so that a user’s application node can be moved to an alternative infrastructure (compute and storage) without other application nodes needing to establish any form of trust. Consumers (the audience with whom the owner shares content) and processors (other computational services that may operate on the content) of the information would be able to verify that the information is authentic, not forged or tampered with. The relationship between users and among application nodes, as well as processors, is based on zero trust.

Processing of information often involves mirroring and syndication. Mirroring with locality for low latency access gives certain types of transaction processing, such as search indexing, the performance characteristics they need. Authorizing a search engine to index one’s content does not automatically grant users of the search engine access to the content. Perhaps only an excerpt is presented by the search engine along with the owner’s address, where the user may request access. A standard protocol is needed to enable this negotiation to be efficient and automated, if the content owner chooses to forego human review and approval.

We need to change how social applications control the relationship between content producers and content consumers. First, for original source content, the root of a new discussion thread, the owner must control how broadly it is published. Second, consumers of content must control what sources of information they consume and how it is presented. Equally important, consumers of an article become producers of reviews and comments, when interacting in a social network. The same principle must apply universally to the follow-on interactions, so that the article’s author should not be able to block haters from commenting, but the author is not obligated to read them. Similarly, readers are not obligated to see hateful commenters, who they want to exclude from their network. The intent should be to enable each person to control their own content and experience, ceding no control to others.

Social applications need self-managed communities with member administered access control and content moderation. Community membership tends to be fluid with subgroups merging and splitting regularly. Each member’s access and content should follow their own memberships rather than being administered by others in those communities. The intent is to mitigate a blacklisted individual being cancelled by mobs. If a cancelled individual can form their own community and move their allies there with ease, cancel culture becomes powerless as a tool of suppression with global reach. Its reach is limited to communities that quarantine themselves.

This notion of social network or community is decentralized. A social application may support a registry of members, which would serve as a superset of potential relationships for content distribution. This would enable a new member to join a social network and request access to their content. Presumably, most members would enable automatic authorization of new members to see their content, if the new member has not been blocked previously. That is, enable a community to default to public square with open participation. However, honor freedom of association, so that no one is forced to interact with those with whom there is no desire to associate, and no one can be banned from forming their own mutually agreed relationships.

We need software innovations to address this urgent need to counter the censors, the cancellers, the de-platformers, the prohibitionists, the silencers of dissent, and the government oppressors. We don’t yet have a good understanding of the requirements which I’ve touched upon above, as I have only scratched the surface. We need an architecture to enable the unstoppable Open Internet that we failed to preserve from the early days. We need to develop a platform that realizes this vision to restore a healthy social fabric for our online communities.

system integration

There is something terribly wrong with software development in the enterprise application space. No one is able to release working software without coordinating across all product development teams to align the version of every product in the universe, because end-to-end workflows can’t be made to work as products are released on independent life cycles.

I believe we are missing architectural design principles. We talk about forward and backward compatibility of APIs, but I’m not sure the industry deeply understands what that entails. The problem goes beyond teams within an organization, because the software industry doesn’t even understand what compatibility entails.

The issue lies in how the base application (e.g., product catalog, store front, sales automation, care, order fulfillment, customer and subscription management, charging, billing, revenue management) is horizontal (generic) and hollow, expecting after-market extensibility to provide the vertical behavior that is specialized for the industry and the enterprise’s business model. The intent of the application vendor is to provide a general purpose platform that can be tailored after-market to the peculiarities of any enterprise. The application will implement an API defined by industry standards (say, tmforum.org for the communications industry) that reflects this general purpose hollowness. The application doesn’t have any real substance until it is customized to model the business. For example, a product catalog would not come populated with 5G mobile product specifications that are branded and priced according to a 5G service provider’s business model).

When extending entities with data that have hidden meaning, implied behavior, constraints, and statefulness (life cycle, workflow), these contribute to the API in ways that were not defined by the original specification. Each new element introduces some degree of incompatibility. Industry standards can never specify in a precise and rigorous manner things they did not foresee.

Stateful behavior is especially troublesome to specify in a manner that ensures compatibility. This includes conversational state and persistent state. Conversational state is where linked information is implicitly kept across multiple requests involved in the same session. A cursor for iterating through a collection of query results is an example of conversational state. Persistent state is durable across transactions, having memory that spans the life of a transaction, a session, a process, and even the life of a compute instance. When methods can only act against objects in certain states, but not others, this constraint must be honored for compatibility across collaborating components.

Objects and attributes are allowed to take on certain values at various points in their life cycles, and transactional behavior and workflow (the steps performed by business processes) are conditional upon the state of these objects. For example, when equipment is installed, it may be in various states of readiness for production use, but when not installed the equipment’s operational characteristics and configuration are irrelevant. Every component with access to that object must understand these semantics and enforce them consistently, otherwise there is no compatibility. Unfortunately, even these very simple conditional constraints and the ones in the previous paragraph are beyond the capability of today’s prevailing interface specification languages and entity modeling frameworks.

Immutability is often conditional on the life cycle state of an entity. For example, an order can be edited during information capture, but its captured intent cannot be edited after the order is firm and in the process of being fulfilled. Again, this constraint cannot be specified in a manner that ensures compatibility across collaborating components.

Methods have failure modes, usually specified as failure responses, error codes, or exceptions. Some kinds of failures are recoverable using techniques like retrying, while others are non-recoverable. This too is usually not expressible for compatibility.

Methods have performance expectations in terms of latency, concurrency, and transaction volume. Methods have resource consumption expectations in terms of memory, cpu, storage, network, and I/O. Methods that involve data sets have expectations about how much data can be passed with corresponding performance and scalability characteristics. This too is usually not expressible for compatibility.

Objects and their attributes are often persistent on durable storage. Subsets of attributes may be persistent, while others are volatile or derived (computed based on the value of other attributes, such as a rolled-up status or a count of a collection). This too is usually not expressible for compatibility.

Methods must trade off concurrency, availability, and partition tolerance. The expectation of what trade offs should be chosen is usually not expressible for compatibility.

Methods expect the caller to be authenticated and they are expected to enforce access control to verify that the caller is authorized. Moreover, the method is expected to enforce data permissions and data privacy. This too is usually not expressible for compatibility.

The list of requirements and constraints that contribute to compatibility goes on. The above is a sampling to give the reader a sense of the problem, not to be comprehensive. The intent is to show how formal specifications are grossly insufficient to ensure a high degree of compatibility across heterogeneous suppliers and independently developed implementations.

Because API compatibility is so unreliable based on specifications and contract testing, the promise of a microservice architecture (within an application) or a service-oriented architecture (for integrating applications across the enterprise) cannot be achieved naively. System integration continues to be plagued by a waterfall model of requiring a complete line-up of application versions to be tested end-to-end, before we have any confidence that they work together. The benefits of agile development and independent life cycles are not achievable, because the pre-requisite compatibility guarantees cannot be met. System integration of enterprise applications remains in the stone age because of this crippling deficiency.

vertical integration

Applications have been pursuing operational efficiency through vertical integration for years. This is generally understood to mean assembling infrastructure (machine and operating system) with platform components (database, middleware) and application components into an engineered system that is pre-integrated and optimized to work together.

Now, the evolution to cloud services is following the same pattern. IaaS is integrated into PaaS. IaaS and PaaS are integrated with application components to deliver SaaS. However, just as we see in on-premise enterprise information systems, applications do not operate in silos. They are integrated by business processes, and they must collaborate to enforce business policies across business functions and organizations.

Marketing is deeply interwoven with sales. Product configuration, pricing, and quotation are tied to order capture and fulfillment. Fulfillment involves inventory, shipping, provisioning, billing, and financial accounting. Customer service is linked with various service assurance components, billing care, and also quote and order capture. All components need views of accounts, assets (products and services subscribed to), agreements, contracts, and warranties. Service usage and demand all feed analytics to drive marketing campaigns that generate more sales. What a tangled web.

What is clear from this picture is that vertical integration does not end with infrastructure, platform, and a software application. Applications contribute components that combine with business processes and business policies to construct higher level applications. This may continue for many layers of integration according to the self-similar paradigm.

The evolution to cloud should recognize the need for integration of SaaS components with business processes and business policies. However, it does not appear as though cloud services have anticipated the need for vertical integration to continue in layers. To construct assemblies, the platform should provide a means of defining such assemblies, so that they can be replicated by packaging and deploying them on infrastructure at various scales. The platform should provide a consistent programming and configuration model for extending and customizing applications in ways that are natural to being reapplied layer by layer.

Vertical integration is not an elegantly solved problem for on-premise applications. On-premise application integration is notoriously complex due to heterogeneity and vastly inconsistent interfaces and programming models. One component’s notion of customer is another’s notion of party. Two components with notions of customer do not agree on its schema and semantics. A product to one component is an offer to another. System integration projects routinely cost five to ten times the software license cost of the application components, because of the difficulty of overcoming impedance mismatches, gaps in functional capabilities, duct tape, and bubblegum.

Examining today’s cloud platforms and the applications built upon them, it is looking like we have not learned much from our past mistakes. We are faced with the same costly and clunky integration nightmare with no breakthrough in sight.

encapsulation

Encapsulation is the packing of data and functions into a single component.

Under this definition, encapsulation means that the internal representation of an object is generally hidden from view outside of the object’s definition.

In 2004, I wrote OOPs, here comes SOA (again) to comment on how Service-Oriented Architecture (SOA) stands in stark contrast to Object-Oriented Programming (OOP).

In my previous article, Going Meta, we can see that beyond the scale of a single object, the architecture of an enterprise application breaks encapsulation by deliberately exposing data representations of entities. If data hiding through encapsulation is a fundamental principle of Object-Oriented Programming, that principle certainly breaks down at several levels. Technical challenges are partly to blame. However, I believe there are non-technical motivations to abandon data hiding.

From a technical perspective, the impedance mismatch between the middle tier and the database tier demands that the database schema be a central design consideration that is agreed upon by all stakeholders. The boundary between the middle tier and the database tier is all about the data and CRUD operations. This boundary may be more service-oriented, if the logic is implemented in the database tier as stored procedures, but the programming language available in the database is seldom expressiveness enough or natural enough for most developers to embrace. In the middle tier, domain services encapsulate the application logic, but the boundary between the middle tier and its clients (Web browsers and machine-to-machine integration with other applications) again is all about serialized data in the form of request, response, and fault messages for remote procedure calls (SOAP and RESTful services) or event messages (publish-subscribe). The entire data model (with very few exceptions for data that is used only for computations that are private to the logic) is exposed through these interfaces.

From a business perspective, it is natural to think in terms of business processes and the data artifacts (e.g., documents, files) that flow between tasks. The application services are merely a means of implementing those tasks; another means of implementing a task may be for a human to perform it either without the aid of software or with the aid of software that lacks the integration to enable the task to be performed automatically. Users do not think of their interactions with the software in terms of data hiding. They are very aware of the data structures that are relevant to the business.

Object-Oriented Programming is relegated to the micro level of an enterprise application. Encapsulation or data hiding is a concept that is relevant only to modules of logic, and these concepts do not extend naturally throughout the architecture for an enterprise application for both technical and business reasons. When developing enterprise applications and systems that involve business processes that integrate across applications, Object-Oriented Programming is a paradigm that sadly has little impact and relevance at a macro level, where SOA fills the vacuum. It seems like the software industry and computer science are still at a very immature stage of evolution, as we go without a programming paradigm that can unify how software is developed at the micro and macro levels.

Reliable Messaging with REST

Marc de Graauw’s article Nobody Needs Reliable Messaging remains as relevant today as it did in 2010, when it was first published. It echoes the principles outlined in Scalable, Reliable, and Secure RESTful services from 2007.

It basically says that you don’t need for REST to support WS-ReliableMessaging delivery requirements, because reliable delivery can be accomplished by the business logic through retries, so long as in the REST layer its methods are idempotent (the same request will produce the same result). Let’s examine the implications in more detail.

First, we must design the REST methods to be idempotent. This is no small feat. This is a huge topic that deserves its own separate examination. But let’s put this topic aside for now, and assume that we have designed our REST web services to support idempotence.

If we are developing components that call REST web services for process automation, the above principle says that the caller is responsible for retrying on failure.

The caller must be able to distinguish a failure to deliver the request from a failure by the server to perform the requested method. The former should be retried, expecting that the failure is temporary. The latter is permanent.

The caller must be able to implement retry in an efficient manner. If the request is retried immediately in a tight loop, it is likely to continue to fail for the same reason. Network connectivity issues sometimes take a few minutes to be resolved. However, if the reason for failure is because the server is overloaded, having all clients retry in a tight loop will exacerbate the problem by slamming the server with a flood of requests, when it is least able to process them. It would be helpful if clients would behave better by backing off for some time and retrying after a delay. Relying on clients to behave nicely on their honor is sure to fail, if their retry logic is coded ad hoc without following a standard convention.

The caller must be able to survive crashes and restarts, so that an automated task can be relied upon to reach a terminal state (success or failure) after starting. Therefore, message delivery must be backed by a persistent store. Delivery must be handled asynchronously so that it can be retried across restarts (including service migration to replacement hardware after a hardware failure), and so that the caller is not blocked waiting.

The caller must be able to detect when too many retry attempts have failed, so that it does not get stuck waiting forever for the request to be delivered. Temporary problems that take too long to be resolved need to be escalated for intervention. These requests should be diverted for special handling, and the caller should continue with other work, until someone can troubleshoot the problem. Poison message handling is essential so that retrying does not result in an infinite loop that would gum up the works.

POST methods are not idempotent, so retry must be handled very carefully to account for side-effects. Even if the request is guaranteed to be delivered, and it is processed properly (exactly once) by the server, the caller must be able to determine if the method succeeded reliably, because the reply can be lost. One approach is to deliver the reply reliably from the server back to the caller. Again, all of the above reliable delivery qualities apply. The interactions to enable this round trip message exchange certainly look very foreign to the simple HTTP synchronous interaction. Either the caller would poll for the reply, or a callback mechanism would be needed. Another approach is to enable the caller to confirm that the original request was processed. With either approach, the reliable execution requirement needs to alter the methods of the REST web services. To achieve better quality of service in the transport, the definition of the methods need to be radically redesigned. (If you are having a John McEnroe “you cannot be serious” moment right about now, it is perfectly understandable.)

Taking these requirements into consideration, it is clear that it is not true that “nobody needs reliable messaging”. Enterprise applications with automated processes that perform mission-critical tasks need the ability to perform those tasks reliably. If reliable message delivery is not handled at the REST layer, the responsibility for retry falls to the message sender. We still need reliable messaging; we must implement the requirement ourselves above REST, and this becomes troublesome without a standard framework that behaves nicely. If we accept that REST can provide only idempotence toward this goal, we must implement a standard framework to handle delivery failures, retry with exponential back off, and divert poison messages for escalation. That is to say, we need a reliable messaging framework on top of REST.

[Note that when we speak of a “client” above, we are not talking about a user sitting in front of a Web browser. We are talking about one mission-critical enterprise application communicating with another in a choreography to accomplish some business transaction. An example of a choreography is the interplay between a buyer and a seller through the systems for commerce, quote, procurement, and order fulfillment.]

OLTP database requirements

Here is what I want from a database in support of enterprise applications for online transaction processing.

  1. ACID transactions – Enterprise CRM, ERP, and HCM applications manage data that is mission critical. People’s jobs, livelihoods, and businesses rely on this data to be correct. Real money is on the line.
  2. Document oriented – A JSON or XML representation should be the canonical way that we should think of objects stored in the database.
  3. Schema aware – A document should conform to a schema (JSON Schema or XML Schema). Information has a structure and meaning, and it should have a formal definition.
  4. Schema versioned – A document schema may evolve in a controlled manner. Software is life cycle managed, and its data needs to evolve with it for compatibility, upgrades, and migration.
  5. Relational – A subset of a document schema may be modeled as relational tables with foreign keys and indexes to support SQL queries, which can be optimized for high performance.

The fundamental shift is from a relational to a document paradigm as the primary abstraction. Relational structures continue to play an adjunct role to improve query performance for those parts of the document schema that are heavily involved in query criteria (WHERE clauses). The document paradigm enables the vast majority of data to be stored and retrieved without having to rigidly conform to relational schema, which cannot evolve as fluidly. That is not to say that data stored outside of relational tables is less important or less meaningful. To the contrary, some of the non-relational data may be the most critical to the business. This approach is simply recognizing information that is not directly involved in query criteria can be treated differently to take advantage of greater flexibility in schema evolution and life cycle management.

Ideally, the adjunct relational tables and SQL queries would be confined by the database to its internal implementation. When exposing a document abstraction to applications, the database should also present a document-oriented query language, such as XQuery or its equivalent for JSON, which would be implemented as SQL, where appropriate as an optimization technique.

NoSQL database technology is often cited as supporting a document paradigm. NoSQL technologies as they exist today do not meet the need, because they do not support ACID transactions and they do not support adjunct structures (i.e., relational tables and indexes) to improve query performance in the manner described above.

Perhaps the next best thing would be to provide a Java persistent entity abstraction, much like EJB3/JPA, which would encapsulate the underlying representation in a document part (e.g., as a XMLType or a JSON CLOB column) and a relational part, all stored in a SQL database. This would also provide JAXB-like serialization and deserialization to and from JSON and XML representations. This is not far from what EclipseLink does today.

Applied Cosmology: The Holographic Principle

The Holographic Principle says that a full description of a volume of space is encoded in the surface that bounds it. This arises from black hole thermodynamics, where the black hole entropy increases with its surface area, not its volume. Everything there is to know about the black hole’s internal content is on its boundary.

Software components have boundaries that are defined by interfaces, which encapsulate everything an outsider needs to know to use it. Everything about its interior is represented by its surface at the boundary. It can be treated like a black box.

Applied Cosmology: The Self Similar Paradigm

Robert Oldershaw’s research on The Self Similar Cosmological Paradigm [http://www3.amherst.edu/~rloldershaw/concepts.html] recognizes that nature is organized in stratified hierarchy, where every level is similar. The shape and motions of atoms is similar to stellar systems. Similarities extend from the stellar scale to the galactic scale, and beyond.

Managing complexity greatly influences software design. Stratified hierarchy is familiar to this discipline.

At the atomic level, we organize our code into units. Each unit is a module with a boundary, which exposes its interface that governs how clients interact with this unit. The unit’s implementation is hidden behind this boundary, enabling it to undergo change independently of other units, as much as possible.

We build upon units by reusing modules and integrating them together into larger units, which themselves are modular and reusable in the same way. Assembly of modules into integrated components is the bread and butter of object-oriented programming. This approach is able to scale up to the level of an application, which exhibits uniformity of platform technologies, programming language, design metaphors, conventions, and development resources (tools, processes, organizations).

The next level of stratification exists because of the need to violate the uniformity across applications. However, the similarity is unbroken. We remain true to the principles of modular reuse. We continue to define a boundary with interfaces that encapsulate the implementation. We continue to integrate applications as components into larger scale components that themselves can be assembled further.

Enterprises are attempting to enable even higher levels of stratification. They define how an organization functions and how it interfaces with other organizations. This is with respect to protocols for human interaction as well as information systems. Organizations are integrated into business units that are integrated into businesses at local, national, multi-national, and global scales. Warren Buffett’s Berkshire Hathaway has demonstrated how entire enterprises exhibit such modular assembly.

This same pattern manifests itself across enterprises and across industries. A company exposes its products and services through an interface (branding, pricing, customer experience) which encapsulates its internal implementation. Through these protocols, we integrate across industries to create supply chains that provide ever more complex products and services.