Category Archives: software

service and resource

I have written about TM Forum SID before in What is wrong with TM Forum SID? My criticisms were focused on deficiencies in behavioral modeling. In this article, I turn my attention to the structural model itself.

Let’s start with the concept of Resource. SID defines a model for resources to represent communications network functions. [GB922 Logical and Compound Resource R14.5.0 §1.1.2] This approach seems self-evident. So far, so good. (My intent is not to evaluate how effective the SID resource model is in achieving its goal.)

When we examine the concept of Service, we run into difficulties. In [GB922 Service Overview R14.5.0 §1.1.3], this overview of “service” makes no attempt to provide a precise definition of the term. This section references other standards efforts that have attempted to address the topic. It references various eTOM process areas that apply to service. Finally, it discusses the things that surround and derive from service. All the while, “service” remains undefined, as the document proceeds to a detailed structural decomposition. I don’t consider this a fatal flaw, because we can fill this gap ourselves through contemplation given SID’s circling around the abstraction to evade nailing its definition.

I would define “service” as something of value that can be delivered as a subscription by the resources of a communications network. That wasn’t too difficult. In the context of SID, this definition of “service” is not intended to include human activities that are provided to clients; that is an entirely different concept.

SID specializes “service” into two concepts: (1) customer-facing service and (2) resource-facing service. A CFS is a service that may be commercialized (branded, priced, and sold) as a product to customers. An RFS is not commercialized.

Here is where we begin to see things go wrong. When we model services, such as network connectivity, it may be a CFS under certain circumstances, but it may be a RFS under other circumstances. I think, at this point, SID should have recognized that the concepts of “service” and “resource” are roles that can be taken on by entities. They are not superclasses that are specialized. Using our example of network connectivity, when it is commercialized, it becomes a service, and when it is used to enable (directly or indirectly) something else to be delivered, it acts as a resource. The concepts of “service” and “resource” should be thought of more like “manager” and “employee”. A person is not intrinsically a manager or an employee; a person may take on one or both of these roles contextually. By not recognizing this pattern, this flaw in SID has made modeling very awkward for many types of communications network technologies, especially for layered services, and services that are built from other services (which are treated as resources).

social applications

Facebook and Twitter have flourished in our personal lives. But their usefulness for work is limited to advertising and other marketing activities. Engagement is through sharing of status updates, links, photos, likes, and comments. This is a decade old approach that has not advanced much.

In Mark Zuckerberg’s interview for Startup School, he shows his understanding that Facebook is a social platform for building social apps. However, it is my opinion that all of the players in the social networking space do not have good vision into the future. Facebook and Twitter treat social interactions as ends in themselves. That is why they present information in a timeline, and they seek out trending topics. Information is like news that is stale after it is read. Engagement is a vehicle for targeted marketing.

Google has tried to compete with Facebook, but they can’t seem to find a formula for success. The article Why Google+ failed, according to Google insiders outlines their failure to achieve mass adoption and engagement. Providing an alternative to Facebook without a discernible improvement is not competitive, because users have no good reason to migrate away from an established network of friends.

Facebook “friend” relationships are more likely to be friends, family, and casual acquaintances. Facebook “follow” and “like” relationships are more likely to be public figures, celebrities, and business-to-consumer connections. Facebook is not the platform for professional relationships, work-related interactions, and business associations. LinkedIn is used for professional relationships with recruiting as its primary function. We should recognize that none of these platforms provides an application platform for actually doing work using social tools. Google failed to recognize this opportunity, as they began to integrate G+ with mail, storage, and other services. Providing a wall for posting information and comments is an extremely limited function for social interaction. It seems like no one has bothered to analyze how workers engage with each other to perform their jobs, so as to identify how social tools can facilitate these interactions to be done with improved productivity.

We do see companies like Atlassian developing tools like JIRA and Confluence for assisting teams to work together. These tools recognize how social interactions are embedded into the information and processes that surround business functions. We need this kind of innovation applied across the board throughout the tools that we use in the enterprise.

Productive work relies on effective communication, coordination, and collaboration. These are social functions. Social networking is already mature in project management, wikis (crowd sourcing information), and discussion forums. But these are often peripheral to the tools that many workers use to perform their primary job functions. We need to be looking at the social interactions that surround these tools to redevelop these tools to facilitate improvements in social interaction.

Let’s explore where social interactions are poor in our work environments today.

As our businesses expand across the globe, our teams are composed of workers who reside in different places and time zones. Remote interactions between non-collocated teams can be extremely challenging and inefficient compared to workers who can have regular face-to-face interactions with tools like white boards and pens. There is a huge opportunity for tablet applications to better support remote workers.

As businesses scale, we may discover that the traditional organizational structures are too rigid to support the ever-accelerating pace of agility that we demand. Perhaps social tools can facilitate innovations in how workers organize themselves. As highly skilled and experienced workers mature, they become more capable of taking the initiative, making good decisions independently, and behaving in a self-motivated manner. Daniel Pink has identified that autonomy, mastery, and purpose are the intrinsic motivators that lead to happy and productive employees. Perhaps with social tooling, it is possible for organizations to evolve to take advantage of spontaneous order among workers instead of relying mostly on top-down management practices for assigning work.

These are two ways in which social networking may apply to enterprises in ways that are not well supported today. All we have to do is examine the pain points in our work environments to identify innovations that may be possible. It is quite surprising to me that we are not already seeing social tools revolutionize the work place, especially in the technology sector where start-ups do not have an entrenched culture and management style.

Reliable Messaging with REST

Marc de Graauw’s article Nobody Needs Reliable Messaging remains as relevant today as it did in 2010, when it was first published. It echoes the principles outlined in Scalable, Reliable, and Secure RESTful services from 2007.

It basically says that you don’t need for REST to support WS-ReliableMessaging delivery requirements, because reliable delivery can be accomplished by the business logic through retries, so long as in the REST layer its methods are idempotent (the same request will produce the same result). Let’s examine the implications in more detail.

First, we must design the REST methods to be idempotent. This is no small feat. This is a huge topic that deserves its own separate examination. But let’s put this topic aside for now, and assume that we have designed our REST web services to support idempotence.

If we are developing components that call REST web services for process automation, the above principle says that the caller is responsible for retrying on failure.

The caller must be able to distinguish a failure to deliver the request from a failure by the server to perform the requested method. The former should be retried, expecting that the failure is temporary. The latter is permanent.

The caller must be able to implement retry in an efficient manner. If the request is retried immediately in a tight loop, it is likely to continue to fail for the same reason. Network connectivity issues sometimes take a few minutes to be resolved. However, if the reason for failure is because the server is overloaded, having all clients retry in a tight loop will exacerbate the problem by slamming the server with a flood of requests, when it is least able to process them. It would be helpful if clients would behave better by backing off for some time and retrying after a delay. Relying on clients to behave nicely on their honor is sure to fail, if their retry logic is coded ad hoc without following a standard convention.

The caller must be able to survive crashes and restarts, so that an automated task can be relied upon to reach a terminal state (success or failure) after starting. Therefore, message delivery must be backed by a persistent store. Delivery must be handled asynchronously so that it can be retried across restarts (including service migration to replacement hardware after a hardware failure), and so that the caller is not blocked waiting.

The caller must be able to detect when too many retry attempts have failed, so that it does not get stuck waiting forever for the request to be delivered. Temporary problems that take too long to be resolved need to be escalated for intervention. These requests should be diverted for special handling, and the caller should continue with other work, until someone can troubleshoot the problem. Poison message handling is essential so that retrying does not result in an infinite loop that would gum up the works.

POST methods are not idempotent, so retry must be handled very carefully to account for side-effects. Even if the request is guaranteed to be delivered, and it is processed properly (exactly once) by the server, the caller must be able to determine if the method succeeded reliably, because the reply can be lost. One approach is to deliver the reply reliably from the server back to the caller. Again, all of the above reliable delivery qualities apply. The interactions to enable this round trip message exchange certainly look very foreign to the simple HTTP synchronous interaction. Either the caller would poll for the reply, or a callback mechanism would be needed. Another approach is to enable the caller to confirm that the original request was processed. With either approach, the reliable execution requirement needs to alter the methods of the REST web services. To achieve better quality of service in the transport, the definition of the methods need to be radically redesigned. (If you are having a John McEnroe “you cannot be serious” moment right about now, it is perfectly understandable.)

Taking these requirements into consideration, it is clear that it is not true that “nobody needs reliable messaging”. Enterprise applications with automated processes that perform mission-critical tasks need the ability to perform those tasks reliably. If reliable message delivery is not handled at the REST layer, the responsibility for retry falls to the message sender. We still need reliable messaging; we must implement the requirement ourselves above REST, and this becomes troublesome without a standard framework that behaves nicely. If we accept that REST can provide only idempotence toward this goal, we must implement a standard framework to handle delivery failures, retry with exponential back off, and divert poison messages for escalation. That is to say, we need a reliable messaging framework on top of REST.

[Note that when we speak of a “client” above, we are not talking about a user sitting in front of a Web browser. We are talking about one mission-critical enterprise application communicating with another in a choreography to accomplish some business transaction. An example of a choreography is the interplay between a buyer and a seller through the systems for commerce, quote, procurement, and order fulfillment.]

OLTP database requirements

Here is what I want from a database in support of enterprise applications for online transaction processing.

  1. ACID transactions – Enterprise CRM, ERP, and HCM applications manage data that is mission critical. People’s jobs, livelihoods, and businesses rely on this data to be correct. Real money is on the line.
  2. Document oriented – A JSON or XML representation should be the canonical way that we should think of objects stored in the database.
  3. Schema aware – A document should conform to a schema (JSON Schema or XML Schema). Information has a structure and meaning, and it should have a formal definition.
  4. Schema versioned – A document schema may evolve in a controlled manner. Software is life cycle managed, and its data needs to evolve with it for compatibility, upgrades, and migration.
  5. Relational – A subset of a document schema may be modeled as relational tables with foreign keys and indexes to support SQL queries, which can be optimized for high performance.

The fundamental shift is from a relational to a document paradigm as the primary abstraction. Relational structures continue to play an adjunct role to improve query performance for those parts of the document schema that are heavily involved in query criteria (WHERE clauses). The document paradigm enables the vast majority of data to be stored and retrieved without having to rigidly conform to relational schema, which cannot evolve as fluidly. That is not to say that data stored outside of relational tables is less important or less meaningful. To the contrary, some of the non-relational data may be the most critical to the business. This approach is simply recognizing information that is not directly involved in query criteria can be treated differently to take advantage of greater flexibility in schema evolution and life cycle management.

Ideally, the adjunct relational tables and SQL queries would be confined by the database to its internal implementation. When exposing a document abstraction to applications, the database should also present a document-oriented query language, such as XQuery or its equivalent for JSON, which would be implemented as SQL, where appropriate as an optimization technique.

NoSQL database technology is often cited as supporting a document paradigm. NoSQL technologies as they exist today do not meet the need, because they do not support ACID transactions and they do not support adjunct structures (i.e., relational tables and indexes) to improve query performance in the manner described above.

Perhaps the next best thing would be to provide a Java persistent entity abstraction, much like EJB3/JPA, which would encapsulate the underlying representation in a document part (e.g., as a XMLType or a JSON CLOB column) and a relational part, all stored in a SQL database. This would also provide JAXB-like serialization and deserialization to and from JSON and XML representations. This is not far from what EclipseLink does today.

Applied Cosmology: The Holographic Principle

The Holographic Principle says that a full description of a volume of space is encoded in the surface that bounds it. This arises from black hole thermodynamics, where the black hole entropy increases with its surface area, not its volume. Everything there is to know about the black hole’s internal content is on its boundary.

Software components have boundaries that are defined by interfaces, which encapsulate everything an outsider needs to know to use it. Everything about its interior is represented by its surface at the boundary. It can be treated like a black box.

Applied Cosmology: The Self Similar Paradigm

Robert Oldershaw’s research on The Self Similar Cosmological Paradigm [http://www3.amherst.edu/~rloldershaw/concepts.html] recognizes that nature is organized in stratified hierarchy, where every level is similar. The shape and motions of atoms is similar to stellar systems. Similarities extend from the stellar scale to the galactic scale, and beyond.

Managing complexity greatly influences software design. Stratified hierarchy is familiar to this discipline.

At the atomic level, we organize our code into units. Each unit is a module with a boundary, which exposes its interface that governs how clients interact with this unit. The unit’s implementation is hidden behind this boundary, enabling it to undergo change independently of other units, as much as possible.

We build upon units by reusing modules and integrating them together into larger units, which themselves are modular and reusable in the same way. Assembly of modules into integrated components is the bread and butter of object-oriented programming. This approach is able to scale up to the level of an application, which exhibits uniformity of platform technologies, programming language, design metaphors, conventions, and development resources (tools, processes, organizations).

The next level of stratification exists because of the need to violate the uniformity across applications. However, the similarity is unbroken. We remain true to the principles of modular reuse. We continue to define a boundary with interfaces that encapsulate the implementation. We continue to integrate applications as components into larger scale components that themselves can be assembled further.

Enterprises are attempting to enable even higher levels of stratification. They define how an organization functions and how it interfaces with other organizations. This is with respect to protocols for human interaction as well as information systems. Organizations are integrated into business units that are integrated into businesses at local, national, multi-national, and global scales. Warren Buffett’s Berkshire Hathaway has demonstrated how entire enterprises exhibit such modular assembly.

This same pattern manifests itself across enterprises and across industries. A company exposes its products and services through an interface (branding, pricing, customer experience) which encapsulates its internal implementation. Through these protocols, we integrate across industries to create supply chains that provide ever more complex products and services.

Applied Cosmology: Machian Dynamics in Configuration Management

Julian Barbour wrote the book titled “The End of Time: The Next Revolution in Physics” [http://www.platonia.com/ideas.html]. He explains how our failure to unify General Relativity with Quantum Theory is because of our ill-conceived preoccupation with time as a necessary component of such a theory. A proper description of reality is composed of the relationships between real things, not a description with respect to an imaginary background (space and time). Therefore, all you have is a configuration of things, which undergoes a change in arrangement. The path through this configuration-space is what we perceive as the flow of time.

We apply this very model of the universe in configuration management.

Software release management is a configuration management problem, where the things in configuration-space are source files. A path through configuration space captures the versions of these source files relative to each other as releases of software are built. Our notion of time is with respect to these software releases.

Enterprise resource management in the communications industry involves many configuration management problems in various domains. We normally refer to such applications as Operations Support Systems.

In network resource management, the configuration-space includes devices and other resources in the network, their connectivity, and the metadata (what is normally called a “device configuration” which needs to be avoided in the context of this discussion for obvious reasons) associated with that connectivity arrangement.

In service resource management, the configuration-space includes services, their resource allocations, and the subscription metadata (what is normally called a “service configuration” which needs to be avoided in the context of this discussion for obvious reasons) or “design”.

We develop such applications using a notion of configuration-space, because such systems cannot operate in a world that is limited to its dependence on a background of space and time. We need to be able to travel backward and forward in time arbitrarily to see how the world looked in the past from the perspective of a particular transaction. We need to be able to hypothesize many possible futures, perhaps only one of which is brought into reality through a rigorous process of analysis, design, planning, procurement, construction, and project management. Reality is always from the perspective of the observer, and one’s frame of reference is always somewhere on the path in configuration-space.

Software engineering is applied cosmology

Engineering is applied science. Some people believe that software engineering is applied computer science. In a limited sense, it is. But software is not entirely separated from hardware. Applications are not entirely separated from processes. Systems are not entirely separated from enterprises. Corporations are not entirely separated from markets. For this reason, I believe what we do is not software engineering at all. It is not limited to applied computer science. Our engineering discipline is actually applied cosmology.

what is wrong with TM Forum SID?

What is TM Forum?

The TM Forum is a standards organization for the communications industry. Rather than defining standards for particular network technologies, TM Forum is most interested in how to manage the processes across the entire service provider’s business. Their approach to modeling the communications problem domain divides the space into processes (eTOM), data (SID), applications (TAM), and integration. TM Forum has gained wide acceptance in the industry, and it has become the only game in town at the size and scale of its membership and audience.

What are eTOM and SID?

The business process framework and information framework together serve as an analysis model. They provide a common vocabulary and model for conceptualizing the communications problem space. As with any standards organization, the specifications are an amalgam of contributions from its diverse membership, and the result is formed by consensus.

The business process framework (eTOM) decomposes the business starting at the macro level and explores the parts at more granular levels of activities. Processes are described in terms of the actors, roles, responsibilities, goals, and the types of relevant information involved. There is no presumption of whether activities are performed by humans or automated systems.

The information framework (SID) decomposes the data into the product, service, and resource domains. Each domain is a model of entities and relationships presented as an object model using UML as the notation. The product domain is concerned with customer-facing and commercial interests. The service domain is concerned with abstracting the network resources into capabilities that can be parameterized for commercialization and that are of value to deliver to subscribers. The resource domain is concerned with the networks that enable services to be delivered.

What is wrong with TM Forum SID

As an analysis model, the eTOM and SID do a decent job to help people understand the problem space. However, these standards are problematic, because proponents promote them as being detailed and precise enough to be considered design models that can be translated directly into a software implementation. More accurately the framework is promoted as a starting point, which implies the ability to extend in a robust manner, and it is this position that I challenge. This position is stated in principle and demonstrated in practice through code generation tools. The separation of behavioral and structural modeling seems like a natural approach to organizing concepts, but at the level of detail to design software behavior and data need to come together cohesively as transactions. Object-oriented or service-oriented techniques would normally do this.

By representing SID in UML, it has the appearance of being an object model, but what it omits is behavior in the form of operations and their signatures. This is explained away by delegating the responsibility for interface specifications to the integration framework. That certainly contradicts the position that SID is a design model which translates into implementation.

Missing detailed transactional behavior

The integration framework tries to define software interfaces that directly use the SID entities. The result is a collection of interfaces that only superficially represents the behavior of the system. Because of the methodology of separating process modeling from data modeling, the interfaces are very CRUD-like, and the majority of behavior is assumed to continue residing in process integration (i.e., BPEL). This would work well for service provider specific business processes, but it is the wrong approach for defining transactional behavior that is intrinsic to the domain. These behaviors include life cycle management, capacity management, utilization and availability, compatibility and eligibility, and topology (rules and constraints for designing services and networks).

Because eTOM does not describe the behavior to this level of detail, SID does not define the entities and relationships to a level of detail that supports these essential behaviors. Consequently, neither do the interfaces defined by the integration framework. The problem is that many in the community view the model as robust with rich and unambiguous semantics, when this is far from true. If the very generic-sounding elements of the SID are used to model some of the behaviorally rich capabilities like resource utilization, we quickly discover missing attributes, missing relationships, and missing entities. Even worse, we may even find that the existing elements conflict with how the model needs to be defined to support the behavior. Sometimes a simple attribute needs to become an entity (or a pattern of entities) with a richer set of attributes and relationships. Sometimes relationships on a generalized entity are repeated in a specialized way on a specialized entity, and this is ambiguous for implementation. These issues undermine the robust extensibility of the framework, making the idea that the framework is easily usable as a starting point (extend by adding) very suspect, because extensions require radical destabilizing redesign.

Software design is a complex set of trade-offs to balance many opposing forces, including performance, scalability, availability, and maintainability alongside the functional requirements. SID cannot possibly be a design model, because none of these forces are considered. I strongly believe it would be an error to accept SID as a design model for direct implementation in software. I believe that any reasonable software implementation will have a data model that looks radically different than SID, but elements of the software interface will have a conceptual linkage to SID as its intellectual ancestry. If we think of SID in this capacity, we have some hope for a viable implementation.

designs are useless

“In preparing for battle I have always found that plans are useless, but planning is indispensable.” -Eisenhower (from Planning Extreme Programming)

I believe this is true, because “no plan survives contact with the enemy”. In software, the enemy is in the form of dark spirits hidden within the code, both legacy and yet to be written. Because plans (the schedule of work to be done) are intimately tied to designs (models of the software), it must also be true that no design survives contact with the enemy. Any programmer who begins writing code based on a preconceived design will almost immediately feel the pain of opposing forces that beg to be resolved through refactoring; programmers who lack this emotional connection with their code are probably experiencing a failure of imagination (to improve the design).

Therefore, I think we can return to the original quote and state its corollary: designs are useless, but designing is indispensable.

All this is to say that for the above reasons, I think these ideas strongly support the notion that the process artifacts (e.g., Functional Solution Approach, Functional Design, Technical Design) are more or less useless, because as soon as they are written they are obsoleted by improvements that are discovered contemporaneously with each line of code written, but the act of producing them (the thought invested in designing) is indispensable to producing good software. This leads me to conclude that we should not fuss so much about the actual content of the artifacts, so long as they capture the essence to show a fruitful journey through designing—that problems have been thought through and decisions have been based on good reasoning. Worrying about the content being perfectly precise, comprehensive, and consistent ends up being a waste of effort, since the unrelenting act of designing will have already moved beyond the snapshot captured in the artifact.

Coincidentally, this theme also aligns with the notion of a learning organization espoused by Lean. The value of designing is the facilitation of learning.