Category Archives: software

Applied Cosmology: Self Similar Paradigm

Robert Oldershaw’s research on The Self Similar Cosmological Paradigm recognizes that nature is organized in stratified hierarchy, where every level is similar. The shape and motions of atoms is similar to stellar systems. Similarities extend from the stellar scale to the galactic scale, and beyond.

Managing complexity greatly influences software design. Stratified hierarchy is familiar to this discipline.

At the atomic level, we organize our code into units. Each unit is a module with a boundary, which exposes its interface that governs how clients interact with this unit. The unit’s implementation is hidden behind this boundary, enabling it to undergo change independently of other units, as much as possible.

We build upon units by reusing modules and integrating them together into larger units, which themselves are modular and reusable in the same way. Assembly of modules into integrated components is the bread and butter of object-oriented programming. This approach is able to scale up to the level of an application, which exhibits uniformity of platform technologies, programming language, design metaphors, conventions, and development resources (tools, processes, organizations).

The next level of stratification exists because of the need to violate the uniformity across applications. However, the similarity is unbroken. We remain true to the principles of modular reuse. We continue to define a boundary with interfaces that encapsulate the implementation. We continue to integrate applications as components into larger scale components that themselves can be assembled further.

Enterprises are attempting to enable even higher levels of stratification. They define how an organization functions and how it interfaces with other organizations. This is with respect to protocols for human interaction as well as information systems. Organizations are integrated into business units that are integrated into businesses at local, national, multi-national, and global scales. Warren Buffett’s Berkshire Hathaway has demonstrated how entire enterprises exhibit such modular assembly.

This same pattern manifests itself across enterprises and across industries. A company exposes its products and services through an interface (branding, pricing, customer experience) which encapsulates its internal implementation. Through these protocols, we integrate across industries to create supply chains that provide ever more complex products and services.

Applied Cosmology: Machian Dynamics

Julian Barbour wrote the book titled “The End of Time: The Next Revolution in Physics” [http://www.platonia.com/ideas.html]. He explains how our failure to unify General Relativity with Quantum Theory is because of our ill-conceived preoccupation with time as a necessary component of such a theory. According to Machian Dynamics, a proper description of reality is composed of the relationships between real things, not a description with respect to an imaginary background (space and time). Therefore, all you have is a configuration of things, which undergoes a change in arrangement. The path through this configuration-space is what we perceive as the flow of time.

We apply this very model of the universe in configuration management.

Software release management is a configuration management problem. The things in configuration-space are source files. A path through configuration space captures the versions of these source files relative to each other as releases of software are built. Our notion of time is with respect to these software releases.

Enterprise resource management in the communications industry involves many configuration management problems in various domains. We normally refer to such applications as Operations Support Systems.

In network resource management, the configuration-space includes devices and other resources in the network, their connectivity, and the metadata (what is normally called a “device configuration” which needs to be avoided in the context of this discussion for obvious reasons) associated with that connectivity arrangement.

In service resource management, the configuration-space includes services, their resource allocations, and the subscription metadata (what is normally called a “service configuration” which needs to be avoided in the context of this discussion for obvious reasons) or “design”.

Such applications have a notion of configuration-space, because such systems cannot operate in a world that is limited to its dependence on a background of space and time. We need to be able to travel backward and forward in time arbitrarily to see how the world looked in the past from the perspective of a particular transaction. These applications enable users to hypothesize many possible futures. Perhaps only one of which is brought into reality through a rigorous process of analysis, design, planning, procurement, construction, and project management. Reality is always from the perspective of the observer, and one’s frame of reference is always somewhere on the path in configuration-space.

Software engineering is applied cosmology

Engineering is applied science. Some people believe that software engineering is applied computer science. In a limited sense, it is. But software is not entirely separated from hardware. Applications are not entirely separated from processes. Systems are not entirely separated from enterprises. Corporations are not entirely separated from markets. For this reason, I believe what we do is not software engineering at all. It is not limited to applied computer science. Our engineering discipline is actually applied cosmology.

what is wrong with TM Forum SID?

What is TM Forum?

The TM Forum is a standards organization for the communications industry. Rather than defining standards for particular network technologies, TM Forum is most interested in how to manage the processes across the entire service provider’s business. Their approach to modeling the communications problem domain divides the space into processes (eTOM), data (SID), applications (TAM), and integration. TM Forum has gained wide acceptance in the industry, and it has become the only game in town at the size and scale of its membership and audience.

What are eTOM and SID?

The business process framework and information framework together serve as an analysis model. They provide a common vocabulary and model for conceptualizing the communications problem space. As with any standards organization, the specifications are an amalgam of contributions from its diverse membership, and the result is formed by consensus.

The business process framework (eTOM) decomposes the business starting at the macro level and explores the parts at more granular levels of activities. Processes are described in terms of the actors, roles, responsibilities, goals, and the types of relevant information involved. There is no presumption of whether activities are performed by humans or automated systems.

The information framework (SID) decomposes the data into the product, service, and resource domains. Each domain is a model of entities and relationships presented as an object model using UML as the notation. The product domain is concerned with customer-facing and commercial interests. The service domain is concerned with abstracting the network resources into capabilities that can be parameterized for commercialization and that are of value to deliver to subscribers. The resource domain is concerned with the networks that enable services to be delivered.

What is wrong with TM Forum SID

As an analysis model, the eTOM and SID do a decent job to help people understand the problem space. However, these standards are problematic, because proponents promote them as being detailed and precise enough to be considered design models that can be translated directly into a software implementation. More accurately the framework is promoted as a starting point, which implies the ability to extend in a robust manner, and it is this position that I challenge. This position is stated in principle and demonstrated in practice through code generation tools. The separation of behavioral and structural modeling seems like a natural approach to organizing concepts, but at the level of detail to design software behavior and data need to come together cohesively as transactions. Object-oriented or service-oriented techniques would normally do this.

By representing SID in UML, it has the appearance of being an object model, but what it omits is behavior in the form of operations and their signatures. This is explained away by delegating the responsibility for interface specifications to the integration framework. That certainly contradicts the position that SID is a design model which translates into implementation.

Missing detailed transactional behavior

The integration framework tries to define software interfaces that directly use the SID entities. The result is a collection of interfaces that only superficially represents the behavior of the system. Because of the methodology of separating process modeling from data modeling, the interfaces are very CRUD-like, and the majority of behavior is assumed to continue residing in process integration (i.e., BPEL). This would work well for service provider specific business processes, but it is the wrong approach for defining transactional behavior that is intrinsic to the domain. These behaviors include life cycle management, capacity management, utilization and availability, compatibility and eligibility, and topology (rules and constraints for designing services and networks).

Because eTOM does not describe the behavior to this level of detail, SID does not define the entities and relationships to a level of detail that supports these essential behaviors. Consequently, neither do the interfaces defined by the integration framework. The problem is that many in the community view the model as robust with rich and unambiguous semantics, when this is far from true. If the very generic-sounding elements of the SID are used to model some of the behaviorally rich capabilities like resource utilization, we quickly discover missing attributes, missing relationships, and missing entities. Even worse, we may even find that the existing elements conflict with how the model needs to be defined to support the behavior. Sometimes a simple attribute needs to become an entity (or a pattern of entities) with a richer set of attributes and relationships. Sometimes relationships on a generalized entity are repeated in a specialized way on a specialized entity, and this is ambiguous for implementation. These issues undermine the robust extensibility of the framework, making the idea that the framework is easily usable as a starting point (extend by adding) very suspect, because extensions require radical destabilizing redesign.

Software design is a complex set of trade-offs to balance many opposing forces, including performance, scalability, availability, and maintainability alongside the functional requirements. SID cannot possibly be a design model, because none of these forces are considered. I strongly believe it would be an error to accept SID as a design model for direct implementation in software. I believe that any reasonable software implementation will have a data model that looks radically different than SID, but elements of the software interface will have a conceptual linkage to SID as its intellectual ancestry. If we think of SID in this capacity, we have some hope for a viable implementation.

designs are useless – like planning

“In preparing for battle I have always found that plans are useless, but planning is indispensable.” -Eisenhower (from Planning Extreme Programming)

I believe this is true, because “no plan survives contact with the enemy”. In software, the enemy is in the form of dark spirits hidden within the code, both legacy and yet to be written. Because plans (the schedule of work to be done) are intimately tied to designs (models of the software), it must also be true that no design survives contact with the enemy. Any programmer who begins writing code based on a preconceived design will almost immediately feel the pain of opposing forces that beg to be resolved through refactoring; programmers who lack this emotional connection with their code are probably experiencing a failure of imagination (to improve the design).

Therefore, I think we can return to the original quote and state its corollary: designs are useless, but designing is indispensable.

All this is to say that for the above reasons, I think these ideas strongly support the notion that the process artifacts (e.g., Functional Solution Approach, Functional Design, Technical Design) are more or less useless, because as soon as they are written they are obsoleted by improvements that are discovered contemporaneously with each line of code written, but the act of producing them (the thought invested in designing) is indispensable to producing good software.

This leads me to conclude that we should not fuss so much about the actual content of the artifacts, so long as they capture the essence to show a fruitful journey through designing—that problems have been thought through and decisions have been based on good reasoning. Worrying about the content being perfectly precise, comprehensive, and consistent ends up being a waste of effort, since the unrelenting act of designing will have already moved beyond the snapshot captured in the artifact.

Coincidentally, this theme also aligns with the notion of a learning organization espoused by Lean. The value of designing is the facilitation of learning.

transparent persistence

Advantages

Transparent persistence has emerged into the mainstream over the past few years with the popularity of JDO and JPA for enterprise application development. This approach offers the following advantages.

  1. Domain modeling is expressed naturally as plain old Java objects (POJOs) without having to program any of the SQL or JDBC calls that are traditionally coded by hand.
  2. Navigation through relationships – objects are naturally related through references, and navigating a relationship will automatically load the related object on demand.
  3. Modified objects are stored automatically when the transaction is committed.
  4. Persistence by reachability – related objects are automatically stored, if they are reachable from another persistent object.

The programming model is improved by eliminating the tedium that is traditionally associated with object persistence. Loading, storing, and querying are all expressed in terms of the Java class and field names, as opposed to the physical schema names. The programmer is largely insulated from the impedance mismatch between Java objects and the relational database. The software can be expressed purely in terms of the domain model, as represented by Java objects.

Challenges

When developing domain objects, persistence is only one aspect. The business logic that applies to the graph of related objects is the most important concern. Transparent persistence introduces challenges to executing business logic to enforce constraints and complex business rules when creating, updating, and deleting persistent objects through reachability.

For example, an equipment rental application may need to enforce the following constraints:

  • When creating equipment, it must be related to a location.
  • When creating a rental, it must be related to a customer, and ensure that the equipment is available for the duration of the rental.
  • When updating a rental, it must ensure that equipment is available for the duration of the rental.

JPA 2.0 does not provide sufficient mechanisms for enforcing these constraints, when creating or updating these entities through reachability. The responsibility is placed on a service object to manage these graphs of entities. The constraint checking must be enforced by the service object per transaction. Java EE 5 does not provide any assistance to ensure that the constraint checks (implemented in Java) are deferred until commit, so that they are not repeated, when performing a sequence of operations in the same transaction.

Adding a preCommit event to a persistent object would provide a good place for expressing constraints. Allowing this event to be deferred until transaction commit would provide the proper optimization for good performance. Of course, preCommit would need to prevent any further modifications to the persistent objects enlisted in the transaction. This would factor out many of the invariants so that they are expressed per entity, removing the responsibility from every operation on service objects, which is prone to programmer error. The domain model would be greatly improved.

java christmas wish list 2008

JPA preCommit

Similar to the other events (PrePersist, PreRemove, PostPersist, PostRemove, PreUpdate, PostUpdate, and PostLoad), JPA needs to add a preCommit event. This would be useful for enforcing constraints (invariants) using Java logic, similar to how less expressive deferred constraints can be enforced in SQL.

read-only transaction

javax.transaction.UserTransaction needs the ability to begin a transaction with an awareness of whether the transaction will be read-only or read-write. A read-only transaction would prevent writes (inserts, updates, and deletes) from being done.

dynamic immutability

It would be helpful if an instance of an object can be mutable, when used by some classes (e.g., builder, factory, repository, deserializer), and immutable, when used by others. This would facilitate the ability to load persistent objects from a data store, derive transient fields from persistent fields, and marking the instance as immutable if the transaction is read-only. I do not want to develop entities that have both a mutable class and an immutable class; and access control (private, protected) is not sufficient, if the mutability is dependent on context (e.g., read-only transaction).

Java technology challenges

Here is my Java wish list.

  1. module deployment – Java has done well to define an archive format and class loading system that enables code to be organized into libraries. However, a modular application needs more than the deployment of Java classes. It also needs to accommodate the following:
    • SQL DDL for initial database schema creation and on-going evolution
    • SQL DML and possibly Java code for upgrading data as the schema evolves and the application features are upgraded
    • XML documents and binary resources containing data that needs to be processed by the application and possibly loaded into the database
    • HTML pages or templates (e.g., Tapestry) that can be dynamically added to a Web application
    • Scripts (e.g., Groovy), rules, or other forms of code that can be dynamically executed.
  2. module dependency – Java needs a better way to dynamically enable or disable behaviors that depend on whether modules of code are available, similar to how function_exists(f) works in PHP, without having to resort to invoking methods using Java reflection. A static programming model should be possible, while protecting a block of code to be conditionally executed only if another class is loadable.

universe of events – cosmology in software

On my second reading of Three Roads to Quantum Gravity by Lee Smolin, the concept of a relational universe stands out as something fundamentally important.

Each measurement is supposed to reveal the state of the particle, frozen at some moment of time. A series of measurements is like a series of movie stills — they are all frozen moments.

The idea of a state in Newtonian physics shares with classical sculpture and painting the illusion of the frozen moment. This gives rise to the illusion that the world is composed of objects. (p.53)

In object oriented programming, the objects correspond to the particles. The focus is on capturing the state of the object, frozen at some moment of time. As methods are called on the object, changes to its state (variables) are like a series of movie stills.

Lee Smolin goes on to write:

If this were really the way the world is, then the primary description of something would be how it is, and change in it would be secondary. Change would be nothing but alterations in how something is. But relativity and quantum theory each tell us that this is not how the world is. They tell us — no, better they scream at us — that our world is a history of processes. Motion and change are primary. Nothing is, except in a very approximate and temporary sense. How something is, or what its state is, is an illusion. It may be a useful illusion for some purposes, but if we want to think fundamentally we must not lose sight of the essential fact that ‘is’ is an illusion. So to speak the language of the new physics we must learn a vocabulary in which process is more important than, and prior to, stasis. Actually, there is already available a suitable and very simple language which you will have no trouble understanding.

From this new point of view, the universe consists of a large number of events. An event may be thought of as the smallest part of a process, a smallest unit of change. But do not think of an event happening to an otherwise static object. It is just a change, no more than that.

The universe of events is a relational universe. That is, all its properties are described in terms of relationships between the events. The most important relationship that two events can have is causality. This is the same notion of causality that we found was essential to make sense of stories.

If objects are merely an illusion, and it is really causal events that are fundamental to modeling a universe that is relational and dynamical, then perhaps we should re-examine how effective object oriented programming is at producing software that effectively models real world processes. Classes of objects definitely focus on the static structure of the universe. The methods on these classes can be considered to correspond to events, which carry information in, perform some computation, and carry information out. However, the causal relationships between events is buried in the procedural code within each method; they are not expressed in a first class manner.

Personal productivity applications like spreadsheets and word processors model objects (e.g., documents) and relationships that undergo relatively simple processes involving only a few actors. The causal history of events is not as important, because there is only one set of objects in a document to maintain integrity among and the series of frozen moments model of the universe works rather well. Enterprise applications such as Enterprise Resource Planning (ERP) facilitate a multitude of parallel business processes that involve many actors and sophisticated collaborations. Each actor is performing transactions against some subset of objects, which are each progressing through a distinct life cycle. Maintaining integrity among the objects changed by these many concurrent events is incredibly complicated. It becomes important to keep a causal history of events in addition to the current state of the universe, as well as having a schedule of future events (for planning) that have not come to pass. A series of frozen moments becomes less appealing, whereas a set of processes and events seems like a better description of the universe.

programming for non-programmers

Users want to express their desires to a computer and have the computer fulfill their wishes on demand. We frequently see this desire manifested as software “requirements” for configurability or customization without the need for programming. Users view programming as a highly technical, error prone, and cumbersome chore that should be left to professional software developers.

Users would like developers to produce software that is capable of intelligently adapting its structure and behavior to the needs of the day, as the environment and the business requirements change. They believe they should be able to inform the software of these requirements through intuitive visual techniques. Click on a tool and fill in some information. Drag and drop some icons across the screen. New computations, new algorithms, and new ways of doing business should be configurable with a few clicks by users, who know nearly nothing about how a computer executes instructions, the nature of such instructions, and how such instructions come into existence.

What users do not realize is that the act of expressing themselves through clicks and gestures to produce instructions for the computer is itself an act of programming. The complexity involved in formulating those instructions is proportional to the degree of intelligence intrinsic to the computation. The breadth of requirements that can be satisfied by those instructions is a function of how much can be expressed by the user. Narrow coverage of requirements implies an inflexible system that targets a niche problem domain. Broad coverage of requirements implies a general purpose system.

General purpose programming languages have traditionally been expressed as text. Like all languages, programming languages have a vocabulary, words formed from a character set, and a grammar, which governs how to organize those words in meaningful ways. Textual programming languages allow humans to express themselves concisely in a form that is both precise enough for a computer to comprehend and similar enough to natural language (e.g., English) to be intuitive.

General purpose programming languages have naturally evolved from primitive instructions close to the machine (e.g., assembly language) to abstractions that correspond to natural language. Text is a highly efficient and flexible method of communicating ideas. An alternative technique using graphical images may be applicable for expressing ideas that can be tangibly visualized, but images are wholly inappropriate for non-visual ideas. An image may be able to convey a thousand words, but to have control over exactly what words to formulate would not be easy by using visual techniques.

We can expect that programming languages will continue to advance to become more intuitive to humans and more efficient at expressing complex instructions to machines. Also, we can expect that graphical techniques will continue to evolve to make programming easier to both professional programmers and users, who don’t want to program. What we must accept is the fact that instructing a machine is programming, regardless of whether it is expressed as text or using visual techniques.