OLTP database requirements

Here is what I want from a database in support of enterprise applications for online transaction processing.

  1. ACID transactions – Enterprise CRM, ERP, and HCM applications manage data that is mission critical. People’s jobs, livelihoods, and businesses rely on this data to be correct. Real money is on the line.
  2. Document oriented – A JSON or XML representation should be the canonical way that we should think of objects stored in the database.
  3. Schema aware – A document should conform to a schema (JSON Schema or XML Schema). Information has a structure and meaning, and it should have a formal definition.
  4. Schema versioned – A document schema may evolve in a controlled manner. Software is life cycle managed, and its data needs to evolve with it for compatibility, upgrades, and migration.
  5. Relational – A subset of a document schema may be modeled as relational tables with foreign keys and indexes to support SQL queries, which can be optimized for high performance.

The fundamental shift is from a relational to a document paradigm as the primary abstraction. Relational structures continue to play an adjunct role to improve query performance for those parts of the document schema that are heavily involved in query criteria (WHERE clauses). The document paradigm enables the vast majority of data to be stored and retrieved without having to rigidly conform to relational schema, which cannot evolve as fluidly. That is not to say that data stored outside of relational tables is less important or less meaningful. To the contrary, some of the non-relational data may be the most critical to the business. This approach is simply recognizing information that is not directly involved in query criteria can be treated differently to take advantage of greater flexibility in schema evolution and life cycle management.

Ideally, the adjunct relational tables and SQL queries would be confined by the database to its internal implementation. When exposing a document abstraction to applications, the database should also present a document-oriented query language, such as XQuery or its equivalent for JSON, which would be implemented as SQL, where appropriate as an optimization technique.

NoSQL database technology is often cited as supporting a document paradigm. NoSQL technologies as they exist today do not meet the need, because they do not support ACID transactions and they do not support adjunct structures (i.e., relational tables and indexes) to improve query performance in the manner described above.

Perhaps the next best thing would be to provide a Java persistent entity abstraction, much like EJB3/JPA, which would encapsulate the underlying representation in a document part (e.g., as a XMLType or a JSON CLOB column) and a relational part, all stored in a SQL database. This would also provide JAXB-like serialization and deserialization to and from JSON and XML representations. This is not far from what EclipseLink does today.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.