Problem solve Get help with specific problems with your technologies, process and projects.

The fallacy of the data layer -- or, a new architectural model for software designs

Most data sources are not exclusive to a specific application. They are shared by numerous applications. Thus, advanced architects will challenge the traditional three-layer/three-tier view of applications and consider a new architectural model for software designs.

It is commonly held as a truth that applications have a UI layer, a business layer and a data layer. In most of my presentations and writing I use a four-layer model: UI, business, data access and data storage. In this case the "data storage" layer is really the same as the traditional data layer in a three-layer model.

But I want to challenge this idea of a data layer. Over the past few months, in discussing service-oriented architecture (SOA) as well as distributed object-oriented architecture, I have become increasingly convinced that the idea of a data tier, data layer or data storage layer is fundamentally flawed.

Note that in this article I use the word "layer" to describe logical separation between concepts regardless of physical configuration, while "tier" means a physical separation. That said, the reality is that a typical data layer really is also a data tier, because most data layers exist in the form of SQL Server, Oracle or some other server-based database engine.

Service-orientated design leads us to the idea that any software component used by more than one other software component (used by more than one "client") should be a service. A service is a powerful unit of reuse, providing a contractual interface by which clients can interact with the service.

More importantly, a service defines a trust boundary. This means that a service protects itself against invalid usage by clients. This is one of the defining elements of service-orientation and is a key benefit. The benefit is that a service can safely service many disparate clients because it trusts none of them. The service ensures that all clients get the same behaviors, and that any data sent to the service by any client is checked for validity based on the rules of the service.

My contention is that the traditional "data layer" is the ideal candidate to be a service.

Consider all the layers in a typical application:

  • User or other consumer
  • Presentation technology
  • UI logic
  • Business logic
  • Data access logic
  • Data source

We tend to think of these as a stack, from user down to persistent data storage, with our application logic like a sandwich in the middle.

But consider which of these we really control as part of a single application. Obviously we don't control the user or other consumers -- they are an autonomous entity outside our sphere of influence. They are an external entity.

We do control our presentation to a large degree. We also control our UI and business layers -- after all, they are written in code by us for our particular application. The same is true of the data access layer -- it is just code we write to interact with the data source.

But now we get to the data source itself. Do we, in a typical application scenario, control the data source? More often than not the answer is no.

In most cases the data source is a database, a set of related tables. And the reality is that other applications or people almost always have direct access to those same tables. They are not under our control, and we cannot reliably ensure that they contain valid information based on the rules and practices established by our application.

This isn't to say that the data doesn't remain consistent based on the rules and practices set forth inside the data source, but those rules and practices rarely match those of our application -- at least not with perfection. It seems that there's always a mismatch between business logic and the rules embedded in the database itself.

What I'm getting at here is the thought that we can control neither end of the spectrum. From the perspective of our application the user and the data source are similarly outside our control.

In short, I am suggesting that the data layer is an external entity.

Microsoft's Pat Helland talks about services being autonomous entities that contain business behavior or logic. He also makes a point of noting that a service owns its data source. Were it true that a given service (application) had exclusive control and access to its data source, I'd buy into what he says, but that is rarely the case in real organizations.

Instead, a service (application) has shared use of a data source. Other services (applications) also have access to the data source. Other people (administrators, power users, etc.) often have direct access to the data source -- bypassing all application logic.

I hold it to be a truth that layers in a single application trust each other. I've written in my blog about trust boundaries and how any time we cross a trust boundary we must consider that we have two separate applications. This is because an application is defined by its trust boundary.

I realize that I'm defining the word application in a specific way, and that is intentional. Most of the problem with SOA discussions and even n-tier architecture discussions flows from semantic or terminology issues. We must define some set of terms for discussion.

So, an application is hereby defined as a set of code, objects, components, services, layers or tiers that exist within a trust boundary. They trust each other, and have no need to replicate business logic (validation, calculation, manipulation, authorization) due to lack of trust. If data is validated or manipulated in one layer, all the other layers just assume it was done correctly.

A corollary of the above statement is that the constituent entities making up an application are encapsulated -- at least in a logical sense. In other words, external entities (users or other applications) can not directly interact with code, objects, components, services, layers or tiers except through the application's formal interfaces (UI/presentation or service interfaces).

Assuming we agree on the above definition and corollary, it is very obvious that the only time a data source can be a layer within an application is if that data source is exclusive to the application.

If the data source is not exclusive, then other applications or users can directly interact with it without going through our application's presentation/UI layer. In such a case (which is the norm, not the exception), the data source cannot be a data layer within an application. It exists outside the trust boundary and thus must be considered as an external entity (like a user or another application).

What does this mean from a practical perspective?

Well it means that our application can't trust the data source. But more than that, it means that we need to fundamentally rethink the architecture of an application (or service).

What is a user? From the perspective of an application, a user is a data source and an event source. In an object-oriented world view the user could easily be characterized as just another object in our model. In a service-oriented world view the user could easily be characterized as just another service with which we interact.

In fact, we can go so far as to equate ADO.NET, XML, Web services, HTML and a GUI as all being nothing more than external interface technologies. What I' m suggesting here is that maybe there is no such thing as a presentation/UI layer -- at least not in a substantively different way from a data access layer (or a service agent layer). This idea is illustrated below.



In Figure 1 the brick square represents our application's trust boundary. Our business logic and any other trusted objects, components or services are represented by the blocks contained within the trust boundary box. Outside the trust boundary are the user, any data sources, external services and legacy systems. Our application interacts with all of these external entities through the trust boundary, meaning that none of them are trusted by our application's code.

I find this particularly compelling when considering this from the perspective of distributed object-oriented architecture. I' ve talked quite a lot about the idea that applications and services alike can, and should be created using an object-oriented design approach. But consider what happens if we mentally put ourselves purely in the domain of the objects themselves and look at the world from that perspective. In that case, what is the difference between a user, a data source, a web service or any other uncontrollable external entity?

Nothing. Each of them is merely an external entity with which we need to interact. In each case we interact through some specialized bit of interface software -- be that ADO.NET, a Windows GUI or an XML parser.

By focusing on our application as the center of any architecture, we can standardize the way we think about all external entities. Data or events coming from a user are no more or less valuable than those coming from a data source or external service. Our business logic must validate, calculate and manipulate data as appropriate based on these interactions regardless of the source.

This means that the layers in a typical application are much simpler:

  • External interfaces
  • Business logic

External interfaces can take numerous forms, including:

  • Data access (like ADO.NET)
  • Service agents (such as those that call web services)
  • HTML interfaces (to interact with web users)
  • GUI interfaces (to interact with Windows users)

In each case, an external interface is responsible for sending and receiving data and possibly for raising and receiving events. To generalize this we can say that the external interfaces send and receive messages, allowing interaction with external and un-trusted entities.

In summary, most data sources are not exclusive to our specific application. They are shared by numerous applications, and thus are an ideal candidate to be a service from a service-oriented perspective. The same is true of any shared external entity, including data sources, users and other services. I think thismeans we need to challenge the traditional three-layer/three-tier view of applications and consider a new architectural model -- one not reliant on a data layer.

About the author

Rockford Lhotka is the Principal Technology Evangelist for Magenic Technologies, a company focused on delivering business value through applied technology and one of the nation's premiere Microsoft Gold Certified Partners. Lhotka is the author of several books, including Expert One-on-One Visual Basic .NET Business Objects and Expert C# Business Objects, both from Apress. He is a Microsoft Software Legend, Regional Director, MVP and INETA speaker. He is a columnist for MSDN Online and contributing author for Visual Studio Magazine, and he regularly presents at major conferences around the world -- including Microsoft PDC, Tech Ed, VS Live! and VS Connections. Lhotka has has worked on many projects in various roles, including software architecture, design and development, network administration and project management. Over his career he has designed and helped to create systems for bio-medical manufacturing, agriculture, point of sale, credit card fraud tracking, general retail, construction and healthcare.

Dig Deeper on ADO.NET development

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.