Agile Data

Agile Data

Follow @scottwambler on Twitter!

There are several techniques that enable software architecture efforts. These techniques include standard modeling languages such as the Unified Modeling Language (UML); frameworks such as the Model-Driven Architecture (MDA) and the Zachman Framework; and software processes such as the Enterprise Unified Process (EUP). In this article I describe each briefly, summarize their strengths and weaknesses, and then discuss how to apply them in an agile manner. My goal is not to present a detailed overview of each technique, each one is worthy of a book in its own right, instead my goal is to make you aware of the technique and to suggest further avenues of investigation if you're interested.

Table of Contents

1. The Unified Modeling Language (UML)

The Unified Modeling Language (UML) defines the industry-standard notation and semantics for object-oriented and component-based systems. First published by the Object Management Group (OMG) in 1997 it has undergone several revisions, the latest of which is UML 2.0. The strengths of the UML include:

Unfortunately the UML suffers from several challenges:

Figure 1. Modeling artifacts for business application development.

2. The Model-Driven Architecture (MDA)

Warning: Since this section was originally written, around the 2004 time frame, MDA has pretty much disappeared in the marketplace. It seems that my estimate of 5% of organizations being sufficiently sophisticated to adopt MDA was wildly optmistic.

The Model-Driven Architecture (MDA) defines an approach to modeling that separates the specification of system functionality from the specification of its implementation on a specific technology platform. In short it defines a guidelines for structuring specifications expressed as models. The MDA promotes an approach where the same model specifying system functionality can be realized on multiple platforms through auxiliary mapping standards, or through point mappings to specific platforms. It also supports the concept of explicitly relating the models of different applications, enabling integration and interoperability and supporting system evolution as platform technologies come and go.

The MDA is part of a collection of modeling-oriented standards from the OMG. These standards include:

  1. XML Metadata Interchange (XMI). XMI is a standard interchange mechanism used between various tools, repositories and middleware. XMI can also be used to automatically produce XML DTDs and XML schemas from UML and Meta Object Facility (MOF) models, providing an XML serialization mechanism for these artifacts.
  2. Unified Modeling Language (UML). The UML defines the notation and semantics for modeling diagrams, as you learned earlier.
  3. Common Warehouse Metamodel (CWM). CWM is the OMG data warehouse standard, and is a good example of applying the MDA paradigm to an application area.
  4. Meta Object Facility (MOF). The MOF provides the standard modeling and interchange constructs that are used in the MDA. The common foundation of the MOF, the UML and CWM are MOF-based, enables the potential model/metadata interchange and interoperability between tools. The MOF is the mechanism through which models are analyzed in XMI.
Figure 2 presents an overview of the MDA. The MDA is based on the idea that a system or component can be modeled via two categories of models: Platform Independent Models (PIMs) and Platform Specific Models (PSMs). PIMs are further divided into Computation Independent Business Models (CIBMs) and Platform Independent Component Models (PICMs). As the name implies PIMs do not take technology-specific considerations into account, a concept that is equivalent to logical models in the structured methodologies (Gane and Sarson 1979; Yourdon 1989) and to essential models within usage-centered techniques (Constantine and Lockwood 1999). The CIBMs represent the business requirements and processes that the system or component supports and the PICMs are used to model the logical business components, also called domain components (Ambler 2004). PSMs bring technology issues into account and are effectively transformations of PIMs to reflect platform-specific considerations.

Figure 2. The Model-Driven Architecture (MDA).

The most interesting aspect of Figure 2 is the relationships that it depicts. There are mappings between the various types of models that describe a system/component, mappings that represent transformations. For example there is a mapping between the CIBM and the PICM for a given system, and from the PICM to the PSM for a system. It is these mappings that provide the glue between the various representations of a system/component, mappings that a sophisticated development tool would be able to use to generate system code perhaps.

The relationships between models in different systems/components are also a critical aspect of the MDA. For example, you see in Figure 2 that there are relationships between the CIBMs of each system, their PICMs, and their PSMs. These relationships support integration between the systems, and thus promote large-scale reuse within your organization.

It's interesting to note that a CIBM, PICM, or PSM may be comprised of several artifacts. For example a CIBM could include a collection of business rule definitions and a UML activity diagram used to describe the overall business process supported by the system/component. A PICM could be described via a UML component model and a collection of interface definitions. A PSM could be comprised of a UML class diagram, several UML sequence diagrams, several UML state chart diagrams, and a physical data model. Furthermore, you may decide to have several PSMs for any given system/component, one for each platform that you intend to deploy it to.

The MDA has four primary strengths:

  1. The MDA defines a coarse separation of views. When you are modeling it is important to consider issues such as “what should the system do” and “what processes should the system support” without having to worry about how it will support it because it enables you to consider a wide range of options. It can also be beneficial to consider the logical domain architecture separately from the physical implementation architecture because the logical architecture will change much slower than the underlying technologies, allowing you to reuse your logical model as the system evolves over time.
  2. The MDA defines a viable strategy for system integration. The MDA's explicit support for modeling relationships between systems/components can help to promote system integration issues, and hence promote greater levels of reuse via system integration, within your organization. My experience is that integration, in particular legacy integration, is a significant issue for most software development efforts. Yet few books cover this critical topic, including most of my own, written under the apparent assumption that all systems are developed in “green-field” environments where everything is being built for the first time.
  3. The MDA may motivate a new breed of modeling tools. The separation of views offers the potential for tools that automatically generate the “next model down”, in particular PSMs from PIMs, based on automated mappings.
  4. The MDA may support tool integration. Part of the overall vision of the OMG is to provide a set of standards for tool vendors to follow and thereby support integration of those tools.
Although the MDA is interesting, and in the near term I suspect we'll hear a lot about it from the OMG and its supporters, there are several potential challenges that you need to be aware of. However, I also think that it's possible to take an agile approach to the MDA. In the end, gut feel tells me that less than 5% of all organizations can truly take advantage of the MDA.

3. The Zachman Framework (ZF)

The Zachman Framework (ZF) (ZIFA 2002; Hay 2003) summarizes a collection of perspectives pertinent to enterprise architecture, a modified version of which is depicted in Figure 3. The rows represent the views of different types of stakeholders, summarized in Table 1. The columns represent different aspects or views of your architecture, summarized in Table 2. There are three important concepts to understand about the ZF:

  1. Within a column the models are evolved/translated to reflect the views of the stakeholders for each row.
  2. The models within a single row should be consistent with one another, with the caveat that agile models are just barely good enough so the models may not be perfectly consistent with an agile instantiation of the ZF.
  3. The ZF does not define a process for enterprise architecture, instead it only defines the various perspectives that an enterprise architecture should encompass. You'll need to define your own process around the ZF, hopefully an agile one.

The modifications that I have made to the ZF are straightforward:

  1. I have adopted David Hay's (2003) interpretation of the framework as in my opinion he has evolved it in a good direction.
  2. I have renamed the first column from Data to Structure. The original name reflected the dominate structural paradigm at the time, that of data. Hay kept this name in his version because he's a data professional and because this name supports the data-driven approach to development that he knows and is fond of. However, as you see in Different Projects Require Different Strategies there are several ways to approach development – data-driven, object-driven, and component-driven – and I have no doubt that more will be proposed in the future. Therefore a more generic name is required so as not to prejudice your implementation.
  3. I have not filled in the cells with suggested artifacts, instead I indicate the perspective that each cell represents, unlike other methodologists. Remember, Agile Modeling implores you to follow the practice Apply the Right Artifact(s) but it doesn't tell you what those artifacts are. For example, in the structure column David Hay suggests that you create a language divergent data model in row 2, a convergent entity/relationship model in row 3, and a database design in row 4. Moriarty (2001) suggests a business class diagram, a class diagram, and a schema data model respectively in the same rows. Based on previous writings (Ambler 1998; Ambler and Constantine 2002b) I would have suggested a component model, a class diagram, and another class diagram for these rows. All three approaches are valid, but all three represent the experiences and prejudices of the individual methodologists. Far better advice would be to understand the perspective represented by each cell, understand the strengths and weaknesses of each type of modeling artifact (e.g. adopt the AM principle Know Your Models), and then follow the practice Apply the Right Artifact(s) to meet your actual needs.
  4. I have indicated mapped the rows to the terms used by the MDA, indicating which models should be platform independent versus platform specific.

Figure 3. The Modified Zachman Framework (ZF).

Zachman Framework (Modified)

Table 1. The rows of the Zachman Framework.



1. Objectives/Scope (Planner's view)

Defines your organization's direction and purpose, defining the boundaries of your enterprise architecture efforts.

2. Enterprise Model (Business owner's view)

Defines in business terms the nature of your organization, including its structure, processes, and organization.

3. Model of Fundamental Concepts (Architect's view)

Defines the enterprise in more rigorous terms than row 2, basically taking the model to a greater level of detail. This row was originally called “information system designer's view” in the original version of the ZF.

4. Technology Model (Designer's view)

Defines how technology will be applied to address the needs defined by the previous rows above.

5. Detailed Representation (Builder's view)

Defines the detailed design, taking implementation language, database storage, and middleware considerations into account.

6. Functioning System

These are the actual, working systems within your organization.

Table 2. The columns of the Zachman Framework.


1. Structure (What) Focus is on the entities/object/components of significance, and the relationships between them, within your organization. This column was originally called Data in the original version of the framework.
2. Activities (How) Focus is on what your organization does to support itself and its customers. This column was originally called Function in the original version of the framework.
3. Locations (Where) Focus is on the geographical distribution of your organization's activities. This column was originally called Network in the original version of the framework.
4. People (Who) Focus is on who is involved in the business of your organization.
5. Time (When) Focus is on the effects that time, such as planning and events, has on your organization.
6. Motivation (Why) Focus is on the translation of business goals, strategies, and constraints into specific implementations.
The Practical Guide to Enterprise Architecture The primary strength of the ZF is that it explicitly shows that there are many views that need to be addressed by an enterprise architecture. An immediate benefit of Figure 3 is that it provides a reminder of the issues that you need to consider in your architecture, whether or not you decide to adopt the ZF. Another implication is that one model does not fit all, as AM's Multiple Models principle implores. Furthermore, a single level of detail isn't sufficient either. With Hay's data-driven approach he requires several flavors of data model in the first column Moriarty's object-driven approach requires several different flavors of class model. That makes sense because your audience changes in each row. A related strength is that the ZF explicitly communicates that there are several stakeholders in an enterprise architecture, not just the enterprise architects and developers. An implication is that you need to involve your stakeholders in the development of your architecture to ensure that it meets their needs, and ideally you want to follow the practice Active Stakeholder Participation.
Unfortunately there are several potential problems with the ZF:

  1. The ZF can lead to a documentation-heavy approach. There are 36 cells in Figure 3, each one of which needs to be supported by one or more artifacts. This is potentially a lot of documentation so you need to really think about what information you actually need versus what information is nice to have – in other words adopt AM's Travel Light principle. If the cost of creating and maintaining a document exceeds it's value then find a way to either reduce the cost or increase the benefit, the implication being that you might decide not to create the documentation at all.
  2. The ZF can lead to a methodology-biased approach. It is very easy to use the ZF to promote your preferred way of working, to use it to beat your methodological drum. That might work very well for you, but is it really the best option for your organization or your clients? I doubt it. Agile enterprise architects have a wide range of modeling techniques to work with, they don't need to follow just a data-driven approach or just an object-driven approach. To be effective you need to choose the right artifacts for your situation, artifacts that reflect your organizational culture, your business environment, your technical environment, and the skillsets of the people involved. Furthermore you need to be prepared to change your approach over time, perhaps a data-driven approach is your best option today because your still working with older technologies whereas a few years from now as you move to a more modern technical environment a component or object-driven approach will prove a better fit. As the times change so should your techniques.
  3. The ZF can lead to a process-heavy approach to development. Looking at Figure 3 you can instantly see the opportunity to define a collection of rigorous processes to support it. What a wonderful opportunity for your quality group to insist on an array of model and documentation reviews, there are 36 cells after all. And what about traceability between the artifacts in those 36 cells, clearly you need to develop and maintain a detailed traceability matrix or database of meta data. Yikes, what a potential nightmare. Sounds good in theory, but these sorts of activities quickly add overhead that grinds progress to a halt. Agile IT professionals follow the AM principle Maximize Stakeholder ROI and only perform activities that add value. Just because something like maintaining traceability sounds like a good idea doesn't mean that it is a good idea.
  4. The ZF is not well accepted within the development community. Although the ZF seems to be growing in popularity within the IT architecture community it doesn't seem to have made into the mainstream development culture. Worse yet, I'm not convinced it ever will until its proponents find a way to slim it down to something that's practical. I hope that the insights I'm providing in this writing help, but it's only a start. An agile approach to implementing the ZF can work; a prescriptive/heavy approach will likely fail or at least never achieve the potential that its proponents claim. Keep it simple.
  5. The ZF can lead to a top-down approach to development. When people first read about the ZF they tend to think that it implies a top-down approach where you start with the models in row 1, then work on row 2 models, and so on. This doesn't have to be the case. In some situations a top-down approach to development works quite well, although there is nothing wrong with a bottom-up approach in other situations. A “middle out” approach where you start working from one of the middle rows, perhaps the architect's view for example, is also viable. Frankly, the best approach is to start in whatever cell works best for you and then iterate from there.
You might be interested in Extending the RUP with the Zachman Framework.

4. The Enterprise Unified Process (EUP)

Enterprise Unified Process The Enterprise Unified Process (EUP) is an instantiation of the Unified Process, as are the Rational Unified Process (RUP) and the Agile Unified Process (AUP). As you can see in Figure 4 the EUP is an extension to the RUP – the focus of the RUP is on the software development process whereas the EUP includes a Production phase in which you operate and support your software. More importantly it includes an Infrastructure Management discipline that includes cross-system activities such as reuse management, programme/portfolio management, software process management, and enterprise architecture to name a few.

Figure 4. Augmented Lifecycle for the Enterprise Unified Process (EUP).

The Enterprise Unified Process Lifecycle

The greatest strength of the EUP is that it explicitly brings enterprise architecture and enterprise administration issues into the RUP. This is important because the RUP is the most popular rigorous software process within the IT industry, hence there is the potential for the EUP to have a very large impact on the industry. It shows how to include enterprise issues in your day-to-day software development activities, a subject that many organizations struggle with.

However, there are several problems with the EUP:

  1. Lack of industry acceptance. Although the RUP has gained significant industry mindshare, the EUP has yet to catch on. This is due to several reasons. First, IBM Rational Corporation is slowly recognizing the EUP. Second, many organizations are struggling with succeeding at the development of single systems, let alone worrying about enterprise-level issues.
  2. The EUP is typically instantiated as a prescriptive/rigorous process, not an agile process. The strength of the RUP is that it is a well-defined, rigorous process. Although it is possible to tailor it down to an agile process this is something that is rarely done in practice – organizations that are interested in rigorous processes tend towards adoption of the RUP, those that are interested in being agile tend toward agile processes such as eXtreme Programming (XP), Feature-Driven Development (FDD), or the Agile Unified Process (AUP). Therefore, because the EUP is an extension to RUP it is very likely that the EUP will also be instantiated in a prescriptive manner. This will very likely lead to a documentation-heavy process that does not support agile enterprise architecture efforts well. An agile instantiation of the EUP, perhaps based on the AUP instead of the RUP, on the other hand, would be agile.

5. Are They Agile?

This is a trick question because the answer depends on how you implement them. Each of these frameworks make it really easy to overcomplicate your modeling efforts, to make them harder than they need to be, and to motivate you to write more documentation than you really need. For example you can apply the techniques defined by the UML in a very agile manner, something that Agile Modeling (AM) aptly demonstrates, or you can apply them in an incredibly dysfunctional manner. You could easily apply the principles and practices of AM to the Zachman Framework and make it very agile. Or you could use the Zachman Framework to justify a complex and documentation-heavy modeling process that requires a large enterprise architecture group to administer and control. It's your choice. Table 3 summarizes strategies for how each technique could be used in an agile manner.

Table 3. Strategies for applying each technique in an agile manner.



Enterprise Unified Process (EUP)

  • Keep it simple.
  • Instantiate an agile, or at least near-agile version of the RUP.
  • Add agile instantiations of the Production phase, the Operations and Support discipline, and the Infrastructure/Enterprise Management discipline.
  • Tailor Agile Modeling and Agile Data Method practices and techniques into the EUP.
  • Start with the AUP, or better yet Disciplined Agile Delivery (DAD), not the RUP, as the base software process to extend.
Model-Driven Architecture (MDA)

  • Keep it simple.
  • Adopt the concept of platform independent and platform specific models, with mappings in between, but keep them agile by ensuring that they are just barely good enough.
  • Adopt good tools based on the MDA that add value to your efforts.
  • Avoid documentation-heavy, bureaucratic processes centered around the MDA – remember, your goal is to develop working software not to create lots of fancy models and documentation.
  • Follow my Roadmap for Agile MDA advice.
Unified Modeling Language (UML)

  • Keep it simple.
  • All developers should have a thorough grasp of UML modeling techniques, including knowing when to use them and when not to.
  • Adopt the values, principles, and practices of Agile Modeling (AM).
  • Recognize that the UML is not complete but that it defines the core of your modeling techniques.
Zachman Framework (ZF)

  • Keep it simple.
  • Adopt the concept that your enterprise architecture efforts must reflect a wide range of perspectives.
  • Adopt the augmented form of the ZF to avoid methodology bias.
  • Avoid documentation-heavy, bureaucratic processes centered around the ZF – remember, your goal is to develop working software not to create lots of fancy models and documentation.