Object-oriented technology supports the building of applications
out of objects that have both data and behavior.
Relational technologies support the storage of data in tables
and manipulation of that data via data manipulation language (DML) internally
within the database via stored procedures and externally via SQL calls.
Some relational databases go further and now support objects internally
as well, a trend that will only grow stronger over time.
It is clear that object technologies and relational technologies are in
common use in most organizations, that both are here to stay for quite awhile,
and that both are being used together to build complex software-based systems.
It is also clear that the fit between the two technologies isn't
perfect, that there is an "impedance mismatch" between the two.
the early 1990s the differences between the two approaches was labeled the
"object-relational impedance mismatch", or simply
"impedance mismatch" for short, labels that are still in common use today.
Much of the conversation about the impedance mismatch focus on the
technical differences between object and relational technologies, and rightfully
so because although there are deceptive
similarities there are also subtle yet important differences. Luckily,
there are strategies for overcoming the O/R impedance mismatch.
Why does this impedance mismatch exist?
The object-oriented paradigm is based on proven software engineering
paradigm, however, is based on proven mathematical principles.
Because the underlying paradigms are different the two technologies do
not work together seamlessly. The impedance mismatch becomes apparent when you look at the
preferred approach to access: With the object paradigm you traverse objects via
their relationships whereas with the relational paradigm you join the data rows
of tables. This fundamental
difference results in a non-ideal combination of object and relational
technologies, although when have you ever used two different things together
without a few hitches?
To succeed using objects and relational databases together you
need to understand both paradigms, and their differences, and then make
intelligent tradeoffs based on that knowledge.
Databases 101 overviews relational databases and Data
Modeling 101 describes the basics of data modeling, providing you with
sufficient background to understand the relational paradigm.
101 overviews object-orientation and the UML, explaining the basics of the
object-oriented paradigm. Until you
understand both paradigms, and gain real-world experience working in both
technologies, it will be very difficult to see past the deceptive similarities
between the two.
The easiest similarity/difference to observe is the different types in object
languages and in relational databases. On the subtle side, Java has a
string and an int whereas Oracle has a varchar and a smallint. Although
values are stored and manipulated differently, it's fairly straightforward to
convert back and forth and DB access libraries such as JDBC handle them
automatically. However, on the not-so-subtle side Java has collections
whereas Oracle has tables, clearly not the same concepts. Oracle has blobs
whereas Java has objects, once again clearly not the same concepts.
Figure 1 depicts a physical data model (PDM) using
data modeling notation. Figure
2 depicts a UML
the surface they look like very similar diagrams, and on the surface they in
fact are. It's how you arrive at
the two diagrams that can be very different.
Figure 1. A physical data model (UML notation).
Let's consider the deceptive similarities between the two
diagrams. Both diagrams depict
structure, the PDM shows four database tables and the relationships between them
whereas the UML class diagram shows four classes and their corresponding
relationships. Both diagrams depict
data, the PDM shows the columns within the tables and the class model the
attributes of the classes. Both
diagrams also depict behavior, the Customer table of Figure
1 includes a delete trigger and the Customer class of Figure
2 includes two operations. The
two diagrams also use similar notations, something that I did on purpose,
although the UML data modeling notation is little different than other industry
Figure 2. A UML class model.
Differences in your modeling approaches will result in subtle differences
between your object schema and your data schema:
There are differences in the types of relationships that
each model supports, with class diagrams being slightly more robust than
physical data models for relational databases.
This is because of the inherent nature of the technologies.
For example, you see that there is a many-to-many relationship between
Customer and Address in Figure 2, a
relationship that was resolved in Figure 1 via the CustomerAddress
associative table. Object
technology natively supports this type of relationship but relational databases do not,
which is why the associative table was introduced.
also reveals a schism within the object community.
It is common practice to not show
keys on class diagrams (Ambler
2003/2005), for example there isn't any shown on Figure
2. However, the reality is that
when you are using a relational database to store your objects then each object
must maintain enough information to be able to successfully write itself, and
the relationships it is involved with, back out to the database.
This is something that I call "shadow information", which you can see
has been added in Figure 3 in
the form of attributes with implementation visibility (no visibility symbol is
shown). For example the Address
class now includes the attribute addressID which corresponds to AddressID
in the Address table (the attributes customers, state, and zipCode
are required to maintain the relationships to the Customer, State,
and ZipCode classes respectively).
Figure 3. A fully attributed
UML class model.
Yet in reality object developers discover that they need to
spend significant portions of their time making their object persistent, perhaps
because they've run into performance problems after improper
mappings or perhaps because they've discovered that they didn't take legacy
data constraints into account in their design.
My experience is that persistence is a significant blind spot for many
object developers, one that promotes the
cultural impedance mismatch.
The schism is that the object community has a tendency to
underestimate the importance of object persistence.
Symptoms of this problem include:
The lack of an official data model in the UML (see
Unofficial UML Data Modeling Profile)
The practice of not modeling keys on class diagrams
The misguided belief that you can model the persistent
aspects of your system by applying a few stereotypes to a UML class diagram
Many popular OOA&D books spend little or no time
discussing object persistence issues
3. Strategies for Overcoming the Object-Relational
Object and relational technologies are real, you are very likely
working with both, and they are here to stay.
Unfortunately the two technologies differ, these differences being
referred to as "the object-relational impedance mismatch".
In this article
you learned that there are two aspects to the impedance
mismatch: technical and cultural.
The technical mismatch can be overcome by ensuring that project
team members, including both application developers and Agile DBAs, understand
the basics of both technologies. Furthermore,
you should actively try to reduce the coupling that your database schema is
involved with by encapsulating access to your database(s) as best you can, by
designing your database well, and by keeping the design clean through database
Unfortunately there has been less attention spent on the cultural
differences between the object-oriented community and the data community.
These differences are often revealed when object professionals and data
professionals argue with each other regarding the approach that should be taken
by a project team.
For a detailed discussion, see
Cultural Impedance Mismatch Between Data Professionals and Application