Agile Data Tools and Scripts for Better Ways of Working (WoW)
To implement the Agile Data method ways of working (WoW) you will need to adopt, build, and/or modify a collection of tools. Tools are just a start, you also need an effective technical environment in which to use them. This environment should be comprised of several “sandboxes” in which you will work. Finally, Agile data engineers will discover that they need to several different types of scripts to support their development efforts.
1. Tools for Agile Data
Having an effective toolset is a critical success factor for any software development effort. Table 1 lists categories of tools, the target audience for the tool, how you would use the tool, and links to a representative sample of such tools. Chances are very good that you already have many of these tools in house, although you will undoubtedly need to obtain several of them.
Agile isn’t in the tool, it’s in the way that you use the tool.
Table 2 lists tools that to my knowledge do not exist yet, at least at the time of this writing, that are needed to support the Agile Data method. My hope that we will see both commercial and open source tools available in the near future.
Table 1. Potential Tools That Support Agile Data Efforts.
Tool Category | Role | Purpose | |
CASE Tool – Development Modeling | Application Developer, Agile data engineer | To support your application development efforts. | |
CASE Tool – Physical Data Modeling | Agile data engineer | To define and manage your physical database schema. Many data modeling tools support the generation and deployment of DDL code, making it easier to change your database schema. And they also produce visual representations of your schema and support your documentation efforts. | |
Configuration Management | Everyone | You need to place all data definition language (DDL), source code, models, scripts, documents, “¦ under version control. | |
Database Refactoring Tools | Application Developer, Agile data engineer | To evolve your database schema in small, safe steps. | |
Development IDE/Refactoring Browser | Application Developer, Agile data engineer | To support your programming and testing efforts. | |
Extract Transform Load (ETL) | Agile data engineer,
Operations engineer |
ETL tools can automate your data cleansing and migrating efforts that evolve your database schema. | |
Persistence Frameworks | Application Developer, Agile data engineer | Persistence frameworks/layers encapsulate your database schema, minimizing the chance that database refactorings will force code refactorings external applications. | |
Release Tools | Application Developer, Agile data engineer | You need to deploy your database between sandboxes, including production. | |
Test Data Generator | Application Developer, Agile data engineer | Developers need test data against which to validate their systems. Test data generators can be particularly useful when you need large amounts of data, perhaps for stress and load testing. | |
Testing tools for load testing, user interface testing, system testing | Application Developer,
Agile data engineer |
You will need to go beyond unit testing to perform a more robust set of tests that go beyond unit testing. The Full Lifecycle Object-Oriented Testing (FLOOT) method which encapsulates a wide range of traditional and agile testing techniques. | |
Traceability Management/ Repository | Everyone | ||
Unit testing tools for your applications | Application Developer | Developers must be able to unit test their work, and to support iterative development they must be able to easily regression test. | |
Unit testing tools for your database | Agile data engineer | Whenever you change your database schema, perhaps as the result of a database refactoring, you must be able to regression test your database to ensure that it still works. | |
Other | Agile data engineer |
Tool Category | Discussion |
Automated Schema Traceability Management Tools | Although Table 1 includes traceability management tools the reality is that most tools are geared either towards requirements traceability or data access traceability (as in the case of repositories such as Rochade and Advantage). Neither are suited for the fine-grained traceability required for database refactoring. Ideally you need a tool that can trace a wide range of application features, such as COBOL procedures and Java operations, to database features such as stored procedures and table columns. Because of the complexity of this task the less manual intervention the better – ideally it should be able to parse your application and database code and create the traceability matrix automatically. |
2. Sandboxes
This section has been replaced by the Sandboxes Core Practice article.
3. Scripts for Agile Data
Data engineers should maintain a database change log and an update log, the minimum that you require for simple stovepipe initiatives where a single application accesses your database. However, to support more complex environments where many applications access the your database you also require a data migration log. Let’s explore how you use each log:
- Database change log. This log contains the data definition language (DDL) source code that implements all database schema changes in the order that they were applied throughout the course of an initiative. This includes structural changes such as adding, dropping, renaming, or modifying things such as tables, views, columns, and indices.
- Update log. This log contains the source code for future changes to the database schema that are to be run after the deprecation period for database changes. The Process of Database Refactoring argues that changing your database schema is inherently more difficult than changing application source code – other developers on your team need time to update their own code and worse yet other applications may access your database and therefore need to be modified and deployed as well. Therefore you will find that you need to maintain both the original and changed portions of your schema, as well as any scaffolding code to keep your data in sync, for a period of time called the “deprecation period.”
- Data migration log. This log contains the data manipulation language (DML) to reformat or cleanse the source data throughout the course of your initiative. You may choose to implement these changes using data cleansing utilities, often the heart of extract-transform-load (ETL) tools, examples of which are listed in Table 1.
You may choose to implement each logical script as a collection of physical scripts, perhaps one for each development iteration or even one for each individual database refactoring, or you may choose to implement as a single script that includes the ability to run only a portion of the changes. You need to be able to apply subsets of your changes to be able to put your database schemas into known states. For example you may find yourself in development iteration 10 to discover that you want to roll back your schema to the way it was at the beginning of iteration 8.
Recommended Reading
This book, Choose Your WoW! A Disciplined Agile Approach to Optimizing Your Way of Working (WoW) – Second Edition, is an indispensable guide for agile coaches and practitioners. It overviews key aspects of the Disciplined Agile® (DA™) tool kit. Hundreds of organizations around the world have already benefited from DA, which is the only comprehensive tool kit available for guidance on building high-performance agile teams and optimizing your WoW. As a hybrid of the leading agile, lean, and traditional approaches, DA provides hundreds of strategies to help you make better decisions within your agile teams, balancing self-organization with the realities and constraints of your unique enterprise context.
I also maintain an agile database books page which overviews many books you will find interesting.