Architecture and Data Blog

Thoughts about intersection of data, devops, design and software architecture

Long Running Data Migrations during Database Refactorings

how to manage long running data migrations

When you are refactoring large databases, you will have certain tables that have millions of rows, so lets say we are doing the Move Column refactoring, moving the TaxAmount column from Charge table which has millions of rows to TaxCharge table. Create the TaxAmount column in the TaxCharge table. Then have to move the data from the TaxAmount column in the Charge table to the TaxAmount column you created in the TaxCharge table.


Moscow

How to use stored procedures as interface to the data

Last week I was at SD Best Practices in Moscow, doing a presentation on “Refactoring Databases: Evolutionary Database Design”. Moscow seems like a interesting place, loads of huge buildings, squares, fountains and roads. Things some how feel rundown, feels like a player trying to regain his former ability or glory.

Opening Keynote by Jim McCarthy about how teams should operate was interesting, he proposed 11 principals or protocols as he calls them, to be followed by members in a team so that the team becomes more productive, many of these protocols are about avoiding waste and promoting clear communication channels.


Database Testing revisited

Testing the database layer that interacts with the database is critical.

Some time ago I wrote about what it means to do database testing.. more I think about this and having had some strange situations recently I want to add more to the list of things we should be testing.

Persistence Layer We should persist the objects to the database using the applications persistence layer and retrieve the objects using the same mechanism and test that we get the same object back. If we have a lot of business logic in out persistence layer we may also want to retrieve the object using Direct SQL and test that the correct values got persisted.


Database Migration Utility

Tools to enable database migration

I have taken up the hobby of searching the opensource landscape for tools that help me do Agile database development. I’m going to write about all the Tools that I come across that help me, my preference is opensource software but not limited to it. I will try to provide some sound examples and share my experiences with all that tools that I come across and share the example code I used.


Database Testing

Enable testing the database layer

What does it mean to test your Database? usually when someone mentions database testing, what is that they want to test. The application code that interacts with the database, or the sql code the resides in the database like stored procedures and triggers etc. I see all these aspects to database testing as important.

Testing the applications persistence mechanism We should test that the application persists what its supposed to save and retrieve the data using SQL and see if the database contains the same information that is being saved, this kind of testing makes sense when the application has complex persistence layer. This type of testing can be achieved using unit tests, functional tests etc.


Move your DBAs to the Project team locations"

Many IT organizations I have seen have groups of specialists, typical are UNIX Group, DBA group etc.

When a project starts the developers on the project have to meet with all the Groups (I have more experience with the DBA group, so I write with the DBA group in mind) that they need to interact with and explain to the groups their design, the projects operational needs and other requirements, later when development starts they have to email these groups about all the setup that needs to be done and also the day to day changes that are needed. This way of working slows down the productivity of the team and the organization.


Refactoring Data

Many a times Refactoring is talked about in the context of code, recently I finished working with Scott Ambler on Database Refactoring

Lately I have been working on changing data in an production database, and have been wondering how do I define it, Data Refactoring? what are the patterns of Data Refactoring. First let me talk about what I mean by Data Refactoring.

When a given application goes into production, and starts life as a live application we find bugs with the application, these bugs create a weird data in the database, also with the way people change data through the app and some times through the database (yikes) and these data changes do lead to bad data. How do you go about fixing these data problems, are there patterns to these fixes.