Architecture and Data Blog

Thoughts about intersection of data, devops, design and software architecture

Presentation on Database Refactoring

My Presentation on Database Refactoring at QCon was recorded and is live now on infoQ here


Considerations when using stored procedures for doing CRUD

How to use stored procedures as interface to the data

Some environments like to have access to the database tables routed via stored procedures. Instead of using Create/Read/Update/Delete aka CRUD with DML, stored procedures are invoked with the parameters to perform the required operation. I’m not arguing about the benefits/pitfalls of this approach, if you have to do stored procedures, here are some things to look at.


Using Explicit Order By in your SQL Statements

Don't rely on the database to order data.

Recently when our test databases where upgraded new version of Oracle, we started noticing that the order in which some drop down lists were being displayed was not correct. It turns out that the SELECT statement we had, did not have a ORDER BY clause and the data was being returned in the ORDER of the creation of rows (in the order of ROWID) when the database was upgraded these ROWID’s got changed and hence the ORDER of the data being shown in the drop down lists.


Version Control your work..

Using version control for non code assets

So we version control/source control everything on our project.. code/data/artifacts/diagrams etc. yesterday I said why not extend it to my writings to everything I have. So I started this long journey of refactoring my folder layout and making a nice folder structure to hold all the things I have written about have other artifacts in the process of writing and moved them all to subversion, now all my example code and writings are all under version control that gets backed up everyday…. feels liberating


Writing a SQL to generate a SQL

Generating SQL code using SQL

We had a weird requirement on our project recently..

Find all the Rows in All the tables that do not comply with the Constraints that we have in development but not in QA environments


Enforcing your assumptions about database functionality

Testing the behavior of the database

When you are writing xUnit tests you are in certain ways trying to make sure that the test breaks when the code that is being tested changes the assumptions you made when writing the Test and Production code.

Similarly if you are relying on the database to throw a error when you put invalid data, then you should write a test around this assumption, so that when someone changes the database to not behave the way you assumed it to behave, the test you wrote will break and it will force the team to think about the change to the database that is being undertaken. If the change is really required, then the team would fix the test else rollback the change being made.


Database Testing revisited

Testing the database layer that interacts with the database is critical.

Some time ago I wrote about what it means to do database testing.. more I think about this and having had some strange situations recently I want to add more to the list of things we should be testing.

Persistence Layer We should persist the objects to the database using the applications persistence layer and retrieve the objects using the same mechanism and test that we get the same object back. If we have a lot of business logic in out persistence layer we may also want to retrieve the object using Direct SQL and test that the correct values got persisted.


Move your DBAs to the Project team locations"

Many IT organizations I have seen have groups of specialists, typical are UNIX Group, DBA group etc.

When a project starts the developers on the project have to meet with all the Groups (I have more experience with the DBA group, so I write with the DBA group in mind) that they need to interact with and explain to the groups their design, the projects operational needs and other requirements, later when development starts they have to email these groups about all the setup that needs to be done and also the day to day changes that are needed. This way of working slows down the productivity of the team and the organization.


To allow NULLs or NOT

When to allow NULLs or NOT, is critical domain discussion

Database designs I have seen tend to not constrain the data in the database. For example make the Item.ManufacturerID non-nullable and make it a foreign key to the Manufacturer table. Similarly Manufacturer.Name and Item.Rate as non-nullable columns. In any greenfield application (existing production application is a topic for another post). When you design table(s) lets say you have Item and Manufacturer table as shown below

CREATE TABLE Item
(ItemID NUMBER NOT NULL,
ManufacturerID NUMBER,
Name VARCHAR2(128),
Rate NUMBER,
CONSTRAINT PK_ITEM
              PRIMARY KEY (ItemID)
);

CREATE TABLE Manufacturer
(ManufacturerID NUMBER NOT NULL,
Name VARCAHR2(128),
CONSTRAINT PK_MANUFACTURER
              PRIMARY KEY (ManufacturerID)
);

For now let’s talk about NOT NULL constraint. Many say they don’t make columns non-nullable, because they don’t know the requirements at design time, or they don’t want their tests to create elaborate sets of data, or that the application enforces the constraint then why enforce the constraint on the database?