January 29, 2010

Workshop at Enterprise Data World 2010

Doing a workshop on Agile Database Development at Enterprise Data World 2010 at SF. See you there.

November 18, 2009

Testing in conversion projects

When working on projects involving Conversion of data or Migration/Moving of data from a legacy database. The testing effort is enormous and testing takes a lot of time, some test automation can help this effort.

Since data is moved/changed from a source database to destination database, we can write sql which should provide results for the types of tests you want to perform, for example: write a sql to give us number of customers, write a sql to give us account balance for a specific account.

These sqls can be run on your source database as well as your destination database and the results can be compared programmatically, providing us an easy way to compare the state of the database before and after conversion/migration. This testing can be run through a CI engine to make it a regression test suite.

Here is an example implementation using ruby,

We have two databases SOURCE and DESTINATION and two sql files names source.sql and destination.sql. The ruby program picks up sql from these two files and runs them against their database i.e. sql from source.sql is run against the SOURCE database and sql from destination.sql is run against DESTINATION database. The results of both of those sqls is compared and an failure is raised when the results do not match.

 results
  statement = get_sql_statement_to_execute
    begin
      source_statement = statement[0]
      destination_statement = statement[1]
      source_rows = exec_sql_in_source_return_rows(source_statement)
      destination_rows = exec_sql_in_destination_return_rows(destination_statement)
      result = compare_rows(source_rows, destination_rows, destination_statement, source_statement)
      results << result
    rescue
      Log.log("Could not process: "+statement)
    end
    if (results.size > 0)
      Log.log("Results do not match in source and destination")
    end

The sample ruby code above shows how the solution can be implemented, thus enabling automation of database conversion/migration testing

October 8, 2009

Ruby OCI 2.0 Array binding

We have been doing some data moving lately using Ruby and Ruby-OCI. We started with Ruby OCI 1.0 and did use prepared statements with bind variables (since we are using oracle database and pulling data from an oracle database and pushing data to an oracle database). Later we found this really cool feature in Ruby-OCI8 2.0 where you can bind a whole array and just make one database trip for many database operations.

Lets say you want to insert 10 rows, using the insert one row at a time would be 10 trips to the database.

def save_accounts(accounts)
    stmt = $connection.parse "INSERT INTO account (accountid,name) values (:account_id,:name)"
      accounts.each do |account|
        stmt.bind_param(:account_id, account[0], Float)
        stmt.bind_param(:name, account[1], String)
        stmt.exec
      end
      $connection.commit
    end

Using the array bind feature, its actually just one trip to the database (off course depends on the array size you are going to bind, but you get the picture, it reduces database trips)

def save_accounts(account_ids, account_names)
      stmt = $connection.parse "INSERT INTO account (accountid,name) values (:account_id,:name)"
      stmt.max_array_size= account_ids.size
      stmt.bind_param_array(:account_id, account_ids)
      stmt.bind_param_array(:name, account_names)
      stmt.exec_array
      $connection.commit
    end
We saw a 100% improvement in performance by changing the way we bind the variables in just one place. Looks like a feature to look out for.

September 3, 2009

Create an Index for all FK Columns in the database

Most of the time I have seen database foreign key constraints on tables without indexes on those columns. Lets say the application is trying to delete a row from the CUSTOMER table
DELETE FROM CUSTOMER WHERE CUSTOMERID = 1000;
When the database goes about deleting the customerId of 1000, if there are foreign key constraints defined on customerId, then the database is going to try to find if the customerId of 1000 is used in any of those tables. Lets say ORDER table has the customerId column, the database is going to issue
SELECT ... FROM ORDER WHERE CUSTOMERID = 1000;
now if there is no index on ORDER.CUSTOMERID, the database will have to do a full Table scan which is very expensive in terms of IO and resources, imagine customerId being used in lots of tables, the problem just multiplies significantly. In an multiuser scenario, this will lead to deadlocks, since the same tables are being read and locks being applied to find dependend children. Introducing an index on all the columns that are foreign key referenced helps a lot in this case.

August 10, 2009

Materialized views and database links in oracle.

Recently one of my colleague Jeff Norris had a weird error. He was trying to build a materialized view over some tables in his local database and some tables in his remote database using database links the sql to create the view ran fine and provided the results as expected, but when put inside a materialized view statement complained with ORA-00942 errors.

Lets say the two databases in question are local and remote, so the sql to create the materialized view to load immediately and refresh everyday is

CREATE MATERIALIZED VIEW MV_CUSTOMERBALANCE 
BUILD IMMEDIATE
REFRESH FORCE START WITH ROUND(SYSDATE) + 23/24
NEXT SYSDATE + 1
AS
SELECT customer.name , account.balance, accounttype.name 
FROM customer , account@remotedb account, accounttype@remotedb accounttype
WHERE
customer.id = account.customerid
AND account.accounttyppeid = accounttype.id
/
Oracle started to complain when creating the above materialized view issuing an error ORA-00942: table or view does not exist, but the SQL without the create materialized view command ran fine giving the expected results.
SELECT customer.name , account.balance, accounttype.name 
FROM customer , account@remotedb account, accounttype@remotedb accounttype
WHERE
customer.id = account.customerid
AND account.accounttyppeid = accounttype.id
/
After some searching around and experimenting I found, in the create materialized view statement the database link name can be used only once, which meant we can only use the "remotedb" name once, we got around this restriction by creating two database links to the remote database as REMOTEACCOUNT and REMOTEACCOUNTTYPE and using them in the creation of the materialized view as shown below.
CREATE MATERIALIZED VIEW MV_CUSTOMERBALANCE 
BUILD IMMEDIATE
REFRESH FORCE START WITH ROUND(SYSDATE) + 23/24
NEXT SYSDATE + 1
AS
SELECT customer.name , account.balance, accounttype.name 
FROM customer , account@remoteaccount account, accounttype@remoteaccounttype accounttype
WHERE
customer.id = account.customerid
AND account.accounttyppeid = accounttype.id
/

August 5, 2009

Perfectly good data.. wasted

Okay this is kind of a rant, maybe I'm too picky or just that I hate to see perfectly good data not being used. This is how it goes..

I go regularly to this store to get Horizon organic milk for my family, about 60% of the time I see milk I need NOT in stock, okay I can live with that, may be lots of folks are buying organic milk, but not when it happens frequently, especially when the store knows how much milk was ordered (or supplied from the warehouse) and how much milk was sold, the store should be able to figure out that organic milk gets sold out pretty fast, putting my Business Intelligence (BI) hat on, I think the store should be able to predict when they are going to run out of organic milk ( for that matter any product), its especially frustrating when they have all the data they need to get it done.

One more non usage of data that really makes me red is, when the organic milk in the store is already expired (past the sell by date). I mean how hard is it for someone to generate a list of all the products that expire today and ask the store associates to remove them from the shelves by the end of the day, especially when its edible items.


May 26, 2009

Explicitly rollback when you encounter a deadlock.

Dead lock is caused in the database when you have resources (connections) waiting for other connections to release locks on the rows that are needed by the session, resulting in all session being blocked. Oracle automatically detects deadlocks are resolves the deadlock by rolling back the statement in the transaction that detected the deadlock. Thing to remember is that last statement is rolled back and not the whole transaction, which means that if you had other modifications, those rows are still locked and the application should make sure that it does a explicit rollback on the connection.

For example.
Lets assume there are two tables Parent(ParentID) and Child(ChildID)

SESSION_A >create table parent (parentId number(10));
Table created.
SESSION_A >create table child (childId number(10));
Table created.
SESSION_A >insert into parent values (100);
1 row created.
SESSION_A >insert into child values (200);
1 row created.
SESSION_A >commit;
Commit complete.
SESSION_A >select * from parent;
  PARENTID
----------
       100

SESSION_A >select * from child;
CHILDID
----------
200
SESSION_A >

Now lets create a situation where a deadlock happens. There are two sessions connected to the same database and same user, SESSION_A and SESSION_B are the two sessions in question.

SESSION_A >update parent set parentid = 1000 where parentid=100;
1 row updated.
SESSION_B >update child set childid = 2000 where childid = 200;
1 row updated.
SESSION_B >update parent set parentid = 2001 where parentid=100;
--Waiting For Lock on Row in Parent Table, held by SESSION_A
SESSION_A >update child set childid = 1001 where childid = 200;
update child set childid = 1001 where childid = 200
       *
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
--SESSION_A requesting lock on row, held by SESSION_B causing deadlock.
SESSION_A >

After you get the ORA-00060 error the statement update child set childid = 1001 where childid = 200; is rolled back.. but SESSION_B is still waiting for the lock on the Parent table to be released.

So when your application get the ORA-00060 or any deadlock exception in any other database, explicitly rollback your transaction (not just the current statement) so that all the changes made in the transaction and all the locks held by the transaction are released.

May 14, 2009

Oracle for the Mac

Ever since I moved to the Mac, I had to run some other OS inside a VM so that I could run Oracle and use it, since Oracle was not available for the the Mac. Now that is no longer the case. Oracle 10gR2 (10.2.0.4) is now available for Mac here

This is especially nice since the Oracle for Mac was the most voted requirement on mix.oracle.com

May 6, 2009

In Oracle 11g password is case sensitive

In Oracle 10g and before we all know that passwords are not case sensitive, so PASSWORD, Password, password would let you in and everything would be okay.

If you upgrade to Oracle 11g (I know lot of you are waiting for 11gR2), you will find that passwords are case sensitive. Here is an example of case sensitive passwords.

c:\Software>sqlplus bddd/bddd@dosa
SQL*Plus: Release 11.1.0.6.0 - Production on Wed May 6 15:17:43 2009
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning and OLAP options
BDDD@dosa >

Lets try to connect with a upper case password

c:\Software>sqlplus bddd/BDDD@dosa
SQL*Plus: Release 11.1.0.6.0 - Production on Wed May 6 15:19:25 2009
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
ERROR:
ORA-01017: invalid username/password; logon denied
Enter user-name:

So what does this mean to apps running with 10g, that get ported to run with 11g. Make sure that the password set in the properties files is of the correct case.

You can also revert to 10g behavior by changing sec_case_sensitive_logon parameter to FALSE, since its TRUE by default.

alter system set sec_case_sensitive_logon=FALSE;

March 31, 2009

Oracle Metadata can be mis-leading

Oracle has metadata about all its objects in various tables/views. One such view is the USER_OBJECTS or ALL_OBJECTS, this view has a column named as STATUS which shows you if the given object is VALID or INVALID. The status applies to DB Code (Stored Procedures, Functions, Triggers etc). To find all the INVALID objects in the schema, issue SELECT * FROM USER_OBJECTS WHERE STATUS='INVALID'. One problem with the way oracle maintains this metadata is, changing the underlying table on which the DB Code depends, oracle marks the objects are INVALID even though the underlying table may have changed in such a way, that it does not affect the DB Code at all (like adding a new column, or making a colum nullable). Here is some code which shows you what I mean. Run it through SQLPlus.
COLUMN OBJECT_NAME FORMAT A30
COLUMN STATUS FORMAT A15
spool objects.log

CREATE TABLE FOO (ID NUMBER(10), NAME VARCHAR2(30));

CREATE OR REPLACE TRIGGER TRIG_FOO
BEFORE INSERT OR UPDATE
ON FOO
REFERENCING OLD AS OLD NEW AS NEW
FOR EACH ROW
BEGIN
	IF :NEW.name IS NULL THEN
		:NEW.name := 'NOT AVAILABLE';
	END IF;
END;
/

CREATE OR REPLACE FUNCTION FUNCTION_GET_NAME_FOR_FOOID(inFooId number)
RETURN VARCHAR2
IS
fooName VARCHAR2(30);
BEGIN
	BEGIN
     SELECT name INTO fooName FROM foo WHERE id = inFooId ;
     EXCEPTION
         WHEN NO_DATA_FOUND THEN 
     		RETURN 'NOT FOUND';
 	END;
     RETURN fooName;
 END;
/

SELECT OBJECT_NAME,STATUS FROM USER_OBJECTS WHERE STATUS='INVALID';

ALTER TABLE FOO ADD ( DESCRIPTION VARCHAR2(100));

SELECT OBJECT_NAME,STATUS FROM USER_OBJECTS WHERE STATUS='INVALID';

spool off
To get the objects back to VALID status, all that needs to be done is
ALTER TRIGGER TRIG_FOO COMPILE;
ALTER FUNCTION FUNCTION_GET_NAME_FOR_FOOID COMPILE;

SELECT OBJECT_NAME,STATUS FROM USER_OBJECTS WHERE STATUS='INVALID';