Architecture and Data Blog

Thoughts about intersection of data, devops, design and software architecture

Automated testing of custom sql

Avoid silent failures

Lots of time hand coded SQL is written in the applications for performance sake, getting data using complex SQL out of the tables. As the database structure changes it becomes really hard to find out all the SQL that needs to be changed. In other cases SQL is generated based on certain conditions in the code and its hard to find if the generated SQL is valid after database changes are done.


Publish data models in CI Pipeline

Enable wider usage of data models by publishing the latest models on team Wikis

Many a times ER models are created by the data team and are not shared outside of the data team generally for the lack of tools licenses, since its not feasible for the entire team to purchase licenses for the ER modelling tools such as Erwin Data Modeller or Er Studio


SSL Connection to AWS Aurora

Creating secure connection to the AWS aurora involves multiple steps

On a recent project we had to connect to AWS Aurora postgres 10.6 version of the database in SSL mode using JDBC and Java 11 JRE. When the Aurora cluster is setup, we can force all connections to use SSL by using the options group settings (forceSSL=true), establishing secure connection from the application to the database is not as easy as it looks.


Static Analysis of PL/SQL code

Applying modern development practices to database code

In all new development and sometimes during legacy codebase modernization, developers tend to add code quality checks and static analysis of codebase such as style checks, bug finders, cyclomatic complexity checking etc. into the CI/CD pipeline. When we inherit a codebase that has much PL/SQL and there is a desire to put the PL/SQL code base through the same types of code analysis, what options does a developer/dba have?


Using liquibase to load data and ignore some columns

Loading data into tables is needed many times on projects to load test, Liquibase provides a method to load data into tables with lots of customization. In the example shown below, I’m loading zip code data with the following column layout

"Zipcode","ZipCodeType","City","State","LocationType","Lat","Long","Location","Decommisioned","TaxReturnsFiled","EstimatedPopulation","TotalWages"

Synonyms as abstraction layer

In many development shops, developers are not allowed to access the database schema directly, and are not allowed to create tables, indexes, views etc, instead are given access via a different schema that allows SELECT, UPDATE and DELETE access on data. The general reason is to avoid developers creating database objects without


Using rake and activerecord to generate boilerplate DB Code

Generating code that is necessary for standard code

IN many projects, there are tables which need default audit columns such as Created_By, Created_Date, Modified_By, Modified_date and other columns that need to be updated every time some actions are done against the tables and/or columns. This type of functionality can be implemented using triggers.


Migrations in NoSQL databases

In relational database usage the pattern of migrations is well understood and has gained widespread acceptance. Frameworks such as DBDeploy, DBMaintain, MyBatis migrations, Flyway, Liquibase, Active Record Migrations and many others. These tools allow to migrate the database and maintain the version history of the database in the database.

With the rise of NoSQL Databases and their adoption in development teams we are faced with the problem of migrations in NoSQL databases. What are the patterns of data migrations that work in NoSQL databases? as NoSQL databases are schema free and the database does not enforce any schema validation, the schema of the data is in the application and thus allows for different techniques of data migration.


10 node mongodb ReplicaSet on a Single Machine"

While doing evalauation of NoSQL databases, we had a 10 node riak cluster and wanted check how a similar setup would work with mongodb. So started to setup a 10 node mongodb cluster. Since this was for initial spikes, we decided to set this up on a single machine as with the other test setup using Riak.

Before I explain how we setup 10 node mongodb ReplicaSet, let me talk about replica sets. MongoDB implements replication, providing high availability using replica sets. In a replica set, there are two or more nodes participating in an asynchronous master-slave replication. The replica-set nodes elect the master node, or primary node, among themselves and when the primary node goes down, the rest of the node elect the new primary node.


10 node Riak cluster on a single machine

When trying to evaluate NoSQL databases, its usually better to try them out. While trying them out, its better to use them with multiple node configurations instead of running single node. Such as clusters in Riak or Replica-set in mongodb maybe even a sharded setup. On our project we evaluated a 10 node Riak cluster so that we could experiment with N, R and W values and decide which values where optimal for us. In Riak here is what N, R and W mean.