Skip to content

Instantly share code, notes, and snippets.

@jiuks
Created March 6, 2013 07:14
Show Gist options
  • Save jiuks/5097348 to your computer and use it in GitHub Desktop.
Save jiuks/5097348 to your computer and use it in GitHub Desktop.
ideas for an article entitled Don't Repeat Yourself

My background is writing code and one of the most useful things I learnt was Don't Repeat Yourself, this was something I read in a book called the Pragmatic Programmer which I would thoroughly recommend to anyone today who has anything to do with software.

What interests me now is how we should relate this to the Business Intelligence, Data Integration and Analytics worlds, and one conversation I had yesterday reminded me of several other conversations that I have had over the years: how can we automate ETL and Business Intelligence meta data generation?

A traditional Data Warehousing project will follow some or all of these steps, whether they do it in repeated Agile sprints, or one big Waterfall Model:

  • elicit, document and validate requirements.
  • design logical and then physical model.
  • indentify source data attribute(s) and transformations for target model.
  • write ETL packages to move code through whichever data warehouse architecture you are following.
  • build meta data in reporting to reflect logical model.
  • build reports and dashboards to fulfil requirements.
  • perform various kinds of testing (integration, regression, user acceptance, performance).
  • release and adopt.

I am trying to simplify the above process, so I am assuming perfect data quality and have bucketed all testing etc together.

This process has remained remarkably unchanged for a number of years, and generally in the Oracle Business Intelligence and Data Warehousing world remains pretty manual, the end result being: we are repeating ourselves.

If we look at how development has changed in other areas, the most obvious example being web applications we see a host of frameworks and accelerators have been written, look at Ruby on Rails, Merb, Ember and Node.js. Significant sites have been, and are, written using these tools. LinkedIn uses Node.js and Twitter was originally written in Rails.

Other than just accelerating the development process these tools also impose standards and things like naming conventions into the application. Most significantly they stop a lot of repetition.

So when are we going to see something similar from Oracle? To be honest I don't think we ever will, so one of the internal projects we are starting this year at Rittman Mead.

We already have a project called Transcend...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment