Database Design and Modeling I
Throughout my experience I come across so many database designing problems. And today I want to share some of the common influences and impacts of most common designing mistakes that bring big disaster to the whole project.
One of the most common designing mistakes is the “Poor Planning / Architecture” of the database or mart.
Let me start with an example. Suppose you brought a piece of land for construction of house today and today itself you hired a contractor. Now tell me, do you want the contractor to start putting foundations from the same day or the next day itself. Do you want him to build your house without any sketch or plans or Blue Prints? Or even without knowing that the land you got is not going to sink or if it is legal to build house on that land. Or even without thinking how much should be the thickness of the wall so as to prevent any unwanted foreign activities. I guess you would say “NO” but in case you have YES in your mind then I don’t think reading further will help you anymore.
A proper design and planning is needed in order to make sure that at the end of the day, your dream house get built without any hassles or obstacles.
The same way a proper planning and designing is required for a robust database to be built. For any business to become success and sustain in the profitable margin, databases plays a vital role, it’s the pillar stone for any project or business. It has to be under proper care and attention and not to be like to get it done in one-night.
If proper care was not taken at the beginning stage for planning and mapping of all the resources, project requirements, etc.; it will become worse to bring any changes at the later stages resulting in more instability down the line. Moreover chances are there that it may divert the project to the wrong direction. It’s more like hunting in dark night expecting dears to be everywhere.
In most cases, just for the sake of getting it done, the databases got deployed and by the time it is realized that it would have done under proper techniques and guidance, no more time is there to go back and fix it.
Mapping is a big problem in many cases for complex applications. Hence most of the designers prefer to put all the columns into one table and not preferring normalization techniques as a good tool for design. But a good design should cover all the databases functionalities and qualities like:
- Performance with respect of ETL operation (loading); and especially for query extraction (reporting).
- Database Extensibility to cover future needs.
- Perfect balance of normalized (we may lose more space i.e. hard drive cost) and de-normalized (may need more processing i.e. CPU and RAM cost) tables to feed every requirement of the project.
- Proper testing analysis for both known and unknown data set.
One of the common situations in designing is the proper implementation of 1-to-Many relationship that we usually do to avoid repeated fields. But in real world scenario, how much this is
significant. Let me give one example on this. In your customer database, some customers may have more than one phone numbers. But the question is; do all the systems or applications really need to track all the phone numbers of a particular customer.
Let’s diagnose the problem little deeply. How many customers do have more than one phone numbers? May be 1 in 100th or 2 in 1000th. That means on an average of only 0.5 percent of customers have more than one phone numbers and those also at most 2 numbers for each customer. Now if we provide PhoneNo1 and PhoneNo2 as two fields in our database to meet the requirement. Then is it good to waste the rest 99.5 percent as NULL values.
Well after little diagnostics, let’s decide to take one child table to keep information about phone numbers, resulting in better space management; like the one that shown below:
But the real question here is that “Is it required to have more than one phone numbers for a customer?” If the answer is YES, then we have to go for child table; but if it is NO then it is always good to avoid such complex design process and the effort needed to achieve this. Instead we can use the same effort in other productiveness. Moreover, in above case, the real issue is not about the wasted space but rather of wasted effort. What I mean to say is that while designing, main focus should be there on business critical measures and requirements.
Another question is about the sustainability and stability of the database i.e. is the above design good enough to answer the growth and expansion of the business or not. In order to have a foresight of your designing process to keep it sustainable over long period, a little analysis about the client business and the market behavior will be much helpful. Business requirements and hence the data structure may change from time to time, may be sooner or later. The model that is designed today may support tomorrow or maybe not. If we know that the data structure is changing eventually, then it’s always better to wait till the requirement becomes stable; instead of rebuilding the design which may or may not be nearer to the actual requirements.
Well for the above problem, if we think little out of box, then we can see that although we have taken child table to keep information about the phone numbers. But it can be optimized further to a better design.
Recall once again the requirement, what it says, a customer may have more than one phone numbers but at the same time, it’s very true that a customer will have at least one phone number (not having a phone number is very rare case). Hence, if I keep first phone number (customers with only single phone number) inside the main table and will move the second one (customers with double phone numbers) to the child table, it not only solves the space problem as well as increases the efficiency and performance of the extraction queries. As shown below:
All phone numbers has its particular pattern and it will never match with the CustID. As we already concluded that 0.5% of customers have more than one phone numbers, then for the rest of the customers, we can directly fetch their phone numbers from the main table itself. Only for 0.5% customers, we need to map with the child table to get their phone numbers. And it can be easily done by conditional statements (i.e. case statement) for phone numbers where it ties to the Customer IDs.
Although, it’s true that it’s practically impossible to get around all the requirements and problems; and plan accordingly in the design phase. Possibilities are there that some unknown issues may arise at the later stage of deployment itself but it is important to militate against potential problems as much as possible, by careful planning.
Register yourself or Subscribe to get the updates for better designing practices.