In my previous posts << SSIS vs Text File Importing –I >> and << SSIS vs Text File Importing –II >> , we discussed about the Row Delimiter issue of SSIS while dealing with text files and the workaround for the same. In this post I will show you how to achieve the same through SSIS package.
Read more…
In my previous post named “<<SSIS vs Text File Importing –I>>”, we discussed about the problem statement for the strange behavior of SSIS over importing of Text Files. In this post, I will disclose the reason behind it. Let’s see how Excel and SSIS treats text files while importing data from it.
Read more…
Last week I come across a serious problem with SSIS while dealing with text files especially with tab delimited text files. It is showing the wrong data after importing into SQL SERVER database. I was shocked by this behavior of SSIS. The total row count in the destination table is much smaller than the source file row count. When I opened the same file in Excel or notepad, I saw the data is correct and properly aligned. Then I deeply dived into the problem and tried to figure out what may be the cause for such an unexpected behavior. Even I am surprised more when I implemented the same process in DTS package (SQL 2000); there it is showing me correct number of records at the destination. Pretty strange, all three products i.e. SSIS, DTS and Excel are from the same manufacturer (Microsoft), but showing different behavior for same type of process (importing data from text file).
Read more…
In SQL server 2000, scheduling a DTS package is not as easy as compare to SSIS packages in SQL 2005 or 2008. As the DTS packages are basically stored on the server itself and are being managed on the server only, it’s little tricky to schedule more than one packages in a single Job.
In a job we don’t have any options to directly link a DTS package other then “Operating System Command (CmdExec)” under which DTSRun command is used to run the package. The drawback here is the way Steps are linked to packages.
Read more…
Here comes one more challenging yet interesting topic to tide over. The requirement goes something like this: You have some sources, let it be some sales data or some Call center data coming from different sources. Data can be of different media. But you are able to load it in your staging tables i.e. one staging database you are maintaining like StageDB for storing these incremental data.
So, here we have both master i.e. Dimensional data and Detail i.e. Fact data in our stage database. But before processing further and loading it into our Mart or Data warehouse, we need to check if there is any redundancy at the row level for each of these tables in the staging database. Here comes the main problem, what if you don’t know how many tables are there in the staging database and how many columns are there for each table in the staging database but still you have to keep only unique records for each table.
Read more…
SQL Server 2005 has some unique features to deal with the Transaction system in the database world. It has some unique sets to take care of every possibility of transactions or types of transaction. Technically, it will give us discrete ways to isolate the transactions from occurrence of deadlocks or crashes.
Before going deeper to the Isolation level that SQL Server provides to distinguish types of transaction, let’s have a look on the definition of the TRANSACTION. What does transaction means in real world and in a database scenario?
Read more…
In my previous article named “Programming Foreach Loop Container – Enumerating Excel Files”, we discussed about enumerating Excel files in a folder and load data from each excel file to the destination by using Foreach File enumerator.
But there may be cases where we may have data in the form of sheets in a single excel file. For example, the total sales of study materials over different countries.
To load all the sheets from a single excel file in Foreach Loop Container, we have to use “Foreach ADO.NET Schema Rowset Enumerator”. This special enumerator is used to enumerate through schema information about a data source i.e. the schema information supported by the OLEDB provider. For example, to get list of tables, schemas, column information, constraints, etc from a database.
Read more…
Sometimes we were not aware of the contents of any object and we have to do certain tasks as many times as the no. of contents of that object. For example, a folder contains some text files with employee information data and we have to load all the text files data to our SQL server employee table. Here, we don’t know about the count of the source but we have to roll over every file. For this type of situation, SSIS have a unique container called “Foreach Loop Container” that will enumerate with each text file in the folder.
It can be used in some other cases too like enumerating each sheet of a excel file, or parsing each row of a table, enumerating each child node of a given tree, etc. In SQL server 2000 DTS package, it is bit difficult to loop over any task but fortunately SSIS has a special container to accomplish such tasks.
Read more…
While building an ETL package, sometimes we come across to repeat same types of tasks many times i.e. like to do certain task depending upon the number of students in the class, or any other condition, etc. So instead of developing same containers or tasks so many times, it’s always better to iterate it with certain parameters.
To resolve the above situation, SQL Server Integration Service (SSIS) comes with one special container called “For Loop Container“. This container has the same logic that a “For loop” keyword have in leading programming languages. It will iterate through all the tasks inside it till the condition satisfies. It acts as a repeating control flow. For each iteration of the loop, the For Loop container evaluates an expression that is specified and repeats its workflow until the expression evaluates to False.
Read more…
While building an ETL package, it’s very important to control the flow of work as per the requirement. And this can be done by using a SSIS component called Sequence container. The sequence container is a part of Control Flow and a sub part of the global container i.e. Control Flow container. Here we can set the flow of our package by precedence constraints.
The Sequence Container is basically used for grouping same type of tasks and to control them under a separate group. It can group the whole contents of the package into multiple separate control flows. The sequence container can be nested to any level to give more clarity to the process flow.
Here in this article, we will see the steps to control the tasks using Sequence container as well as how to do the same by programming Integration Services.
Read more…
Recent Comments