Tuesday, April 30, 2013

How to find the total number of rows in your Database and the rate at which new rows are being added

I was working on a data migration project and needed a way to quantify the progress. Since 1 particular issue we were facing was that the whole process slowed down gradually as it went, I wanted a way to collect data points to know the current insertion rate. Kind of how you'd want to monitor the download speed when downloading a movie or something :).

In this thread, I found the following script that would return the total number of rows contained within a database using clustered indices (i.e. the primary keys), and the heap. So the only reason this would not work is if the database has design flaws.

SELECT SUM(row_count)
    FROM sys.dm_db_partition_stats
    WHERE index_id IN (0,1)
    AND OBJECTPROPERTY([object_id], 'IsMsShipped') = 0;

I added the NOLOCK hint because I don't want this to add to the problem by locking the table. Besides it is just a monitoring tool and I can compromise on the accuracy.

So the logic is,
1. collect 2 data points with an interval in between. Higher the interval, the more accurate the rate is. But you don't want to wait for too long either. So I've found that a 10 second interval works best for me.
2. get the delta which tells us the number of rows added within that period
3. divide it by the time taken for the whole process to get the number of rows per second.

DECLARE @startcount INT, @endcount INT
DECLARE @startdate DATETIME, @enddate DATETIME

SELECT @startcount = SUM(row_count),
@startdate = GETDATE()
FROM sys.dm_db_partition_stats WITH(NOLOCK)
WHERE index_id IN (0,1)
AND OBJECTPROPERTY([object_id], 'IsMsShipped') = 0

WAITFOR DELAY '00:00:10'

SELECT @endcount = SUM(row_count),
@enddate = GETDATE()
FROM sys.dm_db_partition_stats WITH(NOLOCK)
WHERE index_id IN (0,1)
AND OBJECTPROPERTY([object_id], 'IsMsShipped') = 0

SELECT @startcount StartCount,
@endcount EndCount,
@startdate StartTime,
@enddate EndTime,
(@endcount - @startcount)/DATEDIFF(second, @startdate, @enddate) [Rec/s]

Please note that you need to run this query with a user that has access to sys objects.

Wednesday, April 17, 2013

Adding a custom dataflow control to SSIS 2008 and 2012 (Visual Studio BIDS 2008 and 2010)

So recently i wanted to create a custom SSIS data source that could use an Azure Table storage as the data source. i found an excellent (possibly the only) implementation of this here:


The biggest problem I ran in to, being a SSIS NOOB, was getting this custom source to BIDS (Business Intelligence Development Studio)

I was using both SSIS 2008 and 2012. which meands BIDS 2008 and 2010 respectively.

Here's a copy of the thread on this here for convenience (thread already covers all the keywords which i don't want to type :))


I've created a custom data source using the instructions given in the following link:http://msdn.microsoft.com/en-us/library/ms136088.aspx
However once i build the project and try to add the new datasource to the toolbox, i get the following error:
There are no components in 'C:\Somepath\bin\Debug\CustomDataSource.dll' that can be placed on the toolbox
following are the tools i've used:
To create and build the custom datasource; Visual Studio 2010 & build in Framework 3.5
To create the SSIS package SSIS - SQL Server 2008 R2 & SQL Server Business Intelligence Development Studio (VS2008)


i found the answer to my problem; here's the solution and hope it helps someone.
  1. copy the dll to the following folder(s). the exact folder depends on whether your on x64 or x86 (doesn't hurt to have it on both):
    C:\Program Files (x86)\Microsoft SQL Server\100\DTS\PipelineComponents
    C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents
  2. Install the dll to your GAC. you will need to have it signed to do this.
  3. In BI Dev Studio, go to Tools > Choose Toolbox Items. you will find your data source in the "SSIS Dataflow Items" tab
For steps 1 & 2 you might find it helpful to have a post build event like follows:
"C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\gacutil.exe" -u $(TargetName) 
"C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\gacutil.exe" -iF $(TargetFileName) 
copy $(TargetPath) "C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents" 
copy $(TargetPath) "C:\Program Files (x86)\Microsoft SQL Server\100\DTS\PipelineComponents"

If you're using SSIS 2012 with BIDS 2010, the toolbox items get added automatically (i.e. u don't have to go to Tools > Choose Toolbox Items) you just need to copy it to the dll right places (ref - http://fendy-huang.blogspot.com/2011/09/how-to-add-ssis-custom-component-to.html):
C:\Program Files\Microsoft SQL Server\110\DTS\PipelineComponents   
C:\Program Files (x86)\Microsoft SQL Server\110\DTS\PipelineComponents   
C:\Program Files\Microsoft SQL Server\110\DTS\Tasks   
C:\Program Files (x86)\Microsoft SQL Server\110\DTS\Tasks