Thursday, March 29, 2012

ETL Package Problem

I'm going through the Integration tutorial, Lesson 1: Creating a Simple ETL Package. Everything goes fine until I run the package. I get an error about inserting Null in the CurrencyKey field which is set to not null. The flat file looks okay.

Any help would be appreciated.

SSIS package "Lesson 1.dtsx" starting.
Information: 0x4004300A at Extract Sample Currency Data, DTS.Pipeline: Validation phase is beginning.
Information: 0x4004300A at Extract Sample Currency Data, DTS.Pipeline: Validation phase is beginning.
Information: 0x40043006 at Extract Sample Currency Data, DTS.Pipeline: Prepare for Execute phase is beginning.
Information: 0x40043007 at Extract Sample Currency Data, DTS.Pipeline: Pre-Execute phase is beginning.
Information: 0x402090DC at Extract Sample Currency Data, Extract Sample Currency Data [1]: The processing of file "C:\Program Files\Microsoft SQL Server\90\Samples\Integration Services\Tutorial\Creating a Simple ETL Package\Sample Data\SampleCurrencyData.txt" has started.
Information: 0x400490F4 at Extract Sample Currency Data, Lookup Currency Key [30]: component "Lookup Currency Key" (30) has cached 105 rows.
Information: 0x400490F4 at Extract Sample Currency Data, Lookup Date Key [124]: component "Lookup Date Key" (124) has cached 992 rows.
Information: 0x4004300C at Extract Sample Currency Data, DTS.Pipeline: Execute phase is beginning.
Information: 0x402090DE at Extract Sample Currency Data, Extract Sample Currency Data [1]: The total number of data rows processed for file "C:\Program Files\Microsoft SQL Server\90\Samples\Integration Services\Tutorial\Creating a Simple ETL Package\Sample Data\SampleCurrencyData.txt" is 1097.
Information: 0x402090DF at Extract Sample Currency Data, Sample OLE DB Destination [158]: The final commit for the data insertion has started.
Error: 0xC0202009 at Extract Sample Currency Data, Sample OLE DB Destination [158]: An OLE DB error has occurred. Error code: 0x80004005.
An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "The statement has been terminated.".
An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "Cannot insert the value NULL into column 'CurrencyKey', table 'AdventureWorksDW.dbo.FactCurrencyRate'; column does not allow nulls. INSERT fails.".
Information: 0x402090E0 at Extract Sample Currency Data, Sample OLE DB Destination [158]: The final commit for the data insertion has ended.
Error: 0xC0047022 at Extract Sample Currency Data, DTS.Pipeline: The ProcessInput method on component "Sample OLE DB Destination" (158) failed with error code 0xC0202009. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.
Error: 0xC0047021 at Extract Sample Currency Data, DTS.Pipeline: Thread "WorkThread0" has exited with error code 0xC0202009.
Information: 0x40043008 at Extract Sample Currency Data, DTS.Pipeline: Post Execute phase is beginning.
Information: 0x402090DD at Extract Sample Currency Data, Extract Sample Currency Data [1]: The processing of file "C:\Program Files\Microsoft SQL Server\90\Samples\Integration Services\Tutorial\Creating a Simple ETL Package\Sample Data\SampleCurrencyData.txt" has ended.
Information: 0x40043009 at Extract Sample Currency Data, DTS.Pipeline: Cleanup phase is beginning.
Information: 0x4004300B at Extract Sample Currency Data, DTS.Pipeline: "component "Sample OLE DB Destination" (158)" wrote 1097 rows.
Task failed: Extract Sample Currency Data
Warning: 0x80019002 at Lesson 1: The Execution method succeeded, but the number of errors raised (3) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.
SSIS package "Lesson 1.dtsx" finished: Failure.
The program '[3372] Lesson 1.dtsx: DTS' has exited with code 0 (0x0).

CurrencyKey Column is set to Not Null means, it cannot take Null values.

If I remember right, CurrencyKey Column of the AdventureWorksDW database is a Primary key.

Thanks,

S Suresh

|||Then there is a problem with the tutorial. Do you know how to fix the problem?
|||

Hi kinny_k,

I was just wondering if you ever got an answer to this problem with the SSIS ETL tutorial. I've run into the same issue and I've checked and rechecked my steps and can not see anything wrong.

If you have the solution/workaround I'd appreciate the input!

-mayalu

|||

Did you review if any of the mapping in your destination componnet is missing?

Rafael Salas

|||

I had a problem with this too. It turns out that the Flat File dates are in a MM/DD/YYYY format and since my default is DD/MM/YYYY most dates aren't correctly identified.

To resolve this:

1. In the ‘Flat File Connection Manager’, changed the CurrencyDate DataType to string [DT_STR].

2. In the ‘Data Flow’ diagram, a Warning displays on the ‘Extract Sample Currency Data’ dataflow.Double click to open it. A message is displayed warning that the format has changed, click Yes and when the ‘Flat File Source Editor’ is displayed, click OK to close.

3. In the Date Lookup Transformation Editor, Reference Table tab, select 'Use results of an SQL query' and paste in:

SELECT TimeKey, FullDateAlternateKey,
CAST(CAST(CONVERT(VARCHAR(2), FullDateAlternateKey, 101) AS INT) AS VARCHAR) + '/' +
CAST(CAST(SUBSTRING(CONVERT(VARCHAR(5), FullDateAlternateKey, 101), 4,2) AS INT) AS VARCHAR) + '/' +
CAST(CAST(RIGHT(CONVERT(VARCHAR(10), FullDateAlternateKey, 101),4) AS INT) AS VARCHAR) + ' 0:00' USFormatDate
FROM dbo.DimTime

Check to output looks OK by clicking the preview button.

4. In the Date Lookup Transformation Editor, Columns tab, link the CurrencyDate to the new USFormatDate field.

I'd imagine there's a nicer way of fixing this but I'm not at all familiar with SSIS yet.

|||

Hi Tim,

Thanks for the input. I tried it and all goes well until the last step where I try to map the CurrencyDate to the new USFormatDate field where I get a data type mismatch. I guess there's something else peculiar to my setup.

At least I know where to look closer!

-Elsie

|||

Elsie,

Hum, I think I'd check the first step again. You have to change the flat file Currency Date to be a string otherwise the comparison wont work and you’ll get a type mismatch. Other than that, I'm afraid I don't know.

You could always change the FactCurrencyRate table definition to allow Null in the TimeKey field so you don't get the error. In retrospect I should have done that as it’s a lot quicker to get the lesson working.

Tim

|||It seems that the only solution is to use SSIS in a db instance with English(US) as default language.
At least, for db instances with non-English default languages (I have Russian one) nothing else works
http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1566456&SiteID=1&mode=1|||

In both step 9's of the add lookup transformation, be sure to click on the box to select the(CurrencyKey) lookup value. You should have a checkbox before moving on. Then you can pam this lookup to the column in the later step.

Enjoy!

|||I am sure I verifшув that (and everything else) dozens of times,
BTW it does not address the error and the problem (incompatible datetime fields)|||

To get this fully working, following steps are required:

1. In the Sample Flat File Source Data connection manager, change CurrencyDate Data Type to String [DT_STR]

2. In the data flow task editor, double click data Flow souce 'Extract Sample Currency Data' then accept the warning

3. In Look Up Time Key task, under Reference Table tab, choose use Results of an SQL query, then use the following SQL statement, which I modified from an early post,

SELECT TimeKey, FullDateAlternateKey,
CAST(CAST(CONVERT(VARCHAR(2), FullDateAlternateKey, 101) AS INT) AS VARCHAR)+ '/'
+
CAST(CAST(SUBSTRING(CONVERT(VARCHAR(5), FullDateAlternateKey, 101), 4,2) AS INT) AS VARCHAR) + '/' +
CAST(CAST(RIGHT(CONVERT(VARCHAR(10), FullDateAlternateKey, 101),4) AS INT) AS VARCHAR) + ' 00:00:00' USFormatDate
FROM dbo.DimTime

4. In Look Up Time Key task, Link Currency date from Available Input Columns to USFormatDate in Available Look UP Columns.

5. Make sure you link the output of the look up to the destination columns.

Good luck,

Celine

|||

Another solution:

on http://www.microsoft.com/downloads/details.aspx?familyid=e719ecf7-9f46-4312-af89-6ad8702e4e6e&displaylang=en

you find database for case-sensitive collation and case-insensitive collation, a attached the other database and now works.

|||

This link tells that samples download had moved to codeplex

And codeplex has hundreds of samples

Which one?

|||http://www.codeplex.com/MSFTDBProdSamples

ETL Package Problem

I'm going through the Integration tutorial, Lesson 1: Creating a Simple ETL Package. Everything goes fine until I run the package. I get an error about inserting Null in the CurrencyKey field which is set to not null. The flat file looks okay.

Any help would be appreciated.

SSIS package "Lesson 1.dtsx" starting.
Information: 0x4004300A at Extract Sample Currency Data, DTS.Pipeline: Validation phase is beginning.
Information: 0x4004300A at Extract Sample Currency Data, DTS.Pipeline: Validation phase is beginning.
Information: 0x40043006 at Extract Sample Currency Data, DTS.Pipeline: Prepare for Execute phase is beginning.
Information: 0x40043007 at Extract Sample Currency Data, DTS.Pipeline: Pre-Execute phase is beginning.
Information: 0x402090DC at Extract Sample Currency Data, Extract Sample Currency Data [1]: The processing of file "C:\Program Files\Microsoft SQL Server\90\Samples\Integration Services\Tutorial\Creating a Simple ETL Package\Sample Data\SampleCurrencyData.txt" has started.
Information: 0x400490F4 at Extract Sample Currency Data, Lookup Currency Key [30]: component "Lookup Currency Key" (30) has cached 105 rows.
Information: 0x400490F4 at Extract Sample Currency Data, Lookup Date Key [124]: component "Lookup Date Key" (124) has cached 992 rows.
Information: 0x4004300C at Extract Sample Currency Data, DTS.Pipeline: Execute phase is beginning.
Information: 0x402090DE at Extract Sample Currency Data, Extract Sample Currency Data [1]: The total number of data rows processed for file "C:\Program Files\Microsoft SQL Server\90\Samples\Integration Services\Tutorial\Creating a Simple ETL Package\Sample Data\SampleCurrencyData.txt" is 1097.
Information: 0x402090DF at Extract Sample Currency Data, Sample OLE DB Destination [158]: The final commit for the data insertion has started.
Error: 0xC0202009 at Extract Sample Currency Data, Sample OLE DB Destination [158]: An OLE DB error has occurred. Error code: 0x80004005.
An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "The statement has been terminated.".
An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "Cannot insert the value NULL into column 'CurrencyKey', table 'AdventureWorksDW.dbo.FactCurrencyRate'; column does not allow nulls. INSERT fails.".
Information: 0x402090E0 at Extract Sample Currency Data, Sample OLE DB Destination [158]: The final commit for the data insertion has ended.
Error: 0xC0047022 at Extract Sample Currency Data, DTS.Pipeline: The ProcessInput method on component "Sample OLE DB Destination" (158) failed with error code 0xC0202009. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.
Error: 0xC0047021 at Extract Sample Currency Data, DTS.Pipeline: Thread "WorkThread0" has exited with error code 0xC0202009.
Information: 0x40043008 at Extract Sample Currency Data, DTS.Pipeline: Post Execute phase is beginning.
Information: 0x402090DD at Extract Sample Currency Data, Extract Sample Currency Data [1]: The processing of file "C:\Program Files\Microsoft SQL Server\90\Samples\Integration Services\Tutorial\Creating a Simple ETL Package\Sample Data\SampleCurrencyData.txt" has ended.
Information: 0x40043009 at Extract Sample Currency Data, DTS.Pipeline: Cleanup phase is beginning.
Information: 0x4004300B at Extract Sample Currency Data, DTS.Pipeline: "component "Sample OLE DB Destination" (158)" wrote 1097 rows.
Task failed: Extract Sample Currency Data
Warning: 0x80019002 at Lesson 1: The Execution method succeeded, but the number of errors raised (3) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.
SSIS package "Lesson 1.dtsx" finished: Failure.
The program '[3372] Lesson 1.dtsx: DTS' has exited with code 0 (0x0).

CurrencyKey Column is set to Not Null means, it cannot take Null values.

If I remember right, CurrencyKey Column of the AdventureWorksDW database is a Primary key.

Thanks,

S Suresh

|||Then there is a problem with the tutorial. Do you know how to fix the problem?
|||

Hi kinny_k,

I was just wondering if you ever got an answer to this problem with the SSIS ETL tutorial. I've run into the same issue and I've checked and rechecked my steps and can not see anything wrong.

If you have the solution/workaround I'd appreciate the input!

-mayalu

|||

Did you review if any of the mapping in your destination componnet is missing?

Rafael Salas

|||

I had a problem with this too. It turns out that the Flat File dates are in a MM/DD/YYYY format and since my default is DD/MM/YYYY most dates aren't correctly identified.

To resolve this:

1. In the ‘Flat File Connection Manager’, changed the CurrencyDate DataType to string [DT_STR].

2. In the ‘Data Flow’ diagram, a Warning displays on the ‘Extract Sample Currency Data’ dataflow.Double click to open it. A message is displayed warning that the format has changed, click Yes and when the ‘Flat File Source Editor’ is displayed, click OK to close.

3. In the Date Lookup Transformation Editor, Reference Table tab, select 'Use results of an SQL query' and paste in:

SELECT TimeKey, FullDateAlternateKey,
CAST(CAST(CONVERT(VARCHAR(2), FullDateAlternateKey, 101) AS INT) AS VARCHAR) + '/' +
CAST(CAST(SUBSTRING(CONVERT(VARCHAR(5), FullDateAlternateKey, 101), 4,2) AS INT) AS VARCHAR) + '/' +
CAST(CAST(RIGHT(CONVERT(VARCHAR(10), FullDateAlternateKey, 101),4) AS INT) AS VARCHAR) + ' 0:00' USFormatDate
FROM dbo.DimTime

Check to output looks OK by clicking the preview button.

4. In the Date Lookup Transformation Editor, Columns tab, link the CurrencyDate to the new USFormatDate field.

I'd imagine there's a nicer way of fixing this but I'm not at all familiar with SSIS yet.

|||

Hi Tim,

Thanks for the input. I tried it and all goes well until the last step where I try to map the CurrencyDate to the new USFormatDate field where I get a data type mismatch. I guess there's something else peculiar to my setup.

At least I know where to look closer!

-Elsie

|||

Elsie,

Hum, I think I'd check the first step again. You have to change the flat file Currency Date to be a string otherwise the comparison wont work and you’ll get a type mismatch. Other than that, I'm afraid I don't know.

You could always change the FactCurrencyRate table definition to allow Null in the TimeKey field so you don't get the error. In retrospect I should have done that as it’s a lot quicker to get the lesson working.

Tim

|||It seems that the only solution is to use SSIS in a db instance with English(US) as default language.
At least, for db instances with non-English default languages (I have Russian one) nothing else works
http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1566456&SiteID=1&mode=1|||

In both step 9's of the add lookup transformation, be sure to click on the box to select the(CurrencyKey) lookup value. You should have a checkbox before moving on. Then you can pam this lookup to the column in the later step.

Enjoy!

|||I am sure I verifшув that (and everything else) dozens of times,
BTW it does not address the error and the problem (incompatible datetime fields)|||

To get this fully working, following steps are required:

1. In the Sample Flat File Source Data connection manager, change CurrencyDate Data Type to String [DT_STR]

2. In the data flow task editor, double click data Flow souce 'Extract Sample Currency Data' then accept the warning

3. In Look Up Time Key task, under Reference Table tab, choose use Results of an SQL query, then use the following SQL statement, which I modified from an early post,

SELECT TimeKey, FullDateAlternateKey,
CAST(CAST(CONVERT(VARCHAR(2), FullDateAlternateKey, 101) AS INT) AS VARCHAR)+ '/'
+
CAST(CAST(SUBSTRING(CONVERT(VARCHAR(5), FullDateAlternateKey, 101), 4,2) AS INT) AS VARCHAR) + '/' +
CAST(CAST(RIGHT(CONVERT(VARCHAR(10), FullDateAlternateKey, 101),4) AS INT) AS VARCHAR) + ' 00:00:00' USFormatDate
FROM dbo.DimTime

4. In Look Up Time Key task, Link Currency date from Available Input Columns to USFormatDate in Available Look UP Columns.

5. Make sure you link the output of the look up to the destination columns.

Good luck,

Celine

|||

Another solution:

on http://www.microsoft.com/downloads/details.aspx?familyid=e719ecf7-9f46-4312-af89-6ad8702e4e6e&displaylang=en

you find database for case-sensitive collation and case-insensitive collation, a attached the other database and now works.

|||

This link tells that samples download had moved to codeplex

And codeplex has hundreds of samples

Which one?

|||http://www.codeplex.com/MSFTDBProdSamples

ETL library

We are looking for an ETL library or callable application that would
allow us to transfer and transform data between platforms but mainly
into a SQL server database. We would be calling this tool from
javascript programs. Is there something out there in the midrange
price-wise? So far I am finding just high-end ETL tools that are a
little over kill for us.

Thanks."elizabeth" <ezelasky@.hotmail.com> wrote in message
news:78393913.0409150439.818af56@.posting.google.co m...
> We are looking for an ETL library or callable application that would
> allow us to transfer and transform data between platforms but mainly
> into a SQL server database. We would be calling this tool from
> javascript programs. Is there something out there in the midrange
> price-wise? So far I am finding just high-end ETL tools that are a
> little over kill for us.
>
> Thanks.

Have you looked at DTS, which is included with MSSQL? It has a command-line
executable and a COM interface available. See "Data Transformation Services"
in Books Online for more details.

If DTS isn't suitable for you, you might want to be more specific about what
features you require.

Simon|||Folks here felt that DTS was slow (copied record by record) and buggy
that is why I was asked to search for something else. What we need is
fairly basic...to be able to move large amounts of data (in some
cases > 2 gigs) between a variety of sources (flat files, FoxPro &
Dbase) into a SQL Server database. We need to be able to set or
specify the table structure "on-the-fly" thus the reason for calling
it from a scripting environment or another environment.

Thanks.
> Have you looked at DTS, which is included with MSSQL? It has a command-line
> executable and a COM interface available. See "Data Transformation Services"
> in Books Online for more details.
> If DTS isn't suitable for you, you might want to be more specific about what
> features you require.
> Simon|||DTS can use 'fast load' bulk insert techniques to import data. This is the
fastest way to get data into SQL Server. You can create DTS packages 'on
the fly' programmatically and have control over the object model.

--
Hope this helps.

Dan Guzman
SQL Server MVP

"elizabeth" <ezelasky@.hotmail.com> wrote in message
news:78393913.0409151028.7fcc0aca@.posting.google.c om...
> Folks here felt that DTS was slow (copied record by record) and buggy
> that is why I was asked to search for something else. What we need is
> fairly basic...to be able to move large amounts of data (in some
> cases > 2 gigs) between a variety of sources (flat files, FoxPro &
> Dbase) into a SQL Server database. We need to be able to set or
> specify the table structure "on-the-fly" thus the reason for calling
> it from a scripting environment or another environment.
> Thanks.
>>
>> Have you looked at DTS, which is included with MSSQL? It has a
>> command-line
>> executable and a COM interface available. See "Data Transformation
>> Services"
>> in Books Online for more details.
>>
>> If DTS isn't suitable for you, you might want to be more specific about
>> what
>> features you require.
>>
>> Simon|||One alternative is a "roll your own" solution. Create linked servers and
just run UPDATE/INSERT/ SELECT INTO statements. The problem is always
making sure you have the relevant drivers on th dev and production
enviroments. The enclosed link gives most of the connection strings needed.

http://www.able-consulting.com/ADO_Conn.htm

I'll warn you before you start this can get messy.

Regards

Adrian

elizabeth wrote:

> Folks here felt that DTS was slow (copied record by record) and buggy
> that is why I was asked to search for something else. What we need is
> fairly basic...to be able to move large amounts of data (in some
> cases > 2 gigs) between a variety of sources (flat files, FoxPro &
> Dbase) into a SQL Server database. We need to be able to set or
> specify the table structure "on-the-fly" thus the reason for calling
> it from a scripting environment or another environment.
> Thanks.
>>Have you looked at DTS, which is included with MSSQL? It has a command-line
>>executable and a COM interface available. See "Data Transformation Services"
>>in Books Online for more details.
>>
>>If DTS isn't suitable for you, you might want to be more specific about what
>>features you require.
>>
>>Simon

ETL Guidance

Hi,

We've decided to use SSIS for an ETL type project. We are going to convert some files to our special flat file format. Our special flat file format will not change, but we will need to convert about 40 different style flat files into this format. We will receive any of these files at random intervals. We will likely have to add more file formats to the 40 at some point, too.

I'd like to use a flexible design, but I just started with SSIS and its going to be hard to test all the possible combinations, so I hope I can find some guidance in this forum.

What would be a good approach to allow for the most reuse of work to support these various file formats?

Jamie Thomson's excellent SSIS blog has an "SSIS Best Practices" list that might be a good place to start: http://blogs.conchango.com/jamiethomson/archive/2006/01/05/2554.aspx

|||At worst you'll need to make a converison from each style of incoming file to your formatted style. more details would be helpful|||Thanks, I read Jamie's blog everyday and I recommend it to everyone, too.sql

ETL from SQL server to SQL CE

How cai tranform data from an existing SQL server 2005 DB to a (.sdf) SQL server Compact Edition DB ?

I don't know the data provider name for connecting to a SQL CE DB..

ANy suggestions ?

You can use the SQL Server Compact Edition Destination component (with a File Connection Manager) in a Data Flow Task to "export" data to SQL Server Mobile. You can place some transformations between your data source and the destination component.

ETL from multiple Access databases -SSIS?

Howdy Folks,

I have an ETL project that I want to ask a question about. I want to loop through a folder containing several Access databases, and extract and load each one into a single SQL Server 2005 database. The table layout is the same for all the Access dbs, as well as the SQL db. The number of Access dbs in the folder, their records, and their names will change. Is there a way i can use the SSIS Foreach tool to move through the folder and set the connection string inside the loop?

Thanks for your reply,

Chris

Yes, you can do this.

Use the ForEach File Enumerator to enumerate your list of files. you can use wildcards (e.g. *.mdb)

Store the file path of each file in a variable. You can then use the contents of that variable to dynamically build your connection string using an expression.

Lots of information via Google about this stuff if you want to go hunting or ask here with any pertinent questions.

Regards

-Jamie

|||

Thanks Jamie (alot ) for your suggestion. I tried using the Foreach File Enumerator with the root path for the Access files and a *.mdb wildcard. Then I created a mapped string variable varFileNameto hold the file path. Inside the loop i have my Data Flow Task. In the data source for the task, I was using the variable in an expression as the ConnectionSting property which didn't work. So I added an OLE DB connection string in the expression like "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::varFileName] but i don't think that's the answer.

I've been googling on this but can't find solutions for multiple .mdb's, most of waht i find is dealing with flat files. Great blogs btw.

|||

Should be able to find all you need here: http://www.connectionstrings.com/

-Jamie

|||

Yep, i went there. Thanks Jamie. My string was ok. Actually what got me running was to set the Foreach Loop Container>Properties>Execution>DelayValidation = True. Works great running the SSIS package on the box with SQL server (as opposed to OLE DB) destinations.

ETL Delta Pulling Huge data.. Right approach ?

Hi all,

In an approach of building an ETL tool, we are into a situation wherein, a table has to be loaded on an incremental basis. The first run all the records apporx 100 lacs has to be loaded. From the next run, only the records that got updated since the last run of the package or newly added are to be pulled from the source Database. One idea we had was to have two OLE DB Source components, in one get those records that got updated or was added newly, since we have upddate cols in the DB getting them is fairly simple, in the next OLEDB source load all the records form the Destination, pass it onto a Merge Join then have a Conditional Split down the piple line, and handle the updates cum insert.

Now the question is, how slow the show is gonna be ? Will there be a case that the Source DB returns records pretty fast and Merge Join fails in anticipation of all the records from the destination ?

What might be the ideal way to go about my scenario.. Please advice...

Thanks in advance.

Or may be just use a lookup component to check if there are any updates and redirect the errored rows as New records to the destination. Sounds good ?

|||

MShetty wrote:

Or may be just use a lookup component to check if there are any updates and redirect the errored rows as New records to the destination. Sounds good ?

The lookup component might be the way to go; or at least it has been the most popular approach for most scenarios. However, there has been several interesting discussion around this issue.

If you decide to use lookup transforms; be sure to understand how that component uses memory resources. Also, be aware that lookup transform is case sensitive when 'matching' the columns in the pipeline against the lookup table.

Jamie has interesting article on his blog; also Greg has written about an alternative approach using script component:

http://blogs.conchango.com/jamiethomson/archive/2006/09/12/SSIS_3A00_-Checking-if-a-row-exists-and-if-it-does_2C00_-has-it-changed.aspx

Using 2 OLE DB source components and merge join, i think would be slower; but you can test it and gain a better understanding of that.

|||

Rafael Salas wrote:

The lookup component might be the way to go; or at least it has been the most popular approach for most scenarios. However, there has been several interesting discussion around this issue.

If you decide to use lookup transforms; be sure to understand how that component uses memory resources. Also, be aware that lookup transform is case sensitive when 'matching' the columns in the pipeline against the lookup table.

looks the memory usage is going to be an issue in lookup, especially when the record count is pretty huge.

|||

May be is not an issue at all; my advise on that: test it!

make sure you provide a query in the lookup transform with only the columns to be used for the lookup operation, so the resultset is as smallest as possible columns wise.

|||

Rafael Salas wrote:

May be is not an issue at all; my advise on that: test it!

make sure you provide a query in the lookup transform with only the columns to be used for the lookup operation, so the resultset is as smallest as possible columns wise.

I strongly agree. Lookup transform performs very well if you can use it right.

|||I did a testing with about 100 000+ rows in my forum post Merge Join vs. Lookup vs. Custom Script.

This should give you an overview - but please do your own testing before going live as the performance may vary accross configurations|||Thx for all the valuable inputs.. I think Lookup is not a fear, but its the components down the pipeline, OLEDB Command I am worried about,if its going to update lot of rows its going to hit the perfomance big time Sad|||

TheViewMaster wrote:

I did a testing with about 100 000+ rows in my forum post Merge Join vs. Lookup vs. Custom Script.

This should give you an overview - but please do your own testing before going live as the performance may vary accross configurations

Amen to that. its no good asking on here - the only person that can provide a definitive answer is yourself.

-Jamie

|||

MShetty wrote:

Thx for all the valuable inputs.. I think Lookup is not a fear, but its the components down the pipeline, OLEDB Command I am worried about,if its going to update lot of rows its going to hit the perfomance big time

Yup - updating large amount of rows can take forever.
I also came to conclusion that with large number records - OLE DB Command object won't perform well. The solution I use is to load the data which needs to be updated to staging table - and then Execute SQL to update
e.g.
UPDATE Prod_Table
SET Field = a2.Field
FROM Prod_Table a1
JOIN Staging_Table a2 ON a1.PrimaryKey = a2.PrimaryKey|||

TheViewMaster wrote:

MShetty wrote:

Thx for all the valuable inputs.. I think Lookup is not a fear, but its the components down the pipeline, OLEDB Command I am worried about,if its going to update lot of rows its going to hit the perfomance big time

Yup - updating large amount of rows can take forever.
I also came to conclusion that with large number records - OLE DB Command object won't perform well. The solution I use is to load the data which needs to be updated to staging table - and then Execute SQL to update
e.g.
UPDATE Prod_Table
SET Field = a2.Field
FROM Prod_Table a1
JOIN Staging_Table a2 ON a1.PrimaryKey = a2.PrimaryKey

The approach ViewMaster is describing is exactly what I do on such cases. OLE DB commands are simply slow by nature, command gets executed once per every row, so its implementation is just not adecuate for large number of rows. You can use a lookup to separete your inserts from updates and then the insert pipe can go directly to the destination table and the other one to a stage table to be used back in the control flow in a Execute sql task (1 time update).

|||Thx for the inputs... Will defintely post on what was the test results once I test out the packages Smile