Sunday, March 11, 2012

Errors from training mining models in SSIS

I am in the process of creating an Integration Services package to automate the process of training mining models and getting predictions. Until recently, I have been processing the models directly from Business Intelligence Studio without any problems. However, when I try to use the exact same training set as an input to the Data Mining Model Training destination, I get several errors. Here is the output:

[Mining Models [1]] Error: Parser: An error occurred during pipeline processing.

[Mining Models [1]] Error: Errors in the OLAP storage engine: The process operation ended because the number of errors encountered during processing reached the defined limit of allowable errors for the operation.

[Mining Models [1]] Error: Errors in the OLAP storage engine: An error occurred while the 'CPT MODIFIER' attribute of the 'BCCA DMS ~MC-CLAIM LIN~5' dimension from the 'BCCA LRG DMS TEST' database was being processed.

[Mining Models [1]] Error: File system error: The record ID is incorrect. Physical file: . Logical file: .

[Mining Models [1]] Error: Errors in the OLAP storage engine: The process operation ended because the number of errors encountered during processing reached the defined limit of allowable errors for the operation.

[Mining Models [1]] Error: Errors in the OLAP storage engine: An error occurred while the 'BILL TYPE' attribute of the 'BCCA DMS ~MC-CLAIM LIN~5' dimension from the 'BCCA LRG DMS TEST' database was being processed.

[Mining Models [1]] Error: File system error: The record ID is incorrect. Physical file: . Logical file: .

[DTS.Pipeline] Error: The ProcessInput method on component "Mining Models" (1) failed with error code 0x80004005. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.

I have not been able to find an answer as to why this is happening. I found a post regarding a similar problem with processing an OLAP cube in SSIS, but it seems that the author of that post never found an answer. Has anyone else here seen similar errors when processing mining models from Integration Services?

Also, if I process the mining models manually then try to run only predictions in SSIS, I get many of the same errors. I'll keep looking into the problem myself, but I would be very grateful if someone in this forum could shed some light on this issue.

Can you try to sort your input ascendingly by the column that ends up being mapped to the model's key column?

if this does not work, could you please post some additional information:

- what algorithm is used by your mining model?

- are there other mining modles in the same mining structure? If so, what algorithms do they use?

- the columns of your mining model, their data type and content type

- the structure of the Integration Services pipeline (datatypes for the columns and ther mappings)

|||

Bogdan,

Thanks for your quick reply! I was actually able to fix this problem by eliminating one of the columns from my mining structure. It was a varchar field, but it turns out that there was no useful information in that column (every row contained a blank entry). SSIS was throwing an error when trying to build a dimension for that field. Obviously it's no loss to eliminate a column with no real values in it, but I'm still a little confused as to why it would process through an Analysis Services project but not in SSIS.

|||SSIS has different data handling and transformation logic than AS - it may just be that the conversion from the database type to the internal type failed in SSIS but normally works in AS.

No comments:

Post a Comment