找出导致错误的行(Find out which row caused the error)

编程入门 行业动态 更新时间:2024-10-22 20:32:55
找出导致错误的行(Find out which row caused the error)

我有一个很大的胖查询,它是动态编写的,用于集成一些数据。 基本上它的作用是查询一些表,加入其他表,处理一些数据,然后将其插入到最终表中。

问题是数据太多,我们无法真正信任这些数据源,因为可能存在一些错误或不一致的数据。

例如,我在使用客户数据库开发时花了将近一个小时寻找错误,因为在我的大胖查询中间某处将一些varchar转换为datetime时出错。 事实证明,他们有一些销售约会'2009-02-29',一个超出范围的日期。 是的,我知道。 为什么存储为varchar? 那么,源数据库有3列日期,'月','日'和'年'。 我不知道为什么会这样,但它仍然是。

但是,如果来源不可靠,我该如何对待呢?

我无法处理异常,我真的需要它与原始消息达到另一个级别,但我想提供更多信息,以便用户在致电我们之前至少可以尝试解决它。

所以我想向用户显示行号,或者某个ID,至少可以让他知道他必须纠正的记录。 这也是一项艰苦的工作,因为有时候集成将运行多达80000条记录。 在80000记录集成中,单个虚拟错误消息: '将varchar数据类型转换为日期时间数据类型导致超出范围的日期时间值'意味着什么都没有。

所以任何想法都将不胜感激。

哦,我正在使用带有Service Pack 3的SQL Server 2005。


编辑:

好的,所以对于我所读到的答案,最好的办法是检查每一列对于引发错误至关重要,如果他们确实参加了这个条件,我应该自己提出一个错误,我发现这个消息更具描述性,并添加一些可能存储在单独的表或某些变量中的信息,例如行的ID或其他一些根信息。

I have a big fat query that's written dynamically to integrate some data. Basically what it does is query some tables, join some other ones, treat some data, and then insert it into a final table.

The problem is that there's too much data, and we can't really trust the sources, because there could be some errored or inconsistent data.

For example, I've spent almost an hour looking for an error while developing using a customer's database because somewhere in the middle of my big fat query there was an error converting some varchar to datetime. It turned out to be that they had some sales dating '2009-02-29', an out-of-range date. And yes, I know. Why was that stored as varchar? Well, the source database has 3 columns for dates, 'Month', 'Day' and 'Year'. I have no idea why it's like that, but still, it is.

But how the hell would I treat that, if the source is not trustable?

I can't HANDLE exceptions, I really need that it comes up to another level with the original message, but I wanted to provide some more info, so that the user could at least try to solve it before calling us.

So I thought about displaying to the user the row number, or some ID that would at least give him some idea of what record he'd have to correct. That's also a hard job because there will be times when the integration will run up to 80000 records. And in an 80000 records integration, a single dummy error message: 'The conversion of a varchar data type to a datetime data type resulted in an out-of-range datetime value' means nothing at all.

So any idea would be appreciated.

Oh I'm using SQL Server 2005 with Service Pack 3.


EDIT:

Ok, so for what I've read as answers, best thing to do is check each column that could be critical to raising errors, and if they do attend the condition, I should myself raise an error, with the message I find more descriptive, and add some info that could have been stored in a separate table or some variables, for example the ID of the row, or some other root information.

最满意答案

这听起来像标准的ETL问题:提取,转换和加载。 (除非你必须一遍又一遍地对同一组数据运行这个查询,在这种情况下,你几乎一遍又一遍地做同样的事情。那么性能有多重要?)

您可以提供哪种错误处理和/或“报告错误数据”? 如果您将所有内容都视为“一个大胖查询”,您的选项将变得非常有限 - 无论是查询是否有效,如果不是,我猜你最多只能获得一条RAISERROR消息来告诉来电者什么是。

在这种情况下,我尝试设置的一般框架是:

从源表开始 生成一组临时表(SQLMenace的临时表),您知道这些表是一致的并且是正确形成的(有效数据,密钥等) 对这些表写下“不太大而且很胖的查询”

通过这种方式,您应该始终能够返回(或存储)有效数据集...即使它是空的。 诀窍在于确定例程何时失败 - 何时数据太糟糕而无法处理并产生所需的结果,因此您返回一个措辞恰当的错误消息呢?

This sounds like a standard ETL issue: Extract, Transform, and Load. (Unless you have to run this query over and over again against the same set of data, in which case you'd pretty much do the same thing, only over and over again. So how critical is performance?)

What kind of error handling and/or "reporting of bad data" are you allowed to provide? If you have everything as "one big fat query", your options become very limited -- either the query works or it doesn't, and if it doesn't I'm guessing you get at best one RAISERROR message to tell the caller what's what.

In a situation like this, the general framework I'd try to set up is:

Starting with the source table(s) Produce an interim set of tables (SQLMenace's staging tables) that you know are consistant and properly formed (valid data, keys, etc.) Write the "not quite so big and fat query" against those tables

Done this way, you should always be able to return (or store) a valid data set... even if it is empty. The trick will be in determining when the routine fails -- when is the data too corrupt to process and produce the desired results, so you return a properly worded error message instead?

更多推荐

本文发布于:2023-07-31 00:10:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1340345.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:错误   Find   row   caused   error

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!