需要帮助从C#诊断SQL Server奇怪的查询超时

编程入门 行业动态 更新时间:2024-10-28 10:33:49
本文介绍了需要帮助从C#诊断SQL Server奇怪的查询超时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我已经开发了许多.NET/SQL Server应用程序,但是我遭受了SQL查询超时的困扰,无法了解到底.我在查找有问题的查询并为它们重新编制索引/重新编写方面具有丰富的经验.我的Web应用程序使用SQL Server的RDS和Web应用程序的EC2托管在AWS上.我们每天有100-200个唯一身份用户,数据库约为15GB,其中有两个表大于1GB.

I have developed many .NET / SQL Server applications but I'm suffering from SQL query timeouts that I can't get to the bottom of. I have lots of experience in this area of finding the offending queries and re-indexing / re-writing them. My web app is hosted on AWS using RDS for SQL Server and EC2 for the Web App. We have 100-200 unique users per day and the database is around 15GB with a couple of tables > 1GB.

我全天都会看到以下异常消息:

I see exceptions throughout the day with the message:

'Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.'

遭受超时的查询与发生超时的时间一样随机.它似乎与任何显而易见的事情(备份运行一整夜等)都不相符.

The queries that suffer from timeouts are as random as the time the timeouts occur. It doesn't seem to coincide with anything obvious (backups run overnight etc).

我尝试从C#应用程序获取每个查询,然后直接在SQL中运行(使用与Arith Abort相同的SET选项),它们都运行良好.有些查询本质上是较慢的查询,但是最慢的查询在大约2秒钟内运行,并且具有约40万个逻辑读取.但是,我还看到查询超时在15毫秒内运行,并且<10个逻辑读取.

I have tried taking each query from the C# app and running it directly in SQL (with the same SET options like Arith Abort) and they all run just fine. Some are slower queries by nature but the slowest one runs in about 2 seconds and has ~400k logical reads. However, I also see queries timeout that run in 15ms and have < 10 logical reads.

我看到的最奇怪的事情是我已经从Web应用程序中获取了一个查询,并将其编码到已经运行24小时的控制台应用程序中,每秒调用一次查询.即使我已经看到主系统在运行期间同一查询的超时,它也没有一个异常/超时.

The most odd thing I've seen is I've taken a query from the web app and coded it up into a console app which has been running for 24 hours, calling the query once per second. It has not had a single exception / timeout even though I've seen the main system have timeouts for the same query during the time it's been running.

我最近将RDS服务器升级到了M5 Large,并且每天都在一夜之间重建所有索引.我已经运行DBCC FREEPROCCACHE,以确保没有造成问题的过时查询计划.

I have recently upgraded the RDS server to an M5 Large and all indexes are rebuilt overnight every day. I have run DBCC FREEPROCCACHE at some point to ensure there are no stale query plans causing the problem.

我觉得这是参数嗅探,或者我最后的想法是硬件/网络故障,但这确实吸引了稻草人!

I feel it's parameter sniffing or my last thought is hardware / network glitches but that really clutching at straws!

我得到的堆栈跟踪看起来像是中间查询,而不是在连接阶段.

The stack trace I get looks like it's mid-query and not during the connection phase.

at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error) at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync() at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket() at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer() at System.Data.SqlClient.TdsParserStateObject.TryReadByteArray(Byte[] buff, Int32 offset, Int32 len, Int32& totalRead) at System.Data.SqlClient.TdsParserStateObject.TryReadString(Int32 length, String& value) at System.Data.SqlClient.TdsParser.TryReadSqlStringValue(SqlBuffer value, Byte type, Int32 length, Encoding encoding, Boolean isPlp, TdsParserStateObject stateObj) at System.Data.SqlClient.TdsParser.TryReadSqlValue(SqlBuffer value, SqlMetaDataPriv md, Int32 length, TdsParserStateObject stateObj, SqlCommandColumnEncryptionSetting columnEncryptionOverride, String columnName) at System.Data.SqlClient.SqlDataReader.TryReadColumnInternal(Int32 i, Boolean readHeaderOnly) at System.Data.SqlClient.SqlDataReader.TryReadColumn(Int32 i, Boolean setTimeout, Boolean allowPartiallyReadColumn) at System.Data.SqlClient.SqlDataReader.GetValueInternal(Int32 i) at System.Data.SqlClient.SqlDataReader.GetValue(Int32 i)

在某种程度上解决问题的任何技术方面的帮助将不胜感激,我担心它会突然变得更糟.

Any help with some techniques to get to the bottom of this would be much appreciated as it's unsettling and I fear it's suddenly going to get a lot worse.

谢谢

编辑1

我试图通过每10ms运行一次测试应用程序(如上)并同时在SSMS中运行缓慢的阻塞事务来在本地创建相同的问题.

I have tried to create the same problem locally by running the test app (as above) once every 10ms and running a slow blocking transaction in SSMS at the same time.

从应用查询

SELECT TOP 10 * FROM MyTable WHERE LastModifiedBy = 'Stu'

在SSMS中查询

BEGIN TRAN UPDATE TOP (10000) MyTable SET LastModifiedBy = 'Me' where LastModifiedBy = 'Me' WAITFOR DELAY '00:00:35' COMMIT

当出现此错误时,我会看到我通常希望在SQL Profiler中看到的内容,其中应用程序查询恰好花费30000ms,并且我在应用程序中遇到异常.但是,此操作的有用输出是堆栈跟踪与我在生产中(上图)看到的不同.

When this errors I see what I'd usually expect to see in SQL Profiler where the app query takes exactly 30000ms and I get an exception in the app. However, the useful output from this is the stack trace is different from the one I see in production (above).

at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData() at System.Data.SqlClient.SqlDataReader.get_MetaData() at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted) at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method) at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method) at System.Data.SqlClient.SqlCommand.ExecuteDbDataReader(CommandBehavior behavior) at System.Data.Common.DbCommand.System.Data.IDbCommand.ExecuteReader(CommandBehavior behavior)

我正在读取此堆栈跟踪,因为查询从未开始执行,因为它仍在尝试读取查询的元数据.但是,这与生产中的堆栈跟踪形成对比(在我看来),该跟踪似乎在从列读取数据的中间,但是执行中途超时.

I'm reading this stack trace as the query never started to execute since it's still trying to read meta-data for the query. However, this contrasts with the stack trace from production that (to my eyes) appears to be in the middle of reading data from columns but has a timeout mid execution.

我还一直在阅读有关.NET 4.6.2的信息,这是我们正在使用的版本.我将在今天晚上将所有内容升级到4.7.2,以排除这种情况.(连接到远程SQL Server将Web服务器升级到 Framework 4.6.1时出现故障)

I've also been reading about .NET 4.6.2 which is the version we're using. I'll upgrade everything to 4.7.2 this evening to rule that out. (Connection to remote SQL server breaks when upgrading web server to framework 4.6.1)

推荐答案

经过一周的紧张调查后,问题已解决!它已经运行了2多个小时,没有一个超时时间:-)

After a week of stressful investigation it's fixed!! It's been running now for over 2 hours without a single timeout :-)

原来是与.NET v4.6.2.的某种错误或不匹配.

Turned out to be some kind of bug or mismatch with .NET v4.6.2.

我的配置为:

  • AWS RDS上的SQL Server 2017 Web Edition
  • .NET v4.6.2
  • Dapper v1.50.5

我的更改是:

  • 在Web服务器上安装.NET 4.7.2
  • 升级Web App和Visual Studio中的所有DLL项目以使用.NET 4.7.2(确保将web.config更新为< httpRuntime targetFramework ="4.7.2"/> )
  • 通过Nuget将Dapper升级到最新的v.1.60.0(我认为Dapper没错,我只是在升级它的同时进行了其他所有操作,因为它与数据库有关)

这些问题帮助我指出了这个方向:

These questions helped point me in this direction:

  • SqlDataReader.GetValue挂起
  • ADO.Net SQLCommand.ExecuteReader()变慢或挂起
  • SqlDataReader挂在GetValue()方法和SNIReadSyncOverAsync

谢谢您的互联网-在您开始学习之前,我是如何编码的

THANK YOU INTERNET - HOW ON EARTH DID I CODE BEFORE YOU CAME ALONG

更多推荐

需要帮助从C#诊断SQL Server奇怪的查询超时

本文发布于:2023-11-07 22:17:32,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1567586.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:奇怪   SQL   Server

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!