在Datagridview中查找重复项(Find duplicates in a Datagridview)

编程入门 行业动态 更新时间:2024-10-24 14:15:35
Datagridview中查找重复项(Find duplicates in a Datagridview)

我想在dgv中搜索重复项,并在列表中收集重复项的行号(如有必要,将其显示给用户)。 这是我的代码:

Function Check(ByVal dgv As DataGridView) Dim Duplicates As New List(Of Tuple(Of Integer, Integer)) For i As Integer = 1 To dgv.RowCount For k As Integer = 1 To dgv.RowCount For j As Integer = 1 To dgv.ColumnCount Dim l As Integer If dgv.Rows(i).Cells(j).Value = dgv.Rows(k + 1).Cells(j).Value Then l += l + 1 If l = dgv.ColumnCount Then Duplicates.Add(Tuple.Create(i, k)) End If End If Next Next Next Return Duplicates End Function

现在我实际上有两个问题:

由于我是初学者,我想知道这是否是搜索duplicats的最佳方式

我总是得到错误, Operator '=' is not defined for type 'DBNull' 。 我知道错误,但不知道如何处理它。 我试过了:

Dim l As Integer 'DbNull - Check Dim first As String If IsDBNull(dgv.Rows(i).Cells(j).Value) Then first = 0 Else first = dgv.Rows(i).Cells(j).Value End If

现在我检查了first = second而不是dgv.Rows(i).Cells(j).Value = dgv.Rows(k + 1).Cells(j).Value但是现在我有一个类型问题,因为db-类型是Date,varchar,integer等等,这使我dim first as string与dim first as string冲突。 谁知道摆脱错误的方法?

附加信息:我的dgv - 绑定到连接到sql server的数据表 - 有6个可见列和4个不可见列

I want to search a dgv for duplicates and collect the row numbers of the duplicates in a list (to show it to the user if necessary). this is my code:

Function Check(ByVal dgv As DataGridView) Dim Duplicates As New List(Of Tuple(Of Integer, Integer)) For i As Integer = 1 To dgv.RowCount For k As Integer = 1 To dgv.RowCount For j As Integer = 1 To dgv.ColumnCount Dim l As Integer If dgv.Rows(i).Cells(j).Value = dgv.Rows(k + 1).Cells(j).Value Then l += l + 1 If l = dgv.ColumnCount Then Duplicates.Add(Tuple.Create(i, k)) End If End If Next Next Next Return Duplicates End Function

Now I have actually two questions:

Since I am a beginner I would like to know if this is the best way to search for duplicats

I always get the error that Operator '=' is not defined for type 'DBNull'. I know the error but dont know how to handle it. I tried:

Dim l As Integer 'DbNull - Check Dim first As String If IsDBNull(dgv.Rows(i).Cells(j).Value) Then first = 0 Else first = dgv.Rows(i).Cells(j).Value End If

Now I checked if first = second instead of dgv.Rows(i).Cells(j).Value = dgv.Rows(k + 1).Cells(j).Value But now I have a type problem, since the db-types are Date, varchar, integer and so on and this gives me conflicts with dim first as string. Anyone knows a way to get rid of the error?

Additional information: My dgv - is bound to a datatable which is connected to a sql server - has 6 visible and 4 invisible columns

最满意答案

您需要在基础数据表中找到重复项,而不是DataGridView本身。

您当前的方法效率很低,因为它会为每一行循环所有其他行 - O(N ^ 2)。 可以使用字典(字符串,行)优化到一次传递。

这是一个代码示例来说明这个想法。 请注意keyColumns hashset如何用于指定应使用哪些列来确定重复项(也称为唯一键)。

Dim dt As New DataTable dt.Columns.Add("col1") dt.Columns.Add("col2") dt.Columns.Add("col3") dt.Rows.Add({"val1", "val2", "val3"}) dt.Rows.Add({"val1", "val3", "val3"}) dt.Rows.Add({"val1", "val3", "val4"}) Dim dict As New Dictionary(Of String, List(Of DataRow)) Dim keyColumns As New HashSet(Of String)({"col1", "col3"}) For Each dr As DataRow In dt.Rows Dim sbKey As New System.Text.StringBuilder For Each col As DataColumn In dt.Columns Dim colName As String = col.ColumnName If Not keyColumns.Contains(colName) Then Continue For Dim colValue As String = dr.Field(Of String)(colName) sbKey.Append(colValue & "@") Next Dim key As String = sbKey.ToString Dim drList As List(Of DataRow) = Nothing If Not dict.TryGetValue(key, drList) Then drList = New List(Of DataRow) dict.Add(key, drList) End If drList.Add(dr) Next

最后,您的dict包含按键组织的所有数据行的字典。 每个键中有1个条目的那些没有重复项。 其他人是重复的。 您可以进一步调整它以仅查找具有N个重复项(N> = 1)的行,因此,例如:

Dim p = dict.Where(Function(x) x.Value.Count > 1)

将找到所有数据行的子集,其中至少找到一个副本,并包括所有冲突的(包括原始的)。

You need to be finding duplicates in the underlying data table, not the DataGridView itself.

Your current approach is inefficient, because it loops all other rows for every row - O(N^2). It is possible to optimize to one pass using a dictionary(of string, row).

Here is a code sample to illustrate the idea. Notice how keyColumns hashset is used to specify which columns should be used to determine the duplicate (also known as unique key).

Dim dt As New DataTable dt.Columns.Add("col1") dt.Columns.Add("col2") dt.Columns.Add("col3") dt.Rows.Add({"val1", "val2", "val3"}) dt.Rows.Add({"val1", "val3", "val3"}) dt.Rows.Add({"val1", "val3", "val4"}) Dim dict As New Dictionary(Of String, List(Of DataRow)) Dim keyColumns As New HashSet(Of String)({"col1", "col3"}) For Each dr As DataRow In dt.Rows Dim sbKey As New System.Text.StringBuilder For Each col As DataColumn In dt.Columns Dim colName As String = col.ColumnName If Not keyColumns.Contains(colName) Then Continue For Dim colValue As String = dr.Field(Of String)(colName) sbKey.Append(colValue & "@") Next Dim key As String = sbKey.ToString Dim drList As List(Of DataRow) = Nothing If Not dict.TryGetValue(key, drList) Then drList = New List(Of DataRow) dict.Add(key, drList) End If drList.Add(dr) Next

In the end, your dict contains a dictionary of all data rows, organized by key. Those which have 1 entry in each key do not have duplicates. Others are duplicates. You can tweak it further to find only rows which have N duplicates (N>=1), so this, for example:

Dim p = dict.Where(Function(x) x.Value.Count > 1)

Will find you the subset of all data rows, for which at least one duplicate was found, and include all conflicting ones (including the original).

更多推荐

本文发布于:2023-08-07 02:08:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1458809.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:Datagridview   Find   duplicates

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!