大型数据集的唯一计数公式

编程入门 行业动态 更新时间:2024-10-24 13:23:07
本文介绍了大型数据集的唯一计数公式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我在确定将 1 或 0 输入相邻单元格以指示在处理大型数据集时值是否唯一的方法时遇到了麻烦.我已经阅读了多种方法来实现此目的,但是对于我来说,这些方法似乎都不有效:我使用的是Excel 2010实例(因此,我没有拥有不同的计数数据透视表中的功能,当我尝试使用PowerPivot时,由于处理限制,它会使我的文件崩溃.

I am having trouble determining a way to enter a 1 or 0 into an adjacent cell to indicate whether or not a value is unique when working with a large dataset. I have read of multiple methods for accomplishing this, however none of them seem efficient for my purposes: I am using an instance of Excel 2010 (so I do not have the Distinct Count feature in PivotTables, and when I try to use PowerPivot it crashes my file due to processing limitations.

在此StackOverflow问题中:要计算唯一计数的简单数据透视表值有一些建议使用 SUMPRODUCT 或 COUNTIF ,但是当按原样使用50,000+行时,这将导致糟糕的性能,文件大小约为35MB,而不是〜3 MB.我想知道对于大型动态数据集,不管是公式还是VBA,是否有更好的解决方案.

In this StackOverflow question: Simple Pivot Table to Count Unique Values there are suggestions to use SUMPRODUCT or COUNTIF, but when working with 50,000+ rows as I am, this causes terrible performance and a file size of ~35 MB instead of ~3 MB. I wanted to know if there is a better solution for a large, dynamic dataset whether it is a formula or VBA.

我要完成的一个示例是(以 Unique 列为相邻单元格):

An example of what I would like to accomplish is (with the Unique column being the adjacent cell):

Name Week Unique John 1 1 Sally 1 1 John 1 0 Sally 2 1

我试图编写 COUNTIF 的相同功能,但没有成功:

I attempted to script the same functionality of COUNTIF but with no success:

For Each Cell In ThisWorkbook.Worksheets("Overtime & Type Data").Range("Z2:Z" & DataLastRow) If Worksheets("Overtime & Type Data").Cells(Cell.Row, 26) <> Worksheets("Overtime & Type Data").Cells(Cell.Row - 1, 26) Then FirstCell = Cell.Row End If If (Worksheets("Overtime & Type Data").Range(Cells(FirstCell, 26), Cells(Cell.Row, 26)) = Worksheets("Overtime & Type Data").Range(Cells(Cell.Row, 26))) = True Then Cell.Value = 1 Else Cell.Value = 0 End If Next Cell

推荐答案

此代码在不到3秒的时间内成功运行了130,000多行.调整列字母以适合您的数据集.

This code ran on over 130,000 rows successfully in less than 3 seconds. Adjust the column letters to fit your dataset.

Sub tgr() Const colName As String = "A" Const colWeek As String = "B" Const colOutput As String = "C" Dim ws As Worksheet Dim rngData As Range Dim DataCell As Range Dim rngFound As Range Dim collUniques As Collection Dim arrResults() As Long Dim ResultIndex As Long Dim UnqCount As Long Set ws = ThisWorkbook.Sheets("Overtime & Type Data") Set rngData = ws.Range(colName & 2, ws.Cells(Rows.Count, colName).End(xlUp)) Set collUniques = New Collection ReDim arrResults(1 To rngData.Cells.Count, 1 To 1) On Error Resume Next For Each DataCell In rngData.Cells ResultIndex = ResultIndex + 1 collUniques.Add ws.Cells(DataCell.Row, colName).Value & ws.Cells(DataCell.Row, colWeek).Value, ws.Cells(DataCell.Row, colName).Value & ws.Cells(DataCell.Row, colWeek).Value If collUniques.Count > UnqCount Then UnqCount = collUniques.Count arrResults(ResultIndex, 1) = 1 Else arrResults(ResultIndex, 1) = 0 End If Next DataCell On Error GoTo 0 ws.Cells(rngData.Row, colOutput).Resize(rngData.Cells.Count).Value = arrResults End Sub

更多推荐

大型数据集的唯一计数公式

本文发布于:2023-10-30 19:07:21,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1543731.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:公式   数据

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!