完善而广之整理文件的性能

编程入门行业动态更新时间:2024-10-26 13:34:00

本文介绍了完善而广之整理文件的性能的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

通过文件名的特定阵列，通过文件扩展名来排序最简单了方法是这样的：

的Array.Sort（文件名，（X，Y）=＆GT; Path.GetExtension（x）的.CompareTo（Path.GetExtension（γ）））;

问题是，在很长的名单（〜800K），它需要很长的时间进行排序，同时通过整个文件名排序是更快了几秒钟！

的理论，有一种方法来优化它：而不是使用 Path.GetExtension（）和比较新创建的扩展，只串，我们可以提供一个比较比对从启动现有的文件名字符串 LastIndexOf（'。'），而无需创建新的字符串。

现在，假设我找到了 LastIndexOf（'。'），我想重用原生.NET的StringComparer和之后的它仅适用于部分的字符串 LastIndexOf（'。'），以preserve所有文化的考虑。没有找到一种方法来做到这一点。

任何想法？

编辑：

使用tanascius的主意，用 char.CompareTo（）的方法，我带着我的尤伯杯快速文件扩展名-的Comparer，现在它推而广之快3倍时间排序！它甚至快于使用 Path.GetExtension（）以某种方式的所有方法。你怎么看？

编辑2：

我发现，这个实现不考虑，因为文化char.CompareTo（）方法不考虑文化，所以这不是一个完美的解决方案。

任何想法？

公共静态INT CompareExtensions（字符串filePath1，串FILEPATH2） { 如果（filePath1 == NULL和放大器;＆安培; FILEPATH2 == NULL） { 返回0; } 否则，如果（filePath1 == NULL） { 返回-1; } 否则，如果（FILEPATH2 == NULL） { 返回1; } INT I = filePath1.LastIndexOf（'。'）; 诠释J = filePath2.LastIndexOf（'。'）; 如果（我== -1） { I = filePath1.Length; } 其他 { 我++; } 如果（j == -1） { J = filePath2.Length; } 其他 { J ++; } 对于（; I＆LT; filePath1.Length和放大器;＆放大器; J＆LT; filePath2.Length;我++，J ++） { INT compareResults = filePath1 [I] .CompareTo（FILEPATH2 [J]）; 如果（compareResults！= 0） { 返回compareResults; } } 如果（I＆GT; = filePath1.Length和放大器;＆放大器; J＆GT; = filePath2.Length） { 返回0; } 否则，如果（I＆GT; = filePath1.Length） { 返回-1; } 其他 { 返回1; } }

解决方案

您可以编写一个比较器，用于比较扩展的每一个字符。字符有一个的CompareTo（），太（的看到这里）。

基本上你循环，直到你没有留在至少一个字符串或一个或多个字符的CompareTo（）返回值！= 0。

编辑：在回答到OP的修改的

您比较器方法的性能可以显著改善。请参见下面的code。此外，我增加了行

的String.Compare（filePath1 [I]的ToString（），FILEPATH2 [J]的ToString（） m_CultureInfo，m_CompareOptions）;

，以便能够使用的CultureInfo 和 CompareOptions 的。然而，这减慢的一切相比，使用普通的 char.CompareTo（）（约系数2）版本。但是，根据我的自己太问题这似乎是这样去了。

公共密封类ExtensionComparer：的IComparer＆LT;字符串＆GT; { 私人只读的CultureInfo m_CultureInfo; 私人只读CompareOptions m_CompareOptions; 公共ExtensionComparer（）：这个（CultureInfo.CurrentUICulture，CompareOptions.None）{} 公共ExtensionComparer（的CultureInfo CultureInfo的，CompareOptions compareOptions） { m_CultureInfo = CultureInfo的; m_CompareOptions = compareOptions; } 公众诠释比较（字符串filePath1，串FILEPATH2） { 如果（filePath1 == NULL || FILEPATH2 == NULL） { 如果（filePath1！= NULL） { 返回1; } 如果（FILEPATH2！= NULL） { 返回-1; } 返回0; } 变种I = filePath1.LastIndexOf（'。'）+ 1; 变种J = filePath2.LastIndexOf + 1（'。'）; 如果（我== 0 ||Ĵ== 0） { 如果（ⅰ！= 0） { 返回1; } 复位J！= 0？ -1：0; } 而（真） { 如果（我== filePath1.Length ||Ĵ== filePath2.Length） { 如果（我！= filePath1.Length） { 返回1; } 复位J！= filePath2.Length？ -1：0; } VAR compareResults =的String.Compare（filePath1 [I]的ToString（），FILEPATH2 [J]的ToString（），m_CultureInfo，m_CompareOptions）; // VAR compareResults = filePath1 [I] .CompareTo（FILEPATH2 [J]）; 如果（compareResults！= 0） { 返回compareResults; } 我++; J ++; } } }

用法：

fileNames1.Sort（新ExtensionComparer（CultureInfo.GetCultureInfo（SV-SE）， CompareOptions.StringSort））;

With a given array of file names, the most simpliest way to sort it by file extension is like this:

Array.Sort(fileNames, (x, y) => Path.GetExtension(x).CompareTo(Path.GetExtension(y)));

The problem is that on very long list (~800k) it takes very long to sort, while sorting by the whole file name is faster for a couple of seconds!

Theoretical, there is a way to optimize it: instead of using Path.GetExtension() and compare the newly created extension-only-strings, we can provide a Comparison than compares the existing filename strings starting from the LastIndexOf('.') without creating new strings.

Now, suppose i found the LastIndexOf('.'), i want to reuse native .NET's StringComparer and apply it only to the part on string after the LastIndexOf('.'), to preserve all culture consideration. Didn't found a way to do that.

Any ideas?

Edit:

With tanascius's idea to use char.CompareTo() method, i came with my Uber-Fast-File-Extension-Comparer, now it sorting by extension 3x times faster! it even faster than all methods that uses Path.GetExtension() in some manner. what do you think?

Edit 2:

I found that this implementation do not considering culture since char.CompareTo() method do not considering culture, so this is not a perfect solution.

Any ideas?

public static int CompareExtensions(string filePath1, string filePath2) { if (filePath1 == null && filePath2 == null) { return 0; } else if (filePath1 == null) { return -1; } else if (filePath2 == null) { return 1; } int i = filePath1.LastIndexOf('.'); int j = filePath2.LastIndexOf('.'); if (i == -1) { i = filePath1.Length; } else { i++; } if (j == -1) { j = filePath2.Length; } else { j++; } for (; i < filePath1.Length && j < filePath2.Length; i++, j++) { int compareResults = filePath1[i].CompareTo(filePath2[j]); if (compareResults != 0) { return compareResults; } } if (i >= filePath1.Length && j >= filePath2.Length) { return 0; } else if (i >= filePath1.Length) { return -1; } else { return 1; } }

解决方案

You can write a comparer that compares each character of the extension. char has a CompareTo(), too (see here).

Basically you loop until you have no more chars left in at least one string or one CompareTo() returns a value != 0.

EDIT: In response to the edits of the OP

The performance of your comparer method can be significantly improved. See the following code. Additionally I added the line

string.Compare( filePath1[i].ToString(), filePath2[j].ToString(), m_CultureInfo, m_CompareOptions );

to enable the use of CultureInfo and CompareOptions. However this slows down everything compared to a version using a plain char.CompareTo() (about factor 2). But, according to my own SO question this seems to be the way to go.

public sealed class ExtensionComparer : IComparer<string> { private readonly CultureInfo m_CultureInfo; private readonly CompareOptions m_CompareOptions; public ExtensionComparer() : this( CultureInfo.CurrentUICulture, CompareOptions.None ) {} public ExtensionComparer( CultureInfo cultureInfo, CompareOptions compareOptions ) { m_CultureInfo = cultureInfo; m_CompareOptions = compareOptions; } public int Compare( string filePath1, string filePath2 ) { if( filePath1 == null || filePath2 == null ) { if( filePath1 != null ) { return 1; } if( filePath2 != null ) { return -1; } return 0; } var i = filePath1.LastIndexOf( '.' ) + 1; var j = filePath2.LastIndexOf( '.' ) + 1; if( i == 0 || j == 0 ) { if( i != 0 ) { return 1; } return j != 0 ? -1 : 0; } while( true ) { if( i == filePath1.Length || j == filePath2.Length ) { if( i != filePath1.Length ) { return 1; } return j != filePath2.Length ? -1 : 0; } var compareResults = string.Compare( filePath1[i].ToString(), filePath2[j].ToString(), m_CultureInfo, m_CompareOptions ); //var compareResults = filePath1[i].CompareTo( filePath2[j] ); if( compareResults != 0 ) { return compareResults; } i++; j++; } } }

Usage:

fileNames1.Sort( new ExtensionComparer( CultureInfo.GetCultureInfo( "sv-SE" ), CompareOptions.StringSort ) );

更多推荐

完善而广之整理文件的性能

本文发布于:2023-11-23 09:02:14，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1620860.html