通过文件名的特定阵列,通过文件扩展名来排序最简单了方法是这样的:
的Array.Sort(文件名, (X,Y)=> Path.GetExtension(x)的.CompareTo(Path.GetExtension(γ)));
问题是,在很长的名单(〜800K),它需要很长的时间进行排序,同时通过整个文件名排序是更快了几秒钟!
的理论,有一种方法来优化它:而不是使用 Path.GetExtension()和比较新创建的扩展,只串,我们可以提供一个比较比对从启动现有的文件名字符串 LastIndexOf('。'),而无需创建新的字符串。
现在,假设我找到了 LastIndexOf('。'),我想重用原生.NET的StringComparer和之后的它仅适用于部分的字符串 LastIndexOf('。'),以preserve所有文化的考虑。没有找到一种方法来做到这一点。
任何想法?
编辑:
使用tanascius的主意,用 char.CompareTo()的方法,我带着我的尤伯杯快速文件扩展名-的Comparer,现在它推而广之快3倍时间排序!它甚至快于使用 Path.GetExtension()以某种方式的所有方法。你怎么看?
编辑2:
我发现,这个实现不考虑,因为文化char.CompareTo()方法不考虑文化,所以这不是一个完美的解决方案。
任何想法?
公共静态INT CompareExtensions(字符串filePath1,串FILEPATH2) { 如果(filePath1 == NULL和放大器;&安培; FILEPATH2 == NULL) { 返回0; } 否则,如果(filePath1 == NULL) { 返回-1; } 否则,如果(FILEPATH2 == NULL) { 返回1; } INT I = filePath1.LastIndexOf('。'); 诠释J = filePath2.LastIndexOf('。'); 如果(我== -1) { I = filePath1.Length; } 其他 { 我++; } 如果(j == -1) { J = filePath2.Length; } 其他 { J ++; } 对于(; I< filePath1.Length和放大器;&放大器; J< filePath2.Length;我++,J ++) { INT compareResults = filePath1 [I] .CompareTo(FILEPATH2 [J]); 如果(compareResults!= 0) { 返回compareResults; } } 如果(I> = filePath1.Length和放大器;&放大器; J> = filePath2.Length) { 返回0; } 否则,如果(I> = filePath1.Length) { 返回-1; } 其他 { 返回1; } }解决方案
您可以编写一个比较器,用于比较扩展的每一个字符。 字符有一个的CompareTo(),太(的看到这里)。
基本上你循环,直到你没有留在至少一个字符串或一个或多个字符的CompareTo()返回值!= 0。
编辑:在回答到OP的修改的
您比较器方法的性能可以显著改善。请参见下面的code。此外,我增加了行
的String.Compare(filePath1 [I]的ToString(),FILEPATH2 [J]的ToString() m_CultureInfo,m_CompareOptions);
,以便能够使用的CultureInfo 和 CompareOptions 的。然而,这减慢的一切相比,使用普通的 char.CompareTo()(约系数2)版本。但是,根据我的自己太问题这似乎是这样去了。
公共密封类ExtensionComparer:的IComparer<字符串> { 私人只读的CultureInfo m_CultureInfo; 私人只读CompareOptions m_CompareOptions; 公共ExtensionComparer():这个(CultureInfo.CurrentUICulture,CompareOptions.None){} 公共ExtensionComparer(的CultureInfo CultureInfo的,CompareOptions compareOptions) { m_CultureInfo = CultureInfo的; m_CompareOptions = compareOptions; } 公众诠释比较(字符串filePath1,串FILEPATH2) { 如果(filePath1 == NULL || FILEPATH2 == NULL) { 如果(filePath1!= NULL) { 返回1; } 如果(FILEPATH2!= NULL) { 返回-1; } 返回0; } 变种I = filePath1.LastIndexOf('。')+ 1; 变种J = filePath2.LastIndexOf + 1('。'); 如果(我== 0 ||Ĵ== 0) { 如果(ⅰ!= 0) { 返回1; } 复位J!= 0? -1:0; } 而(真) { 如果(我== filePath1.Length ||Ĵ== filePath2.Length) { 如果(我!= filePath1.Length) { 返回1; } 复位J!= filePath2.Length? -1:0; } VAR compareResults =的String.Compare(filePath1 [I]的ToString(),FILEPATH2 [J]的ToString(),m_CultureInfo,m_CompareOptions); // VAR compareResults = filePath1 [I] .CompareTo(FILEPATH2 [J]); 如果(compareResults!= 0) { 返回compareResults; } 我++; J ++; } } }
用法:
fileNames1.Sort(新ExtensionComparer(CultureInfo.GetCultureInfo(SV-SE), CompareOptions.StringSort));With a given array of file names, the most simpliest way to sort it by file extension is like this:
Array.Sort(fileNames, (x, y) => Path.GetExtension(x).CompareTo(Path.GetExtension(y)));The problem is that on very long list (~800k) it takes very long to sort, while sorting by the whole file name is faster for a couple of seconds!
Theoretical, there is a way to optimize it: instead of using Path.GetExtension() and compare the newly created extension-only-strings, we can provide a Comparison than compares the existing filename strings starting from the LastIndexOf('.') without creating new strings.
Now, suppose i found the LastIndexOf('.'), i want to reuse native .NET's StringComparer and apply it only to the part on string after the LastIndexOf('.'), to preserve all culture consideration. Didn't found a way to do that.
Any ideas?
Edit:
With tanascius's idea to use char.CompareTo() method, i came with my Uber-Fast-File-Extension-Comparer, now it sorting by extension 3x times faster! it even faster than all methods that uses Path.GetExtension() in some manner. what do you think?
Edit 2:
I found that this implementation do not considering culture since char.CompareTo() method do not considering culture, so this is not a perfect solution.
Any ideas?
public static int CompareExtensions(string filePath1, string filePath2) { if (filePath1 == null && filePath2 == null) { return 0; } else if (filePath1 == null) { return -1; } else if (filePath2 == null) { return 1; } int i = filePath1.LastIndexOf('.'); int j = filePath2.LastIndexOf('.'); if (i == -1) { i = filePath1.Length; } else { i++; } if (j == -1) { j = filePath2.Length; } else { j++; } for (; i < filePath1.Length && j < filePath2.Length; i++, j++) { int compareResults = filePath1[i].CompareTo(filePath2[j]); if (compareResults != 0) { return compareResults; } } if (i >= filePath1.Length && j >= filePath2.Length) { return 0; } else if (i >= filePath1.Length) { return -1; } else { return 1; } }解决方案
You can write a comparer that compares each character of the extension. char has a CompareTo(), too (see here).
Basically you loop until you have no more chars left in at least one string or one CompareTo() returns a value != 0.
EDIT: In response to the edits of the OP
The performance of your comparer method can be significantly improved. See the following code. Additionally I added the line
string.Compare( filePath1[i].ToString(), filePath2[j].ToString(), m_CultureInfo, m_CompareOptions );to enable the use of CultureInfo and CompareOptions. However this slows down everything compared to a version using a plain char.CompareTo() (about factor 2). But, according to my own SO question this seems to be the way to go.
public sealed class ExtensionComparer : IComparer<string> { private readonly CultureInfo m_CultureInfo; private readonly CompareOptions m_CompareOptions; public ExtensionComparer() : this( CultureInfo.CurrentUICulture, CompareOptions.None ) {} public ExtensionComparer( CultureInfo cultureInfo, CompareOptions compareOptions ) { m_CultureInfo = cultureInfo; m_CompareOptions = compareOptions; } public int Compare( string filePath1, string filePath2 ) { if( filePath1 == null || filePath2 == null ) { if( filePath1 != null ) { return 1; } if( filePath2 != null ) { return -1; } return 0; } var i = filePath1.LastIndexOf( '.' ) + 1; var j = filePath2.LastIndexOf( '.' ) + 1; if( i == 0 || j == 0 ) { if( i != 0 ) { return 1; } return j != 0 ? -1 : 0; } while( true ) { if( i == filePath1.Length || j == filePath2.Length ) { if( i != filePath1.Length ) { return 1; } return j != filePath2.Length ? -1 : 0; } var compareResults = string.Compare( filePath1[i].ToString(), filePath2[j].ToString(), m_CultureInfo, m_CompareOptions ); //var compareResults = filePath1[i].CompareTo( filePath2[j] ); if( compareResults != 0 ) { return compareResults; } i++; j++; } } }Usage:
fileNames1.Sort( new ExtensionComparer( CultureInfo.GetCultureInfo( "sv-SE" ), CompareOptions.StringSort ) );
更多推荐
完善而广之整理文件的性能
发布评论