将字符串转换为摩尔斯电码

本文介绍了将字符串转换为摩尔斯电码的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述挑战

按字符数计算的最短代码，它将仅使用字母字符(大写和小写)，数字，逗号，句点和问号输入字符串，并以摩尔斯电码返回该字符串的表示形式. 摩尔斯电码输出应包含长音(AKA'dah')的破折号(-，ASCII 0x2D)和短音(AKA'dit')的点(.，ASCII 0x2E).

每个字母应以空格(' '，ASCII 0x20)分隔，每个单词应以正斜杠(/，ASCII 0x2F)分隔.

摩尔斯电码表:

替代文字liranuna/junk/morse.gif 测试案例:

Input: Hello world Output: .... . .-.. .-.. --- / .-- --- .-. .-.. -..

Input: Hello, Stackoverflow. Output: .... . .-.. .-.. --- --..-- / ... - .- -.-. -.- --- ...- . .-. ..-. .-.. --- .-- .-.-.-

代码计数包括输入/输出(即完整程序).

解决方案

C(131个字符)

是， 13 1 ！

main(c){for(;c=c?c:(c=toupper(getch())-32)? "•ƒŒKa`^ZRBCEIQiw#S#nx(37+$6-2&@/4)'18=,*%.:0;?5" [c-12]-34:-3;c/=2)putch(c/2?46-c%2:0);}

通过将while和for循环中的逻辑组合到单个for循环中，并将c变量的声明移到main定义中，我找到了更多的字符输入参数.我从 strager对另一个挑战的回答中借鉴了后一种技术.

对于那些试图使用GCC或仅ASCII编辑器来验证程序的人，您可能需要以下稍长的版本:

main(c){for(;c=c?c:(c=toupper(getchar())-32)?c<0?1: "\x95#\x8CKa`^ZRBCEIQiw#S#nx(37+$6-2&@/4)'18=,*%.:0;?5" [c-12]-34:-3;c/=2)putchar(c/2?46-c%2:32);}

由于以下更改，此版本的长度增加了17个字符(称重为148个):

+4:getchar()和putchar()而不是不可移植的getch()和putch()
+6:两个字符的转义码，而不是非ASCII字符
+1:空格字符为32，而不是0
+6:添加了"c<0?1:"以抑制小于ASCII 32的字符(即，来自'\n'的字符)的垃圾邮件.您仍然会从!"#$%&'()*+[\]^_`{|}~或任何高于ASCII 126的内容中获取垃圾.

这应该使代码完全可移植.编译:

gcc -std=c89 -funsigned-char morse.c

-std=c89是可选的.但是，-funsigned-char是必需的，否则您将得到逗号和句号的垃圾.

135个字符

c;main(){while(c=toupper(getch()))for(c=c-32? "•ƒŒKa`^ZRBCEIQiw#S#nx(37+$6-2&@/4)'18=,*%.:0;?5" [c-44]-34:-3;c;c/=2)putch(c/2?46-c%2:0);}

我认为，此最新版本在视觉上也更具吸引力.不，它不是可移植的，并且不再受到越界输入的保护.它还有一个非常糟糕的UI，接受逐个字符的输入并将其转换为摩尔斯电码，并具有 no 退出条件(您必须按 Ctrl + 中断).但是，不需要具有良好UI的可移植，健壮的代码.

对代码的简要解释如下:

main(c){ while(c = toupper(getch())) /* well, *sort of* an exit condition */ for(c = c - 32 ? // effectively: "if not space character" "•ƒŒKa`^ZRBCEIQiw#S#nx(37+$6-2&@/4)'18=,*%.:0;?5"[c - 44] - 34 /* This array contains a binary representation of the Morse Code * for all characters between comma (ASCII 44) and capital Z. * The values are offset by 34 to make them all representable * without escape codes (as long as chars > 127 are allowed). * See explanation after code for encoding format. */ : -3; /* if input char is space, c = -3 * this is chosen because -3 % 2 = -1 (and 46 - -1 = 47) * and -3 / 2 / 2 = 0 (with integer truncation) */ c; /* continue loop while c != 0 */ c /= 2) /* shift down to the next bit */ putch(c / 2 ? /* this will be 0 if we're down to our guard bit */ 46 - c % 2 /* We'll end up with 45 (-), 46 (.), or 47 (/). * It's very convenient that the three characters * we need for this exercise are all consecutive. */ : 0 /* we're at the guard bit, output blank space */ ); }

代码中长字符串中的每个字符都包含一个文本字符的已编码莫尔斯电码.编码字符的每一位代表一个破折号或一个点.一个代表破折号，一个零代表一个点.最低有效位代表摩尔斯电码中的第一个破折号或点.最后一个防护"位确定代码的长度.即，每个编码字符中的最高一位代表代码结束并且不被打印.没有此保护位，带尾点的字符将无法正确打印.

例如，在摩尔斯电码中，字母"L"为".-..".为了用二进制表示，我们需要一个0，一个1和另外两个0，从最低有效位开始:0010.将一个1再附加一个保护位，然后得到编码的摩尔斯电码:10010或十进制18.添加+34偏移以获得52，这是字符"4"的ASCII值.因此，编码的字符数组的第33个字符(索引为32)的字符为'4'.

此技术类似于 ACoolie的编码技术， strager's (2) ， Miles's ， pingw33n's ， Alec的和 Andrea的解决方案，但稍微简单一点，每位只需要进行一次操作(移位/除法)，而不是两次(移位/除法)并递减).

通读其余的实现，我发现 Alec 和 Anon 提出了使用保护位的这种编码方案. Anon的解决方案特别有趣，它使用Python的bin函数并用[3:]除去"0b"前缀和保护位，而不是像Alec和我一样进行循环，与和移位.

作为奖励，此版本还处理连字符(-....-)，斜杠(-..-.)，冒号(---...)，分号(-.-.-.)，等于(-...-)和符号( .--.-.).只要允许使用8位字符，这些字符就不需要额外的代码字节来支持.在不增加代码长度的情况下，此版本不再支持任何字符(除非摩尔斯电码表示大于/小于符号).

由于我发现旧的实现仍然很有趣，并且文本中有一些适用于该版本的警告，因此，我在下面保留了本文的先前内容.

好吧，大概用户界面很烂，对吧?因此，我从 strager 借来了，我替换了gets()，它提供了带getch()的缓冲的，回声的行输入，提供无缓冲的，非回声的字符输入.这意味着您键入的每个字符都会立即在屏幕上转换为摩尔斯电码.也许那很酷.它不再可以与stdin或命令行参数一起使用，但是非常小.

不过，我将旧代码保留在下面，以供参考.这是新的.

具有171个字符的边界检查的新代码:

W(i){i?W(--i/2),putch(46-i%2):0;}c;main(){while(c=toupper(getch())-13) c=c-19?c>77|c<31?0:W("œ*~*hXPLJIYaeg*****u*.AC5+;79-@6=0/8?F31,2:4BDE" [c-31]-42):putch(47),putch(0);}

Enter 中断循环并退出程序.

新代码，无边界检查，共159个字符:

W(i){i?W(--i/2),putch(46-i%2):0;}c;main(){while(c=toupper(getch())-13) c=c-19?W("œ*~*hXPLJIYaeg*****u*.AC5+;79-@6=0/8?F31,2:4BDE"[c-31]-42): putch(47),putch(0);}

以下遵循旧的196/177代码，并有一些解释:

W(i){i?W(--i/2),putch(46-i%2):0;}main(){char*p,c,s[99];gets(s); for(p=s;*p;)c=*p++,c=toupper(c),c=c-32?c>90|c<44?0:W( "œ*~*hXPLJIYaeg*****u*.AC5+;79-@6=0/8?F31,2:4BDE"[c-44]-42): putch(47),putch(0);}

这基于 Andrea的Python答案，使用相同的技术像该答案一样生成莫尔斯电码.但是，我不是一个接一个地存储可编码字符并找到它们的索引，而是一个接一个地存储索引并按字符查找它们(类似于我先前的回答).这样可以避免在结尾处出现较长的间隙，而这会给早期的实现者带来麻烦.

作为之前，我使用的字符大于127 .将其转换为纯ASCII将添加3个字符.长字符串的第一个字符必须替换为\x9C.这次需要偏移量，否则大量字符必须在32以下，并且必须用转义码表示.

与以前一样，处理命令行参数而不是stdin会添加2个字符，并且在代码之间使用实数空格会添加1个字符.

另一方面，此处的某些其他例程不会处理超出[[..0-9 \?A-Za-z]可接受范围的输入.如果从此例程中删除了此类处理，则可以删除19个字符，从而使总数减少到177个字符.但是，如果这样做，并且无效输入被馈送到该程序，它可能会崩溃并烧毁.

在这种情况下，代码可能是:

W(i){i?W(--i/2),putch(46-i%2):0;}main(){char*p,s[99];gets(s); for(p=s;*p;p++)*p=*p-32?W( "œ*~*hXPLJIYaeg*****u*.AC5+;79-@6=0/8?F31,2:4BDE" [toupper(*p)-44]-42):putch(47),putch(0);}

The challenge

The shortest code by character count, that will input a string using only alphabetical characters (upper and lower case), numbers, commas, periods and question mark, and returns a representation of the string in Morse code. The Morse code output should consist of a dash (-, ASCII 0x2D) for a long beep (AKA 'dah') and a dot (., ASCII 0x2E) for short beep (AKA 'dit').

Each letter should be separated by a space (' ', ASCII 0x20), and each word should be separated by a forward slash (/, ASCII 0x2F).

Morse code table:

alt text liranuna/junk/morse.gif

Test cases:

Input: Hello world Output: .... . .-.. .-.. --- / .-- --- .-. .-.. -..

Input: Hello, Stackoverflow. Output: .... . .-.. .-.. --- --..-- / ... - .- -.-. -.- --- ...- . .-. ..-. .-.. --- .-- .-.-.-

Code count includes input/output (that is, the full program).

解决方案

C (131 characters)

Yes, 131!

main(c){for(;c=c?c:(c=toupper(getch())-32)? "•ƒŒKa`^ZRBCEIQiw#S#nx(37+$6-2&@/4)'18=,*%.:0;?5" [c-12]-34:-3;c/=2)putch(c/2?46-c%2:0);}

I eeked out a few more characters by combining the logic from the while and for loops into a single for loop, and by moving the declaration of the c variable into the main definition as an input parameter. This latter technique I borrowed from strager's answer to another challenge.

For those trying to verify the program with GCC or with ASCII-only editors, you may need the following, slightly longer version:

main(c){for(;c=c?c:(c=toupper(getchar())-32)?c<0?1: "\x95#\x8CKa`^ZRBCEIQiw#S#nx(37+$6-2&@/4)'18=,*%.:0;?5" [c-12]-34:-3;c/=2)putchar(c/2?46-c%2:32);}

This version is 17 characters longer (weighing in at a comparatively huge 148), due to the following changes:

+4: getchar() and putchar() instead of the non-portable getch() and putch()
+6: escape codes for two of the characters instead of non-ASCII characters
+1: 32 instead of 0 for space character
+6: added "c<0?1:" to suppress garbage from characters less than ASCII 32 (namely, from '\n'). You'll still get garbage from any of !"#$%&'()*+[\]^_`{|}~, or anything above ASCII 126.

This should make the code completely portable. Compile with:

gcc -std=c89 -funsigned-char morse.c

The -std=c89 is optional. The -funsigned-char is necessary, though, or you will get garbage for comma and full stop.

135 characters

c;main(){while(c=toupper(getch()))for(c=c-32? "•ƒŒKa`^ZRBCEIQiw#S#nx(37+$6-2&@/4)'18=,*%.:0;?5" [c-44]-34:-3;c;c/=2)putch(c/2?46-c%2:0);}

In my opinion, this latest version is much more visually appealing, too. And no, it's not portable, and it's no longer protected against out-of-bounds input. It also has a pretty bad UI, taking character-by-character input and converting it to Morse Code and having no exit condition (you have to hit Ctrl+Break). But portable, robust code with a nice UI wasn't a requirement.

A brief-as-possible explanation of the code follows:

main(c){ while(c = toupper(getch())) /* well, *sort of* an exit condition */ for(c = c - 32 ? // effectively: "if not space character" "•ƒŒKa`^ZRBCEIQiw#S#nx(37+$6-2&@/4)'18=,*%.:0;?5"[c - 44] - 34 /* This array contains a binary representation of the Morse Code * for all characters between comma (ASCII 44) and capital Z. * The values are offset by 34 to make them all representable * without escape codes (as long as chars > 127 are allowed). * See explanation after code for encoding format. */ : -3; /* if input char is space, c = -3 * this is chosen because -3 % 2 = -1 (and 46 - -1 = 47) * and -3 / 2 / 2 = 0 (with integer truncation) */ c; /* continue loop while c != 0 */ c /= 2) /* shift down to the next bit */ putch(c / 2 ? /* this will be 0 if we're down to our guard bit */ 46 - c % 2 /* We'll end up with 45 (-), 46 (.), or 47 (/). * It's very convenient that the three characters * we need for this exercise are all consecutive. */ : 0 /* we're at the guard bit, output blank space */ ); }

Each character in the long string in the code contains the encoded Morse Code for one text character. Each bit of the encoded character represents either a dash or a dot. A one represents a dash, and a zero represents a dot. The least significant bit represents the first dash or dot in the Morse Code. A final "guard" bit determines the length of the code. That is, the highest one bit in each encoded character represents end-of-code and is not printed. Without this guard bit, characters with trailing dots couldn't be printed correctly.

For instance, the letter 'L' is ".-.." in Morse Code. To represent this in binary, we need a 0, a 1, and two more 0s, starting with the least significant bit: 0010. Tack one more 1 on for a guard bit, and we have our encoded Morse Code: 10010, or decimal 18. Add the +34 offset to get 52, which is the ASCII value of the character '4'. So the encoded character array has a '4' as the 33rd character (index 32).

This technique is similar to that used to encode characters in ACoolie's, strager's(2), Miles's, pingw33n's, Alec's, and Andrea's solutions, but is slightly simpler, requiring only one operation per bit (shifting/dividing), rather than two (shifting/dividing and decrementing).

EDIT: Reading through the rest of the implementations, I see that Alec and Anon came up with this encoding scheme—using the guard bit—before I did. Anon's solution is particularly interesting, using Python's bin function and stripping off the "0b" prefix and the guard bit with [3:], rather than looping, anding, and shifting, as Alec and I did.

As a bonus, this version also handles hyphen (-....-), slash (-..-.), colon (---...), semicolon (-.-.-.), equals (-...-), and at sign (.--.-.). As long as 8-bit characters are allowed, these characters require no extra code bytes to support. No more characters can be supported with this version without adding length to the code (unless there's Morse Codes for greater/less than signs).

Because I find the old implementations still interesting, and the text has some caveats applicable to this version, I've left the previous content of this post below.

Okay, presumably, the user interface can suck, right? So, borrowing from strager, I've replaced gets(), which provides buffered, echoed line input, with getch(), which provides unbuffered, unechoed character input. This means that every character you type gets translated immediately into Morse Code on the screen. Maybe that's cool. It no longer works with either stdin or a command-line argument, but it's pretty damn small.

I've kept the old code below, though, for reference. Here's the new.

New code, with bounds checking, 171 characters:

W(i){i?W(--i/2),putch(46-i%2):0;}c;main(){while(c=toupper(getch())-13) c=c-19?c>77|c<31?0:W("œ*~*hXPLJIYaeg*****u*.AC5+;79-@6=0/8?F31,2:4BDE" [c-31]-42):putch(47),putch(0);}

Enter breaks the loop and exits the program.

New code, without bounds checking, 159 characters:

W(i){i?W(--i/2),putch(46-i%2):0;}c;main(){while(c=toupper(getch())-13) c=c-19?W("œ*~*hXPLJIYaeg*****u*.AC5+;79-@6=0/8?F31,2:4BDE"[c-31]-42): putch(47),putch(0);}

Below follows the old 196/177 code, with some explanation:

W(i){i?W(--i/2),putch(46-i%2):0;}main(){char*p,c,s[99];gets(s); for(p=s;*p;)c=*p++,c=toupper(c),c=c-32?c>90|c<44?0:W( "œ*~*hXPLJIYaeg*****u*.AC5+;79-@6=0/8?F31,2:4BDE"[c-44]-42): putch(47),putch(0);}

This is based on Andrea's Python answer, using the same technique for generating the morse code as in that answer. But instead of storing the encodable characters one after another and finding their indexes, I stored the indexes one after another and look them up by character (similarly to my earlier answer). This prevents the long gaps near the end that caused problems for earlier implementors.

As before, I've used a character that's greater than 127. Converting it to ASCII-only adds 3 characters. The first character of the long string must be replaced with \x9C. The offset is necessary this time, otherwise a large number of characters are under 32, and must be represented with escape codes.

Also as before, processing a command-line argument instead of stdin adds 2 characters, and using a real space character between codes adds 1 character.

On the other hand, some of the other routines here don't deal with input outside the accepted range of [ ,.0-9\?A-Za-z]. If such handling were removed from this routine, then 19 characters could be removed, bringing the total down as low as 177 characters. But if this is done, and invalid input is fed to this program, it may crash and burn.

The code in this case could be:

W(i){i?W(--i/2),putch(46-i%2):0;}main(){char*p,s[99];gets(s); for(p=s;*p;p++)*p=*p-32?W( "œ*~*hXPLJIYaeg*****u*.AC5+;79-@6=0/8?F31,2:4BDE" [toupper(*p)-44]-42):putch(47),putch(0);}

更多推荐

将字符串转换为摩尔斯电码

将字符串转换为摩尔斯电码

发布评论取消回复

最近发表

热门文章

标签列表