克尼汉 &函数式语言中的 Ritchie 字数统计示例程序

编程入门行业动态更新时间:2024-10-18 19:27:58

本文介绍了克尼汉 &函数式语言中的 Ritchie 字数统计示例程序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我最近在网上阅读了一些关于函数式编程的文章，我想我对它背后的概念有了基本的了解.

I have been reading a little bit about functional programming on the web lately and I think I got a basic idea about the concepts behind it.

我很好奇涉及某种状态的日常编程问题是如何用纯函数式编程语言解决的.

I'm curious how everyday programming problems which involve some kind of state are solved in a pure functional programing language.

例如:《C 编程语言》一书中的字数统计程序如何用纯函数式语言实现?

For example: how would the word count program from the book 'The C programming Language' be implemented in a pure functional language?

欢迎任何贡献，只要解决方案是纯函数式的.

Any contributions are welcome as long as the solution is in a pure functional style.

这是书中的字数统计 C 代码:

Here's the word count C code from the book:

#include <stdio.h>

#define IN  1 /* inside a word */
#define OUT 0 /* outside a word */

/* count lines, words, and characters in input */
main()
{
  int c, nl, nw, nc, state;

  state = OUT;
  nl = nw = nc = 0;
  while ((c = getchar()) != EOF) {
    ++nc;
    if (c == '
')
      ++nl;
    if (c == ' ' || c == '
' || c = '	')
      state = OUT;
    else if (state == OUT) {
      state = IN;
      ++nw;
    }
  }

  printf("%d %d %d
", nl, nw, nc);
}

推荐答案

基本上，在函数式风格中，您需要将获取数据流的 IO 操作与基于当前的一些有状态转换的纯操作分开角色和当前状态.

Basically, in a functional styly you'll want to divide the IO operation of getting your stream of data from the pure operation of some stateful transistion based on the current character and the current state.

来自 Tikhon 的 Haskell 解决方案非常干净，但对您的输入数据执行三遍，并将导致整个输入包含在内存中，直到计算出结果.您可以以增量方式处理数据，我在下面使用 Text 包执行此操作，但没有使用其他高级 Haskell 工具(可能会以非 Haskell 人员的可理解性为代价进行清理).

The Haskell solution from Tikhon is very clean but performs three passes on your input data and will result in the entire input being contained in memory until the result is computed. You can process data incrementally, which I do below using the Text package but no other advanced Haskell tools (which could clean this up at the expense of understandability by non-Haskellers).

首先我们有我们的序言:

First we have our preamble:

{-# LANGUAGE BangPatterns #-}

import Data.Text.Lazy as T
import Data.Text.Lazy.IO as TIO

然后我们定义我们的数据结构来保存进程的状态(字符数、单词数和行数以及状态输入/输出):

Then we define our data structure to hold the state of the process (number of characters, words, and lines along with the State IN/OUT):

data Counts = Cnt { nc, nl, nw :: !Int
                  , state :: State  }
        deriving (Eq, Ord, Show)

data State = IN | OUT
        deriving (Eq, Ord, Show)

现在我定义一个零"状态只是为了方便使用.我通常会创建一些辅助函数或使用像 lense 这样的包来使 Counts 结构中的每个字段的递增变得简单，但对于这个答案将不使用:

Now I define a "zero" state just for easy use. I'd normally make a number of helper functions or use a package like lense to make incrementing each field in the Counts structure simple, but will go without for this answer:

zeros :: Counts
zeros = Cnt 0 0 0 OUT

现在我将你的一系列 if/else 语句翻译成一个纯粹的状态机:

And now I translate your series of if/else statements into a pure state machine:

op :: Counts -> Char -> Counts
op c '
' = c { nc = nc c + 1, nw = nw c + 1, nl = nl c + 1, state=OUT }
op c ch | ch == ' ' || ch == '	' = c { nc = nc c + 1, state=OUT }
        | state c == OUT = c { nc = nc c + 1, nw = nw c + 1, state = IN }
        | otherwise  = c { nc = nc c + 1 }

最后，main 函数只是获取输入流并将我们的操作折叠到字符上:

Finally the main function just gets the input stream and folds our operation over the characters:

main = do
        contents <- TIO.getContents
        print $ T.foldl' op zeros contents

您提到不了解语法.这是我将解释的更简单的版本:

You mentioned not understanding the syntax. Here is an even simpler version that I will explain:

import Data.Text.Lazy as T
import Data.Text.Lazy.IO as TIO

op (nc, nw, nl, st) ch
  | ch == '
'              = (nc + 1, nw + 1 , nl + 1, True)
  | ch == ' ' || ch == '	' = (nc + 1, nw     , nl    , True)
  | st                      = (nc + 1, nw + 1 , nl    , False)
  | otherwise               = (nc + 1, nw     , nl    , st)

main = do
        contents <- TIO.getContents
        print $ T.foldl' op (0,0,0,True) contents

import 语句让我们可以访问我们使用的 getContents 和 foldl' 函数.

The import statements give us access to the getContents and foldl' functions we use.

op 函数使用了一堆保护——像 | 这样的部分ch = ' ' - 这基本上就像一个 C if/elseif/else 系列.

The op function uses a bunch of guards - parts like | ch = ' ' - which is basically like a C if/elseif/else series.

元组 ( ... , ... , ... , ... ) 包含我们所有的状态.Haskell 变量是不可变的，因此我们通过向先前变量的值添加一个(或不添加)来创建新元组.

The tuples ( ... , ... , ... , ... ) contain all our state. Haskell variables are immutable, so we make new tuples by adding one (or not) to the values of the previous variables.

这篇关于克尼汉 &函数式语言中的 Ritchie 字数统计示例程序的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

更多推荐

[db:关键词]

本文发布于:2023-03-27 08:21:11，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/686656.html