能否请您告诉我Apache Spark和AKKA之间的区别,我知道这两个框架都旨在对分布式和并行计算进行编程,但是我看不到它们之间的联系或区别.
Could you please tell me the difference between Apache Spark and AKKA, I know that both frameworks meant to programme distributed and parallel computations, yet i don't see the link or the difference between them.
此外,我想获得适用于每种情况的用例.
Moreover, I would like to get the use cases suitable for each of them.
推荐答案Apache Spark实际上是在Akka上构建的.
Apache Spark is actually built on Akka.
Akka是一个通用框架,用于在Scala或Java中创建反应性,分布式,并行和弹性并发应用程序. Akka使用Actor模型来隐藏所有与线程相关的代码,并为您提供真正简单而有用的界面,以轻松实现可伸缩且容错的系统. Akka的一个很好的例子是一个实时应用程序,该应用程序使用和处理来自手机的数据并将其发送到某种存储中.
Akka is a general purpose framework to create reactive, distributed, parallel and resilient concurrent applications in Scala or Java. Akka uses the Actor model to hide all the thread-related code and gives you really simple and helpful interfaces to implement a scalable and fault-tolerant system easily. A good example for Akka is a real-time application that consumes and process data coming from mobile phones and sends them to some kind of storage.
Apache Spark(不是Spark Streaming)是使用map-reduce算法的通用版本处理批处理数据的框架. Apache Spark的一个很好的例子是对存储数据的一些指标进行计算,以更好地了解您的数据.数据将按需加载和处理.
Apache Spark (not Spark Streaming) is a framework to process batch data using a generalized version of the map-reduce algorithm. A good example for Apache Spark is a calculation of some metrics of stored data to get a better insight of your data. The data gets loaded and processed on demand.
Apache Spark Streaming能够以几乎实时的小批量数据执行类似的操作和功能,就像您已经存储了数据一样.
Apache Spark Streaming is able to perform similar actions and functions on near real-time small batches of data the same way you would do it if the data would be already stored.
2016年4月更新
从Apache Spark 1.6.0开始,Apache Spark不再依赖Akka进行节点之间的通信.感谢@EugeneMi的评论.
From Apache Spark 1.6.0, Apache Spark is no longer relying on Akka for communication between nodes. Thanks to @EugeneMi for the comment.
更多推荐
Apache Spark与Akka
发布评论