从Twitter获得稳定的消息流

编程入门 行业动态 更新时间:2024-10-27 08:27:03
本文介绍了从Twitter获得稳定的消息流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我想尝试创建一个简单的Twitter客户端,以了解我的口味并自动找到朋友和有趣的推文,以向我提供相关信息.

I'd like to try to make a simple twitter client that learns my tastes and automatically finds friends and interesting tweets to provide me with relevant information.

开始之前,我需要获得大量随机Twitter消息,以便可以在其上测试一些机器学习算法.

To get started, I would need to get a good stream of random twitter messages, so I can test a few machine learning algorithms on them.

我应该为此使用哪些API方法?我是否需要定期轮询以获取消息,或者有办法让Twitter在发布消息时推送消息?

What API methods should I use for this? Do I have to poll regularly to get messages, or is there a way to get twitter to push messages as they are published?

我也有兴趣学习任何类似的项目.

I'd also be interested in learning about any similar project.

推荐答案

我使用 tweepy 访问Twitter API并收听它们提供的公共流-应该是百分之一-所有推文的样本.这是我自己使用的示例代码.您仍然可以使用基本的身份验证机制进行流式传输,尽管它们可能很快会改变.相应地更改USERNAME和PASSWORD变量,并确保您遵守Twitter返回的错误代码(此示例代码在某些情况下可能不遵守Twitter希望的指数退避机制).

I use tweepy to access Twitter API and listen to the public stream they provide -- which should be a one-percent-sample of all tweets. Here is my sample code that I use myself. You can still use the basic auth mechanism for streaming, though they may change that soon. Change the USERNAME and PASSWORD variables accordingly and make sure you respect the error codes that Twitter returns (this sample code might not be respecting the exponential backoff mechanism that Twitter wants in some cases).

import tweepy import time def log_error(msg): timestamp = time.strftime('%Y%m%d:%H%M:%S') sys.stderr.write("%s: %s\n" % (timestamp,msg)) class StreamWatcherListener(tweepy.StreamListener): def on_status(self, status): print status.text.encode('utf-8') def on_error(self, status_code): log_error("Status code: %s." % status_code) time.sleep(3) return True # keep stream alive def on_timeout(self): log_error("Timeout.") def main(): auth = tweepy.BasicAuthHandler(USERNAME, PASSWORD) listener = StreamWatcherListener() stream = tweepy.Stream(auth, listener) stream.sample() if __name__ == '__main__': try: main() except KeyboardInterrupt: break except Exception,e: log_error("Exception: %s" % str(e)) time.sleep(3)

我还设置了套接字模块的超时时间,我相信Python的默认超时行为存在一些问题,因此请小心.

I also set the timeout of the socket module, I believe I had some problems with the default timeout behavior in Python, so be careful.

import socket socket.setdefaulttimeout(timeout)

更多推荐

从Twitter获得稳定的消息流

本文发布于:2023-10-05 09:38:19,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1467351.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:稳定   消息   Twitter

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!