我正在尝试使用httr访问Github上的私有存储库.如果添加我的github令牌(作为环境变量存储在GITHUB_TOKEN中),我能够做到这一点没有问题:
I am trying to access a private repository on Github using httr. I am able to do so with no problem if I add my github token (stored as an environment variable in GITHUB_TOKEN):
httr::GET("api.github/repos/aammd/miniature-meme/releases/assets/2859674", httr::write_disk("test.rds", overwrite = TRUE), httr::progress("down"), httr::add_headers(Authorization = paste("token", Sys.getenv("GITHUB_TOKEN"))))但是,如果我尝试指定另一个标头,则会出现错误.在这种情况下,我想下载与发行版关联的二进制文件(在github术语中为"asset"):
However, if I try to specify another header, I get an error. In this case, I want to download the binary file associated with a release (the "asset", in github terminology):
httr::GET("api.github/repos/aammd/miniature-meme/releases/assets/2859674", httr::write_disk("test.rds", overwrite = TRUE), httr::progress("down"), httr::add_headers(Authorization = paste("token", Sys.getenv("GITHUB_TOKEN"))), httr::add_headers(Accept = "application/octet-stream")) ?xml version="1.0" encoding="UTF-8"?> <Error><Code>InvalidArgument</Code><Message>Only one auth mechanism allowed; only the X-Amz-Algorithm query parameter, Signature query string parameter or the Authorization header should be specified</Message>那只是消息的一部分(其余包括我的令牌).
That's only part of the message (the rest includes my token).
显然我的授权被发送了两次!我该如何预防呢?与httr::handle_pool()
Apparently my authorization is being sent twice! How can I prevent this? Is it related to httr::handle_pool()
似乎原始请求收到了包含签名的回复.然后,此签名和我的令牌一起被发送回,从而导致错误. 这些人
It appears that the original request receives a reply, which contains a signature. This signature, along with my token is then sent back, causing an error. A similar thing happened to these people
-> GET /repos/aammd/miniature-meme/releases/assets/2859674 HTTP/1.1 -> Host: api.github -> User-Agent: libcurl/7.43.0 r-curl/2.3 httr/1.2.1.9000 -> Accept-Encoding: gzip, deflate -> Authorization: token tttttttt -> Accept: application/octet-stream -> <- HTTP/1.1 302 Found <- Server: GitHub <- Date: Tue, 17 Jan 2017 13:28:12 GMT <- Content-Type: text/html;charset=utf-8 <- Content-Length: 0 <- Status: 302 Found <- X-RateLimit-Limit: 5000 <- X-RateLimit-Remaining: 4984 <- X-RateLimit-Reset: 1484662101 <- location: github-cloud.s3.amazonaws/releases/76993567/aee5d0d6-c70a-11e6-9078-b5bee39f9fbc.RDS?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAISTNZFOVBIJMK3TQ%2F20170117%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20170117T132812Z&X-Amz-Expires=300&X-Amz-Signature=ssssssssss&X-Amz-SignedHeaders=host&actor_id=1198242&response-content-disposition=attachment%3B%20filename%3Dff.RDS&response-content-type=application%2Foctet-stream <- Access-Control-Expose-Headers: ETag, Link, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval <- Access-Control-Allow-Origin: * <- Content-Security-Policy: default-src 'none' <- Strict-Transport-Security: max-age=31536000; includeSubdomains; preload <- X-Content-Type-Options: nosniff <- X-Frame-Options: deny <- X-XSS-Protection: 1; mode=block <- Vary: Accept-Encoding <- X-Served-By: 3e3b9690823fb031da84658eb58aa83b <- X-GitHub-Request-Id: 82782802:6E1B:E9F0BE:587E1BEC <- -> GET /releases/76993567/aee5d0d6-c70a-11e6-9078-b5bee39f9fbc.RDS?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAISTNZFOVBIJMK3TQ%2F20170117%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20170117T132812Z&X-Amz-Expires=300&X-Amz-Signature=sssssssssssssss&X-Amz-SignedHeaders=host&actor_id=1198242&response-content-disposition=attachment%3B%20filename%3Dff.RDS&response-content-type=application%2Foctet-stream HTTP/1.1 -> Host: github-cloud.s3.amazonaws -> User-Agent: libcurl/7.43.0 r-curl/2.3 httr/1.2.1.9000 -> Accept-Encoding: gzip, deflate -> Authorization: token ttttttttttttt -> Accept: application/octet-stream -> <- HTTP/1.1 400 Bad Request <- x-amz-request-id: FA56B3D23B468704 <- x-amz-id-2: 49X1mT5j5BrZ4HApeR/+wb7iVOWA8yn1obrgMoeOy44RH414bo/Ov8AAWSx2baEXO0H/WHX5jK0= <- Content-Type: application/xml <- Transfer-Encoding: chunked <- Date: Tue, 17 Jan 2017 13:28:12 GMT <- Connection: close <- Server: AmazonS3 <-gh也不起作用
我创建了一个公共仓库来测试这个想法.可以从API返回JSON,但不能返回二进制文件:
gh doesn't work either
I created a public repo to test this idea out. the JSON can be returned from the API, but not the binary file:
# this works fine gh::gh("api.github/repos/aammd/test_idea/releases/assets/2998763") # this does not gh::gh("api.github/repos/aammd/test_idea/releases/assets/2998763", .send_headers = c("Accept" = "application/octet-stream"))wget可能有效,但是
我发现了一个要点,该要点显示了如何使用wget 完成此操作.关键因素似乎是:
wget might work, however
I've found a gist that shows how to do this with wget. The key component seems to be:
wget -q --auth-no-challenge --header='Accept:application/octet-stream' \ $TOKEN:@api.github/repos/$REPO/releases/assets/$asset_id \ -O $2但是,如果我尝试在httr::GET中复制该代码,则会失败:
However if I try to replicate that in httr::GET I am not successful:
auth_url <- sprintf("%s:@api.github/repos/aammd/miniature-meme/releases/assets/2859674", Sys.getenv("GITHUB_TOKEN")) httr::GET(auth_url, httr::write_disk("test.rds", overwrite = TRUE), httr::progress("down"), httr::add_headers(Accept = "application/octet-stream"))从R DOES 调用wget是可行的,但是此解决方案并不完全令人满意,因为我不能保证我所有的用户都已安装wget(除非有一种方法可以做到) ?).
Calling wget from R DOES work, but this solution is not totally satisfying because I can't guarantee that all my users have wget installed (unless there is a way to do that?).
system(sprintf("wget --auth-no-challenge --header='Accept:application/octet-stream' %s -O testwget.rds", auth_url))wget的
输出(请注意,上面没有-q)包含在此处(同样,希望对标记和签名进行编辑):
output of wget (note the absence of -q above) included here (again, tokens and signatures redacted, hopefully):
--2017-01-18 13:21:55-- ttttt:*password*@api.github/repos/aammd/miniature-meme/releases/assets/2859674 Resolving api.github... 192.30.253.117, 192.30.253.116 Connecting to api.github|192.30.253.117|:443... connected. HTTP request sent, awaiting response... 302 Found Location: github-cloud.s3.amazonaws/releases/76993567/aee5d0d6-c70a-11e6-9078-b5bee39f9fbc.RDS?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAISTNZFOVBIJMK3TQ%2F20170118%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20170118T122156Z&X-Amz-Expires=300&X-Amz-Signature=SSSSSSSS-Amz-SignedHeaders=host&actor_id=1198242&response-content-disposition=attachment%3B%20filename%3Dff.RDS&response-content-type=application%2Foctet-stream [following] --2017-01-18 13:21:55-- github-cloud.s3.amazonaws/releases/76993567/aee5d0d6-c70a-11e6-9078-b5bee39f9fbc.RDS?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAISTNZFOVBIJMK3TQ%2F20170118%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20170118T122156Z&X-Amz-Expires=300&X-Amz-Signature=SSSSSSSSSSSS-Amz-SignedHeaders=host&actor_id=1198242&response-content-disposition=attachment%3B%20filename%3Dff.RDS&response-content-type=application%2Foctet-stream Resolving github-cloud.s3.amazonaws... 52.216.226.120 Connecting to github-cloud.s3.amazonaws|52.216.226.120|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 682 [application/octet-stream] Saving to: ‘testwget.rds’ 0K 100% 15.5M=0s 2017-01-18 13:21:56 (15.5 MB/s) - ‘testwget.rds’ saved [682/682]推荐答案
事实证明,有两种可能的解决方案!
It turns out that there are two possible solutions to this problem!
正如@ user7433058所建议的,我们确实可以将令牌作为参数传递! 请注意,我们必须使用paste0.这是Github自己在其API文档中建议的方法
As suggested by @user7433058, we can indeed pass the token through as a parameter! note however that we have to use paste0. This is the approach suggested by Github themselves on their API documentation
## pass oauth in the url httr::GET(paste0("api.github/repos/aammd/miniature-meme/releases/assets/2859674?access_token=", Sys.getenv("GITHUB_TOKEN")), httr::write_disk("test.rds", overwrite = TRUE), httr::progress("down"), httr::add_headers(Accept = "application/octet-stream")) tt <- readRDS("test.rds")解决方案第二:再次询问
另一种解决方案是第一次发出请求,然后提取URL并使用它发出第二个请求.由于问题是由两次发送授权信息引起的-一次在URL中,一次在标头中-我们可以通过仅使用URL来避免此问题.
Solution the second: ask again
Another solution is to make the request the first time, then extract the URL and use it to make a second request. Since the problem is caused by sending Authorization information twice -- once in the URL, once in the header -- we can avoid the problem by only using the URL.
## alternatively, get the query url (containing signature) from the (failed) html request made the first time firsttry <- httr::GET("api.github/repos/aammd/miniature-meme/releases/assets/2859674", httr::add_headers(Authorization = paste("token", Sys.getenv("GITHUB_TOKEN")), Accept = "application/octet-stream")) httr::GET(firsttry$url, httr::write_disk("test.rds", overwrite = TRUE), httr::write_disk("test2.rds", overwrite = TRUE), httr::progress("down"), httr::add_headers(Accept = "application/octet-stream")) tt2 <- readRDS("test2.rds")我想这有点低效率(总共发出3个请求,而不是2个).但是,由于只有第一个请求是对实际github API的请求,因此在您的速率限制步骤中仅计为1.
This is, I suppose, a bit less efficient (making 3 requests total instead of 2). However, since only the first request is to the actual github API, it only counts for 1 towards your rate-limiting step.
如果您告诉httr不要遵循重定向,那么我们只能发出2个请求,而不是3个http请求.为此,请在两个请求中的第一个中使用httr::config(followlocation = FALSE)(即获取firsttry)
We can make only 2, not 3, http requests if you tell httr not to follow redirects. To do this use httr::config(followlocation = FALSE) in the first of the two requests (i.e. to get firsttry)
更多推荐
使用httr对github私有存储库进行身份验证
发布评论