从文件类型字段中下载文件?

编程入门 行业动态 更新时间:2024-10-25 20:22:17
本文介绍了从文件类型字段中下载文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我要寻找一种方式来从不同的网页下载文件,并让他们存储在一个特定文件夹下,在本地计算机上。我使用Python 2.7

I am looking for a way to download files from different pages and get them stored under a particular folder in a local machine. I am using Python 2.7

请参阅下面的字段:

修改

的这里是HTML内容:的

<input type="hidden" name="supplierProfiles(1152444).location.locationPurposes().extendedAttributes(Upload_RFI_Form).value.filename" value="Screenshot.docx"> <a style="display:inline; position:relative;" href=" /aems/file/filegetrevision.do?fileEntityId=8120070&cs=LU31NT9us5P9Pvkb1BrtdwaCrEraskiCJcY6E2ucP5s.xyz"> Screenshot.docx </a>

一个方法可行我只是尝试:的与HTML内容,如果加说 xyz.test 构建如URL,如下

xyz.test/aems/file/filegetrevision.do?fileEntityId=8120070&cs=LU31NT9us5P9Pvkb1BrtdwaCrEraskiCJcY6E2ucP5s.xyz

和放置到浏览器的URL和命中输入让我有机会下载文件所提到的截图。但是,现在我们可以找到这样的aems/file/filegetrevision.do?fileEntityId=8120070&cs=LU31NT9us5P9Pvkb1BrtdwaCrEraskiCJcY6E2ucP5s.xyz它的值是多少present呢?

and place that URL on to the browser and hit Enter giving me a chance to download the file as screenshot mentioned. But now can we find such aems/file/filegetrevision.do?fileEntityId=8120070&cs=LU31NT9us5P9Pvkb1BrtdwaCrEraskiCJcY6E2ucP5s.xyz values how many it is present there?

code 我试过至今的

只有痛苦如何下载该文件。使用脚本构建网址:

Only pain how to download that file. using scripts constructed URL:

for a in soup.find_all('a', {"style": "display:inline; position:relative;"}, href=True): href = a['href'].strip() href = "xyz.test/" + href print(href)

请帮我在这里!

让我知道如果你的人需要从我这里任何更多的信息,我很高兴地分享给你的人。

Let me know if you people need any more information from me, I am happy to share that to you people.

在此先感谢!

推荐答案

由于@JohnZwinck建议您可以使用 urllib.urlretrieve ,并使用重模块创建的特定网页上的链接列表,并下载每个文件。下面是一个例子。

As @JohnZwinck suggested you can use urllib.urlretrieve and use the re module to create a list of links on a given page and download each file. Below is an example.

#!/usr/bin/python """ This script would scrape and download files using the anchor links. """ #Imports import os, re, sys import urllib, urllib2 #Config base_url = "www.google/" destination_directory = "downloads" def _usage(): """ This method simply prints out the Usage information. """ print "USAGE: %s <url>" %sys.argv[0] def _create_url_list(url): """ This method would create a list of downloads, using the anchor links found on the URL passed. """ raw_data = urllib2.urlopen(url).read() raw_list = re.findall('<a style="display:inline; position:relative;" href="(.+?)"', raw_data) url_list = [base_url + x for x in raw_list] return url_list def _get_file_name(url): """ This method will return the filename extracted from a passed URL """ parts = url.split('/') return parts[len(parts) - 1] def _download_file(url, filename): """ Given a URL and a filename, this method will save a file locally to the» destination_directory path. """ if not os.path.exists(destination_directory): print 'Directory [%s] does not exist, Creating directory...' % destination_directory os.makedirs(destination_directory) try: urllib.urlretrieve(url, os.path.join(destination_directory, filename)) print 'Downloading File [%s]' % (filename) except: print 'Error Downloading File [%s]' % (filename) def _download_all(main_url): """ Given a URL list, this method will download each file in the destination directory. """ url_list = _create_url_list(main_url) for url in url_list: _download_file(url, _get_file_name(url)) def main(argv): """ This is the script's launcher method. """ if len(argv) != 1: _usage() sys.exit(1) _download_all(sys.argv[1]) print 'Finished Downloading.' if __name__ == '__main__': main(sys.argv[1:])

您可以更改 BASE_URL 并根据您的需要和脚本另存为<$ C $ destination_directory C> download.py 。然后从终端使用它像

You can Change the base_url and the destination_directory according to your needs and save the script as download.py. Then from the terminal use it like

python download.py www.example/?page=1

更多推荐

从文件类型字段中下载文件?

本文发布于:2023-11-01 20:36:03,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1550347.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:字段   文件类型   文件

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!