相同的python函数给出不同的输出(Same python function giving different output)

编程入门 行业动态 更新时间:2024-10-27 10:32:23
相同的python函数给出不同的输出(Same python function giving different output)

我在python中制作一个抓取脚本。 我首先从我必须废弃歌曲列表的地方收集电影的链接。 这是包含电影链接的movie.txt列表

https://www.lyricsbogie.com/category/movies/a-flat-2010 https://www.lyricsbogie.com/category/movies/a-night-in-calcutta-1970 https://www.lyricsbogie。 com / category / movies / a-scandall-2016 https://www.lyricsbogie.com/category/movies/a-strange-love-story-2011 https://www.lyricsbogie.com/category/movies/a- sublime-love-story-barsaat-2005 https://www.lyricsbogie.com/category/movies/a-wednesday-2008 https://www.lyricsbogie.com/category/movies/aa-ab-laut-chalen- 1999 https://www.lyricsbogie.com/category/movies/aa-dekhen-zara-2009 https://www.lyricsbogie.com/category/movies/aa-gale-lag-jaa-1973 https:// www .lyricsbogie.com / category / movies / aa-gale-lag-jaa-1994 https://www.lyricsbogie.com/category/movies/aabra-ka-daabra-2004 https://www.lyricsbogie.com/category / movies / aabroo-1943 https://www.lyricsbogie.com/category/movies/aabroo-1956 https://www.lyricsbogie.com/category/movies/aabroo-1968 https://www.lyricsbogie.com/类别/电影/ aabshar 1953

这是我的第一个python函数:

import requests from bs4 import BeautifulSoup as bs def get_songs_links_for_movies1(): url='https://www.lyricsbogie.com/category/movies/a-flat-2010' source_code = requests.get(url) plain_text = source_code.text soup = bs(plain_text,"html.parser") for link in soup.find_all('h3',class_='entry-title'): href = link.a.get('href') href = href+"\n" print(href)

输出以上功能:

https://www.lyricsbogie.com/movies/a-flat-2010/pyar-itna-na-kar.html https://www.lyricsbogie.com/movies/a-flat-2010/chal-halke-halke.html https://www.lyricsbogie.com/movies/a-flat-2010/meetha-sa-ishq.html https://www.lyricsbogie.com/movies/a-flat-2010/dil-kashi.html https://www.lyricsbogie.com/movies/ae-dil-hai-mushkil-2016/ae-dil-hai-mushkil-title.html https://www.lyricsbogie.com/movies/m-s-dhoni-the-untold-story-2016/kaun-tujhe.html https://www.lyricsbogie.com/movies/raaz-reboot-2016/raaz-aankhein-teri.html https://www.lyricsbogie.com/albums/akira-2016/baadal-2.html https://www.lyricsbogie.com/movies/baar-baar-dekho-2016/sau-aasmaan.html https://www.lyricsbogie.com/albums/gajanan-2016/gajanan-title.html https://www.lyricsbogie.com/movies/days-of-tafree-2016/jeeley-yeh-lamhe.html https://www.lyricsbogie.com/tv-shows/coke-studio-pakistan-season-9-2016/ala-baali.html https://www.lyricsbogie.com/albums/piya-2016/piya-title.html https://www.lyricsbogie.com/albums/sach-te-supna-2016/sach-te-supna-title.html

它成功获取指定链接的歌曲URL。 但是现在当我尝试自动化进程并传递一个文件movie.txt逐个读取url并得到结果但是它的输出与我上面逐一添加url的函数不匹配。 此功能也无法获取歌曲网址。 这是我的功能无法正常工作。

import requests from bs4 import BeautifulSoup as bs def get_songs_links_for_movies(): file = open("movie.txt","r") for url in file: source_code = requests.get(url) plain_text = source_code.text soup = bs(plain_text,"html.parser") for link in soup.find_all('h3',class_='entry-title'): href = link.a.get('href') href = href+"\n" print(href)

输出上述功能

https://www.lyricsbogie.com/movies/ae-dil-hai-mushkil-2016/ae-dil-hai-mushkil-title.html https://www.lyricsbogie.com/movies/m-s-dhoni-the-untold-story-2016/kaun-tujhe.html https://www.lyricsbogie.com/movies/raaz-reboot-2016/raaz-aankhein-teri.html https://www.lyricsbogie.com/albums/akira-2016/baadal-2.html https://www.lyricsbogie.com/movies/baar-baar-dekho-2016/sau-aasmaan.html https://www.lyricsbogie.com/albums/gajanan-2016/gajanan-title.html https://www.lyricsbogie.com/movies/days-of-tafree-2016/jeeley-yeh-lamhe.html https://www.lyricsbogie.com/tv-shows/coke-studio-pakistan-season-9-2016/ala-baali.html https://www.lyricsbogie.com/albums/piya-2016/piya-title.html https://www.lyricsbogie.com/albums/sach-te-supna-2016/sach-te-supna-title.html https://www.lyricsbogie.com/movies/ae-dil-hai-mushkil-2016/ae-dil-hai-mushkil-title.html https://www.lyricsbogie.com/movies/m-s-dhoni-the-untold-story-2016/kaun-tujhe.html https://www.lyricsbogie.com/movies/raaz-reboot-2016/raaz-aankhein-teri.html https://www.lyricsbogie.com/albums/akira-2016/baadal-2.html https://www.lyricsbogie.com/movies/baar-baar-dekho-2016/sau-aasmaan.html https://www.lyricsbogie.com/albums/gajanan-2016/gajanan-title.html https://www.lyricsbogie.com/movies/days-of-tafree-2016/jeeley-yeh-lamhe.html https://www.lyricsbogie.com/tv-shows/coke-studio-pakistan-season-9-2016/ala-baali.html https://www.lyricsbogie.com/albums/piya-2016/piya-title.html https://www.lyricsbogie.com/albums/sach-te-supna-2016/sach-te-supna-title.html https://www.lyricsbogie.com/movies/ae-dil-hai-mushkil-2016/ae-dil-hai-mushkil-title.html https://www.lyricsbogie.com/movies/m-s-dhoni-the-untold-story-2016/kaun-tujhe.html https://www.lyricsbogie.com/movies/raaz-reboot-2016/raaz-aankhein-teri.html https://www.lyricsbogie.com/albums/akira-2016/baadal-2.html https://www.lyricsbogie.com/movies/baar-baar-dekho-2016/sau-aasmaan.html https://www.lyricsbogie.com/albums/gajanan-2016/gajanan-title.html https://www.lyricsbogie.com/movies/days-of-tafree-2016/jeeley-yeh-lamhe.html https://www.lyricsbogie.com/tv-shows/coke-studio-pakistan-season-9-2016/ala-baali.html https://www.lyricsbogie.com/albums/piya-2016/piya-title.html https://www.lyricsbogie.com/albums/sach-te-supna-2016/sach-te-supna-title.html

等等..........

通过比较第一功能输出和第二功能输出。 你清楚地看到没有功能1提取的歌曲网址,并且功能2一次又一次地重复相同的输出。

任何人都可以帮助我,为什么会发生这种情况。

I am making a scraping script in python. I first collect the links of the movie from where I have to scrap the songs list. Here is the movie.txt list containing movies link

https://www.lyricsbogie.com/category/movies/a-flat-2010 https://www.lyricsbogie.com/category/movies/a-night-in-calcutta-1970 https://www.lyricsbogie.com/category/movies/a-scandall-2016 https://www.lyricsbogie.com/category/movies/a-strange-love-story-2011 https://www.lyricsbogie.com/category/movies/a-sublime-love-story-barsaat-2005 https://www.lyricsbogie.com/category/movies/a-wednesday-2008 https://www.lyricsbogie.com/category/movies/aa-ab-laut-chalen-1999 https://www.lyricsbogie.com/category/movies/aa-dekhen-zara-2009 https://www.lyricsbogie.com/category/movies/aa-gale-lag-jaa-1973 https://www.lyricsbogie.com/category/movies/aa-gale-lag-jaa-1994 https://www.lyricsbogie.com/category/movies/aabra-ka-daabra-2004 https://www.lyricsbogie.com/category/movies/aabroo-1943 https://www.lyricsbogie.com/category/movies/aabroo-1956 https://www.lyricsbogie.com/category/movies/aabroo-1968 https://www.lyricsbogie.com/category/movies/aabshar-1953

Here is my first python function:

import requests from bs4 import BeautifulSoup as bs def get_songs_links_for_movies1(): url='https://www.lyricsbogie.com/category/movies/a-flat-2010' source_code = requests.get(url) plain_text = source_code.text soup = bs(plain_text,"html.parser") for link in soup.find_all('h3',class_='entry-title'): href = link.a.get('href') href = href+"\n" print(href)

output of the above function:

https://www.lyricsbogie.com/movies/a-flat-2010/pyar-itna-na-kar.html https://www.lyricsbogie.com/movies/a-flat-2010/chal-halke-halke.html https://www.lyricsbogie.com/movies/a-flat-2010/meetha-sa-ishq.html https://www.lyricsbogie.com/movies/a-flat-2010/dil-kashi.html https://www.lyricsbogie.com/movies/ae-dil-hai-mushkil-2016/ae-dil-hai-mushkil-title.html https://www.lyricsbogie.com/movies/m-s-dhoni-the-untold-story-2016/kaun-tujhe.html https://www.lyricsbogie.com/movies/raaz-reboot-2016/raaz-aankhein-teri.html https://www.lyricsbogie.com/albums/akira-2016/baadal-2.html https://www.lyricsbogie.com/movies/baar-baar-dekho-2016/sau-aasmaan.html https://www.lyricsbogie.com/albums/gajanan-2016/gajanan-title.html https://www.lyricsbogie.com/movies/days-of-tafree-2016/jeeley-yeh-lamhe.html https://www.lyricsbogie.com/tv-shows/coke-studio-pakistan-season-9-2016/ala-baali.html https://www.lyricsbogie.com/albums/piya-2016/piya-title.html https://www.lyricsbogie.com/albums/sach-te-supna-2016/sach-te-supna-title.html

It successfully fetches the songs url of the specified link. But now when I try to automate the process and passes a file movie.txt to read url one by one and get the result but its output does not match with the function above in which I add url by myself one by one. Also this function does not get the songs url. Here is my function that does not work correctly.

import requests from bs4 import BeautifulSoup as bs def get_songs_links_for_movies(): file = open("movie.txt","r") for url in file: source_code = requests.get(url) plain_text = source_code.text soup = bs(plain_text,"html.parser") for link in soup.find_all('h3',class_='entry-title'): href = link.a.get('href') href = href+"\n" print(href)

output of the above function

https://www.lyricsbogie.com/movies/ae-dil-hai-mushkil-2016/ae-dil-hai-mushkil-title.html https://www.lyricsbogie.com/movies/m-s-dhoni-the-untold-story-2016/kaun-tujhe.html https://www.lyricsbogie.com/movies/raaz-reboot-2016/raaz-aankhein-teri.html https://www.lyricsbogie.com/albums/akira-2016/baadal-2.html https://www.lyricsbogie.com/movies/baar-baar-dekho-2016/sau-aasmaan.html https://www.lyricsbogie.com/albums/gajanan-2016/gajanan-title.html https://www.lyricsbogie.com/movies/days-of-tafree-2016/jeeley-yeh-lamhe.html https://www.lyricsbogie.com/tv-shows/coke-studio-pakistan-season-9-2016/ala-baali.html https://www.lyricsbogie.com/albums/piya-2016/piya-title.html https://www.lyricsbogie.com/albums/sach-te-supna-2016/sach-te-supna-title.html https://www.lyricsbogie.com/movies/ae-dil-hai-mushkil-2016/ae-dil-hai-mushkil-title.html https://www.lyricsbogie.com/movies/m-s-dhoni-the-untold-story-2016/kaun-tujhe.html https://www.lyricsbogie.com/movies/raaz-reboot-2016/raaz-aankhein-teri.html https://www.lyricsbogie.com/albums/akira-2016/baadal-2.html https://www.lyricsbogie.com/movies/baar-baar-dekho-2016/sau-aasmaan.html https://www.lyricsbogie.com/albums/gajanan-2016/gajanan-title.html https://www.lyricsbogie.com/movies/days-of-tafree-2016/jeeley-yeh-lamhe.html https://www.lyricsbogie.com/tv-shows/coke-studio-pakistan-season-9-2016/ala-baali.html https://www.lyricsbogie.com/albums/piya-2016/piya-title.html https://www.lyricsbogie.com/albums/sach-te-supna-2016/sach-te-supna-title.html https://www.lyricsbogie.com/movies/ae-dil-hai-mushkil-2016/ae-dil-hai-mushkil-title.html https://www.lyricsbogie.com/movies/m-s-dhoni-the-untold-story-2016/kaun-tujhe.html https://www.lyricsbogie.com/movies/raaz-reboot-2016/raaz-aankhein-teri.html https://www.lyricsbogie.com/albums/akira-2016/baadal-2.html https://www.lyricsbogie.com/movies/baar-baar-dekho-2016/sau-aasmaan.html https://www.lyricsbogie.com/albums/gajanan-2016/gajanan-title.html https://www.lyricsbogie.com/movies/days-of-tafree-2016/jeeley-yeh-lamhe.html https://www.lyricsbogie.com/tv-shows/coke-studio-pakistan-season-9-2016/ala-baali.html https://www.lyricsbogie.com/albums/piya-2016/piya-title.html https://www.lyricsbogie.com/albums/sach-te-supna-2016/sach-te-supna-title.html

and so on..........

By comparing 1st function output and 2nd function output. You clearly see that there is no song url that function 1 fetches and also function 2 repeating the same output again and again.

Can Anyone help me in that why is it happening.

最满意答案

要了解发生了什么,您可以在for循环中打印从文件中读取的URL的表示:

for url in file: print(repr(url)) ...

打印此表示(而不仅仅是字符串)可以更轻松地查看特殊字符。 在这种情况下,输出提供了'https://www.lyricsbogie.com/category/movies/a-flat-2010\n' 。 如您所见,网址中存在换行符,因此获取的网址不正确。

使用例如rstrip()方法通过url.rstrip()替换url来删除换行符。

To understand what is happening, you can print the representation of the url read from the file in the for loop:

for url in file: print(repr(url)) ...

Printing this representation (and not just the string) makes it easier to see special characters. In this case, the output gave 'https://www.lyricsbogie.com/category/movies/a-flat-2010\n'. As you see, there is a line break in the url, so the fetched url is not correct.

Use for instance the rstrip() method to remove the newline character, by replacing url by url.rstrip().

更多推荐

本文发布于:2023-07-26 02:17:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1269620.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:函数   python   function   output   giving

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!