问题描述
限时送ChatGPT账号..我有一个条件,我想从特定标签中检索文本,但它似乎没有返回 true.. 有什么帮助吗?
I have a condition where i want to retrieve text from a specific tag, but it does not seem to be returning true.. any help?
#!/usr/bin/perl
use HTML::TreeBuilder;
use warnings;
use strict;
my $URL = "http://prospectus.ulster.ac.uk/modules/index/index/selCampus/JN/selProgramme/2132/hModuleCode/COM137";
my $tree = HTML::TreeBuilder->new_from_content($URL);
if (my $div = $tree->look_down(_tag => "div ", class => "col col60 moduledetail")) {
printf $div->as_text();
print "test";
open (FILE, '>mytest.txt');
print FILE $div;
close (FILE);
}
print $tree->look_down(_tag => "th", class => "moduleCode")->as_text();
$tree->delete();
它没有进入 if 语句,并且 if 语句外的打印表明有一个未定义的值,但我知道它应该返回 true,因为这些标签确实存在.
It is not getting into the if statement and the print outside the if statement is saying that there is an undefined value, but i know that it should be returning true because these tags do exist.
<th class="moduleCode">COM137<small>CRN: 33413</small></th>
谢谢
推荐答案
您正在调用 HTML::TreeBuilder->new_from_content
但您提供的是 URL 而不是内容.您必须先get
HTML,然后才能将其传递给HTML::TreeBuilder
.
You are calling HTML::TreeBuilder->new_from_content
yet you are supplying a URL instead of content. You have to get
the HTML before you can pass it to HTML::TreeBuilder
.
也许最简单的方法是使用 LWP::Simple 导入一个名为获取
.这将读取 URL 中的数据并将其作为字符串返回.
Perhaps the simplest way is to use LWP::Simple which imports a subroutine called get
. This will read the data at the URL and return it as a string.
您的条件块从未执行的原因是您的标签名称中有一个空格.您需要 "div"
而不是 "div"
.
The reason your conditional block is never executed is that you have a space in the tag name. You need "div"
instead of "div "
.
还要注意以下几点:
您不应该使用 printf
将该字符串作为格式说明符来输出单个字符串.它可能会生成缺少参数警告并且无法正确输出字符串.
You shouldn't output a single string by using printf
with that string as a format specifier. It may generate missing argument warnings and fail to output the string properly.
理想情况下,您应该使用词法文件句柄和 open
的三参数形式.您还应该检查所有 open
调用的状态并做出相应的响应.
You should ideally use lexical file handles and the three-argument form of open
. You should also check the status of all open
calls and respond accordingly.
你的标量变量 $div
是一个有福的哈希引用,所以按原样打印它会输出类似 HTML::Element=HASH(0xfffffff)
.你需要调用它的方法来提取你想要显示的值
Your scalar variable $div
is a blessed hash reference, so printing it as it is will output something like HTML::Element=HASH(0xfffffff)
. You need to call its methods to extract the values you want to display
纠正这些错误后,您的代码看起来像这样,尽管我没有格式化输出,因为我不知道您想要什么.
With these errors corrected your code looks like this, although I haven't formatted the output as I can't tell what you want.
use strict;
use warnings;
use HTML::TreeBuilder;
use LWP::Simple;
my $url = "http://prospectus.ulster.ac.uk/modules/index/index/selCampus/JN/selProgramme/2132/hModuleCode/COM137";
my $html = get $url;
my $tree = HTML::TreeBuilder->new_from_content($html);
if (my $div = $tree->look_down(_tag => "div", class => "col col60 moduledetail")) {
print $div->as_text(), "\n";
open my $fh, '>', 'mytest.txt' or die "Unable to open output file: $!";
print $fh $div->as_text, "\n";
}
print $tree->look_down(_tag => "th", class => "moduleCode")->as_text, "\n";
这篇关于为什么这个条件不起作用?带类的 Div的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
更多推荐
[db:关键词]
发布评论