另一个IMPORTXML返回空内容

编程入门 行业动态 更新时间:2024-10-14 22:15:38
本文介绍了另一个IMPORTXML返回空内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

当我输入

=IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","//h2")

在我的Google表格中,我得到:#N/A Imported content is empty.

in my google sheet, I get: #N/A Imported content is empty.

但是,当我输入时:

=IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","*")

我得到了一些内容,所以我可以假定对该页面的访问没有被阻止.

I get some content, so I can presume that access to the page is not blocked.

毫无疑问,该页面包含几个h2标签.

And the page contains several h2 tags without any doubt.

那是什么问题?

推荐答案

  • 您想知道以下情况的原因.
    • =IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","//h2")返回#N/A Imported content is empty.
    • =IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","*")返回内容.
      • You want to know the reason of the following situation.
        • =IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","//h2") returns #N/A Imported content is empty.
        • =IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","*") returns the content.
        • 如果我的理解正确,那么这个答案如何?

          If my understanding is correct, how about this answer?

          当我看到www.ilgiornale.it/autore/franco-battaglia.html的HTML数据时,我注意到它的错误之处.如下.

          When I saw the HTML data of www.ilgiornale.it/autore/franco-battaglia.html, I noticed that the wrong point of it. It is as follows.

          window.jQuery || document.write("<script src='/sites/all/modules/jquery_update/replace/jquery/jquery.min.js'>\x3C/script>")

          在这种情况下,脚本标签不会像\x3C/script>那样关闭.似乎IMPORTXML检索此行时,脚本选项卡未关闭.我可以确认将\x3C转换为<时,=IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","//h2")正确返回了h2标记的值.

          In this case, the script tag is not closed like \x3C/script>. It seems that when IMPORTXML retrieves this line, the script tab is not closed. I could confirm that when \x3C is converted to <, =IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","//h2") correctly returns the values of h2 tag.

          通过这种方式,似乎出现了=IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","//h2")返回#N/A Imported content is empty的问题.

          By this, it seems that the issue that =IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","//h2") returns #N/A Imported content is empty occurs.

          关于=IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","*")返回内容的原因,当我输入此公式时,找不到脚本选项卡的值.从这种情况来看,我认为脚本标签可能有问题.因此,我可以找到上述错误点.我可以确认,当\x3C转换为<时,=IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","*")返回的值包括脚本标记的值.

          About the reason that =IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","*") returns the content, when I put this formula, I couldn't find the values of the script tab. From this situation, I thought that the script tag might have an issue. So I could find the above wrong point. I could confirm that when \x3C is converted to <, =IMPORTXML("www.ilgiornale.it/autore/franco-battaglia.html","*") returns the values including the values of the script tag.

          为了避免出现上述问题,需要将\x3C修改为<.那么以下解决方法呢?在这些变通办法中,我使用了Google Apps脚本.请考虑这些变通办法只是几种变通办法中的两个.

          In order to avoid above issue, it is required to be modified \x3C to <. So how about the following workarounds? In these workarounds, I used Google Apps Script. Please think of these workarounds as just two of several workarounds.

          首先,在这种模式下,从URL下载HTML数据,然后修改错误的点.然后,将修改后的HTML数据创建为文件,并共享该文件.并检索文件的URL.使用该URL检索值.

          In this pattern, at first, download the HTML data from the URL, and modify the wrong point. Then, the modified HTML data is created as a file, and the file is shared. And retrieve the URL of the file. Using this URL, the values are retrieved.

          function myFunction() { var url = "www.ilgiornale.it/autore/franco-battaglia.html"; var data = UrlFetchApp.fetch(url).getContentText().replace(/\\x3C/g, "<"); var file = DriveApp.createFile("htmlData.html", data, MimeType.HTML); file.setSharing(DriveApp.Access.ANYONE_WITH_LINK, DriveApp.Permission.VIEW); var endpoint = "drive.google/uc?id=" + file.getId() + "&export=download"; Logger.log(endpoint) }

          • 使用此脚本时,请首先运行myFunction()函数并检索端点.作为测试用例,请将端点放入单元格"A1".并将=IMPORTXML(A1,"//h2")放入单元格"A2".这样,就可以检索值.
            • When you use this script, at first, please run the function of myFunction() and retrieve the endpoint. And as a test case, please put the endpoint to the cell "A1". And put =IMPORTXML(A1,"//h2") to the cell "A2". By this, the values can be retrieved.
            • 在这种模式下,通过解析HTML数据直接将标记h2的值检索出来,并将其放入活动的电子表格中.

              In this pattern, the values of the tag h2 are directly retrieved by parsing HTML data and put them to the active Spreadsheet.

              function myFunction() { var url = "www.ilgiornale.it/autore/franco-battaglia.html"; var data = UrlFetchApp.fetch(url).getContentText().match(/<h2[\s\S]+?<\/h2>/g); var xml = XmlService.parse("<temp>" + data.join("") + "</temp>"); var h2Values = xml.getRootElement().getChildren("h2").map(function(e) {return [e.getValue()]}); var sheet = SpreadsheetApp.getActiveSheet(); sheet.getRange(sheet.getLastRow() + 1, 1, h2Values.length, 1).setValues(h2Values); Logger.log(h2Values) }

              • 运行脚本时,标记h2的值将直接放置到活动电子表格中.
                • When you run the script, the values of the tag h2 are directly put to the active Spreadsheet.
                  • UrlFetchApp类
                  • XmlService类
                  • Class UrlFetchApp
                  • Class XmlService

                  如果我误解了您的问题,而这不是您想要的方向,我深表歉意.

                  If I misunderstood your question and this was not the direction you want, I apologize.

更多推荐

另一个IMPORTXML返回空内容

本文发布于:2023-11-28 14:12:03,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1642731.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:内容   IMPORTXML

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!