我正在使用GoogleSheet的IMPORTXML函数来检索一年前的每个日历日期或可获得数据的最近一年的数据.
I'm using GoogleSheet's IMPORTXML function to retrieve data for each calendar date one year earlier or the closest year-ago date where data are available.
这是数据示例(完整的数据源在这里):
<entry> <id>data.treasury.gov/Feed.svc/DailyTreasuryYieldCurveRateData(6794)</id> <title type="text"></title> <updated>2018-02-06T22:05:38Z</updated> <author> <name /> </author> <link rel="edit" title="DailyTreasuryYieldCurveRateDatum" href="DailyTreasuryYieldCurveRateData(6794)" /> <category term="TreasuryDataWarehouseModel.DailyTreasuryYieldCurveRateDatum" scheme="schemas.microsoft/ado/2007/08/dataservices/scheme" /> <content type="application/xml"> <m:properties> <d:Id m:type="Edm.Int32">6794</d:Id> <d:NEW_DATE m:type="Edm.DateTime">2017-02-24T00:00:00</d:NEW_DATE> <d:BC_1MONTH m:type="Edm.Double">0.4</d:BC_1MONTH> <d:BC_3MONTH m:type="Edm.Double">0.52</d:BC_3MONTH> <d:BC_6MONTH m:type="Edm.Double">0.65</d:BC_6MONTH> <d:BC_1YEAR m:type="Edm.Double">0.8</d:BC_1YEAR> <d:BC_2YEAR m:type="Edm.Double">1.12</d:BC_2YEAR> <d:BC_3YEAR m:type="Edm.Double">1.38</d:BC_3YEAR> <d:BC_5YEAR m:type="Edm.Double">1.8</d:BC_5YEAR> <d:BC_7YEAR m:type="Edm.Double">2.12</d:BC_7YEAR> <d:BC_10YEAR m:type="Edm.Double">2.31</d:BC_10YEAR> <d:BC_20YEAR m:type="Edm.Double">2.69</d:BC_20YEAR> <d:BC_30YEAR m:type="Edm.Double">2.95</d:BC_30YEAR> <d:BC_30YEARDISPLAY m:type="Edm.Double">2.95</d:BC_30YEARDISPLAY> </m:properties> </content> </entry> <entry> <id>data.treasury.gov/Feed.svc/DailyTreasuryYieldCurveRateData(6795)</id> <title type="text"></title> <updated>2018-02-06T22:05:38Z</updated> <author> <name /> </author> <link rel="edit" title="DailyTreasuryYieldCurveRateDatum" href="DailyTreasuryYieldCurveRateData(6795)" /> <category term="TreasuryDataWarehouseModel.DailyTreasuryYieldCurveRateDatum" scheme="schemas.microsoft/ado/2007/08/dataservices/scheme" /> <content type="application/xml"> <m:properties> <d:Id m:type="Edm.Int32">6795</d:Id> <d:NEW_DATE m:type="Edm.DateTime">2017-02-27T00:00:00</d:NEW_DATE> <d:BC_1MONTH m:type="Edm.Double">0.44</d:BC_1MONTH> <d:BC_3MONTH m:type="Edm.Double">0.5</d:BC_3MONTH> <d:BC_6MONTH m:type="Edm.Double">0.68</d:BC_6MONTH> <d:BC_1YEAR m:type="Edm.Double">0.81</d:BC_1YEAR> <d:BC_2YEAR m:type="Edm.Double">1.2</d:BC_2YEAR> <d:BC_3YEAR m:type="Edm.Double">1.46</d:BC_3YEAR> <d:BC_5YEAR m:type="Edm.Double">1.87</d:BC_5YEAR> <d:BC_7YEAR m:type="Edm.Double">2.18</d:BC_7YEAR> <d:BC_10YEAR m:type="Edm.Double">2.36</d:BC_10YEAR> <d:BC_20YEAR m:type="Edm.Double">2.72</d:BC_20YEAR> <d:BC_30YEAR m:type="Edm.Double">2.98</d:BC_30YEAR> <d:BC_30YEARDISPLAY m:type="Edm.Double">2.98</d:BC_30YEARDISPLAY> </m:properties> </content> </entry> <entry>这是我目前用于检索2017年2月27日数据的XPath查询:
This is the XPath query I'm currently using to retrieve data for 2017 Feb 27:
//*[local-name() = 'NEW_DATE'][text() = '2017-02-27T00:00:00']/..这是显示的结果:
6795 2017-02-27T00:00:00 0.44 0.5 0.68 0.81 1.2 1.46 1.87 2.18 2.36 2.72 2.98 2.98有没有办法:
为了避免减少IMPORTXML调用,我避免使用IF函数.
I'm avoiding using the IF function in order to make fewer IMPORTXML calls.
推荐答案因为Google表格(GS)似乎仅与XPath 1.0兼容(其文档和产品论坛页面此处和此处截至目前为止,请先不要确认或弄清支持哪种版本),例如XPath 2.0 IF-THEN-ELSE语句.取而代之的是,可以使用GS本机函数过滤所搜索的源XML数据.
Because Google Sheets (GS) seems to be compatible with only XPath 1.0 (its documentation and product forum pages here and here don't confirm or clarify which version(s) is supported as of this date), alternative approaches such as XPath 2.0 IF-THEN-ELSE statements cannot be used. Instead, the source XML data sought can be filtered using GS native functions.
问题1
按照@Tanaike提出的解决方案,检索元素的所有子节点(保存一个),可以使用XPath"not"命令(即!)完成,如下面应用于"NEW_DATE"的父元素并省去了子元素"Id". TRANSPOSE用于以列形式显示它. (A1是包含问题中源XML URL的单元格.)
Per @Tanaike's proposed solution, retrieving all children nodes of an element, save one, can be accomplished using the XPath "not" command, i.e. !, as applied below to the parent element of "NEW_DATE" and leaving out the child element "Id". TRANSPOSE is used to display it in columnar form. (A1 is the cell containing the source XML URL in the question.)
=TRANSPOSE(IMPORTXML(A1, "//*[local-name() = 'NEW_DATE'][text() = '2017-02-25T00:00:00']/../*[local-name()!='Id']"))问题2
要搜索一年前的特定日期,如果缺少该日期的数据,则要检索最接近的远期日期,则需要嵌套的GS函数首先使用上述公式检索"NEW_DATE"数据,然后反向查找使用SORT并按MATCH的最接近可用日期的顺序排列.然后,将INDEX函数用于反向排序的元素以选择适当的日期. CONCATENATE,TEXT和TODAY函数仅用于以与XML数据兼容的格式设置上一年的日期.公式如下.
To search for a specific date one year ago and, in case data for that date are missing, to retrieve the closest forward date will require nested GS functions to first retrieve the "NEW_DATE" data with the above formula, next to reverse its order using SORT and to MATCH the closest available date. The INDEX function is then used on the reverse-sorted element to select the appropriate date. The CONCATENATE, TEXT and TODAY functions are used just to set the year-ago date in a format compatible with the XML data. The formula is below.
=TRANSPOSE(IMPORTXML(A1,CONCATENATE("//*[local-name() = 'NEW_DATE'][text() = '",INDEX(SORT(IMPORTXML(A1, "//*[local-name() = 'NEW_DATE']"),1,FALSE),MATCH(CONCATENATE(TEXT(TODAY()-365,"YYYY-MM-DD"),"T00:00:00"),SORT(IMPORTXML(A1, "//*[local-name() = 'NEW_DATE']"),1,FALSE),-1)),"']/../*[local-name()! = 'Id']")))
更多推荐
IMPORTXML XPath
发布评论