从网页中提取特定数据(Extract specific data from webpage)

基本上这是我的代码：

int main() { CURL *curl; FILE *fp; CURLcode res; std::string readBuffer; curl = curl_easy_init(); char outfilename[FILENAME_MAX] = "C:\\Users\\admin\\desktop\\test.txt"; if(curl) { fp = fopen(outfilename,"wb"); curl_easy_setopt(curl, CURLOPT_URL, "http://www.example.com"); curl_easy_setopt(curl, CURLOPT_POSTFIELDS, "user=123&pass=123"); curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1); curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data); curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp); res = curl_easy_perform(curl); Sleep(1000); curl_easy_cleanup(curl); fclose(fp); } return EXIT_SUCCESS; }

输出已成功保存在文本文件中。

我关心的是如何在特定标签之间提取特定内容。

例如，我只想要<bla> .............. </ bla>之间的内容。

什么是最简单的方式，谢谢你。

Basically this is my code :

int main() { CURL *curl; FILE *fp; CURLcode res; std::string readBuffer; curl = curl_easy_init(); char outfilename[FILENAME_MAX] = "C:\\Users\\admin\\desktop\\test.txt"; if(curl) { fp = fopen(outfilename,"wb"); curl_easy_setopt(curl, CURLOPT_URL, "http://www.example.com"); curl_easy_setopt(curl, CURLOPT_POSTFIELDS, "user=123&pass=123"); curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1); curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data); curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp); res = curl_easy_perform(curl); Sleep(1000); curl_easy_cleanup(curl); fclose(fp); } return EXIT_SUCCESS; }

The output is successfully saved in the text file.

My concern is how to extract specific content in between specific tags.

For example i want only the content between < bla> .............. < /bla> .

Whats the easiest way and thank you.

最满意答案

在您的示例中，您将响应从网站转储到文件，libcURL写入您按原样命中的网页返回的数据，它不会花费重组返回的数据。

您可以通过定义write_data函数来获取内存中的数据，该函数只需要以下格式：

size_t write_data(char *ptr, size_t size, size_t nmemb, void *userdata);

在内存中获取数据后，您可以解析它并根据需要对其进行重组。有关使用write_data函数，请参见示例。

对于XML解析，您可以使用此示例代码

In your Example, you are dumping the response from the website to a file, libcURL writes the data returned by the webpage that you hit as it is, it does not take efforts for restructuring the returned data.

You can obtain the data in a memory, by defining the write_data function, which needs the following format only:

size_t write_data(char *ptr, size_t size, size_t nmemb, void *userdata);

Once you get the data in a memory, you can parse it and restructure it as required. See Example Here for using write_data function.

For XML Parsing you may use This sample code

更多推荐