为什么下载的文件可能会损坏?

编程入门行业动态更新时间:2024-10-25 12:22:56

本文介绍了为什么下载的文件可能会损坏?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我一直在尝试从以下URL下载pdf文件: pdfobject/markup/examples/full-browser-window.html

I have been trying to download a pdf file from the following URL: pdfobject/markup/examples/full-browser-window.html

Josh M 建议采用以下解决方案他的电脑.但是，我无法使其正常工作.我的意思是以下代码将文件保存到目标位置，但是，下载文件的重量仅为984字节(通常应为18 Kb).因此文件已损坏.我想不出为什么会发生这种情况的任何原因?

Josh M suggested the following solution that works on his computer. However, I cannot get it to work. I mean the following code saves the file to the destination, however, the downloaded file's weight is only 984 bytes (normally should be 18 Kb). So the file is corrupted. I cannot think of any reason of why this could happen?

import java.io.ByteArrayOutputStream; import java.io.File; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.URL; import java.URLConnection; import java.nio.file.Files; import java.nio.file.StandardOpenOption; public final class FileDownloader { private FileDownloader(){} public static void main(String args[]) throws IOException{ download("pdfobject/markup/examples/full-browser-window.html", new File("C:\\Users\\Owner\\Desktop\\temporary\\myFile.pdf")); download2("pdfobject/markup/examples/full-browser-window.html", new File("C:\\Users\\Owner\\Desktop\\temporary\\myFile2.pdf")); } public static void download(final String url, final File destination) throws IOException { final URLConnection connection = new URL(url).openConnection(); connection.setConnectTimeout(60000); connection.setReadTimeout(60000); connection.addRequestProperty("User-Agent", "Mozilla/5.0"); final ByteArrayOutputStream baos = new ByteArrayOutputStream(); final byte[] buffer = new byte[2048]; int read; final InputStream input = connection.getInputStream(); while((read = input.read(buffer)) > -1) baos.write(buffer, 0, read); baos.flush(); Files.write(destination.toPath(), baos.toByteArray(), StandardOpenOption.WRITE); input.close(); } public static void download2(final String url, final File destination) throws IOException { final URLConnection connection = new URL(url).openConnection(); connection.setConnectTimeout(60000); connection.setReadTimeout(60000); connection.addRequestProperty("User-Agent", "Mozilla/5.0"); final FileOutputStream output = new FileOutputStream(destination, false); final byte[] buffer = new byte[2048]; int read; final InputStream input = connection.getInputStream(); while((read = input.read(buffer)) > -1) output.write(buffer, 0, read); output.flush(); output.close(); input.close(); } }

推荐答案

您正在下载.html URL，该URL 包含一个作为嵌入式对象的参考PDF.与浏览器不同，Java不会对此进行处理，因此您要保存HTML，而不是PDF.看看里面.为了您的帮助，这里是:

You are downloading a .html URL which contains a referenced PDF as an embedded object. Java doesn't process that, unlike a browser, so you are saving the HTML, not the PDF. Have a look inside. For your assistance, here it is:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "www.w3/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="www.w3/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <title>Embedding a PDF using static HTML markup: Full-browser window (100% width/height)</title>  <style type="text/css">  </style> </head> <body> <object data="/pdf/sample.pdf#toolbar=1&navpanes=0&scrollbar=1&page=1&view=FitH" type="application/pdf" width="100%" height="100%"> <p>It appears you don't have a PDF plugin for this browser. No biggie... you can <a href="/pdf/sample.pdf">click here to download the PDF file.</a></p> </object> </body> </html>