我一直在尝试从以下URL下载pdf文件: pdfobject/markup/examples/full-browser-window.html
I have been trying to download a pdf file from the following URL: pdfobject/markup/examples/full-browser-window.html
Josh M 建议采用以下解决方案他的电脑.但是,我无法使其正常工作.我的意思是以下代码将文件保存到目标位置,但是,下载文件的重量仅为984字节(通常应为18 Kb).因此文件已损坏.我想不出为什么会发生这种情况的任何原因?
Josh M suggested the following solution that works on his computer. However, I cannot get it to work. I mean the following code saves the file to the destination, however, the downloaded file's weight is only 984 bytes (normally should be 18 Kb). So the file is corrupted. I cannot think of any reason of why this could happen?
import java.io.ByteArrayOutputStream; import java.io.File; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.URL; import java.URLConnection; import java.nio.file.Files; import java.nio.file.StandardOpenOption; public final class FileDownloader { private FileDownloader(){} public static void main(String args[]) throws IOException{ download("pdfobject/markup/examples/full-browser-window.html", new File("C:\\Users\\Owner\\Desktop\\temporary\\myFile.pdf")); download2("pdfobject/markup/examples/full-browser-window.html", new File("C:\\Users\\Owner\\Desktop\\temporary\\myFile2.pdf")); } public static void download(final String url, final File destination) throws IOException { final URLConnection connection = new URL(url).openConnection(); connection.setConnectTimeout(60000); connection.setReadTimeout(60000); connection.addRequestProperty("User-Agent", "Mozilla/5.0"); final ByteArrayOutputStream baos = new ByteArrayOutputStream(); final byte[] buffer = new byte[2048]; int read; final InputStream input = connection.getInputStream(); while((read = input.read(buffer)) > -1) baos.write(buffer, 0, read); baos.flush(); Files.write(destination.toPath(), baos.toByteArray(), StandardOpenOption.WRITE); input.close(); } public static void download2(final String url, final File destination) throws IOException { final URLConnection connection = new URL(url).openConnection(); connection.setConnectTimeout(60000); connection.setReadTimeout(60000); connection.addRequestProperty("User-Agent", "Mozilla/5.0"); final FileOutputStream output = new FileOutputStream(destination, false); final byte[] buffer = new byte[2048]; int read; final InputStream input = connection.getInputStream(); while((read = input.read(buffer)) > -1) output.write(buffer, 0, read); output.flush(); output.close(); input.close(); } } 推荐答案您正在下载.html URL,该URL 包含一个作为嵌入式对象的参考PDF.与浏览器不同,Java不会对此进行处理,因此您要保存HTML,而不是PDF.看看里面.为了您的帮助,这里是:
You are downloading a .html URL which contains a referenced PDF as an embedded object. Java doesn't process that, unlike a browser, so you are saving the HTML, not the PDF. Have a look inside. For your assistance, here it is:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "www.w3/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="www.w3/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <title>Embedding a PDF using static HTML markup: Full-browser window (100% width/height)</title> <!-- This example created for PDFObject by Philip Hutchison (www.pipwerks) --> <style type="text/css"> <!-- html { height: 100%; } body { margin: 0; padding: 0; height: 100%; } p { padding: 1em; } object { display: block; } --> </style> </head> <body> <object data="/pdf/sample.pdf#toolbar=1&navpanes=0&scrollbar=1&page=1&view=FitH" type="application/pdf" width="100%" height="100%"> <p>It appears you don't have a PDF plugin for this browser. No biggie... you can <a href="/pdf/sample.pdf">click here to download the PDF file.</a></p> </object> </body> </html>更多推荐
为什么下载的文件可能会损坏?
发布评论