使用POST请求和Java客户端库加载BigQuery的任何示例?(Any examples of loading to BigQuery using a POST request and Java

编程入门 行业动态 更新时间:2024-10-23 23:30:48
使用POST请求和Java客户端库加载BigQuery的任何示例?(Any examples of loading to BigQuery using a POST request and Java client library?)

有没有人有任何使用以下两种方法为BigQuery创建新插入作业的示例:

bigquery java客户端库 通过POST请求创建一个加载作业: https : //developers.google.com/bigquery/loading-data-into-bigquery#loaddatapostrequest

Does anyone have any examples of creating a new insert job for BigQuery using both:

the bigquery java client library creating a load job from a POST request documented here: https://developers.google.com/bigquery/loading-data-into-bigquery#loaddatapostrequest

最满意答案

你需要调用bigquery.jobs()。insert(...)方法。

我不知道你已经完成了什么,但你至少应该拥有一个经过身份验证的API客户端,如下所示:

bigquery = new Bigquery.Builder(HTTP_TRANSPORT, JSON_FACTORY, credentials) .setApplicationName("...").build();

这是我使用google-http-client库为java和bigquery-api编写的insertRows方法的简化版本(你应该检查数据集是否存在,验证id等):

public Long insertRows(String projectId, String datasetId, String tableId, InputStream schema, AbstractInputStreamContent data) { try { // Defining table fields ObjectMapper mapper = new ObjectMapper(); List<TableFieldSchema> schemaFields = mapper.readValue(schema, new TypeReference<List<TableFieldSchema>>(){}); TableSchema tableSchema = new TableSchema().setFields(schemaFields); // Table reference TableReference tableReference = new TableReference() .setProjectId(projectId) .setDatasetId(datasetId) .setTableId(tableId); // Load job configuration JobConfigurationLoad loadConfig = new JobConfigurationLoad() .setDestinationTable(tableReference) .setSchema(tableSchema) // Data in Json format (could be CSV) .setSourceFormat("NEWLINE_DELIMITED_JSON") // Table is created if it does not exists .setCreateDisposition("CREATE_IF_NEEDED") // Append data (not override data) .setWriteDisposition("WRITE_APPEND"); // If your data are coming from Google Cloud Storage //.setSourceUris(...); // Load job Job loadJob = new Job() .setJobReference( new JobReference() .setJobId(Joiner.on("-").join("INSERT", projectId, datasetId, tableId, DateTime.now().toString("dd-MM-yyyy_HH-mm-ss-SSS"))) .setProjectId(projectId)) .setConfiguration(new JobConfiguration().setLoad(loadConfig)); // Job execution Job createTableJob = bigquery.jobs().insert(projectId, loadJob, data).execute(); // If loading data from Google Cloud Storage //createTableJob = bigquery.jobs().insert(projectId, loadJob).execute(); String jobId = createTableJob.getJobReference().getJobId(); // Wait for job completion createTableJob = waitForJob(projectId, createTableJob); Long rowCount = createTableJob != null ? createTableJob.getStatistics().getLoad().getOutputRows() : 0l; log.info("{} rows inserted in table '{}' (dataset: '{}', project: '{}')", rowCount, tableId, datasetId, projectId); return rowCount; } catch (IOException e) { throw Throwables.propagate(e); } }

我不知道数据的格式,但如果您使用的是文件,则可以添加如下功能:

public Long insertRows(String projectId, String datasetId, String tableId, File schema, File data) { try { return insertRows(projectId, datasetId, tableId, new FileInputStream(schema), new FileContent(MediaType.OCTET_STREAM.toString(), data)); } catch (FileNotFoundException e) { throw Throwables.propagate(e); } }

You need to call the bigquery.jobs().insert(...) method.

I don't know what you have done yet but you should have an authenticated client to the API at least like:

bigquery = new Bigquery.Builder(HTTP_TRANSPORT, JSON_FACTORY, credentials) .setApplicationName("...").build();

That's a simplified version of an insertRows method i wrote using the google-http-client library for java and the bigquery-api (you should check that the dataset exists, validate ids etc.):

public Long insertRows(String projectId, String datasetId, String tableId, InputStream schema, AbstractInputStreamContent data) { try { // Defining table fields ObjectMapper mapper = new ObjectMapper(); List<TableFieldSchema> schemaFields = mapper.readValue(schema, new TypeReference<List<TableFieldSchema>>(){}); TableSchema tableSchema = new TableSchema().setFields(schemaFields); // Table reference TableReference tableReference = new TableReference() .setProjectId(projectId) .setDatasetId(datasetId) .setTableId(tableId); // Load job configuration JobConfigurationLoad loadConfig = new JobConfigurationLoad() .setDestinationTable(tableReference) .setSchema(tableSchema) // Data in Json format (could be CSV) .setSourceFormat("NEWLINE_DELIMITED_JSON") // Table is created if it does not exists .setCreateDisposition("CREATE_IF_NEEDED") // Append data (not override data) .setWriteDisposition("WRITE_APPEND"); // If your data are coming from Google Cloud Storage //.setSourceUris(...); // Load job Job loadJob = new Job() .setJobReference( new JobReference() .setJobId(Joiner.on("-").join("INSERT", projectId, datasetId, tableId, DateTime.now().toString("dd-MM-yyyy_HH-mm-ss-SSS"))) .setProjectId(projectId)) .setConfiguration(new JobConfiguration().setLoad(loadConfig)); // Job execution Job createTableJob = bigquery.jobs().insert(projectId, loadJob, data).execute(); // If loading data from Google Cloud Storage //createTableJob = bigquery.jobs().insert(projectId, loadJob).execute(); String jobId = createTableJob.getJobReference().getJobId(); // Wait for job completion createTableJob = waitForJob(projectId, createTableJob); Long rowCount = createTableJob != null ? createTableJob.getStatistics().getLoad().getOutputRows() : 0l; log.info("{} rows inserted in table '{}' (dataset: '{}', project: '{}')", rowCount, tableId, datasetId, projectId); return rowCount; } catch (IOException e) { throw Throwables.propagate(e); } }

I don't know the format of your data but if your are using files, you can add a function like:

public Long insertRows(String projectId, String datasetId, String tableId, File schema, File data) { try { return insertRows(projectId, datasetId, tableId, new FileInputStream(schema), new FileContent(MediaType.OCTET_STREAM.toString(), data)); } catch (FileNotFoundException e) { throw Throwables.propagate(e); } }

更多推荐

本文发布于:2023-07-28 19:04:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1308175.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:示例   客户端   加载   BigQuery   Java

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!