通量的预测模型"/>
深度学习对通量的预测模型
目录
0.准备工作
1.因子的选择
2.网络的结构
3.优化的方法
4.结果的展示
0.准备工作
哦对了
因为我看了看联合观测资料里的有很多缺测,于是我决定用coare 资料
先加载一些包
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from scipy.io import loadmat
# PyTorch
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader# For data preprocess
import numpy as np
import csv
import os# For plotting
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
然后
data = np.loadtxt('coare.txt')
数据差不多是这样的
'''
-------------------------------------------------------------------------------
00Date: YYMMDDHHmmss, YY=year, MM=month, DD=day, HH=hour, mm=minute,ss=sec
1Us: ship speed (as described above)
2U: true wind speed at 15-m height
3Tru: true wind direction rel. to N (meteorological convention)
4Rel: relative wind direction
5Hed: the direction the ship's bow is pointing
6Ts: sea surface temp (no cool skin correction)
7T: Vaisala air temperature (about 15 m)
8qs: sea surface specific humidity (g/kg) (no cool skin correction)
9q: Vaisala air specific humidity (about 15 m)
-------------------------------------------------------------------------------
10Hsc: covariance sensible heat flux
11Hsi: inertial sensible heat flux
12Hsb: bulk sensible heat flux
13Hlc: covariance latent heat flux
14Hli: inertial latent heat flux
15Hlb: bulk latent heat flux
16Tuc: covariance surface stress (-wu part only)
17Tui: inertial-dissipation surface stress
18Tub: bulk surface stress
-------------------------------------------------------------------------------
19Rs: solar irradiance
20Rl: longwave irradiance
21Rain: precipitation (mm/hr)
22J: ship plume/contamination index (0 implies good conditions)
23Oph: standard deviation of OPHIR hygrometer clear channel counts (<15 implies reasonably clean optics).
24Tlt: mean wind vector tilt, degrees (<10 ok covariances)
25Jm: ship maneuver/contamination index, m/s (<2 implies good conditions)
-------------------------------------------------------------------------------
26Ct: sonic temperature structure function parameter (K^2/m^.667)
27Cq: water vapor structure function parameter ((g/m^3)/m^.667)
28Cu: streamwise velocity structure function parameter ((m/s)^2/m^.667)
29Cw: vertical velocity structure function parameter ((m/s)^2/m^.667)
30Hr: sensible heat flux due to precipitation at droplet wet-bulb T
31To: OPHIR air temperature
32Qo: OPHIR specific humidity
-------------------------------------------------------------------------------
33Lat: Latitude
34Lon: Longitude
-------------------------------------------------------------------------------
这是35列每列的变量。
'''
因为后面还需要参考,我就放进代码里
接下来就是选择用哪些来回归了
还有就是整理好数据,为了后面的训练数据用
得准备训练集,和测试集
验证集会在后面从训练集里面分出来的
feature = [2,6,8,9,19,21,33]
feature_for_train = [2,6,8,9,19,21,33,13]
feature_count = np.size(feature)
#feature = list(range(1,10))
#feature.extend([19,20,21,22,23,24,25,33,34])
#feature_for_train = list(range(1,10))
#feature_for_train.extend([19,20,21,22,23,24,25,33,34,13])
data_train1 = data[0:4000,:]
data_test1 = data[4000:4806,:]
hlc = data[:,13]
hlb = data[:,15]data_train = data_train1[:,feature_for_train]
data_test = data_test1[:,feature]
data_train = pd.DataFrame(data_train)
data_test =pd.DataFrame(data_test)
data_train.to_csv('train.csv',index=False)
data_test.to_csv('test.csv',index=False)
#还是保存成csv吧。就先回归潜热吧
因为我比较喜欢研究潜热,主要是其他两个别人搞得已经非常好了,没啥搞头了
但是这个我后面还是会进行模拟的
只是潜热我更喜欢它
接下来就是调试深度学习的代码了,因为之前学的课,我有半成品的代码,我就不要从头开始敲了,这样方便一点了。
接下来我就放几个关键的地方
我准备写:
1.因子的选择
2.网络的结构
3.优化的方法
4.结果的展示
这也是深度学习里面最为关键的几个点,最核心的点吧
1.因子的选择
首先是选择参数
我们得 抓住重点,最大的 错误就是对于无关紧要的变量进行细致计算却丢掉了重要的参数。这差不多是朗道说的。
feature = [2,6,8,9,19,21,33]
feature_for_train = [2,6,8,9,19,21,33,13]feature_count = np.size(feature)class myDataset(Dataset):def __init__(self,path,mode='train',target_only=False):self.mode = mode# Read data into numpy arrayswith open(path, 'r') as fp:data = list(csv.reader(fp))data = np.array(data[1:])[:, :].astype(float)if not target_only:feats = list(range(feature_count))else:feats = list(range(feature_count))# feats = list(range(40))# feats.extend([57,75])# TODOif mode == 'test':# Testing data data = data[:, feats]self.data = torch.FloatTensor(data)else:# Training data (train/dev sets)target = data[:, -1]data = data[:, feats]# Splitting training data into train & dev setsif mode == 'train':indices = [i for i in range(len(data)) if i % 10 != 0]elif mode == 'dev':indices = [i for i in range(len(data)) if i % 10 == 0]# Convert data into PyTorch tensorsself.data = torch.FloatTensor(data[indices])self.target = torch.FloatTensor(target[indices])# Normalize features (you may remove this part to see what will happen)self.data[:, :] = (self.data[:, :] - self.data[:, :].mean(dim=0, keepdim=True)) / self.data[:, :].std(dim=0, keepdim=True)self.dim = self.data.shape[1]print('Finished reading the {} set of COVID19 Dataset ({} samples found, each dim = {})'.format(mode, len(self.data), self.dim))def __getitem__(self, index):# Returns one sample at a timeif self.mode in ['train', 'dev']:# For trainingreturn self.data[index], self.target[index]else:# For testing (no target)return self.data[index]def __len__(self):# Returns the size of the datasetreturn len(self.data)
这里面写了一个class来搞他
就是其实选择因子最主要还是前面的那两行,后面几乎不用动了。
2.网络的结构
这个网络的结构也就是模型的核心了,之后的调参都是对于这个框架的细节进行修改,网络的结构非常重要了。
但是具体每个问题需要什么结构还是需要对症下药的,我也讲不清楚。
在这个问题里面
我是这样的,但是我还在调试。
class NeuralNet(nn.Module):''' A simple fully-connected deep neural network '''def __init__(self, input_dim):super(NeuralNet, self).__init__()# Define your neural network here# TODO: How to modify this model to achieve better performance?self = nn.Sequential(nn.Linear(input_dim, 64),nn.ReLU(),nn.Linear(64, 1))# Mean squared error lossself.criterion = nn.MSELoss(reduction='mean')def forward(self, x):''' Given input of size (batch_size x input_dim), compute output of the network '''return self(x).squeeze(1)def cal_loss(self, pred, target):''' Calculate loss '''# TODO: you may implement L2 regularization herereturn self.criterion(pred, target)
具体的还需要多学习,看文献了。
3.优化的方法
优化也有很多方法,我一下就能想到的知道就是SGD和Adam
这两我都试了
config = {'n_epochs': 3000, # maximum number of epochs'batch_size': 500, # mini-batch size for dataloader'optimizer': 'Adam', # optimization algorithm (optimizer in torch.optim)'optim_hparas': { # hyper-parameters for the optimizer (depends on which optimizer you are using)
# 'lr': 0.001, # learning rate of SGD
# 'momentum': 0.09 # momentum for SGD},'early_stop': 10000, # early stopping epochs (the number epochs since your model's last improvement)
因为我现在的更大的问题出现在model bias
所以优化我觉得这个就可以了。
4.结果的展示
然后我就训练出了一些结果,然后不断地调整,目前也有了可以看得结果
可以看出来预测结果还是不错的,而且我只用了一天的时间就解决了,别的科学家搞了好多年的问题。 当然这是个不公平的对比,因为时代不同了。
更多推荐
深度学习对通量的预测模型
发布评论