我正在研究Keras卷积神经网络的例子。 (例如,请参阅https://github.com/fchollet/keras/blob/master/examples/imdb_cnn.py 。)但是,我无法弄清楚“maxlen”参数的含义。 它会与填充有关吗? 它不是最大数量的功能; 他们有一个max_features参数。
I'm looking at Keras' example for convolutional neural networks. (See https://github.com/fchollet/keras/blob/master/examples/imdb_cnn.py for example.) However, I cannot figure out what they mean by the "maxlen" parameter. Would it have something to do with padding? It isn't the maximum number of features; they have a max_features parameter for that.
最满意答案
maxlen参数是单词中文本样本的长度。
在Keras代码示例中,您有以下设置:
# set parameters: max_features = 5000 maxlen = 400 ... embedding_dims = 50这意味着您有5000个单词的词汇表,每个单词都嵌入到具有50个维度的特征向量中,每个文本样本可以长达400个单词。
间接地,当文本样本短于400个单词时,这也与填充有关。 然后你必须将它们填充到400的长度。
对于用于文本分类的1D-ConvNets,请参阅本文和此博客文章:
https://arxiv.org/abs/1408.5882
http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/
The maxlen parameter is the length of your text samples in words.
In the Keras code example you have these settings:
# set parameters: max_features = 5000 maxlen = 400 ... embedding_dims = 50This means you have a vocabulary of 5000 words, each of these words are embedded into a feature vector with 50 dimensions and each of your text samples can be 400 words long.
Indirectly this also has a relation to padding when you have text samples that are shorter than 400 words. Then you have to pad these to a length of 400.
For 1D-ConvNets for text classification see also this paper and this blog post:
https://arxiv.org/abs/1408.5882
http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/
更多推荐
发布评论