构建keras模型(Constructing a keras model)

编程入门行业动态更新时间:2024-10-25 14:22:47

我不明白这段代码中发生了什么：

def construct_model(use_imagenet=True): # line 1: how do we keep all layers of this model ? model = keras.applications.InceptionV3(include_top=False, input_shape=(IMG_SIZE, IMG_SIZE, 3), weights='imagenet' if use_imagenet else None) # line 1: how do we keep all layers of this model ? new_output = keras.layers.GlobalAveragePooling2D()(model.output) new_output = keras.layers.Dense(N_CLASSES, activation='softmax')(new_output) model = keras.engine.training.Model(model.inputs, new_output) return model

具体来说，当我们调用最后一个构造函数时，我的困惑是

model = keras.engine.training.Model(model.inputs, new_output)

我们指定输入层和输出层，但它如何知道我们希望所有其他层保留？

换句话说，我们将new_output层附加到我们在第1行加载的预训练模型，即new_output层，然后在最终构造函数（最后一行）中，我们只创建并返回具有指定输入的模型，输出图层，但它如何知道我们想要的其他图层？

附带问题1）：keras.engine.training.Model和keras.models.Model有什么区别？

问题2）：当我们做new_layer = keras.layers.Dense（...）（prev_layer）时到底发生了什么？（）操作是否返回新图层，它究竟做了什么？

I don't understand what's happening in this code:

Specifically, my confusion is, when we call the last constructor

model = keras.engine.training.Model(model.inputs, new_output)

we specify input layer and output layer, but how does it know we want all the other layers to stay?

In other words, we append the new_output layer to the pre-trained model we load in line 1, that is the new_output layer, and then in the final constructor (final line), we just create and return a model with a specified input and output layers, but how does it know what other layers we want in between?

Side question 1): What is the difference between keras.engine.training.Model and keras.models.Model?

Side question 2): What exactly happens when we do new_layer = keras.layers.Dense(...)(prev_layer)? Does the () operation return new layer, what does it do exactly?

最满意答案

此模型是使用Functional API模型创建的

基本上它的工作方式是这样的（也许如果你在阅读之前去下面的“侧面问题2”它可能会更清楚）：

你有一个输入张量 （你也可以把它看作“输入数据”）您可以创建（或重复使用）图层您将输入张量传递给图层（您使用输入“调用”图层）你得到一个输出张量

在创建整个图形之前，您一直在使用这些张量。

但这还没有创造出“模型”。（你可以训练和使用其他东西）。你所拥有的只是一张图表，告诉你哪些张量在哪里。

要创建模型，请定义它的起始端点。

在这个例子中。

他们采用现有模型： model = keras.applications.InceptionV3(...) 他们想扩展这个模型，所以他们得到它的输出张量 ： model.output 它们将此张量作为GlobalAveragePooling2D图层的输入传递他们将此图层的输出张量作为new_output 他们将此作为输入传递给另一层： Dense(N_CLASSES, ....) 并将其输出作为new_output （此var已被替换，因为他们不想保留其旧值...）

但是，由于它与功能API一起使用，我们还没有模型，只有图表。为了创建模型，我们使用Model定义输入张量和输出张量：

new_model = Model(old_model.inputs, new_output)

现在你有了你的模型。如果你在另一个var中使用它，就像我做的那样（ new_model ），旧模型仍将存在于model 。这些模型共享相同的层，每当您训练其中一个时，另一个也会更新。

问题：它如何知道我们想要的其他层？

当你这样做时：

outputTensor = SomeLayer(...)(inputTensor)

你有输入和输出之间的连接。（Keras将使用内部张量流机制并将这些张量和节点添加到图形中）。没有输入，输出张量不能存在。整个InceptionV3模型InceptionV3连接。它的输入张量遍历所有层以产生一个ouptut张量。只有一种可能的方式来跟踪数据，图表就是这样。

当您获得此模型的输出并使用它来获得更多输出时，所有新输出都将连接到此模型，从而连接到模型的第一个输入。

添加到张量的属性_keras_history可能与它跟踪图形的方式密切相关。

因此，做Model(old_model.inputs, new_output)自然会遵循唯一可能的方式：图形。

如果您尝试使用未连接的张量进行此操作，则会出现错误。

问题1

喜欢从“keras.models”导入。基本上，这个模块将从其他模块导入：

https://github.com/keras-team/keras/blob/master/keras/models.py

请注意，文件keras/models.py从keras.engine.training导入Model 。所以，这是一回事。

问题2

它不是new_layer = keras.layers.Dense(...)(prev_layer) 。

它是output_tensor = keras.layers.Dense(...)(input_tensor) 。

你在同一行做两件事：

创建图层 - 使用keras.layers.Dense(...) 使用输入张量调用图层以获得输出张量

如果要使用具有不同输入的相同图层：

denseLayer = keras.layers.Dense(...) #creating a layer output1 = denseLayer(input1) #calling a layer with an input and getting an output output2 = denseLayer(input2) #calling the same layer on another input output3 = denseLayer(input3) #again

奖励 - 创建与顺序模型相同的功能模型

如果您创建此顺序模型：

model = Sequential() model.add(Layer1(...., input_shape=some_shape)) model.add(Layer2(...)) model.add(Layer3(...))

你做的完全一样：

inputTensor = Input(some_shape) outputTensor = Layer1(...)(inputTensor) outputTensor = Layer2(...)(outputTensor) outputTensor = Layer3(...)(outputTensor) model = Model(inputTensor,outputTensor)

有什么不同？

那么，功能API模型完全可以自由地构建你想要的。您可以创建分支：

out1 = Layer1(..)(inputTensor) out2 = Layer2(..)(inputTensor)

你可以加入张量：

joinedOut = Concatenate()([out1,out2])

通过这种方式，您可以使用各种花哨的东西，分支，门，连接，添加等创建任何您想要的东西，这些都是顺序模型无法做到的。

实际上， Sequential模型也是一个Model ，但是为了在没有分支的模型中快速使用而创建。

This model was created using the Functional API Model

Basically it works like this (perhaps if you go to the "side question 2" below before reading this it may get clearer):

You have an input tensor (you can see it as "input data" too) You create (or reuse) a layer You pass the input tensor to a layer (you "call" a layer with an input) You get an output tensor

You keep working with these tensors until you have created the entire graph.

But this hasn't created a "model" yet. (One you can train and use other things). All you have is a graph telling which tensors go where.

To create a model, you define it's start end end points.

In the example.

They take an existing model: model = keras.applications.InceptionV3(...) They want to expand this model, so they get its output tensor: model.output They pass this tensor as the input of a GlobalAveragePooling2D layer They get this layer's output tensor as new_output They pass this as input to yet another layer: Dense(N_CLASSES, ....) And get its output as new_output (this var was replaced as they are not interested in keeping its old value...)

But, as it works with the functional API, we don't have a model yet, only a graph. In order to create a model, we use Model defining the input tensor and the output tensor:

new_model = Model(old_model.inputs, new_output)

Now you have your model. If you use it in another var, as I did (new_model), the old model will still exist in model. And these models are sharing the same layers, in a way that whenever you train one of them, the other gets updated as well.

Question: how does it know what other layers we want in between?

When you do:

outputTensor = SomeLayer(...)(inputTensor)

you have a connection between the input and output. (Keras will use the inner tensorflow mechanism and add these tensors and nodes to the graph). The output tensor cannot exist without the input. The entire InceptionV3 model is connected from start to end. Its input tensor goes through all the layers to yield an ouptut tensor. There is only one possible way for the data to follow, and the graph is the way.

When you get the output of this model and use it to get further outputs, all your new outputs are connected to this, and thus to the first input of the model.

Probably the attribute _keras_history that is added to the tensors is closely related to how it tracks the graph.

So, doing Model(old_model.inputs, new_output) will naturally follow the only way possible: the graph.

If you try doing this with tensors that are not connected, you will get an error.

Side question 1

Prefer to import from "keras.models". Basically, this module will import from the other module:

https://github.com/keras-team/keras/blob/master/keras/models.py

Notice that the file keras/models.py imports Model from keras.engine.training. So, it's the same thing.

Side question 2

It's not new_layer = keras.layers.Dense(...)(prev_layer).

It is output_tensor = keras.layers.Dense(...)(input_tensor).

You're doing two things in the same line:

Creating a layer - with keras.layers.Dense(...) Calling the layer with an input tensor to get an output tensor

If you wanted to use the same layer with different inputs:

Bonus - Creating a functional model that is equal to a sequential model

If you create this sequential model:

model = Sequential() model.add(Layer1(...., input_shape=some_shape)) model.add(Layer2(...)) model.add(Layer3(...))

You're doing exactly the same as:

inputTensor = Input(some_shape) outputTensor = Layer1(...)(inputTensor) outputTensor = Layer2(...)(outputTensor) outputTensor = Layer3(...)(outputTensor) model = Model(inputTensor,outputTensor)

What is the difference?

Well, functional API models are totally free to be build anyway you want. You can create branches:

out1 = Layer1(..)(inputTensor) out2 = Layer2(..)(inputTensor)

You can join tensors:

joinedOut = Concatenate()([out1,out2])

With this, you can create anything you want with all kinds of fancy stuff, branches, gates, concatenations, additions, etc., which you can't do with a sequential model.

In fact, a Sequential model is also a Model, but created for a quick use in models without branches.

更多推荐

本文发布于:2023-08-07 01:52:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1457813.html