Understanding Tensorflow LSTM models?

Question

I have some trouble understanding LSTM models in TensorFlow. For simplicity, let us consider the Example program.

I use the tflearn as a wrapper as it does all the initialization and other higher level stuff automatically.

Till line number 42 net = tflearn.input_data([None, 200]) it's pretty clear what happens.

You load a dataset into variables and make it of standard lengths in this case 200. for both the input variables and also the 2 classes present in this case are converted to one hot vectors.

What I would like to know here is how the LSTM takes the input and across how many samples does it predict the output?

What do these parameters indicate: n_words=20000 & net = tflearn.embedding(net, input_dim=20000, output_dim=128)?

My goal is to replicate the activity recognition dataset in the paper.

For example, I would like to input a 4096 vector as input to the lstm and the idea is to take 16 of such vectors and then produce the classification result.

I think the code would look like this but I don't know how the input to the LSTM should be given

from __future__ import division, print_function, absolute_import

import tflearn
from tflearn.data_utils import to_categorical, pad_sequences
from tflearn.datasets import imdb

train, val = something.load_data()
trainX, trainY = train #each X sample is a (16,4096) nd float64 
valX, valY = val #each Y is a one hot vector of 101 classes.

net = tflearn.input_data([None, 16,4096])
net = tflearn.embedding(net, input_dim=4096, output_dim=256)
net = tflearn.lstm(net, 256)
net = tflearn.dropout(net, 0.5)
net = tflearn.lstm(net, 256)
net = tflearn.dropout(net, 0.5)
net = tflearn.fully_connected(net, 101, activation='softmax')
net = tflearn.regression(net, optimizer='adam',
                         loss='categorical_crossentropy')

model = tflearn.DNN(net, clip_gradients=0., tensorboard_verbose=3)
model.fit(trainX, trainY, validation_set=(testX, testY), show_metric=True,
          batch_size=128,n_epoch=2,snapshot_epoch=True)

Emre · Answer 1 · 2017-05-18 05:11:54Z

Arsenal,

We have to reshape the input data to fit in the LSTM's input layer like:

   trainX= np.reshape(trainX, (-1, 16, 4096))
   testX = np.reshape(testX, (-1, 16, 4096))

When one layer's output goes to the following layer's input, the earlier layer must produce a sequence like:

    net = tflearn.lstm(net,128, dropout=0.2, return_seq=True)
    net = tflearn.lstm(net,128)
    net = tflearn.fully_connected(net, num_classes, activation='softmax')

Hope this helps.

asked	2 years ago
viewed	1,028 times
active	today

Stack Exchange Network

Understanding Tensorflow LSTM models?

bumped to the homepage by Community♦ 5 hours ago

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged machine-learning python deep-learning tensorflow or ask your own question.

Hot Network Questions

Understanding Tensorflow LSTM models?

bumped to the homepage by Community♦ 5 hours ago

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged machine-learning python deep-learning tensorflow or ask your own question.

Related

Hot Network Questions