tensorflow - How to train with inputs of variable size? -


this question rather abstract , not tied tensorflow or keras. want train language model, , want use inputs of different sizes lstms. particularly, i'm following paper: https://www.researchgate.net/publication/317379370_a_neural_language_model_for_query_auto-completion.

the authors use, among other things, word embeddings , one-hot encoding of characters. likely, dimensions of each of these inputs different. now, feed network, see few alternatives i'm sure i'm missing , know how should done.

  • create 3d tensor of shape (instances, 2, max(embeddings,characters)). is, padding smaller input 0s.
  • create 3d tensor of shape (instances, embeddings+characters, 1)). is, concatenating inputs.

it looks me both alternatives bad efficiently training model. so, what's best way approach this? see authors use embedding layer purpose, technically, mean?


edit

here more details. let's call these inputs x (character-level input) , e (word-level input). on each character of sequence (a text), compute x, e , y, label.

  • x: character one-hot encoding. character index of size 38, vector filled 37 zeros , 1 1.
  • e: precomputed word embedding of dimension 200. if character space, fetch word embedding of previous word in sequence, otherwise, assign vector incomplete word (inc, of size 200). real example sequence "red car": r>inc, e>inc, d>inc, _>embeddings["red"], c>inc, a>inc, r>inc.
  • y: label predicted, next character, one-hot encoded. output of same dimension x because uses same character index. in example above, "r", y one-hot encoding of "e".

according keras documentation, padding idea seems one. there masking parameter in embedding layer, make keras skip these values instead of processing them. in theory, don't lose performance. if library built, skipping skipping processing.

you need take care not attribute value 0 other character, not spaces or unknown words.

an embedding layer not masking (masking option in embedding layer).

the embedding layer transforms integer values word/character dictionary actual vectors of shape.

suppose have dictionary:

1: hey 2: , 3: i'm 4: here 5: not 

and form sentences like

[1,2,3,4,0] -> "hey, i'm here" [1,2,3,5,4] -> "hey, i'm not here" [1,2,1,2,1] -> "hey, hey, hey" 

the embedding layer tranform each of integers vectors of size. 2 things @ same time:

  • transforms words in vectors because neural networks can handle vectors or intensities. list of indices cannot processed neural network directly, there no logical relation between indices , words

  • creates vector "meaningful" set of features each word.

and after training, become "meaningful" vectors. each element starts represent feature of word, although feature obscure humans. it's possible embedding capable of detecting words verbs, nouns, feminine, masculine, etc, encoded in combination of numeric values (presence/abscence/intensity of features).


you may try approach in question, instead of using masking, needs separate batches length, each batch can trained @ time without needing pad them: keras misinterprets training data shape


Comments

Popular posts from this blog

neo4j - finding mutual friends in a cypher statement starting with three or more persons -

php - How to remove letter in front of the word laravel -

minify - Minimizing css files -