deeppavlov.models.ranking¶

Ranking classes.

class deeppavlov.models.ranking.bilstm_siamese_network.BiLSTMSiameseNetwork(*args, **kwargs)[source]¶

The class implementing a siamese neural network with BiLSTM and max pooling.

There is a possibility to use a binary cross-entropy loss as well as a triplet loss with random or hard negative sampling.

Parameters

len_vocab – A size of the vocabulary to build embedding layer.
seed – Random seed.
shared_weights – Whether to use shared weights in the model to encode contexts and responses.
embedding_dim – Dimensionality of token (word) embeddings.
reccurent – A type of the RNN cell. Possible values are lstm and bilstm.
hidden_dim – Dimensionality of the hidden state of the RNN cell. If reccurent equals bilstm hidden_dim should be doubled to get the actual dimensionality.
max_pooling – Whether to use max-pooling operation to get context (response) vector representation. If False, the last hidden state of the RNN will be used.
triplet_loss – Whether to use a model with triplet loss. If False, a model with crossentropy loss will be used.
margin – A margin parameter for triplet loss. Only required if triplet_loss is set to True.
hard_triplets – Whether to use hard triplets sampling to train the model i.e. to choose negative samples close to positive ones. If set to False random sampling will be used. Only required if triplet_loss is set to True.

class deeppavlov.models.ranking.bilstm_gru_siamese_network.BiLSTMGRUSiameseNetwork(*args, **kwargs)[source]¶

The class implementing a siamese neural network with BiLSTM, GRU and max pooling.

GRU is used to take into account multi-turn dialogue context.

Parameters

len_vocab – A size of the vocabulary to build embedding layer.
seed – Random seed.
shared_weights – Whether to use shared weights in the model to encode contexts and responses.
embedding_dim – Dimensionality of token (word) embeddings.
reccurent – A type of the RNN cell. Possible values are lstm and bilstm.
hidden_dim – Dimensionality of the hidden state of the RNN cell. If reccurent equals bilstm hidden_dim should be doubled to get the actual dimensionality.
max_pooling – Whether to use max-pooling operation to get context (response) vector representation. If False, the last hidden state of the RNN will be used.
triplet_loss – Whether to use a model with triplet loss. If False, a model with crossentropy loss will be used.
margin – A margin parameter for triplet loss. Only required if triplet_loss is set to True.
hard_triplets – Whether to use hard triplets sampling to train the model i.e. to choose negative samples close to positive ones. If set to False random sampling will be used. Only required if triplet_loss is set to True.

class deeppavlov.models.ranking.keras_siamese_model.KerasSiameseModel(*args, **kwargs)[source]¶

The class implementing base functionality for siamese neural networks in keras.

Parameters

learning_rate – Learning rate.
use_matrix – Whether to use a trainable matrix with token (word) embeddings.
emb_matrix – An embeddings matrix to initialize an embeddings layer of a model. Only used if use_matrix is set to True.
max_sequence_length – A maximum length of text sequences in tokens. Longer sequences will be truncated and shorter ones will be padded.
dynamic_batch – Whether to use dynamic batching. If True, the maximum length of a sequence for a batch will be equal to the maximum of all sequences lengths from this batch, but not higher than max_sequence_length.
attention – Whether any attention mechanism is used in the siamese network.
*args – Other parameters.
**kwargs – Other parameters.

class deeppavlov.models.ranking.mpm_siamese_network.MPMSiameseNetwork(*args, **kwargs)[source]¶

The class implementing a siamese neural network with bilateral multi-Perspective matching.

The network architecture is based on https://arxiv.org/abs/1702.03814.

Parameters

dense_dim – Dimensionality of the dense layer.
perspective_num – Number of perspectives in multi-perspective matching layers.
dim (aggregation) – Dimensionality of the hidden state in the second BiLSTM layer.
inpdrop_val – Float between 0 and 1. A dropout value for the linear transformation of the inputs.
recdrop_val – Float between 0 and 1. A dropout value for the linear transformation of the recurrent state.
ldrop_val – A dropout value of the dropout layer before the second BiLSTM layer.
dropout_val – A dropout value of the dropout layer after the second BiLSTM layer.

class deeppavlov.models.ranking.siamese_model.SiameseModel(batch_size: int, num_context_turns: int = 1, *args, **kwargs)[source]¶

The class implementing base functionality for siamese neural networks.

Parameters

batch_size – A size of a batch.
num_context_turns – A number of context turns in data samples.
*args – Other parameters.
**kwargs – Other parameters.

load(*args, **kwargs) → None [source]¶

save(*args, **kwargs) → None [source]¶

train_on_batch(samples_generator: Iterable[List[numpy.ndarray]], y: List[int]) → float [source]¶

This method is called by trainer to make one training step on one batch. The number of samples returned by samples_generator is always equal to batch_size, so we need to: 1) accumulate data for all of the inputs of the model; 2) format inputs of a model in a proper way using self._make_batch function; 3) run a model with provided inputs and ground truth labels (y) using self._train_on_batch function; 4) return mean loss value on the batch

Parameters

samples_generator (Iterable[List[np.ndarray]]) – generator that returns list of numpy arrays of words of all sentences represented as integers. Its shape: (number_of_context_turns + 1, max_number_of_words_in_a_sentence)
y (List[int]) – tuple of labels, with shape: (batch_size, )

Returns

value of mean loss on the batch

Return type

float

__call__(samples_generator: Iterable[List[numpy.ndarray]]) → Union[numpy.ndarray, List[str]][source]¶

This method is called by trainer to make one evaluation step on one batch.

Parameters

samples_generator (Iterable[List[np.ndarray]]) – generator that returns list of numpy arrays
words of all sentences represented as integers. (of) –
shape (Has) – (number_of_context_turns + 1, max_number_of_words_in_a_sentence)

Returns

predictions for the batch of samples

Return type

np.ndarray

reset() → None [source]¶

class deeppavlov.models.ranking.siamese_predictor.SiamesePredictor(model: deeppavlov.models.ranking.siamese_model.SiameseModel, batch_size: int, num_context_turns: int = 1, ranking: bool = True, attention: bool = False, responses: Optional[deeppavlov.core.data.simple_vocab.SimpleVocabulary] = None, preproc_func: Optional[Callable] = None, interact_pred_num: int = 3, *args, **kwargs)[source]¶

The class for ranking or paraphrase identification using the trained siamese network in the interact mode.

Parameters

batch_size – A size of a batch.
num_context_turns – A number of context turns in data samples.
ranking – Whether to perform ranking. If it is set to False paraphrase identification will be performed.
attention – Whether any attention mechanism is used in the siamese network. If False then calculated in advance vectors of responses will be used to obtain similarity score for the input context; Otherwise the whole siamese architecture will be used to obtain similarity score for the input context and each particular response. The parameter will be used if the ranking is set to True.
responses – A instance of SimpleVocabulary with all possible responses to perform ranking. Will be used if the ranking is set to True.
preproc_func – A __call__ function of the SiamesePreprocessor.
interact_pred_num – The number of the most relevant responses which will be returned. Will be used if the ranking is set to True.
**kwargs – Other parameters.