deeppavlov.models.ranking¶
Ranking classes.
-
class
deeppavlov.models.ranking.bilstm_siamese_network.BiLSTMSiameseNetwork(*args, **kwargs)[source]¶ The class implementing a siamese neural network with BiLSTM and max pooling.
There is a possibility to use a binary cross-entropy loss as well as a triplet loss with random or hard negative sampling.
- Parameters
len_vocab – A size of the vocabulary to build embedding layer.
seed – Random seed.
shared_weights – Whether to use shared weights in the model to encode
contextsandresponses.embedding_dim – Dimensionality of token (word) embeddings.
reccurent – A type of the RNN cell. Possible values are
lstmandbilstm.hidden_dim – Dimensionality of the hidden state of the RNN cell. If
reccurentequalsbilstmhidden_dimshould be doubled to get the actual dimensionality.max_pooling – Whether to use max-pooling operation to get
context(response) vector representation. IfFalse, the last hidden state of the RNN will be used.triplet_loss – Whether to use a model with triplet loss. If
False, a model with crossentropy loss will be used.margin – A margin parameter for triplet loss. Only required if
triplet_lossis set toTrue.hard_triplets – Whether to use hard triplets sampling to train the model i.e. to choose negative samples close to positive ones. If set to
Falserandom sampling will be used. Only required iftriplet_lossis set toTrue.
-
class
deeppavlov.models.ranking.bilstm_gru_siamese_network.BiLSTMGRUSiameseNetwork(*args, **kwargs)[source]¶ The class implementing a siamese neural network with BiLSTM, GRU and max pooling.
GRU is used to take into account multi-turn dialogue
context.- Parameters
len_vocab – A size of the vocabulary to build embedding layer.
seed – Random seed.
shared_weights – Whether to use shared weights in the model to encode
contextsandresponses.embedding_dim – Dimensionality of token (word) embeddings.
reccurent – A type of the RNN cell. Possible values are
lstmandbilstm.hidden_dim – Dimensionality of the hidden state of the RNN cell. If
reccurentequalsbilstmhidden_dimshould be doubled to get the actual dimensionality.max_pooling – Whether to use max-pooling operation to get
context(response) vector representation. IfFalse, the last hidden state of the RNN will be used.triplet_loss – Whether to use a model with triplet loss. If
False, a model with crossentropy loss will be used.margin – A margin parameter for triplet loss. Only required if
triplet_lossis set toTrue.hard_triplets – Whether to use hard triplets sampling to train the model i.e. to choose negative samples close to positive ones. If set to
Falserandom sampling will be used. Only required iftriplet_lossis set toTrue.
-
class
deeppavlov.models.ranking.keras_siamese_model.KerasSiameseModel(*args, **kwargs)[source]¶ The class implementing base functionality for siamese neural networks in keras.
- Parameters
learning_rate – Learning rate.
use_matrix – Whether to use a trainable matrix with token (word) embeddings.
emb_matrix – An embeddings matrix to initialize an embeddings layer of a model. Only used if
use_matrixis set toTrue.max_sequence_length – A maximum length of text sequences in tokens. Longer sequences will be truncated and shorter ones will be padded.
dynamic_batch – Whether to use dynamic batching. If
True, the maximum length of a sequence for a batch will be equal to the maximum of all sequences lengths from this batch, but not higher thanmax_sequence_length.attention – Whether any attention mechanism is used in the siamese network.
*args – Other parameters.
**kwargs – Other parameters.
-
class
deeppavlov.models.ranking.mpm_siamese_network.MPMSiameseNetwork(*args, **kwargs)[source]¶ The class implementing a siamese neural network with bilateral multi-Perspective matching.
The network architecture is based on https://arxiv.org/abs/1702.03814.
- Parameters
dense_dim – Dimensionality of the dense layer.
perspective_num – Number of perspectives in multi-perspective matching layers.
dim (aggregation) – Dimensionality of the hidden state in the second BiLSTM layer.
inpdrop_val – Float between 0 and 1. A dropout value for the linear transformation of the inputs.
recdrop_val – Float between 0 and 1. A dropout value for the linear transformation of the recurrent state.
ldrop_val – A dropout value of the dropout layer before the second BiLSTM layer.
dropout_val – A dropout value of the dropout layer after the second BiLSTM layer.
-
class
deeppavlov.models.ranking.siamese_model.SiameseModel(batch_size: int, num_context_turns: int = 1, *args, **kwargs)[source]¶ The class implementing base functionality for siamese neural networks.
- Parameters
batch_size – A size of a batch.
num_context_turns – A number of
contextturns in data samples.*args – Other parameters.
**kwargs – Other parameters.
-
train_on_batch(samples_generator: Iterable[List[numpy.ndarray]], y: List[int]) → float[source]¶ This method is called by trainer to make one training step on one batch. The number of samples returned by samples_generator is always equal to batch_size, so we need to: 1) accumulate data for all of the inputs of the model; 2) format inputs of a model in a proper way using self._make_batch function; 3) run a model with provided inputs and ground truth labels (y) using self._train_on_batch function; 4) return mean loss value on the batch
- Parameters
samples_generator (Iterable[List[np.ndarray]]) – generator that returns list of numpy arrays of words of all sentences represented as integers. Its shape: (number_of_context_turns + 1, max_number_of_words_in_a_sentence)
y (List[int]) – tuple of labels, with shape: (batch_size, )
- Returns
value of mean loss on the batch
- Return type
-
__call__(samples_generator: Iterable[List[numpy.ndarray]]) → Union[numpy.ndarray, List[str]][source]¶ This method is called by trainer to make one evaluation step on one batch.
- Parameters
samples_generator (Iterable[List[np.ndarray]]) – generator that returns list of numpy arrays
words of all sentences represented as integers. (of) –
shape (Has) – (number_of_context_turns + 1, max_number_of_words_in_a_sentence)
- Returns
predictions for the batch of samples
- Return type
np.ndarray
-
class
deeppavlov.models.ranking.siamese_predictor.SiamesePredictor(model: deeppavlov.models.ranking.siamese_model.SiameseModel, batch_size: int, num_context_turns: int = 1, ranking: bool = True, attention: bool = False, responses: Optional[deeppavlov.core.data.simple_vocab.SimpleVocabulary] = None, preproc_func: Optional[Callable] = None, interact_pred_num: int = 3, *args, **kwargs)[source]¶ The class for ranking or paraphrase identification using the trained siamese network in the
interactmode.- Parameters
batch_size – A size of a batch.
num_context_turns – A number of
contextturns in data samples.ranking – Whether to perform ranking. If it is set to
Falseparaphrase identification will be performed.attention – Whether any attention mechanism is used in the siamese network. If
Falsethen calculated in advance vectors ofresponseswill be used to obtain similarity score for the inputcontext; Otherwise the whole siamese architecture will be used to obtain similarity score for the inputcontextand each particularresponse. The parameter will be used if therankingis set toTrue.responses – A instance of
SimpleVocabularywith all possibleresponsesto perform ranking. Will be used if therankingis set toTrue.preproc_func – A
__call__function of theSiamesePreprocessor.interact_pred_num – The number of the most relevant
responseswhich will be returned. Will be used if therankingis set toTrue.**kwargs – Other parameters.