claf.model.sequence_classification package¶
Submodules¶
-
class
claf.model.sequence_classification.mixin.
SequenceClassification
[source]¶ Bases:
object
Sequence Classification Mixin Class
-
make_metrics
(predictions)[source]¶ Make metrics with prediction dictionary
- Args:
- predictions: prediction dictionary consisting of
key: ‘id’ (sequence id)
- value: dictionary consisting of
class_idx
- Returns:
- metrics: metric dictionary consisting of
‘macro_f1’: class prediction macro(unweighted mean) f1
‘macro_precision’: class prediction macro(unweighted mean) precision
‘macro_recall’: class prediction macro(unweighted mean) recall
‘accuracy’: class prediction accuracy
-
make_predictions
(output_dict)[source]¶ Make predictions with model’s output_dict
- Args:
- output_dict: model’s output dictionary consisting of
sequence_embed: embedding vector of the sequence
logits: representing unnormalized log probabilities of the class
class_idx: target class idx
data_idx: data idx
loss: a scalar loss to be optimized
- Returns:
- predictions: prediction dictionary consisting of
key: ‘id’ (sequence id)
- value: dictionary consisting of
class_idx
-
predict
(output_dict, arguments, helper)[source]¶ Inference by raw_feature
- Args:
- output_dict: model’s output dictionary consisting of
sequence_embed: embedding vector of the sequence
logits: representing unnormalized log probabilities of the class.
arguments: arguments dictionary consisting of user_input helper: dictionary to get the classification result, consisting of
class_idx2text: dictionary converting class_idx to class_text
- Returns: output dict (dict) consisting of
logits: representing unnormalized log probabilities of the class
class_idx: predicted class idx
class_text: predicted class text
-
-
class
claf.model.sequence_classification.structured_self_attention.
StructuredSelfAttention
(token_embedder, num_classes, encoding_rnn_hidden_dim=300, encoding_rnn_num_layer=2, encoding_rnn_dropout=0.0, attention_dim=350, num_attention_heads=30, sequence_embed_dim=2000, dropout=0.5, penalization_coefficient=1.0)[source]¶ Bases:
claf.model.sequence_classification.mixin.SequenceClassification
,claf.model.base.ModelWithTokenEmbedder
Implementation of model presented in A Structured Self-attentive Sentence Embedding (https://arxiv.org/abs/1703.03130)
- Args:
token_embedder: used to embed the sequence num_classes: number of classified classes
- Kwargs:
encoding_rnn_hidden_dim: hidden dimension of rnn (unidirectional) encoding_rnn_num_layer: the number of rnn layers encoding_rnn_dropout: rnn dropout probability attention_dim: attention dimension # d_a in the paper num_attention_heads: number of attention heads # r in the paper sequence_embed_dim: dimension of sequence embedding dropout: classification layer dropout penalization_coefficient: penalty coefficient for frobenius norm
-
forward
(features, labels=None)[source]¶ - Args:
features: feature dictionary like below. {“sequence”: [0, 3, 4, 1]}
- Kwargs:
label: label dictionary like below. {“class_idx”: 2, “data_idx”: 0}
Do not calculate loss when there is no label. (inference/predict mode)
- Returns: output_dict (dict) consisting of
sequence_embed: embedding vector of the sequence
logits: representing unnormalized log probabilities of the class.
class_idx: target class idx
data_idx: data idx
loss: a scalar loss to be optimized
Module contents¶
-
class
claf.model.sequence_classification.
BertForSeqCls
(token_makers, num_classes, pretrained_model_name=None, dropout=0.2)[source]¶ Bases:
claf.model.sequence_classification.mixin.SequenceClassification
,claf.model.base.ModelWithoutTokenEmbedder
Implementation of Sentence Classification model presented in BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (https://arxiv.org/abs/1810.04805)
- Args:
token_embedder: used to embed the sequence num_classes: number of classified classes
- Kwargs:
pretrained_model_name: the name of a pre-trained model dropout: classification layer dropout
-
forward
(features, labels=None)[source]¶ - Args:
features: feature dictionary like below. {
- “bert_input”: {
- “feature”: [
[3, 4, 1, 0, 0, 0, …], …,
]
}, “token_type”: {
- “feature”: [
[0, 0, 0, 0, 0, 0, …], …,
],
}
}
- Kwargs:
label: label dictionary like below. {
“class_idx”: [2, 1, 0, 4, 5, …] “data_idx”: [2, 4, 5, 7, 2, 1, …]
} Do not calculate loss when there is no label. (inference/predict mode)
- Returns: output_dict (dict) consisting of
sequence_embed: embedding vector of the sequence
logits: representing unnormalized log probabilities of the class.
class_idx: target class idx
data_idx: data idx
loss: a scalar loss to be optimized
-
print_examples
(index, inputs, predictions)[source]¶ Print evaluation examples
- Args:
index: data index inputs: mini-batch inputs predictions: prediction dictionary consisting of
key: ‘id’ (sequence id)
- value: dictionary consisting of
class_idx
- Returns:
print(Sequence, Sequence Tokens, Target Class, Predicted Class)
-
class
claf.model.sequence_classification.
RobertaForSeqCls
(token_makers, num_classes, pretrained_model_name=None, dropout=0.2)[source]¶ Bases:
claf.model.sequence_classification.mixin.SequenceClassification
,claf.model.base.ModelWithoutTokenEmbedder
Implementation of Sentence Classification model presented in BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (https://arxiv.org/abs/1810.04805)
- Args:
token_embedder: used to embed the sequence num_classes: number of classified classes
- Kwargs:
pretrained_model_name: the name of a pre-trained model dropout: classification layer dropout
-
forward
(features, labels=None)[source]¶ - Args:
features: feature dictionary like below. {
- “bert_input”: {
- “feature”: [
[3, 4, 1, 0, 0, 0, …], …,
]
},
}
- Kwargs:
label: label dictionary like below. {
“class_idx”: [2, 1, 0, 4, 5, …] “data_idx”: [2, 4, 5, 7, 2, 1, …]
} Do not calculate loss when there is no label. (inference/predict mode)
- Returns: output_dict (dict) consisting of
sequence_embed: embedding vector of the sequence
logits: representing unnormalized log probabilities of the class.
class_idx: target class idx
data_idx: data idx
loss: a scalar loss to be optimized
-
print_examples
(index, inputs, predictions)[source]¶ Print evaluation examples
- Args:
index: data index inputs: mini-batch inputs predictions: prediction dictionary consisting of
key: ‘id’ (sequence id)
- value: dictionary consisting of
class_idx
- Returns:
print(Sequence, Sequence Tokens, Target Class, Predicted Class)
-
class
claf.model.sequence_classification.
StructuredSelfAttention
(token_embedder, num_classes, encoding_rnn_hidden_dim=300, encoding_rnn_num_layer=2, encoding_rnn_dropout=0.0, attention_dim=350, num_attention_heads=30, sequence_embed_dim=2000, dropout=0.5, penalization_coefficient=1.0)[source]¶ Bases:
claf.model.sequence_classification.mixin.SequenceClassification
,claf.model.base.ModelWithTokenEmbedder
Implementation of model presented in A Structured Self-attentive Sentence Embedding (https://arxiv.org/abs/1703.03130)
- Args:
token_embedder: used to embed the sequence num_classes: number of classified classes
- Kwargs:
encoding_rnn_hidden_dim: hidden dimension of rnn (unidirectional) encoding_rnn_num_layer: the number of rnn layers encoding_rnn_dropout: rnn dropout probability attention_dim: attention dimension # d_a in the paper num_attention_heads: number of attention heads # r in the paper sequence_embed_dim: dimension of sequence embedding dropout: classification layer dropout penalization_coefficient: penalty coefficient for frobenius norm
-
forward
(features, labels=None)[source]¶ - Args:
features: feature dictionary like below. {“sequence”: [0, 3, 4, 1]}
- Kwargs:
label: label dictionary like below. {“class_idx”: 2, “data_idx”: 0}
Do not calculate loss when there is no label. (inference/predict mode)
- Returns: output_dict (dict) consisting of
sequence_embed: embedding vector of the sequence
logits: representing unnormalized log probabilities of the class.
class_idx: target class idx
data_idx: data idx
loss: a scalar loss to be optimized