claf.model.sequence_classification package¶

Submodules¶

class claf.model.sequence_classification.mixin.SequenceClassification[source]¶

Bases: object

Sequence Classification Mixin Class

make_metrics(predictions)[source]¶

Make metrics with prediction dictionary

Args:
predictions: prediction dictionary consisting of
key: ‘id’ (sequence id)

value: dictionary consisting of

class_idx
Returns:
metrics: metric dictionary consisting of
‘macro_f1’: class prediction macro(unweighted mean) f1

‘macro_precision’: class prediction macro(unweighted mean) precision

‘macro_recall’: class prediction macro(unweighted mean) recall

‘accuracy’: class prediction accuracy

make_predictions(output_dict)[source]¶

Make predictions with model’s output_dict

Args:
output_dict: model’s output dictionary consisting of
sequence_embed: embedding vector of the sequence

logits: representing unnormalized log probabilities of the class

class_idx: target class idx

data_idx: data idx

loss: a scalar loss to be optimized
Returns:
predictions: prediction dictionary consisting of
key: ‘id’ (sequence id)

value: dictionary consisting of

class_idx

predict(output_dict, arguments, helper)[source]¶

Inference by raw_feature

Args:
output_dict: model’s output dictionary consisting of
sequence_embed: embedding vector of the sequence

logits: representing unnormalized log probabilities of the class.
arguments: arguments dictionary consisting of user_input helper: dictionary to get the classification result, consisting of
class_idx2text: dictionary converting class_idx to class_text
Returns: output dict (dict) consisting of
- logits: representing unnormalized log probabilities of the class
- class_idx: predicted class idx
- class_text: predicted class text

print_examples(index, inputs, predictions)[source]¶

Print evaluation examples

Args:
index: data index inputs: mini-batch inputs predictions: prediction dictionary consisting of
key: ‘id’ (sequence id)

value: dictionary consisting of

class_idx
Returns:
print(Sequence, Target Class, Predicted Class)

write_predictions(predictions, file_path=None, is_dict=True, pycm_obj=None)[source]¶: Override write_predictions() in ModelBase to log confusion matrix

class claf.model.sequence_classification.structured_self_attention.StructuredSelfAttention(token_embedder, num_classes, encoding_rnn_hidden_dim=300, encoding_rnn_num_layer=2, encoding_rnn_dropout=0.0, attention_dim=350, num_attention_heads=30, sequence_embed_dim=2000, dropout=0.5, penalization_coefficient=1.0)[source]¶

Bases: claf.model.sequence_classification.mixin.SequenceClassification, claf.model.base.ModelWithTokenEmbedder

Implementation of model presented in A Structured Self-attentive Sentence Embedding (https://arxiv.org/abs/1703.03130)

Args:
token_embedder: used to embed the sequence num_classes: number of classified classes
Kwargs:
encoding_rnn_hidden_dim: hidden dimension of rnn (unidirectional) encoding_rnn_num_layer: the number of rnn layers encoding_rnn_dropout: rnn dropout probability attention_dim: attention dimension # d_a in the paper num_attention_heads: number of attention heads # r in the paper sequence_embed_dim: dimension of sequence embedding dropout: classification layer dropout penalization_coefficient: penalty coefficient for frobenius norm

forward(features, labels=None)[source]¶

Args:
features: feature dictionary like below. {“sequence”: [0, 3, 4, 1]}
Kwargs:
label: label dictionary like below. {“class_idx”: 2, “data_idx”: 0}

Do not calculate loss when there is no label. (inference/predict mode)
Returns: output_dict (dict) consisting of
- sequence_embed: embedding vector of the sequence
- logits: representing unnormalized log probabilities of the class.
- class_idx: target class idx
- data_idx: data idx
- loss: a scalar loss to be optimized

penalty(attention)[source]¶

Module contents¶

class claf.model.sequence_classification.BertForSeqCls(token_makers, num_classes, pretrained_model_name=None, dropout=0.2)[source]¶

Bases: claf.model.sequence_classification.mixin.SequenceClassification, claf.model.base.ModelWithoutTokenEmbedder

Implementation of Sentence Classification model presented in BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (https://arxiv.org/abs/1810.04805)

Args:
token_embedder: used to embed the sequence num_classes: number of classified classes
Kwargs:
pretrained_model_name: the name of a pre-trained model dropout: classification layer dropout

forward(features, labels=None)[source]¶

Args:
features: feature dictionary like below. {

“bert_input”: {

“feature”: [
[3, 4, 1, 0, 0, 0, …], …,

]

}, “token_type”: {

“feature”: [
[0, 0, 0, 0, 0, 0, …], …,

],

}

}
Kwargs:
label: label dictionary like below. {

“class_idx”: [2, 1, 0, 4, 5, …] “data_idx”: [2, 4, 5, 7, 2, 1, …]

} Do not calculate loss when there is no label. (inference/predict mode)
Returns: output_dict (dict) consisting of
- sequence_embed: embedding vector of the sequence
- logits: representing unnormalized log probabilities of the class.
- class_idx: target class idx
- data_idx: data idx
- loss: a scalar loss to be optimized

print_examples(index, inputs, predictions)[source]¶

Print evaluation examples

Args:
index: data index inputs: mini-batch inputs predictions: prediction dictionary consisting of
key: ‘id’ (sequence id)

value: dictionary consisting of

class_idx
Returns:
print(Sequence, Sequence Tokens, Target Class, Predicted Class)

class claf.model.sequence_classification.RobertaForSeqCls(token_makers, num_classes, pretrained_model_name=None, dropout=0.2)[source]¶

Bases: claf.model.sequence_classification.mixin.SequenceClassification, claf.model.base.ModelWithoutTokenEmbedder

Implementation of Sentence Classification model presented in BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (https://arxiv.org/abs/1810.04805)

Args:
token_embedder: used to embed the sequence num_classes: number of classified classes
Kwargs:
pretrained_model_name: the name of a pre-trained model dropout: classification layer dropout

forward(features, labels=None)[source]¶

Args:
features: feature dictionary like below. {

“bert_input”: {

“feature”: [
[3, 4, 1, 0, 0, 0, …], …,

]

},

}
Kwargs:
label: label dictionary like below. {

“class_idx”: [2, 1, 0, 4, 5, …] “data_idx”: [2, 4, 5, 7, 2, 1, …]

} Do not calculate loss when there is no label. (inference/predict mode)
Returns: output_dict (dict) consisting of
- sequence_embed: embedding vector of the sequence
- logits: representing unnormalized log probabilities of the class.
- class_idx: target class idx
- data_idx: data idx
- loss: a scalar loss to be optimized

print_examples(index, inputs, predictions)[source]¶

Print evaluation examples

Args:
index: data index inputs: mini-batch inputs predictions: prediction dictionary consisting of
key: ‘id’ (sequence id)

value: dictionary consisting of

class_idx
Returns:
print(Sequence, Sequence Tokens, Target Class, Predicted Class)

class claf.model.sequence_classification.StructuredSelfAttention(token_embedder, num_classes, encoding_rnn_hidden_dim=300, encoding_rnn_num_layer=2, encoding_rnn_dropout=0.0, attention_dim=350, num_attention_heads=30, sequence_embed_dim=2000, dropout=0.5, penalization_coefficient=1.0)[source]¶

Bases: claf.model.sequence_classification.mixin.SequenceClassification, claf.model.base.ModelWithTokenEmbedder

Implementation of model presented in A Structured Self-attentive Sentence Embedding (https://arxiv.org/abs/1703.03130)

Args:
token_embedder: used to embed the sequence num_classes: number of classified classes
Kwargs:
encoding_rnn_hidden_dim: hidden dimension of rnn (unidirectional) encoding_rnn_num_layer: the number of rnn layers encoding_rnn_dropout: rnn dropout probability attention_dim: attention dimension # d_a in the paper num_attention_heads: number of attention heads # r in the paper sequence_embed_dim: dimension of sequence embedding dropout: classification layer dropout penalization_coefficient: penalty coefficient for frobenius norm

forward(features, labels=None)[source]¶

Args:
features: feature dictionary like below. {“sequence”: [0, 3, 4, 1]}
Kwargs:
label: label dictionary like below. {“class_idx”: 2, “data_idx”: 0}

Do not calculate loss when there is no label. (inference/predict mode)
Returns: output_dict (dict) consisting of
- sequence_embed: embedding vector of the sequence
- logits: representing unnormalized log probabilities of the class.
- class_idx: target class idx
- data_idx: data idx
- loss: a scalar loss to be optimized

penalty(attention)[source]¶