claf.model.sequence_classification package

Submodules

class claf.model.sequence_classification.mixin.SequenceClassification[source]

Bases: object

Sequence Classification Mixin Class

make_metrics(predictions)[source]

Make metrics with prediction dictionary

  • Args:
    predictions: prediction dictionary consisting of
    • key: ‘id’ (sequence id)

    • value: dictionary consisting of
      • class_idx

  • Returns:
    metrics: metric dictionary consisting of
    • ‘macro_f1’: class prediction macro(unweighted mean) f1

    • ‘macro_precision’: class prediction macro(unweighted mean) precision

    • ‘macro_recall’: class prediction macro(unweighted mean) recall

    • ‘accuracy’: class prediction accuracy

make_predictions(output_dict)[source]

Make predictions with model’s output_dict

  • Args:
    output_dict: model’s output dictionary consisting of
    • sequence_embed: embedding vector of the sequence

    • logits: representing unnormalized log probabilities of the class

    • class_idx: target class idx

    • data_idx: data idx

    • loss: a scalar loss to be optimized

  • Returns:
    predictions: prediction dictionary consisting of
    • key: ‘id’ (sequence id)

    • value: dictionary consisting of
      • class_idx

predict(output_dict, arguments, helper)[source]

Inference by raw_feature

  • Args:
    output_dict: model’s output dictionary consisting of
    • sequence_embed: embedding vector of the sequence

    • logits: representing unnormalized log probabilities of the class.

    arguments: arguments dictionary consisting of user_input helper: dictionary to get the classification result, consisting of

    • class_idx2text: dictionary converting class_idx to class_text

  • Returns: output dict (dict) consisting of
    • logits: representing unnormalized log probabilities of the class

    • class_idx: predicted class idx

    • class_text: predicted class text

print_examples(index, inputs, predictions)[source]

Print evaluation examples

  • Args:

    index: data index inputs: mini-batch inputs predictions: prediction dictionary consisting of

    • key: ‘id’ (sequence id)

    • value: dictionary consisting of
      • class_idx

  • Returns:

    print(Sequence, Target Class, Predicted Class)

write_predictions(predictions, file_path=None, is_dict=True, pycm_obj=None)[source]

Override write_predictions() in ModelBase to log confusion matrix

class claf.model.sequence_classification.structured_self_attention.StructuredSelfAttention(token_embedder, num_classes, encoding_rnn_hidden_dim=300, encoding_rnn_num_layer=2, encoding_rnn_dropout=0.0, attention_dim=350, num_attention_heads=30, sequence_embed_dim=2000, dropout=0.5, penalization_coefficient=1.0)[source]

Bases: claf.model.sequence_classification.mixin.SequenceClassification, claf.model.base.ModelWithTokenEmbedder

Implementation of model presented in A Structured Self-attentive Sentence Embedding (https://arxiv.org/abs/1703.03130)

  • Args:

    token_embedder: used to embed the sequence num_classes: number of classified classes

  • Kwargs:

    encoding_rnn_hidden_dim: hidden dimension of rnn (unidirectional) encoding_rnn_num_layer: the number of rnn layers encoding_rnn_dropout: rnn dropout probability attention_dim: attention dimension # d_a in the paper num_attention_heads: number of attention heads # r in the paper sequence_embed_dim: dimension of sequence embedding dropout: classification layer dropout penalization_coefficient: penalty coefficient for frobenius norm

forward(features, labels=None)[source]
  • Args:

    features: feature dictionary like below. {“sequence”: [0, 3, 4, 1]}

  • Kwargs:

    label: label dictionary like below. {“class_idx”: 2, “data_idx”: 0}

    Do not calculate loss when there is no label. (inference/predict mode)

  • Returns: output_dict (dict) consisting of
    • sequence_embed: embedding vector of the sequence

    • logits: representing unnormalized log probabilities of the class.

    • class_idx: target class idx

    • data_idx: data idx

    • loss: a scalar loss to be optimized

penalty(attention)[source]

Module contents

class claf.model.sequence_classification.BertForSeqCls(token_makers, num_classes, pretrained_model_name=None, dropout=0.2)[source]

Bases: claf.model.sequence_classification.mixin.SequenceClassification, claf.model.base.ModelWithoutTokenEmbedder

Implementation of Sentence Classification model presented in BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (https://arxiv.org/abs/1810.04805)

  • Args:

    token_embedder: used to embed the sequence num_classes: number of classified classes

  • Kwargs:

    pretrained_model_name: the name of a pre-trained model dropout: classification layer dropout

forward(features, labels=None)[source]
  • Args:

    features: feature dictionary like below. {

    “bert_input”: {
    “feature”: [

    [3, 4, 1, 0, 0, 0, …], …,

    ]

    }, “token_type”: {

    “feature”: [

    [0, 0, 0, 0, 0, 0, …], …,

    ],

    }

    }

  • Kwargs:

    label: label dictionary like below. {

    “class_idx”: [2, 1, 0, 4, 5, …] “data_idx”: [2, 4, 5, 7, 2, 1, …]

    } Do not calculate loss when there is no label. (inference/predict mode)

  • Returns: output_dict (dict) consisting of
    • sequence_embed: embedding vector of the sequence

    • logits: representing unnormalized log probabilities of the class.

    • class_idx: target class idx

    • data_idx: data idx

    • loss: a scalar loss to be optimized

print_examples(index, inputs, predictions)[source]

Print evaluation examples

  • Args:

    index: data index inputs: mini-batch inputs predictions: prediction dictionary consisting of

    • key: ‘id’ (sequence id)

    • value: dictionary consisting of
      • class_idx

  • Returns:

    print(Sequence, Sequence Tokens, Target Class, Predicted Class)

class claf.model.sequence_classification.RobertaForSeqCls(token_makers, num_classes, pretrained_model_name=None, dropout=0.2)[source]

Bases: claf.model.sequence_classification.mixin.SequenceClassification, claf.model.base.ModelWithoutTokenEmbedder

Implementation of Sentence Classification model presented in BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (https://arxiv.org/abs/1810.04805)

  • Args:

    token_embedder: used to embed the sequence num_classes: number of classified classes

  • Kwargs:

    pretrained_model_name: the name of a pre-trained model dropout: classification layer dropout

forward(features, labels=None)[source]
  • Args:

    features: feature dictionary like below. {

    “bert_input”: {
    “feature”: [

    [3, 4, 1, 0, 0, 0, …], …,

    ]

    },

    }

  • Kwargs:

    label: label dictionary like below. {

    “class_idx”: [2, 1, 0, 4, 5, …] “data_idx”: [2, 4, 5, 7, 2, 1, …]

    } Do not calculate loss when there is no label. (inference/predict mode)

  • Returns: output_dict (dict) consisting of
    • sequence_embed: embedding vector of the sequence

    • logits: representing unnormalized log probabilities of the class.

    • class_idx: target class idx

    • data_idx: data idx

    • loss: a scalar loss to be optimized

print_examples(index, inputs, predictions)[source]

Print evaluation examples

  • Args:

    index: data index inputs: mini-batch inputs predictions: prediction dictionary consisting of

    • key: ‘id’ (sequence id)

    • value: dictionary consisting of
      • class_idx

  • Returns:

    print(Sequence, Sequence Tokens, Target Class, Predicted Class)

class claf.model.sequence_classification.StructuredSelfAttention(token_embedder, num_classes, encoding_rnn_hidden_dim=300, encoding_rnn_num_layer=2, encoding_rnn_dropout=0.0, attention_dim=350, num_attention_heads=30, sequence_embed_dim=2000, dropout=0.5, penalization_coefficient=1.0)[source]

Bases: claf.model.sequence_classification.mixin.SequenceClassification, claf.model.base.ModelWithTokenEmbedder

Implementation of model presented in A Structured Self-attentive Sentence Embedding (https://arxiv.org/abs/1703.03130)

  • Args:

    token_embedder: used to embed the sequence num_classes: number of classified classes

  • Kwargs:

    encoding_rnn_hidden_dim: hidden dimension of rnn (unidirectional) encoding_rnn_num_layer: the number of rnn layers encoding_rnn_dropout: rnn dropout probability attention_dim: attention dimension # d_a in the paper num_attention_heads: number of attention heads # r in the paper sequence_embed_dim: dimension of sequence embedding dropout: classification layer dropout penalization_coefficient: penalty coefficient for frobenius norm

forward(features, labels=None)[source]
  • Args:

    features: feature dictionary like below. {“sequence”: [0, 3, 4, 1]}

  • Kwargs:

    label: label dictionary like below. {“class_idx”: 2, “data_idx”: 0}

    Do not calculate loss when there is no label. (inference/predict mode)

  • Returns: output_dict (dict) consisting of
    • sequence_embed: embedding vector of the sequence

    • logits: representing unnormalized log probabilities of the class.

    • class_idx: target class idx

    • data_idx: data idx

    • loss: a scalar loss to be optimized

penalty(attention)[source]