claf.tokens.embedding package¶

Submodules¶

class claf.tokens.embedding.base.TokenEmbedding(vocab)[source]¶

Bases: torch.nn.modules.module.Module

Token Embedding

It can be embedding matrix, language model (ELMo), neural machine translation model (CoVe) and features.

Args:
vocab: Vocab (rqa.tokens.vocab)

forward(tokens)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

get_vocab_size()[source]¶

class claf.tokens.embedding.bert_embedding.BertEmbedding(vocab, pretrained_model_name=None, trainable=False, unit='subword')[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

BERT Embedding(Encoder)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (https://arxiv.org/abs/1810.04805)

Args:
vocab: Vocab (claf.tokens.vocab)
Kwargs:
pretrained_model_name: … use_as_embedding: … trainable: Finetune or fixed

forward(inputs)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

remove_cls_sep_token(inputs, outputs)[source]¶

class claf.tokens.embedding.char_embedding.CharEmbedding(vocab, dropout=0.2, embed_dim=16, kernel_sizes=[5], num_filter=100, activation='relu')[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

Character Embedding (CharCNN) (https://arxiv.org/abs/1509.01626)

Args:
vocab: Vocab (claf.tokens.vocab)
Kwargs:
dropout: The number of dropout probability embed_dim: The number of embedding dimension kernel_sizes: The list of kernel size (n-gram) num_filter: The number of cnn filter activation: Activation Function (eg. ReLU)

forward(chars)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

class claf.tokens.embedding.cove_embedding.CoveEmbedding(vocab, glove_pretrained_path=None, model_pretrained_path=None, dropout=0.2, trainable=False, project_dim=None)[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

Cove Embedding

Learned in Translation: Contextualized Word Vectors (http://papers.nips.cc/paper/7209-learned-in-translation-contextualized-word-vectors.pdf)

Args:
vocab: Vocab (claf.tokens.vocab)
Kwargs:
dropout: The number of dropout probability pretrained_path: pretrained vector path (eg. GloVe) trainable: finetune or fixed project_dim: The number of project (linear) dimension

forward(words)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

class claf.tokens.embedding.elmo_embedding.ELMoEmbedding(vocab, options_file='elmo_2x4096_512_2048cnn_2xhighway_options.json', weight_file='elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5', do_layer_norm=False, dropout=0.5, trainable=False, project_dim=None)[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

ELMo Embedding Embedding From Language Model

Deep contextualized word representations (https://arxiv.org/abs/1802.0536)

Args:
vocab: Vocab (claf.tokens.vocab)
Kwargs:
options_file: ELMo model config file path weight_file: ELMo model weight file path do_layer_norm: Should we apply layer normalization (passed to ScalarMix)?

default is False

dropout: The number of dropout probability trainable: Finetune or fixed project_dim: The number of project (linear) dimension

forward(chars)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

class claf.tokens.embedding.frequent_word_embedding.FrequentTuningWordEmbedding(vocab, dropout=0.2, embed_dim=100, padding_idx=None, max_norm=None, norm_type=2, scale_grad_by_freq=False, sparse=False, pretrained_path=None)[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

Frequent Word Finetuning Embedding Finetuning embedding matrix, according to ‘threshold_index’

Args:
vocab: Vocab (claf.tokens.vocab)
Kwargs:
dropout: The number of dropout probability embed_dim: The number of embedding dimension padding_idx: If given, pads the output with the embedding vector at padding_idx

(initialized to zeros) whenever it encounters the index.

max_norm: If given, will renormalize the embedding vectors to have a norm lesser
than this before extracting. Note: this will modify weight in-place.

norm_type: The p of the p-norm to compute for the max_norm option. Default 2. scale_grad_by_freq: if given, this will scale gradients by the inverse of

frequency of the words in the mini-batch. Default False.

sparse: if True, gradient w.r.t. weight will be a sparse tensor.
See Notes under torch.nn.Embedding for more details regarding sparse gradients.

pretrained_path: pretrained vector path (eg. GloVe) trainable: finetune or fixed

forward(words, frequent_tuning=False)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

class claf.tokens.embedding.sparse_feature.OneHotEncoding(index, token_name, classes)[source]¶

Bases: torch.nn.modules.module.Module

Sparse to one-hot encoding

Args:
vocab: Vocab (claf.tokens.vocab)

forward(inputs)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

get_output_dim()[source]¶

class claf.tokens.embedding.sparse_feature.SparseFeature(vocab, embed_type, feature_count, params={})[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

Sparse Feature

Sparse to Embedding
One Hot Encoding

Args:
vocab: Vocab (claf.tokens.vocab) embed_type: The type of embedding [one_hot|embedding] feature_count: The number of feature count
Kwargs:
params: additional parameters for embedding module

forward(inputs)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

class claf.tokens.embedding.sparse_feature.SparseToEmbedding(index, token_name, classes, dropout=0, embed_dim=15, trainable=True, padding_idx=None, max_norm=None, norm_type=2, scale_grad_by_freq=False, sparse=False)[source]¶

Bases: torch.nn.modules.module.Module

Sparse to Embedding

Args:
token_name: token_name
Kwargs:
dropout: The number of dropout probability embed_dim: The number of embedding dimension padding_idx: If given, pads the output with the embedding vector at padding_idx

(initialized to zeros) whenever it encounters the index.

max_norm: If given, will renormalize the embedding vectors to have a norm lesser
than this before extracting. Note: this will modify weight in-place.

norm_type: The p of the p-norm to compute for the max_norm option. Default 2. scale_grad_by_freq: if given, this will scale gradients by the inverse of

frequency of the words in the mini-batch. Default False.

sparse: if True, gradient w.r.t. weight will be a sparse tensor.
See Notes under torch.nn.Embedding for more details regarding sparse gradients.

pretrained_path: pretrained vector path (eg. GloVe) trainable: finetune or fixed

forward(inputs)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

get_output_dim()[source]¶

class claf.tokens.embedding.word_embedding.WordEmbedding(vocab, dropout=0.2, embed_dim=100, padding_idx=None, max_norm=None, norm_type=2, scale_grad_by_freq=False, sparse=False, pretrained_path=None, trainable=True)[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

Word Embedding Default Token Embedding

Args:
vocab: Vocab (claf.tokens.vocab)
Kwargs:
dropout: The number of dropout probability embed_dim: The number of embedding dimension padding_idx: If given, pads the output with the embedding vector at padding_idx

(initialized to zeros) whenever it encounters the index.

max_norm: If given, will renormalize the embedding vectors to have a norm lesser
than this before extracting. Note: this will modify weight in-place.

norm_type: The p of the p-norm to compute for the max_norm option. Default 2. scale_grad_by_freq: if given, this will scale gradients by the inverse of

frequency of the words in the mini-batch. Default False.

sparse: if True, gradient w.r.t. weight will be a sparse tensor.
See Notes under torch.nn.Embedding for more details regarding sparse gradients.

pretrained_path: pretrained vector path (eg. GloVe) trainable: finetune or fixed

forward(words)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

Module contents¶

class claf.tokens.embedding.BertEmbedding(vocab, pretrained_model_name=None, trainable=False, unit='subword')[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

BERT Embedding(Encoder)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (https://arxiv.org/abs/1810.04805)

Args:
vocab: Vocab (claf.tokens.vocab)
Kwargs:
pretrained_model_name: … use_as_embedding: … trainable: Finetune or fixed

forward(inputs)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

remove_cls_sep_token(inputs, outputs)[source]¶

class claf.tokens.embedding.CharEmbedding(vocab, dropout=0.2, embed_dim=16, kernel_sizes=[5], num_filter=100, activation='relu')[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

Character Embedding (CharCNN) (https://arxiv.org/abs/1509.01626)

Args:
vocab: Vocab (claf.tokens.vocab)
Kwargs:
dropout: The number of dropout probability embed_dim: The number of embedding dimension kernel_sizes: The list of kernel size (n-gram) num_filter: The number of cnn filter activation: Activation Function (eg. ReLU)

forward(chars)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

class claf.tokens.embedding.CoveEmbedding(vocab, glove_pretrained_path=None, model_pretrained_path=None, dropout=0.2, trainable=False, project_dim=None)[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

Cove Embedding

Learned in Translation: Contextualized Word Vectors (http://papers.nips.cc/paper/7209-learned-in-translation-contextualized-word-vectors.pdf)

Args:
vocab: Vocab (claf.tokens.vocab)
Kwargs:
dropout: The number of dropout probability pretrained_path: pretrained vector path (eg. GloVe) trainable: finetune or fixed project_dim: The number of project (linear) dimension

forward(words)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

class claf.tokens.embedding.ELMoEmbedding(vocab, options_file='elmo_2x4096_512_2048cnn_2xhighway_options.json', weight_file='elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5', do_layer_norm=False, dropout=0.5, trainable=False, project_dim=None)[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

ELMo Embedding Embedding From Language Model

Deep contextualized word representations (https://arxiv.org/abs/1802.0536)

Args:
vocab: Vocab (claf.tokens.vocab)
Kwargs:
options_file: ELMo model config file path weight_file: ELMo model weight file path do_layer_norm: Should we apply layer normalization (passed to ScalarMix)?

default is False

dropout: The number of dropout probability trainable: Finetune or fixed project_dim: The number of project (linear) dimension

forward(chars)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

class claf.tokens.embedding.FrequentTuningWordEmbedding(vocab, dropout=0.2, embed_dim=100, padding_idx=None, max_norm=None, norm_type=2, scale_grad_by_freq=False, sparse=False, pretrained_path=None)[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

Frequent Word Finetuning Embedding Finetuning embedding matrix, according to ‘threshold_index’

Args:
vocab: Vocab (claf.tokens.vocab)
Kwargs:
dropout: The number of dropout probability embed_dim: The number of embedding dimension padding_idx: If given, pads the output with the embedding vector at padding_idx

(initialized to zeros) whenever it encounters the index.

max_norm: If given, will renormalize the embedding vectors to have a norm lesser
than this before extracting. Note: this will modify weight in-place.

norm_type: The p of the p-norm to compute for the max_norm option. Default 2. scale_grad_by_freq: if given, this will scale gradients by the inverse of

frequency of the words in the mini-batch. Default False.

sparse: if True, gradient w.r.t. weight will be a sparse tensor.
See Notes under torch.nn.Embedding for more details regarding sparse gradients.

pretrained_path: pretrained vector path (eg. GloVe) trainable: finetune or fixed

forward(words, frequent_tuning=False)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

class claf.tokens.embedding.SparseFeature(vocab, embed_type, feature_count, params={})[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

Sparse Feature

Sparse to Embedding
One Hot Encoding

Args:
vocab: Vocab (claf.tokens.vocab) embed_type: The type of embedding [one_hot|embedding] feature_count: The number of feature count
Kwargs:
params: additional parameters for embedding module

forward(inputs)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension

class claf.tokens.embedding.WordEmbedding(vocab, dropout=0.2, embed_dim=100, padding_idx=None, max_norm=None, norm_type=2, scale_grad_by_freq=False, sparse=False, pretrained_path=None, trainable=True)[source]¶

Bases: claf.tokens.embedding.base.TokenEmbedding

Word Embedding Default Token Embedding

Args:
vocab: Vocab (claf.tokens.vocab)
Kwargs:
dropout: The number of dropout probability embed_dim: The number of embedding dimension padding_idx: If given, pads the output with the embedding vector at padding_idx

(initialized to zeros) whenever it encounters the index.

max_norm: If given, will renormalize the embedding vectors to have a norm lesser
than this before extracting. Note: this will modify weight in-place.

norm_type: The p of the p-norm to compute for the max_norm option. Default 2. scale_grad_by_freq: if given, this will scale gradients by the inverse of

frequency of the words in the mini-batch. Default False.

sparse: if True, gradient w.r.t. weight will be a sparse tensor.
See Notes under torch.nn.Embedding for more details regarding sparse gradients.

pretrained_path: pretrained vector path (eg. GloVe) trainable: finetune or fixed

forward(words)[source]¶: embedding look-up

get_output_dim()[source]¶: get embedding dimension