ptls.nn.seq_encoder
All classes from ptls.nn.seq_encoder also available in ptls.nn
ptls.nn.trx_encoder works with individual transaction.
ptls.nn.seq_encoder takes into account sequential structure and the links between transactions.
There are 2 types of seq encoders: - required embeddings as input - requires raw features as input
Embeddings as input
We implement ptls-api for torch and huggingface sequential layers:
ptls.nn.RnnEncoderfortorch.nn.GRUptls.nn.TransformerEncoderfortorch.nn.TransformerEncoderptls.nn.LongformerEncoderfortransformers.LongformerModel
They expect vectorized input, which can be obtained with TrxEncoder.
Output format controlled by is_reduce_sequence property. True means that sequence will be reduced
to one single vector. It's last hidden state for RNN and CLS token output for transformer.
False means than all hidden vectors for all transactions will be returned. Set this property based on your needs.
It's possible to set it during encoder initialisation. It's possible to change it in runtime.
Simple Example:
x = PaddedBatch(torch.randn(10, 80, 4), torch.randint(40, 80, (10,)))
seq_encoder = RnnEncoder(input_size=4, hidden_size=16)
y = seq_encoder(x)
assert y.payload.size() == (10, 80, 16)
More complicated example:
x = PaddedBatch(
payload={
'mcc_code': torch.randint(1, 10, (3, 8)),
'currency': torch.randint(1, 4, (3, 8)),
'amount': torch.randn(3, 8) * 4 + 5,
},
length=torch.Tensor([2, 8, 5]).long()
)
trx_encoder = TrxEncoder(
embeddings={
'mcc_code': {'in': 10, 'out': 6},
'currency': {'in': 4, 'out': 2},
},
numeric_values={'amount': 'identity'},
)
seq_encoder = RnnEncoder(input_size=trx_encoder.output_size, hidden_size=16)
z = trx_encoder(x)
y = seq_encoder(z) # embeddings wor each transaction
seq_encoder.is_reduce_sequence = True
h = seq_encoder(z) # embeddings for sequences, aggregate all transactions in one embedding
assert y.payload.size() == (3, 8, 16)
assert h.size() == (3, 16)
Usually seq_encoder used with preliminary trx_encoder. It's possible to pack them to torch.nn.Sequential.
It's possible to add more layers between trx_encoder and seq_encoder (linear, normalisation, convolutions, ...).
They should work with PaddedBatch. Examples will be presented later. Such layers also works after seq_encoder
with is_reduce_sequence=False.
Features as input
As you can see TrxEncoder works with raw features and compatible with embedding seq encoder.
We make a composition layers, which contains TrxEncoder and one SeqEncoder implementation.
There are:
ptls.nn.RnnSeqEncoderwithRnnEncoderptls.nn.TransformerSeqEncoderwithTransformerEncoderptls.nn.LongformerSeqEncoderwithLongformerEncoder
They work as simple Sequential(trx_encoder, seq_encoder) and support is_reduce_sequence property.
The main advantage that you can simply create such encoder from config file using hydra instantiate tools.
You can avoid of explicit set of seq_encoder.input_size, they will be taken from trx_encoder. Let's compare.
Sequential-style:
config = """
model:
_target_: torch.nn.Sequential
_args_:
-
_target_: ptls.nn.TrxEncoder
embeddings:
mcc_code:
in: 10
out: 6
currency:
in: 4
out: 2
numeric_values:
amount: identity
-
_target_: ptls.nn.RnnEncoder
input_size: 9 # depends on TrxEncoder output
hidden_size: 24
"""
model = hydra.utils.instantiate(OmegaConf.create(config))['model']
SeqEncoder-style:
config = """
model:
_target_: ptls.nn.RnnSeqEncoder
trx_encoder:
_target_: ptls.nn.TrxEncoder
embeddings:
mcc_code:
in: 10
out: 6
currency:
in: 4
out: 2
numeric_values:
amount: identity
hidden_size: 24
"""
model = hydra.utils.instantiate(OmegaConf.create(config))['model']
The second config are simpler. Both of configs make an identical model. You can check:
x = PaddedBatch(
payload={
'mcc_code': torch.randint(1, 10, (3, 8)),
'currency': torch.randint(1, 4, (3, 8)),
'amount': torch.randn(3, 8) * 4 + 5,
},
length=torch.Tensor([2, 8, 5]).long()
)
y = model(x)
AggFeatureSeqEncoder
ptls.nn.AggFeatureSeqEncoder.
It looks like seq_encoder. It take raw features at input and provide reduced representation at output.
This encoder creates features, which are good for boosting model. This is a strong baseline for many tasks.
AggFeatureSeqEncoder eat the same input as other seq_encoders, and it can easily be replaced
by rnn of transformer seq encoder. It use gpu and works fast. It haven't parameters for learn.
Possible pipeline:
seq_encoder = AggFeatureSeqEncoder(...)
agg_embeddings = trainer.predict(seq_encoder, dataloader)
catboost_model.fit(agg_embeddings, target)
We plain to split AggFeatureSeqEncoder into components which will be compatible with other ptls-layers.
It will be possible to choose flexible between TrxEncoder with AggSeqEncoder and OheEncoder with RnnEncoder.
Classes
See docstrings for classes.
Take trx embedding as input:
ptls.nn.RnnEncoderptls.nn.TransformerEncoderptls.nn.LongformerEncoder
Take raw features as input:
ptls.nn.RnnSeqEncoderptls.nn.TransformerSeqEncoderptls.nn.LongformerSeqEncoderptls.nn.AggFeatureSeqEncoder