Transformer Networks

Use self-attention mechanisms to handle sequential data without the recurrence found in RNNs. They can capture long-range dependencies and parallelize training more efficiently.
Author

Benedict Thekkel

Back to top