Published On Jan 21, 2024
All you need to know about the transformer architecture: How to structure the inputs, attention (Queries, Keys, Values), positional embeddings, residual connections. Bonus: an overview of the difference between Recurrent Neural Networks (RNNs) and transformers.
9:19 Order of multiplication should be the opposite: x1(vector) * Wq(matrix) = q1(vector). Otherwise we do not get the 1x3 dimensionality at the end. Sorry for messing up the animation!
Check this out for a super cool transformer visualisation! š https://poloclub.github.io/transforme...
ā”ļø AI Coffee Break Merch! šļø https://aicoffeebreak.creator-spring....
Outline:
00:00 Transformers explained
00:47 Text inputs
02:29 Image inputs
03:57 Next word prediction / Classification
06:08 The transformer layer: 1. MLP sublayer
06:47 2. Attention explained
07:57 Attention vs. self-attention
08:35 Queries, Keys, Values
09:19 Order of multiplication should be the opposite: x1(vector) * Wq(matrix) = q1(vector).
11:26 Multi-head attention
13:04 Attention scales quadratically
13:53 Positional embeddings
15:11 Residual connections and Normalization Layers
17:09 Masked Language Modelling
17:59 Difference to RNNs
Thanks to our Patrons who support us in Tier 2, 3, 4: š
Dres. Trost GbR, Siltax, Vignesh Valliappan, āŖ@Mutual_Informationā¬ , Kshitij
Our old Transformer explained šŗ video: Ā Ā Ā ā¢Ā TheĀ TransformerĀ neuralĀ networkĀ archit...Ā Ā
šŗ Tokenization explained: Ā Ā Ā ā¢Ā WhatĀ isĀ tokenizationĀ andĀ howĀ doesĀ itĀ ...Ā Ā
šŗ Word embeddings: Ā Ā Ā ā¢Ā HowĀ modernĀ searchĀ enginesĀ workĀ āĀ Vect...Ā Ā
š½ļø Replacing Self-Attention: Ā Ā Ā ā¢Ā ReplacingĀ Self-attentionĀ Ā
š½ļø Position embeddings: Ā Ā Ā ā¢Ā PositionĀ encodingsĀ inĀ TransformersĀ ex...Ā Ā
āŖ@SerranoAcademyā¬ Transformer series: Ā Ā Ā ā¢Ā TheĀ AttentionĀ MechanismĀ inĀ LargeĀ Lang...Ā Ā
š Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Åukasz Kaiser, and Illia Polosukhin. "Attention is all you need." Advances in neural information processing systems 30 (2017).
āāāāāāāāāāāāāāāāāāāāāāāāāā
š„ Optionally, pay us a coffee to help with our Coffee Bean production! ā
Patreon: Ā Ā /Ā aicoffeebreakĀ Ā
Ko-fi: https://ko-fi.com/aicoffeebreak
āāāāāāāāāāāāāāāāāāāāāāāāāā
š Links:
AICoffeeBreakQuiz: Ā Ā Ā /Ā aicoffeebreakĀ Ā
Twitter: Ā Ā /Ā aicoffeebreakĀ Ā
Reddit: Ā Ā /Ā aicoffeebreakĀ Ā
YouTube: Ā Ā Ā /Ā aicoffeebreakĀ Ā
#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #researchā
Music šµ : Sunset n Beachz - Ofshane
Video editing: Nils Trost