Quartz 4

Home

❯

02 AISystem

❯

AISysReview

❯

ZOMI infra

❯

6_Algorithm_Data

❯

1_Basic TODO

❯

Transformer

Transformer

Oct 12, 20251 min read

https://www.bilibili.com/video/BV1rt421476q/?spm_id_from=333.1387.collection.video_card.click&vd_source=bc07d988d4ccb4ab77470cec6bb87b69

Motivation

slow to train

short memory

vanishing gradient

slower to train

short memory

vanishing gradient

Transformer Attention

任务:机器翻译

input

Encoder

to → hidden layer

inference

Decoder

BERT Vs GPT

BERT oct 2018

GPT-I Jun 2018


Graph View

  • Motivation
  • Transformer Attention
  • Encoder
  • Decoder
  • BERT Vs GPT

Backlinks

  • 1_Basic TODO

Created with Quartz v4.5.2 © 2025

  • GitHub
  • Discord Community