참고
[Paper Review] TRPO and PPO
2 분 소요
🧠 Paper Review: TRPO & PPO
[Paper] Mamba: Linear-Time Sequence Modeling with Selective State Spaces
최대 1 분 소요
This is a brief review for “Mamba: Linear-Time Sequence Modeling with Selective State Spaces”. You can see the paper at this paper link.
[SSM] Modeling Sequences with Structured State Spaces - Part I
10 분 소요
1.1 Deep sequence model Definition 1.1 (Informal). We use sequence model to refer to a parameterized map on sequences $y=f_\theta(x)$ where inputs and o...
[LLM-RL] Lecture 2: Value Functions
2 분 소요
Overview. This post focuses on Value Functions in reinforcement learning. We begin with a quick recap of the RL setup, then introduce state-value / action-va...
댓글남기기