[DevLog] Overview 최대 1 분 소요 Topics to be covered in somedays Monitoring models in production by $Nvidia Tech Blog$ link 공유하기 Twitter Facebook LinkedIn 이전 다음 댓글남기기
[LLM-RL] Lecture 1: MDP, Objective, Value Functions, and Imitation Learning 3 분 소요 Overview. This post builds from the MDP framework to the RL objective and value functions, then contrasts pure RL with Imitation Learning (IL), focusing on B...
[Paper] Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning (HICRA) 최대 1 분 소요 This is a brief review for “Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning (HICRA)”. You can see the paper at this link.
[Paper] REFRAG: Rethinking RAG‑based Decoding 최대 1 분 소요 This is a brief review for “REFRAG: Rethinking RAG‑based Decoding”. You can see the paper at this link.
[Survey] Recent technical reports 최대 1 분 소요 This is a collection of recent technical reports from several vendors including Google DeepMind, x.AI, AllenAI, AI21Labs, Databricks and HyperCLOVA.
댓글남기기