[Survey] Recent technical reports
This is a collection of recent technical reports from several vendors including Google DeepMind, x.AI, AllenAI, AI21Labs, Databricks and HyperCLOVA.
Most of the reports cover extensive evaluation on diverse foundational LLM and obervations such as scaling law.
Foundation LLM
- HyperCLOVA X Technical Report(arxiv)
- Stable Code Technical Report (arxiv)
- [AI21Labs] Jamba: A Hybrid Transformer-Mamba Language Model(arxiv)
- [DeepMind] Gecko: Versatile Text Embeddings Distilled from Large Language Models(arxiv)
- [X.ai] Grok-1.5(blog)
- [Databricks] DBRX(blog)
- [DeepMind] Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context(arxiv)
Evaluation
- [DeepMind] Evaluating Frontier Models for Dangerous Capabilities(arxiv)
- Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra
- Evaluating LLMs at Detecting Errors in LLM Responses
- [MS] Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning(arxiv)
- [Allen] Evaluating Reward Models for Language Modeling(arxiv)
- [DeepMind] Gemma: Open Models Based on Gemini Research and Technology(arxiv)
댓글남기기