admin

admin

Last seen: 23 hours ago

Member since Sep 06, 2024

Reinforcement Learning From Human Feedback (RLHF) Fo...

Reinforcement Learning from Human Feedback (RLHF) has turned out to be the ke...

Is the future of AI open or closed? Watch today’s Pr...

By Sayash Kapoor, Rishi Bommasani, Percy Liang, Arvind Narayanan Perhaps the ...

Evaluating LLMs is a minefield

Annotated slides from a recent talk

How Transparent Are Foundation Model Developers?

Introducing the Foundation Model Transparency Index

LLM Training: RLHF and Its Alternatives

I frequently reference a process called Reinforcement Learning with Human Fee...

From Self-Alignment to LongLoRA

Another month, another round of interesting research papers ranging from larg...

LLM Business and Busyness: Recent Company Investment...

Discussing Recent Company Investments and AI Adoption, New Small Openly Avail...

AI and Open Source in 2023

The Highs and Lows: A Year in Review

A Potential Successor to RLHF for Efficient LLM Alig...

From Vision Transformers to innovative large language model finetuning techni...

Practical Tips for Finetuning LLMs Using LoRA (Low-R...

Things I Learned From Hundreds of Experiments

Tackling Hallucinations, Boosting Reasoning Abilitie...

This month, I want to focus on three papers that address three distinct probl...

Ten Noteworthy AI Research Papers of 2023

This year has felt distinctly different. I've been working in, on, and with m...

Understanding and Coding Self-Attention, Multi-Head ...

This article will teach you about self-attention mechanisms used in transform...

Model Merging, Mixtures of Experts, and Towards Smal...

Model Merging, Mixtures of Experts, and Towards Smaller LLMs

Improving LoRA: Implementing Weight-Decomposed Low-R...

Low-rank adaptation (LoRA) is a machine learning technique that modifies a pr...

Research Papers in February 2024: A LoRA Successor, ...

Once again, this has been an exciting month in AI research. This month, I'm c...