Posts

Ten Noteworthy AI Research Papers of 2023

This year has felt distinctly different. I've been working in, on, and with m...

Understanding and Coding Self-Attention, Multi-Head ...

This article will teach you about self-attention mechanisms used in transform...

Model Merging, Mixtures of Experts, and Towards Smal...

Model Merging, Mixtures of Experts, and Towards Smaller LLMs

Improving LoRA: Implementing Weight-Decomposed Low-R...

Low-rank adaptation (LoRA) is a machine learning technique that modifies a pr...

Tips for LLM Pretraining and Evaluating Reward Models

Discussing AI Research Papers in March 2024

Research Papers in February 2024: A LoRA Successor, ...

Once again, this has been an exciting month in AI research. This month, I'm c...

Using and Finetuning Pretrained Transformers

What are the different ways to use and finetune pretrained large language mod...

How Good Are the Latest Open LLMs? And Is DPO Better...

Discussing the Latest Model Releases and AI Research in April 2024

LLM Research Insights: Instruction Masking and New L...

Discussing the Latest Model Releases and AI Research in May 2024

Developing an LLM: Building, Training, Finetuning

A Deep Dive into the Lifecycle of LLM Development

Instruction Pretraining LLMs

The Latest Research in Instruction Finetuning

New LLM Pre-training and Post-training Paradigms

A Look at How Moderns LLMs Are Trained

Building LLMs from the Ground Up: A 3-hour Coding Wo...

If your weekend plans include catching up on AI developments and understandin...

GeForce NOW to Bring ‘Dead Rising Deluxe Remaster’ t...

Rise and shine — Capcom’s latest action-adventure game, Dead Rising Deluxe Re...

Extend Viva Connections with pre-built 3rd party Ada...

Viva Connections is the gateway for the employee experience and provides an e...

Announcing the General Availability of Java experien...

Azure Container Apps is a fully managed, serverless container platform that e...