admin

admin

Last seen: 23 hours ago

Member since Sep 06, 2024

AI leaderboards are no longer useful. It's time to s...

What spending $2,000 can tell us about evaluating AI agents

Scientists should use AI as a tool, not an oracle

How AI hype leads to flawed research that fuels more hype

AI scaling myths

Scaling will run out. The question is when.

New paper: AI agents that matter

Rethinking AI agent benchmarking and evaluation

AI existential risk probabilities are too unreliable...

How speculation gets laundered through pseudo-quantification

ML/AI Platform Build vs Buy Decision: What Factors t...

An ML/AI platform provides a coherent collection of tools and frameworks to b...

Introducing Redesigned Navigation, Run Groups, Repor...

We’ve been working on these improvements for quite some time, so it’s excitin...

How to Migrate From MLflow to Neptune

MLflow is a framework widely used for its experiment-tracking capabilities, b...

Building LLM Applications With Vector Databases

As a Machine Learning Engineer working with many companies, I repeatedly enco...

Adversarial Machine Learning: Defense Strategies

The growing prevalence of ML models in business-critical applications results...

3 Takes on End-to-End For the MLOps Stack: Was It Wo...

As machine learning (ML) drives innovation across industries, organizations s...

LLM Observability: Fundamentals, Practices, and Tools

Large Language Models (LLMs) have become the driving force behind AI-powered ...

Observability in LLMOps: Different Levels of Scale

Observability is invaluable in LLMOps. Whether we’re talking about pretrainin...

LLM Evaluation For Text Summarization

Text summarization is a prime use case of LLMs (Large Language Models). It ai...

Strategies For Effective Prompt Engineering

When I first delved into machine learning, prompt engineering seemed like a n...

LLM For Structured Data

It is estimated that 80% to 90% of the data worldwide is unstructured. Howeve...