NLP

Why Do Language Models Hallucinate?

Overview Recently, OpenAI has just released the paper “Why Language Models Hallucinate” by Adam Tauman Kalai, Ofir Nachum, Santosh Vempala, and Edwin Zhang (2025). Abstraction: Like students facing hard exam questions, large language models sometimes guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty.

Sep 7, 2025 5 min read NLP, Large Language Models

Why Do Language Models Hallucinate?

Why Entropy Matters in Machine Learning?

Low vs High Entropy Entropy is a powerful and fundamental concept that quietly drives some of the most effective algorithms in machine learning. From decision trees to deep neural networks, entropy plays a central role in helping models navigate uncertainty and make better predictions.

Apr 4, 2025 4 min read NLP, Speech, Machine Learning

Why Entropy Matters in Machine Learning?

Understanding FlashAttention: Inner vs Outer Loop Optimization

Understanding FlashAttention: Inner vs Outer Loop Optimization FlashAttention is a groundbreaking optimization technique for computing attention in Transformer models. It drastically improves performance by reducing memory bottlenecks and utilizing GPU memory more efficiently.

Feb 1, 2025 2 min read NLP, FlashAttention

Understanding FlashAttention: Inner vs Outer Loop Optimization

Adversarial Attacks on Large Language Models (LLMs)

Adversarial Attacks on Large Language Models (LLMs) Adversarial attacks on large language models (LLMs) involve manipulating inputs to deceive the model into generating harmful, biased, or incorrect outputs. These attacks exploit the vulnerabilities of LLMs, which rely on patterns in training data to generate responses.

Jan 11, 2025 4 min read NLP, Large Language Models

Adversarial Attacks on Large Language Models (LLMs)

GLiNER: A Generalist Model for Named Entity Recognition using Bidirectional Transformers

1. What is Named Entity Recognition (NER)? Named Entity Recognition (NER) is a fundamental task in Natural Language Processing (NLP) that involves identifying and classifying spans of text that refer to real-world entities such as:

Nov 2, 2024 4 min read NLP, Named Entity Recognition, Deep Learning

GLiNER: A Generalist Model for Named Entity Recognition using Bidirectional Transformers

Comparing batch vs layer normalization

The purpose of this post is just to understand the key difference between two types of well-known normalization techniques.

Minh Nguyen Le

Last updated on Aug 25, 2022 2 min read NLP, TTS

Comparing batch vs layer normalization