Posts

AI Speech Engineer Roadmap: From Zero to Production in 18 Months

Mar 11, 2026

A curated 18-month learning roadmap for becoming an AI Speech Engineer — covering foundations, core technologies (ASR, TTS, Speaker Verification, Diarization, Voice Conversion), and the latest Audio Language Models, distilled from 6 years of hands-on experience.

Mar 11, 2026

Why Do Language Models Hallucinate?

Sep 7, 2025

An analysis of why language models hallucinate — hallucinations arise from statistical pressures in training and evaluation procedures that reward guessing over acknowledging uncertainty.

Sep 7, 2025

Writing technical content in Academic

Apr 28, 2025

A guide to writing technical content in Academic — highlighting code snippets, rendering math equations, and drawing diagrams from text.

Apr 28, 2025

Speaker Diarization: From Traditional Methods to the Modern Models

Apr 28, 2025

Speaker Diarization answers “Who spoken when?” — covering core concepts, traditional and modern end-to-end approaches, and the latest Sortformer model for speaker segmentation.

Apr 28, 2025

Why Entropy Matters in Machine Learning?

Apr 4, 2025

Understanding entropy and why it’s a core concept in decision trees, neural networks, and loss functions like cross-entropy.

Apr 4, 2025

LoRA-Whisper: A Scalable and Efficient Solution for Multilingual ASR

Mar 15, 2025

Exploring LoRA-Whisper, a scalable and efficient approach for multilingual ASR using Low-Rank Adaptation to fine-tune OpenAI’s Whisper model while avoiding catastrophic forgetting across languages.

Mar 15, 2025

Understanding FlashAttention: Inner vs Outer Loop Optimization

Feb 1, 2025

FlashAttention is a groundbreaking optimization technique for computing attention in Transformer models, drastically improving GPU memory efficiency through inner vs outer loop restructuring.

Feb 1, 2025

Adversarial Attacks on Large Language Models (LLMs)

Jan 11, 2025

An overview of adversarial attacks on large language models (LLMs) — how manipulated inputs can deceive models into generating harmful or incorrect outputs, covering key attack types, implications, and defense strategies.

Jan 11, 2025

GLiNER: A Generalist Model for Named Entity Recognition using Bidirectional Transformers

Nov 2, 2024

A detailed summary of the GLiNER paper, introducing a lightweight, scalable, and highly effective model for open-type named entity recognition using bidirectional transformers with zero-shot generalization.

Nov 2, 2024

Handy Bash Snippets and Linux Tips

May 28, 2024

A curated collection of bash functions, troubleshooting commands, and performance tweaks that I often use in my daily workflow.

May 28, 2024

No results found

Posts