Skip to content

About Wayne

Blog

Blog¶

Technical notes on machine learning, GPU programming, and security.

Notes on Serving LLMs with TensorRT-LLM and Triton — 2026-05-31 · LLM serving / NVIDIA stack
Where tensor-parallel inference hits the NVLink wall — 2026-05-31 · GPU / distributed systems
0 % vs 40 %: making a RAG agent refuse to hallucinate — 2026-05-31 · LLM / RAG
Notes on Federated Learning and Differential Privacy — 2026-05-31 · privacy-preserving ML
Notes on CUDA Tensor Core GEMM (WMMA) — 2026-05-31 · CUDA / GPU kernels
Meta-Reinforcement Learning of Structured Exploration Strategies — 2025-01-11 · machine learning
CUDA Programming 入門 — 2025-01-11 · GPU / parallel programming
滲透測試入門技術 — 2025-01-10 · cybersecurity