Blog¶
Technical notes on machine learning, GPU programming, and security.
- Notes on Serving LLMs with TensorRT-LLM and Triton — 2026-05-31 · LLM serving / NVIDIA stack
- Where tensor-parallel inference hits the NVLink wall — 2026-05-31 · GPU / distributed systems
- 0 % vs 50 %: making a RAG agent refuse to hallucinate — 2026-05-31 · LLM / RAG
- Notes on Federated Learning and Differential Privacy — 2026-05-31 · privacy-preserving ML
- Notes on CUDA Tensor Core GEMM (WMMA) — 2026-05-31 · CUDA / GPU kernels
- Meta-Reinforcement Learning of Structured Exploration Strategies — 2025-01-11 · machine learning
- CUDA Programming 入門 — 2025-01-11 · GPU / parallel programming
- 滲透測試入門技術 — 2025-01-10 · cybersecurity