Streaming Map-Filter-Reduce in Go

Classic map-reduce-filter pattern process data step-by-step. For example, a map-filter-collect on an array of numbers would first transform each number, then filter out the one’s that meet a condition, and finally collect the results. However, in real-time applications, we often need data to be processed as it becomes available. Imagine an LLM agent that summarizes paragraphs from search one by one. Instead of waiting for all sections to finish, it can start streaming summaries to the frontend immediately, giving users faster, progressive feedback. ...

Hybrid Search & Chunk Stitching: Advanced RAG with Custom-GPT Actions in Go

Custom GPT Actions let us expose an API that ChatGPT calls on behalf of users. We build the backend — ChatGPT handles the frontend: web, mobile, voice, message history, auth. But a production-grade RAG backend needs more than basic vector search. This article covers building a high-performance Custom GPT in Go, along with the interesting parts: hybrid search with Reciprocal Rank Fusion, chunk stitching for contiguous context, query-focused summarization with SLMs, and decoupled embedding strategies for A/B testing. ...