- Ready
Hermes Agent Memory: A Four-Layer L0–L2 Design
Hermes Agent's memory system: system-prompt assembly (L0), persistent memory (L1), external plugins (L1.5), session search (L2), and compression. No vector DB.
Research · · updated May 24 ·aiagentllmmemoryretrievalsearchsqlitearchitecture - Ready
Prompt Caching in Practice: Cache Design and defer_loading
Lessons from Claude Code's prompt caching: the philosophy behind cache design, how OpenAI / Anthropic / Google differ, and the defer_loading stub pattern.
Research · · updated May 24 ·aiagentllmpromptperformancesoftware engineering - Ready
Hermes Agent Memory: Engineering Safety Mechanisms
Six safety designs in Hermes memory_tool.py: injection scan, file lock, reload-under-lock, refuse-on-overflow, atomic write, and substring-match delete.
Research · · updated May 4 ·aiagentsecuritysoftware engineeringconcurrency - Ready
Faiss vs Chroma: Vector Store Selection Tradeoffs
Notes on how Faiss, Chroma, and adjacent vector stores position themselves for RAG / vector retrieval, plus the tradeoffs across ANN index algorithms.
Research · · updated May 20 ·aillmragretrievalvector databasefaisschromamilvusqdrantweaviatepgvectorann - Ready
RAG Retrieval Details and Pipeline Design
Notes on the key retrieval-side details of RAG — Embedding, Reranker, Chunking, Hybrid Search, Query Transformation, and more.
Research · ·aillmragretrievalembeddingrerankerreference - Ready
Message Types in the Vercel AI SDK
Notes on how the Vercel AI SDK layers its Message types, its SSE streaming protocol, and practical state-management advice for real-world development.
Reference · · updated Apr 10 ·aillmfrontendreacttypescriptagentreference - Ready
Agent Routing and Cost Control in Multi-Agent Systems
Notes on the fundamentals of agent route / agent routing, the common ways to implement it, and how to think about cost control in multi-agent systems.
Research · · updated Apr 4 ·aiagentllmmulti-agentorchestrationroutingworkflow - Ready
Modern Next.js Stack Tradeoffs: ORM, Auth, and State
Working through a Grok conversation to lay out the real tradeoffs and decision criteria for ORM, auth, and state management in a modern Next.js project.
Research · · updated Mar 30 ·frontendtypescriptreactsoftware engineeringagentreference - Ready
Harness Engineering and Codex in Production
A practical write-up on how the OpenAI team built a production-grade project with code written almost 100% by Codex.
Research · · updated Mar 14 ·aiagentcodexsoftware engineeringworkflow - Ready
Jina Embeddings API: A Deep Dive
Working notes on Jina Embeddings — multilingual retrieval, long context, Late Chunking, and how to choose between v4 and v5.
Research · · updated Apr 16 ·aillmragembeddingrerankerjinaqwen - Ready
Defense-in-Depth Notes on Prompt Injection
A consolidation of the core ideas — and interview-ready answers — behind OpenAI's, Anthropic's, and common engineering defenses against prompt injection.
Research · · updated Mar 14 ·aiagentpromptsecurityllm
📒
Notes
First-hand notes · always growing
A place for raw, first-hand notes: agent learning, engineering trade-offs (embeddings, vector databases, AI SDKs), and sharp interview questions. Not yet polished into posts — but closer to "what I'm thinking right now" than any finished article.
Showing 1-11 of 11