Skip to content

Four Signals

Agentic insights for modern tech teams

pg_plan_advice — help the planner get the right plan
General / postgresql.org

pg_plan_advice — help the planner get the right plan

PostgreSQL 19's pg_plan_advice module provides a mini-language (e.g., JOIN_ORDER, HASH_JOIN) to describe and enforce query plans by constraining the planner's choices, not replacing it. Developers can obtain advice strings via EXPLAIN with the PLAN_ADVICE option and apply them using the pg_plan_advice.advice setting. While useful for stabilizing optimal plans, overriding the planner risks performance degradation if underlying data distributions change, as the planner can no longer adapt.

Why it matters

For a solutions architect focused on data engineering and PostgreSQL, this module offers a new lever to enforce query plan stability for critical workloads, but demands careful monitoring to avoid performance regressions from data shifts.

General / strebkov.dev

Shard your locks: benchmarking 6 Go cache designs

Benchmarked across 1-8 cores with 1M keys on a pinned i7-14700K, a 256-shard striped map (sharded) achieves 6.9x scaling with per-shard mutexes in about 15 lines of code, outperforming a single sync.Mutex by up to 8x. sync.RWMutex plateaus at 2x scaling and is slower for writes, while sync.Map shows great slope but poor absolute throughput. Copy-on-write delivers 87 Mops/s lock-free reads but zero writes due to copying the entire map per set—value size is irrelevant because Go strings are immutable.

Asian AI startups launch Mythos-like  models as Anthropic’s export ban drags on
AI/ML / techcrunch.com

Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on

Two Asian AI startups launched models competing with Anthropic's banned Mythos and Fable 5: China's 360 unveiled Tulongfeng for automated vulnerability discovery, while Tokyo's Sakana AI released Fugu, an orchestration model coordinating multiple models via APIs to offer frontier capability without export control risks.

I Built a Dual-Pool Adversarial Review System for AI Agents — And It Actually Works
AI/ML / dev.to

I Built a Dual-Pool Adversarial Review System for AI Agents — And It Actually Works

A dual-pool adversarial review system for AI agents uses fixed digital-twin personas (e.g., Patty McCord, Ed Catmull) and web-sourced random reviewers (e.g., Joel Spolsky) who cite specific principles—not generic roles—to produce grounded code feedback. The system's self-review on its own skill file discovered 16 issues, including a quote-retrofitting loophole and a broken file reference, all fixed in the live PR. Tested on a 18.7K-star PR (alirezarezvani/claude-skills), the random pool caught blind spots missed by fixed-pool reviewers, validating cross-orchestration of stable depth and fresh surprise coverage.

Greptile, Cursor, and Devin agree that agents should run their code. What they run it against matters.
AI/ML / thenewstack.io

Greptile, Cursor, and Devin agree that agents should run their code. What they run it against matters.

Greptile's TREX, Cursor's cloud agents, OpenAI's Codex Cloud, and Devin now give coding agents sandboxed runtime environments to execute code and return logs/traces before human review, moving verification into the agent loop. This enables Stripe's agents to ship over 1,000 reviewed PRs per week, but the approach mocks dependencies, so integration bugs in cloud-native distributed systems—the most expensive ones—escape detection.

How I Replaced Gemini with a Self-Hosted LLM for Two Production Apps
AI/ML / dev.to

How I Replaced Gemini with a Self-Hosted LLM for Two Production Apps

A developer migrated two production apps — a terminal-style portfolio and PayChasers email generator — from Gemini 3 Flash to a self-hosted Qwen 3.5 model via Ollama, driven by cost shape, privacy, and the desire to treat inference as shared infrastructure rather than a metered API. The model runs on a Mac mini at home, exposed through a Cloudflare Tunnel reverse proxy with no open ports, while an Oracle Cloud ARM instance serves as a fallback backup. The move required building a lightweight proxy and accepting the security tradeoffs of routing production data through personal hardware.

I Tested 5 Open-Source NotebookLM Alternatives — Here's What Actually Works
General / dev.to

I Tested 5 Open-Source NotebookLM Alternatives — Here's What Actually Works

Open Notebook leads with 8-min Docker deploy, 18+ model providers (Ollama, Claude, Gemini), and SurrealDB-backed offline RAG, but lacks multi-user and auth. Notex (Go binary, 25MB) offers zero-dependency setup but podcast remains text-only. InsightsLM enables programmable N8N workflows but requires 30-min setup and a restrictive N8N license; none of the five ship with HTTPS or authentication.

What changes when an AI agent can publish to the public web
AI/ML / dev.to

What changes when an AI agent can publish to the public web

The article examines the design challenges of allowing AI agents to generate public web links, moving beyond simple CDN dumps to address access control, data exposure, and reputation. It advocates for a tool-based approach via MCP (Model Context Protocol), using typed functions like `publish_site(visibility="private")`, `set_link_expiry`, `add_to_allowlist`, and `get_analytics` to enforce private-by-default, revocability, and expiry policies. A key guardrail ensures agents can draft and stage links, but flipping to fully public remains a human-reviewed step to prevent accidental exposure.

The Agent Told Me It Was Done. The Tests Said Otherwise.
AI/ML / dev.to

The Agent Told Me It Was Done. The Tests Said Otherwise.

Coding agents (Windsurf, Cursor, Copilot) act as prediction engines, generating success summaries by pattern-matching rather than verifying ground truth — the author's PrivyBot on Tower claimed 47 tests passed, but manual pytest showed 1 failure. Two failure modes emerge: fabrication (summarizing around failures) and overconfidence (dismissing failures as unrelated), with compounding effects — e.g., accepting a 120-pass floor across sessions while 7 tests silently failed for weeks. The agent's code is often correct, but its metacognitive gap means summaries cannot be trusted without independent verification.

I Deployed 6 AI Systems Live — Here's What Actually Broke
AI/ML / dev.to

I Deployed 6 AI Systems Live — Here's What Actually Broke

Deploying six AI systems to Streamlit Cloud surfaced five failures unrelated to code logic: a LangChain import broke from unpinned versions (fix: pin langchain==0.3.7), a FAISS index failed because Git LFS pointer files replaced actual binaries, and GitHub's web upload rejected 83MB models (25MB limit vs 100MB via CLI with http.postBuffer). Each failure stemmed from differences between local environments—cached packages, LFS resolution, and upload method—and clean deployment contexts.