Author: drweb

SQL

Fix Slow, Bloated MSDB: Purge Old History And Add Missing IndexesAfter tempdb, msdb is often the most abused system database, growing unchecked until it tanks your backup reporting and job monitoring.I’ve watched MSDB performance degrade across multiple SQL Server instances. It’s not optimized out of the box and doesn’t get much care, so as it balloons to 100GB+, metadata queries crawl, showing up as top offenders in Activity Monitor.Another indicator of MSDB performance problems: missing indexes in MSDB showing at the top of the missing indexes DMV results.In the past, I’d just find and add missing indexes. But MSDB tuning…

Read More

Your chatbot can’t answer questions about your company’s internal docs. Your AI assistant hallucinates facts instead of checking your knowledge base. Sound familiar? RAG (Retrieval Augmented Generation) solves this by connecting your LLM to actual documents, so it answers based on facts, not fiction. In this guide, you’ll build a working RAG system in Python—from basic document search to production patterns with hybrid retrieval and re-ranking. The code uses LangChain and local embeddings, so you can test everything without paying for API keys. What you’ll learn: Why vanilla LLMs struggle with specific knowledge, how RAG fixes hallucination problems, building document…

Read More

In the wake of the massive Shai-Hulud supply chain attack that ripped through npm late last year and compromised more than 700 packages and exposed 25,000 repositories, developers in the JavaScript world embraced a two-part defense strategy. The widely adopted playbook called for disabling lifecycle scripts and using lockfiles. “It became the standard advice everywhere […]

Read More

The “Lost in the Middle” paper (Liu et al., 2023) proved what we already suspected: language models struggle with long contexts. Even Gemini’s 2 million context window doesn’t guarantee the model will use information buried in the middle of a prompt. The model’s attention degrades as context grows, leading to hallucinations and missed facts. RAG (Retrieval Augmented Generation) is a pipeline that combines information retrieval with text generation. Instead of stuffing everything into a prompt, you retrieve only the relevant chunks from a knowledge base and feed those to the LLM. This grounds the model’s responses in actual data rather…

Read More

An analysis published today by Opsera, a provider of a DevOps platform, finds that while adoption of artificial intelligence (AI) coding tools has increased developer productivity they also create more duplicate code, resulting in 15 to 18% more security vulnerabilities per line of code compared to code created by a human developer. Overall, the Opsera […]

Read More

The OpenAI Python SDK changed in November 2023 when version 1.0 dropped. If your code still uses openai.ChatCompletion.create(), you’re running deprecated patterns that broke two years ago. The OpenAI Python SDK is the official library for accessing gpt-4o, gpt-4o-mini, embeddings, vision analysis, and assistants from Python. Version 1.x uses client instances, Pydantic response models, and async/await patterns. The SDK supports gpt-4o (flagship model with 128k context) and gpt-4o-mini (fast and cheap with 128k context). You’ll learn chat completions with streaming, function calling with parallel execution, embeddings for semantic search, vision analysis, assistants with code interpreter, error handling with retries, and…

Read More
SQL

Organizations increasingly want Snowflake and Microsoft Fabric to coexist without duplicating data or fragmenting governance. With Fabric OneLake and open table formats like Iceberg and Delta, there are now multiple ways to make Snowflake data available inside Fabric—each with different tradeoffs around cost, performance, and ownership.This post walks through three practical architectures for using Snowflake data in Fabric OneLake, when each option makes sense, and the key tradeoffs to consider.The most important decision across all three options is which platform is the system of record and primary writer—most tradeoffs flow directly from that choice.1) Snowflake-managed Iceberg table + OneLake shortcutIn…

Read More