askill
ai

aiSafety 100Repository

Use this skill when building AI features, integrating LLMs, implementing RAG, working with embeddings, deploying ML models, or doing data science. Activates on mentions of OpenAI, Anthropic, Claude, GPT, LLM, RAG, embeddings, vector database, Pinecone, Qdrant, LangChain, LlamaIndex, DSPy, MLflow, fine-tuning, LoRA, QLoRA, model deployment, ML pipeline, feature engineering, or machine learning.

7 stars
1.2k downloads
Updated 2/8/2026

Package Files

Loading files...
SKILL.md

AI/ML Engineering

Build production AI systems with modern patterns and tools.

Quick Reference

The 2026 AI Stack

LayerToolPurpose
PromptingDSPyProgrammatic prompt optimization
OrchestrationLangGraphStateful multi-agent workflows
RAGLlamaIndexDocument ingestion and retrieval
VectorsQdrant / PineconeEmbedding storage and search
EvaluationRAGASRAG quality metrics
Experiment TrackingMLflow / W&BLogging, versioning, comparison
ServingBentoML / vLLMModel deployment
ProtocolMCPTool and context integration

DSPy: Programmatic Prompting

Manual prompts are dead. DSPy treats prompts as optimizable code:

import dspy

class QA(dspy.Signature):
    """Answer questions with short factoid answers."""
    question = dspy.InputField()
    answer = dspy.OutputField(desc="1-5 words")

# Create module
qa = dspy.Predict(QA)

# Use it
result = qa(question="What is the capital of France?")
print(result.answer)  # "Paris"

Optimize with real data:

from dspy.teleprompt import BootstrapFewShot

optimizer = BootstrapFewShot(metric=exact_match)
optimized_qa = optimizer.compile(qa, trainset=train_data)

RAG Architecture (Production)

Query → Rewrite → Hybrid Retrieval → Rerank → Generate → Cite
         │              │                │
         v              v                v
    Query expansion  Dense + BM25   Cross-encoder

LlamaIndex + LangGraph Pattern:

from llama_index.core import VectorStoreIndex
from langgraph.graph import StateGraph

# Data layer (LlamaIndex)
index = VectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine()

# Control layer (LangGraph)
def retrieve(state):
    response = query_engine.query(state["question"])
    return {"context": response.response, "sources": response.source_nodes}

graph = StateGraph(State)
graph.add_node("retrieve", retrieve)
graph.add_node("generate", generate_answer)
graph.add_edge("retrieve", "generate")

MCP Integration

Model Context Protocol is the standard for tool integration:

from mcp import Server, Tool

server = Server("my-tools")

@server.tool()
async def search_docs(query: str) -> str:
    """Search the knowledge base."""
    results = await vector_store.search(query)
    return format_results(results)

Embeddings (2026)

ModelDimensionsBest For
text-embedding-3-large3072General purpose
BGE-M31024Multilingual RAG
Qwen3-EmbeddingFlexibleCustom domains

Fine-Tuning with LoRA/QLoRA

from peft import LoraConfig, get_peft_model

config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
)

model = get_peft_model(base_model, config)
# Train on ~24GB VRAM (QLoRA on RTX 4090)

MLOps Pipeline

# MLflow tracking
mlflow.set_experiment("rag-v2")

with mlflow.start_run():
    mlflow.log_params({"chunk_size": 512, "model": "gpt-4"})
    mlflow.log_metrics({"faithfulness": 0.92, "relevance": 0.88})
    mlflow.log_artifact("prompts/qa.txt")

Evaluation with RAGAS

from ragas import evaluate
from ragas.metrics import faithfulness, answer_relevancy, context_precision

results = evaluate(
    dataset,
    metrics=[faithfulness, answer_relevancy, context_precision],
)
print(results)  # {'faithfulness': 0.92, 'answer_relevancy': 0.88, ...}

Vector Database Selection

DBBest ForPricing
QdrantSelf-hosted, filtering1GB free forever
PineconeManaged, zero-opsFree tier available
WeaviateKnowledge graphs14-day trial
MilvusBillion-scaleSelf-hosted

Agents

  • ai-engineer - LLM integration, RAG, MCP, production AI
  • mlops-engineer - Model deployment, monitoring, pipelines
  • data-scientist - Analysis, modeling, experimentation
  • ml-researcher - Cutting-edge architectures, paper implementation
  • cv-engineer - Computer vision, VLMs, image processing

Deep Dives

Examples

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

95/100Analyzed 2/12/2026

A comprehensive and modern AI/ML engineering guide covering the '2026 stack'. It provides actionable code snippets for DSPy, RAG, MCP, and fine-tuning, making it highly valuable for developers building AI features.

100
95
98
95
95

Metadata

Licenseunknown
Version-
Updated2/8/2026
PublisherNeverSight

Tags

ci-cddatabasellmobservabilityprompting