askill
sql-expert

sql-expertSafety 80Repository

Expert system for generating, validating, and optimizing ClickHouse SQL. Use this when the user needs data, queries, or analysis.

206 stars
4.1k downloads
Updated 3/8/2026

Package Files

Loading files...
SKILL.md

🚨 CRITICAL RULE: MANDATORY VALIDATION

You MUST call validate_sql(sql) for every new query you generate. Context Note: Historical validation steps are pruned to save tokens, but this does NOT excuse you from validating new queries in the current turn. Always validate before executing.

1. Schema Discovery & Context

  • Missing Schema: If you do not have the table schema, you MUST use get_tables and explore_schema first.
    • Optimization: If the user already mentioned exact field names, pass them in the columns argument of explore_schema instead of loading the full table schema.
  • Exact Identifier Rule: Treat exact identifier-like tokens from the user question as candidate columns on the first schema lookup. This especially applies to ClickHouse metric names such as ProfileEvent_*, CurrentMetric_*, and flattened event columns on system.* tables.
  • Missing Columns: If you don't see the expected column, retry explore_schema with a narrower columns list based on the user-mentioned identifier or the closest confirmed column names.
  • Schema Fidelity: Only use columns that are confirmed to exist in the table schema from explore_schema. Do not assume standard columns exist if they are not in the tool output.
  • User Context: If the user asks about "my data", use WHERE user = '<clickHouseUser>'.
  • System Tables: For queries on system.* tables (e.g., system.query_log, system.parts, system.merges), defer to the clickhouse-system-queries skill - it contains table-specific patterns, predicates, and resource metrics that this skill does not cover. For system.query_log, do not generate SQL until references/system-query-log.md has been loaded via skill_resource, and do not call search_query_log for chart/time-series requests.

2. Syntax Rules (The Grammar)

  • Tables: ALWAYS use fully qualified names (e.g., database.table).
  • Semicolons: NEVER include a trailing semicolon (;).
  • Enums: Use exact string literals for Enum columns.
  • Safety: ALWAYS use LIMIT for data exploration queries.

3. Optimization Rules (Best Practices)

  • Time filters: Always filter by the partition key (usually event_date or timestamp) first. Use bounded time windows (e.g., last 24h, 7 days) unless the user asks for all history.
  • Primary Keys (CRITICAL): ClickHouse indexes are sparse. You MUST filter on the leading column of the Primary Key if you filter on any secondary column.
    • Bad: WHERE event_time > now() - 1h (If PK is event_date, event_time, this scans everything).
    • Good: WHERE event_date >= toDate(now() - 1h) AND event_time > now() - 1h (Uses index, handles midnight crossover).
  • Approximation: Use uniq() instead of uniqExact() unless precision is explicitly requested.
  • Joins: Put the smaller table on the RIGHT. Use GLOBAL IN only for distributed queries.

4. Execution Workflow

  1. Generate: Create the SQL following the rules above.
  2. Validate (MANDATORY): Call validate_sql(sql).
    • If invalid: Read the error, fix the SQL, and retry (max 3 attempts).
  3. Decide Action:
    • Visualization: IF the user wants a chart, DO NOT execute. Pass the SQL to the visualization skill logic.
    • Data: IF the user wants answers (lists, counts), call execute_sql(sql).
    • Code Only: IF the user asks to "write SQL", just output the code block.

Install

Download ZIP
Requires askill CLI v1.0+β–Ά

AI Quality Score

72/100Analyzed 3/8/2026

Well-structured technical reference skill for ClickHouse SQL with clear validation requirements, optimization rules, and execution workflow. High on actionability and clarity but somewhat narrow in scope (ClickHouse-specific) and missing troubleshooting guidance. Tags are partially mismatched.

80
85
60
70
85

Metadata

Licenseunknown
Version-
Updated3/8/2026
PublisherFrankChen021

Tags

databasegithub-actionsobservability