Prompts for Data
A structured data generation architect for testing, training, and development. Describe your schema, domain, and constraints β it walks you through a phased protocol to produce realistic synthetic datasets that respect relationships, distributions, edge cases, and privacy boundaries.
Paste a CSV (or describe your dataset) and ask questions in plain English. Get a full exploratory analysis β summary stats, distributions, anomalies, correlations β plus the exact Python or SQL code to reproduce everything. No pandas knowledge required.
Design, debug, and optimize data pipelines β from raw ingestion to clean warehouse tables. Covers ETL/ELT patterns, schema design, Airflow/dbt/Spark, and data quality checks.