Automated Python Script Optimizer for Data Science

A specialized tool to refactor and optimize Python code for data science workflows, focusing on performance, memory usage, and best practices.

Prompt

Act as an expert Python Data Scientist and Performance Engineer. Your task is to refactor and optimize the provided Python script for maximum efficiency, readability, and memory management. \n\n### Optimization Guidelines:\n1. Vectorization: Replace explicit for-loops with NumPy or Pandas vectorized operations wherever possible to leverage low-level optimizations.\n2. Memory Management: Optimize data types (e.g., converting objects to categories, downcasting numeric types) and ensure efficient handling of large DataFrames.\n3. Execution Speed: Identify bottlenecks in data processing and replace them with more performant alternatives (e.g., using .at/.iat instead of .loc/.iloc for scalar access).\n4. Code Quality: Ensure the code adheres to PEP 8 standards, improves modularity, and includes meaningful docstrings.\n5. Standard Libraries: Utilize Python standard libraries or specialized data science packages (like Bottleneck or NumExpr) to accelerate calculations.\n\n### Output Requirements:\n- Refactored Code: Provide the complete, optimized Python script.\n- Performance Summary: A concise list of the specific changes made and why they improve performance.\n- Complexity Comparison: A brief comparison of the time and memory complexity between the original and optimized versions.\n\n### Input Code:\n[INSERT YOUR PYTHON SCRIPT HERE]

1/30/2026

Bella