.. _ai: =========================== AI-assisted code generation =========================== When using LLMs to write Python code for web scraping, these are the most reasonable approaches to consider: Plain Python functions or classes **Pros:** Simple, dependency-free, and easy for LLMs to produce. **Cons:** You must define your own conventions and testing practices; integration across teams and tools can be ad-hoc. **Use when** you need a quick extractor or the logic is small and unlikely to be reused. :doc:`Scrapy spiders ` **Pros:** Built-in crawling, request scheduling, retries and many utilities. **Cons:** Large surface area for AI generation. Spiders mix crawling, error handling and extraction, which makes testing extraction in isolation difficult. **Avoid** generating full spiders with an LLM; prefer generating extraction logic separately. :ref:`web-poet page objects ` **Pros:** Small, standard contract for extraction with field-level decomposition, first-class testing support, and framework integration. **Cons:** Requires adopting web-poet idioms and a small framework cost, which can be unnecessary for trivial scripts. **Use when** you want maintainability, testability, and a predictable contract that can be used by tools and teams. .. note:: :doc:`scrapy-poet ` provides a great way to use web-poet page objects within Scrapy spiders, giving you the benefits of both approaches.