Inputs¶
Page object classes, in their __init__
method,
must define input parameters with type hints pointing to input classes.
Those input classes may be:
Other page object classes.
Item classes, when using a framework that can provide item classes.
Any other class that subclasses
Injectable
or is registered or decorated withInjectable.register
.
Based on the target URL and parameter type hints, frameworks automatically build the required objects at run time, and pass
them to the __init__
method of the corresponding page object class.
For example, if a page object class has an __init__
parameter of type
HttpResponse
, and the target URL is
https://example.com, your framework would send an HTTP request to
https://example.com, download the response, build an
HttpResponse
object with the response data,
and pass it to the __init__
method of the page object class being used.
Built-in input classes¶
Warning
Not all frameworks support all web-poet built-in input classes.
The web_poet.page_inputs
module defines multiple classes that you can
define as inputs for a page object class, including:
HttpResponse
, a complete HTTP response, including URL, headers, and body. This is the most common input for a page object class.HttpClient
, to send additional requests.RequestUrl
, the target URL before following redirects. Useful, for example, to skip the target URL download, and instead useHttpClient
to send a custom request based on parts of the target URL.PageParams
, to receive data from the crawling code.Stats
, to write key-value data pairs during parsing that you can inspect later, e.g. for debugging purposes.BrowserResponse
, which includes URL, status code andBrowserHtml
of a rendered web page.
Custom input classes¶
You may define your own input classes if you are using a framework that supports it.
However, note that custom input classes may make your page object classes less portable across frameworks.