Changelog
0.24.1 (2026-04-28)
Fixed links in https://web-poet.readthedocs.io/llms.txt.
0.24.0 (2026-04-21)
Backward-incompatible changes:
The tutorial-only
web_poet.examplemodule has been removed.Fixturegot some backward-incompatible changes:The
type_nameandshort_nameproperties have been removed.- The following methods got a new
page_clsparameter:assert_field_correct()assert_full_item_correct()assert_no_extra_fields()assert_no_toitem_exceptions()assert_toitem_exception()get_output()get_page()
Improvements:
Added a built-in framework for simple use cases.
Fixtureinstances are no longer tied to a single page object class.While pytest discovery still requires the parent folder name to be named after a page object class, a new
--page-objectoption ofpython -m web_poet.testingallows specifying a different page object class.Retrynow accepts an optionalmax_retriesparameter.Added
DictStatCollector.
0.23.3 (2026-04-07)
Added AI-assisted code generation to the docs.
Made the documentation more LLM-friendly, with markdown versions of every page and
llms.txtandllms-full.txtfiles.
0.23.2 (2026-03-10)
JSON files in test fixtures are now saved using UTF-8 instead of the system encoding.
0.23.1 (2026-01-27)
@fieldno longer strips docstrings from decorated methods.
0.23.0 (2026-01-22)
Dropped Python 3.9 support.
Added
annotation_encode()(see Input annotations) andannotation_decode().Implemented type hint improvements.
0.22.0 (2025-12-15)
Tests now put expected and actual values into pytest user properties.
0.21.0 (2025-11-24)
Added
BrowserPagepage object class to work withBrowserResponse.Added
BrowserResponse.textattribute.
0.20.0 (2025-10-28)
Added support for Python 3.14.
Added support for
BrowserResponse,AnyResponseandBrowserHtmldependencies to the testing framework.Explicitly re-export public names.
0.19.2 (2025-08-22)
Fixed runtime resolving of type annotations for some types.
0.19.1 (2025-08-13)
Improved type annotations.
0.19.0 (2025-06-06)
Removed some deprecated code:
The
web_poet.overridesmodule is removed.The
ItemWebPage,OverrideRuleandPageObjectRegistryclasses are removed.The
from_override_rules()class method and theget_overrides()andsearch_overrides()methods ofRulesRegistryare removed.The
overridesparameter ofhandle_urls()is removed.The
RequestUrlandResponseUrlclasses can no longer be imported fromweb_poet.page_inputs.http.
Tests now support items with
RequestUrlandResponseUrlobjects.Improved the pytest plugin:
Pytest ≥ 7.0.0 is now required.
Tests within a test case can now be run individually.
Tests are now compatible with vscode-python.
Fixed an error of
is_injectable()withGenericAliason Python ≤ 3.10.
0.18.0 (2025-01-30)
Removed support for Python 3.8, added support for Python 3.13.
The minimum required version of url-matcher changed from
0.2.0to0.4.0.type(None)is no longer considered injectable.
0.17.1 (2024-10-11)
web_poet.mixins.SelectableMixin.selectoris now created with thebase_urlvalue set toself.urlif this attribute exists.Added a mention of the form2request library to the
HttpRequestdocumentation.CI improvements.
0.17.0 (2024-03-04)
Now requires
andi >= 0.5.0.Package requirements that were unversioned now have minimum versions specified.
Added support for Python 3.12.
Added support for
typing.Annotateddependencies to the serialization and testing code.Documentation improvements.
CI improvements.
0.16.0 (2024-01-23)
Added new
AnyResponsewhich holds eitherBrowserResponse, orHttpResponse.Documentation improvements.
0.15.1 (2023-11-21)
HttpRequestHeadersnow has afrom_bytes_dictclass method, likeHttpResponseHeaders.
0.15.0 (2023-09-11)
0.14.0 (2023-08-03)
Dropped Python 3.7 support.
Now requires
packaging >= 20.0.Fixed detection of the
Returnsbase class.Improved docs.
Updated type hints.
Updated CI tools.
0.13.1 (2023-05-30)
Fixed an issue with
HttpClientwhich happens when a response with a non-standard status code is received.
0.13.0 (2023-05-30)
A new dependency
BrowserResponsehas been added. It contains a browser-rendered page URL, status code and HTML.The Rules documentation section has been rewritten.
0.12.0 (2023-05-05)
The testing framework now allows defining a custom item adapter.
We have made a backward-incompatible change on test fixture serialization: the
type_namefield of exceptions has been renamed toimport_path.Fixed built-in Python types, e.g.
int, not working as field processors.
0.11.0 (2023-04-24)
JMESPath support is now available: you can use
WebPage.jmespath()andHttpResponse.jmespath()to run queries on JSON responses.The testing framework now supports page objects that raise exceptions from the
to_itemmethod.
0.10.0 (2023-04-19)
New class
Extractorcan be used for easier extraction of nested fields (see Processors for nested fields).Exceptions raised while getting a response for an additional request are now saved in test fixtures.
Multiple documentation improvements and fixes.
Add a
twine checkCI check.
0.9.0 (2023-03-30)
Standardized input validation.
Field processors can now also be defined through a nested
Processorsclass, so that field redefinitions in subclasses can inherit them. See Default processors.Field processors can now opt in to receive the page object whose field is being read.
web_poet.fields.FieldsMixinnow keeps fields from all base classes when using multiple inheritance.Fixed the documentation build.
0.8.1 (2023-03-03)
Fix the error when calling
.to_item(),item_from_fields_sync(), oritem_from_fields()on page objects defined as slotted attrs classes, while settingskip_nonitem_fields=True.
0.8.0 (2023-02-23)
This release contains many improvements to the web-poet testing framework, as well as some other improvements and bug fixes.
Backward-incompatible changes:
cached_method()no longer caches exceptions forasync defmethods. This makes the behavior the same for sync and async methods, and also makes it consistent with Python’s stdlib caching (i.e.functools.lru_cache(),functools.cached_property()).The testing framework now uses the
HttpResponse-info.jsonfile name instead ofHttpResponse-other.jsonto store information about HttpResponse instances. To make tests generated with older web-poet work, rename these files on disk.
Testing framework improvements:
Improved test reporting: better diffs and error messages.
By default, the pytest plugin now generates a test per item attribute (see Running tests). There is also an option (
--web-poet-test-per-item) to run a test per item instead.Page objects with the
HttpClientdependency are now supported (see Additional requests support).Page objects with the
PageParamsdependency are now supported.Added a new
python -m web_poet.testing reruncommand (see Test-Driven Development).Fixed support for nested (indirect) dependencies in page objects. Previously they were not handled properly by the testing framework.
Non-ASCII output is now stored without escaping in the test fixtures, for better readability.
Other changes:
Testing and CI fixes.
Fixed a packaging issue:
testsandtests_extrapackages were installed, not justweb_poet.
0.7.2 (2023-02-01)
Restore the minimum version of
itemadapterfrom 0.7.1 to 0.7.0, and prevent a similar issue from happening again in the future.
0.7.1 (2023-02-01)
Updated the tutorial to cover recent features and focus on best practices. Also, a new module was added,
web_poet.example, that allows using page objects while following the tutorial.Tests for page objects now covers Git LFS and scrapy-poet, and recommends
python -m pytestinstead ofpytest.Improved the warning message when duplicate
ApplyRuleobjects are found.HttpResponse-other.jsoncontent is now indented for better readability.Improved test coverage for fields.
0.7.0 (2023-01-18)
Add a framework for creating tests and running them with pytest.
Support implementing fields in mixin classes.
Introduce new methods for
web_poet.rules.RulesRegistry:Improved the performance of
web_poet.rules.RulesRegistry.search()where passing a single parameter of eitherinstead_oforto_returnresults in O(1) look-up time instead of O(N). Additionally, having eitherinstead_oforto_returnpresent in multi-parameter search calls would filter the initial candidate results resulting in a faster search.Support page object dependency serialization.
Add new dependencies used in testing and serialization code:
andi,python-dateutil, andtime-machine. Alsobackports.zoneinfoon non-Windows platforms when the Python version is older than 3.9.
0.6.0 (2022-11-08)
In this release, the @handle_urls decorator gets an overhaul; it’s not
required anymore to pass another Page Object class to
@handle_urls("...", overrides=...).
Also, the @web_poet.field decorator gets support for output processing
functions, via the out argument.
Full list of changes:
Backwards incompatible
PageObjectRegistryis no longer supporting dict-like access.Official support for Python 3.11.
New
@web_poet.field(out=[...])argument which allows to set output processing functions for web-poet fields.The
web_poet.overridesmodule is deprecated and replaced withweb_poet.rules.The
@handle_urlsdecorator is now creatingApplyRuleinstances instead ofOverrideRuleinstances;OverrideRuleis deprecated.ApplyRuleis similar toOverrideRule, but has the following differences:ApplyRuleaccepts ato_returnparameter, which should be the data container (item) class that the Page Object returns.Passing a string to
for_patternswould auto-convert it intourl_matcher.Patterns.All arguments are now keyword-only except for
for_patterns.
New signature and behavior of
handle_urls:The
overridesparameter is made optional and renamed toinstead_of.If defined, the item class declared in a subclass of
web_poet.ItemPageis used as theto_returnparameter ofApplyRule.Multiple
handle_urlsannotations are allowed.
PageObjectRegistryis replaced withRulesRegistry; its API is changed:backwards incompatible dict-like API is removed;
backwards incompatible O(1) lookups using
.search(use=PagObject)has become O(N);search_overridesmethod is renamed tosearch;get_overridesmethod is renamed toget_rules;from_override_rulesmethod is deprecated; useRulesRegistry(rules=...)instead.
Typing improvements.
Documentation, test, and warning message improvements.
Deprecations:
The
web_poet.overridesmodule is deprecated. Useweb_poet.rulesinstead.The
overridesparameter from@handle_urlsis now deprecated. Use theinstead_ofparameter instead.The
OverrideRuleclass is now deprecated. UseApplyRuleinstead.PageObjectRegistryis now deprecated. UseRulesRegistryinstead.The
from_override_rulesmethod ofPageObjectRegistryis now deprecated. UseRulesRegistry(rules=...)instead.The
PageObjectRegistry.get_overridesmethod is deprecated. UsePageObjectRegistry.get_rulesinstead.The
PageObjectRegistry.search_overridesmethod is deprecated. UsePageObjectRegistry.searchinstead.
0.5.1 (2022-09-23)
The BOM encoding from the response body is now read before the response headers when deriving the response encoding.
Minor typing improvements.
0.5.0 (2022-09-21)
Web-poet now includes a mini-framework for organizing extraction code as Page Object properties:
import attrs
from web_poet import field, ItemPage
@attrs.define
class MyItem:
foo: str
bar: list[str]
class MyPage(ItemPage[MyItem]):
@field
def foo(self):
return "..."
@field
def bar(self):
return ["...", "..."]
Backwards incompatible changes:
web_poet.ItemPageis no longer an abstract base class which requiresto_itemmethod to be implemented. Instead, it provides a defaultasync def to_itemmethod implementation which uses fields marked asweb_poet.fieldto create an item. This change shouldn’t affect the user code in a backwards incompatible way, but it might affect typing.
Deprecations:
web_poet.ItemWebPageis deprecated. Useweb_poet.WebPageinstead.
Other changes:
web-poet is declared as PEP 561 package which provides typing information; mypy is going to use it by default.
Documentation, test, typing and CI improvements.
0.4.0 (2022-07-26)
New
HttpResponse.urljoinmethod, which take page’s base url in account.New
HttpRequest.urljoinmethod.standardized
web_poet.exceptions.Retryexception, which allows to initiate a retry from the Page Object, e.g. based on page content.Documentation improvements.
0.3.0 (2022-06-14)
Backwards Incompatible Change:
web_poet.requests.request_backend_varis renamed toweb_poet.requests.request_downloader_var.
Documentation and CI improvements.
0.2.0 (2022-06-10)
Backward Incompatible Change:
ResponseDatais replaced withHttpResponse.HttpResponseexposes methods useful for web scraping (such as xpath and css selectors, json loading), and handles web page encoding detection. There are also new types likeHttpResponseBodyandHttpResponseHeaders.
Added support for performing additional requests using
web_poet.HttpClient.Introduced
web_poet.BrowserHtmldependencyIntroduced
web_poet.PageParamsto pass arbitrary information inside a Page Object.Added
web_poet.handle_urlsdecorator, which allows to declare which websites should be handled by the page objects. Lower-levelPageObjectRegistryclass is also available.removed support for Python 3.6
added support for Python 3.10
0.1.1 (2021-06-02)
base_urlandurljoinshortcuts
0.1.0 (2020-07-18)
Documentation
WebPage, ItemPage, ItemWebPage, Injectable and ResponseData are available as top-level imports (e.g.
web_poet.ItemPage)
0.0.1 (2020-04-27)
Initial release.