
The previous post in this series considered the mocking capabilities in the unittest
package. Now we’ll look at what it offers for patching.
What’s the difference between “mocking” and “patching”? A perfectly reasonable question. The distinction is subtle, and TBH I’m not always 100% clear on the difference myself. These two terms are often used interchangeably. This is the way that I currently think about it:
- mocking — creates an entire fake object that mimics the behaviour of a real object; while
- patching — fakes a specific behaviour on an existing object.
Hopefully the difference will become clear as we work our way through the examples below.
Patchers
The unittest.mock
package offers a selection of patchers:
patch()
patch.object()
patch.dict()
andpatch.multiple()
.
Each of these can be used as a decorator or a context manager. We’ll start by importing the patch()
function.
from unittest.mock import patch
💡 You only need to import patch()
. The other three functions automatically come with it. Effectively patch()
is the master function and the other functions are attributes attached to patch()
.
Patching
For the purpose of illustration, suppose that I have a randomiser
module (source file) that implements a function, rng()
, and a class, Random
.
from randomiser import rng, Random
Call the rng()
function. It’s not much of a RNG because it always returns the same number. However, the value that it consistently returns is a particularly fine example of a random number.
rng()
0.42
Create an instance of Random
and call its random()
method. This is also a fiendishly poor implementation of a RNG.
Random().random()
0.42
Patcher Objects
The patch()
function is used for patching module-level properties. You supply it with a string identifying the thing you want to patch. The example below creates a patcher object, patcher
, for the rng()
function in the randomiser
module.
patcher = patch("randomiser.rng", return_value=0.13)
💡 The target is only imported when the patcher object is used, not when it is created. So you don’t need to have already imported the thing that you’re patching.
We specified the required return value as an argument to patch()
. As we’ll see below, this is not the only way to do this. You could also assign to the return_value
attribute on the patcher object. The approach you take will depend on whether or not the required return value is available at the time that you create the patcher object.
You might imagine that you’d now be able to call the patcher object. Sadly, you’d be wrong: that’s not how it works. However, the reality might be better than you imagined.
First turn the patcher on by starting it.
patcher.start()
Now call the function.
import randomiser
randomiser.rng()
0.13
Bravo! We get the patched result. 🚀 Now turn the patcher off by stopping it.
patcher.stop()
Try the function again.
randomiser.rng()
0.42
Aha, the original (unpatched) behaviour is restored. So, with this patcher object you can use the patched version of the function whenever required.
Context Manager
If that feels like a bit too much work, then a context manager may be your jam.
with patch("randomiser.rng", return_value=0.13) as mocked:
randomiser.rng()
# Try out one of the assertions attached to the patcher object.
mocked.assert_called_once()
0.13
💡 When used as a context manager or a decorator (see below), the result of patch()
is a MagicMock
object. It’s not necessary to name this object, but doing so will unlock extra functionality, like checking that the patched object was called at least once. More on these assertions to follow below.
You can also use the previously created patcher object as a context manager.
with patcher:
randomiser.rng()
It’s possible to patch global variables too. The randomiser
module has a global SEED
(with a value of 13) and a function rng_seed()
that simply returns the value of SEED
.
randomiser.rng_seed()
13
We can use patch()
to temporarily change the value of SEED
.
with patch("randomiser.SEED", 777):
randomiser.rng_seed()
777
Decorator
Finally (and most relevant to the topic of testing), you can use patch()
as a decorator.
@patch("randomiser.rng", return_value=0.13)
def test_rng(mock_rng):
assert randomiser.rng() == 0.13
Another way to do the same thing is to assign the return_value
within the scope of the decorator.
@patch("randomiser.rng")
def test_rng(mock_rng):
mock_rng.return_value = 0.13
assert randomiser.rng() == 0.13
Patching an Object
The patch.object
function is used to patch properties on a specific object. The name of this function implies that it’s for patching “objects”. This can be confusing. It can be used for patching either
- a class (because in Python a class is also an object!) or
- an instance of a class.
Both of the following are valid:
# Patch the class method.
patch.object(Random, "random", return_value=0.13)
# The patch will apply to all instances of the class.
r = Random()
# Patch the instance method.
patch.object(r, "random", return_value=0.13)
# The patch will only apply to this specific instance of the class.
Patcher Object
As before we can create a patcher object. But now, rather than patching a function or method in a module, we’ll patch a method on a class.
patcher = patch.object(Random, "random", return_value=0.13)
As we’ll see below, the result can be used as a context manager or decorator. But first we’ll use it manually: start the patcher, call the patched method and then stop the patcher.
patcher.start()
r = Random()
r.random()
0.13
patcher.stop()
Voila! 🚀
Again, the object will revert to its original behaviour after the patcher is stopped.
Context Manager
Using the patcher object as a context manager means you don’t need to manually start and stop it. The object is only patched within the scope of the context manager.
with patch.object(Random, "random", return_value=0.13):
r.random()
0.13
Decorator
Finally we can use it as a decorator. The object is only patched within the scope of the decorated function.
@patch.object(Random, "random", return_value=0.13)
def test_scraper(mock_random):
r = Random()
assert r.random() == 0.13
mock_random.assert_called_once()
Patching a Dictionary
The patch.dictionary()
function can be used for objects that implement a dictionary interface, with all of the associated dunder methods. This is the analog to the MagicMock
class.
Multiple Patches
The patch.multiple()
function makes it possible to patch more than one item at a time. This can be useful, but TBH if I don’t really need to do it in a single statement, then I’m inclined so simply use multiple calls to one of the other functions.
Tests with Patching
Let’s apply these patching functions to tests for the BooksScraper
and QuotesScraper
classes.
Patching a Return Value
Many of the simple examples above used the return_value
parameter to specify the return value required on the patched object. Let’s see how this would be used when testing a scraper. We’ll apply it to the BookScraper
class.
First we’ll use a context manager to apply the patch.
from unittest.mock import patch
import json
import pytest
from scraper.books import BooksScraper
BOOKS = json.load(open("books-to-scrape.json"))
HTML = "books-to-scrape.html"
@pytest.fixture
def scraper():
return BooksScraper()
def test_scraper(scraper):
with open(HTML, "r") as f:
html = f.read()
with patch.object(scraper, "download", return_value=html):
html = scraper.download()
books = scraper.parse(html)
assert books == BOOKS
The context manager creates a scope in which the download()
method is called. Within this scope the method is replaced with a patched version that simply returns the HTML read from a file.
Decorators are more commonly used for testing. Now the scope of the patch is the entire test.
import json
from unittest.mock import patch
from scraper.books import BooksScraper
BOOKS = json.load(open("books-to-scrape.json"))
HTML = "books-to-scrape.html"
@patch.object(BooksScraper, "download")
def test_scraper(patched_download):
with open(HTML, "r") as f:
html = f.read()
# Because the required return value is only loaded inside the test function,
# it's not available to the decarator. Instead we assign it here.
#
patched_download.return_value = html
scraper = BooksScraper()
html = scraper.download()
books = scraper.parse(html)
assert books == BOOKS
# Use assertions on attributes of the patched object. (Non-essential meta-testing.)
#
assert patched_download.called
assert patched_download.call_count == 1
# Use assertion methods on the patched object itself. (Non-essential meta-testing.)
#
patched_download.assert_called_once()
💡 Since the required return value is only loaded within the scope of the decorator it’s not possible to specify the return_value
argument to patch.object()
. Not a problem! This can be done later by setting the return_value
attribute on the patcher object.
Patching with Side Effects
The implementation of QuotesScraper
is such that we cannot use return_value
but must instead use side_effect
. The side effect implemented in the mock_download
inner function sets the html
attribute on the patched object.
from unittest.mock import patch
import pandas as pd
import pytest
from scraper.quotes import QuotesScraper
QUOTES = pd.read_csv("quotes-to-scrape.csv")
HTML = "quotes-to-scrape.html"
@pytest.fixture
def scraper():
return QuotesScraper()
@patch.object(QuotesScraper, "download")
def test_scraper(patched_download, scraper):
def mock_download():
with open(HTML, "r") as f:
scraper.html = f.read()
patched_download.side_effect = mock_download
scraper.download()
scraper.parse()
scraper.normalise()
data = scraper.transform()
assert data.equals(QUOTES)
The side_effect
property can be used to patch a generator too. The version of the mock_download
inner function below yields the content of the HTML file.
import json
from unittest.mock import patch
from scraper.books import BooksScraper
BOOKS = json.load(open("books-to-scrape.json"))
HTML = "books-to-scrape.html"
@patch.object(BooksScraper, "download")
def test_scraper(patched_download):
def mock_download():
with open(HTML, "r") as f:
yield f.read()
patched_download.side_effect = mock_download
scraper = BooksScraper()
# Extract HTML from iterator.
for html in scraper.download():
pass
# It will only be called once due to the way that the generator is implemented.
patched_download.assert_called_once()
books = scraper.parse(html)
assert books == BOOKS
Patcher Attribute Assertions
You might have noticed that some tests access the called
and called_count
attributes on the patch object. These are handy little utilities that can be used to (1) ensure that the patch is indeed called and (2) that the patch is called the expected number of times.
Patcher objects expose a number of assertions linked to these attributes that can be useful for meta-testing (checking that the tests are working as expected):
assert_any_call()
assert_called()
— checks that calledassert_called_once()
— checks that called only onceassert_called_once_with()
— checks the argument used to call the objectassert_called_with()
assert_has_calls()
andassert_not_called
.
Patches are Flexible
In the previous post we mocked the response from https://dummyjson.com/user/1. Let’s do the same thing with patching.
import json
from unittest.mock import patch
import requests
emily = {
"id": 1,
"firstName": "Emily",
"lastName": "Johnson",
"age": 28,
"gender": "female"
}
@patch("requests.get")
def test_user_response(mock_get):
mock_get.return_value.status_code = 200
mock_get.return_value.text = json.dumps(emily)
respose = requests.get("https://dummyjson.com/user/1")
assert response.status_code == 200
assert response.text == json.dumps(emily)
This is not in any way a meaningful test, but it shows how functionality from third party packages can easily be patched.
Conclusion
Patching is a useful alternative to mocking for web scraper tests. Often the two can be used interchangeably. Use one. Or the other. Or both. But, whatever you do, make use of patching and mocking to ensure that your web scrapers are tested quickly, robustly and predictably.
