Skip to content
This repository was archived by the owner on May 31, 2021. It is now read-only.

Commit 44e66bd

Browse files
committed
Add description for blocking asynchronous client.
1 parent 7110119 commit 44e66bd

File tree

1 file changed

+60
-1
lines changed

1 file changed

+60
-1
lines changed

webscraper.rst

Lines changed: 60 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -190,7 +190,9 @@ Therefore, we need to convert our strings in to bytestrings.
190190
Next, we read header and message from the reader, which is a ``StreamReader``
191191
instance.
192192
We need to iterate over the reader by using the specific for loop for
193-
``asyncio``::
193+
``asyncio``:
194+
195+
.. code-block:: python
194196
195197
async for raw_line in reader:
196198
@@ -207,8 +209,65 @@ characters.
207209
Getting Multiple Pages Asynchronously - Without Time Savings
208210
------------------------------------------------------------
209211

212+
This is our first approach retrieving multiple pages, using our asynchronous
213+
``get_page()``:
214+
215+
216+
.. literalinclude:: examples/async_client_blocking.py
217+
218+
219+
The interesting things happen in a few lines in ``get_multiple_pages()``
220+
(the rest of this function just measures the run time and displays it):
221+
210222
.. literalinclude:: examples/async_client_blocking.py
223+
:language: python
224+
:start-after: pages = []
225+
:end-before: duration
226+
227+
The ``closing`` from the standard library module ``contextlib`` starts
228+
the event loop within a context and closes the loop when leaving the context:
229+
230+
.. code-block:: python
231+
232+
with closing(asyncio.get_event_loop()) as loop:
233+
<body>
234+
235+
The two lines above are equivalent to these five lines:
236+
237+
.. code-block:: python
238+
239+
loop = asyncio.get_event_loop():
240+
try:
241+
<body>
242+
finally:
243+
loop.close()
244+
245+
We call ``get_page()`` for each page in a loop.
246+
Here we decide to wrap each call in ``loop.run_until_complete()``:
247+
248+
.. code-block:: python
249+
250+
for wait in waits:
251+
pages.append(loop.run_until_complete(get_page(host, port, wait)))
252+
253+
This means, we wait until each pages has been retrieved before asking for
254+
the next.
255+
Let's run it from the command-line to see what happens::
256+
257+
async_client_blocking.py
258+
It took 11.06 seconds for a total waiting time of 11.00.
259+
Waited for 1.00 seconds.
260+
That's all.
261+
Waited for 5.00 seconds.
262+
That's all.
263+
Waited for 3.00 seconds.
264+
That's all.
265+
Waited for 2.00 seconds.
266+
That's all.
211267

268+
So it still takes about eleven seconds in total.
269+
We made it more complex and did not improve speed.
270+
Let's see if we can do better.
212271

213272
Getting Multiple Pages Asynchronously - With Time Savings
214273
---------------------------------------------------------

0 commit comments

Comments
 (0)