Concurrent futures webscraping

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP


Concurrent futures webscraping



whoever is reading his!
Thank you for taking the time to look at this.



I am currently trying to develop a fast webscraping function so I can scrape a large list of files.



This is the code I have currently:


import time
import requests
from bs4 import BeautifulSoup
from concurrent.futures import ProcessPoolExecutor, as_completed
def parse(url):
r = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')
return soup.find_all('a')
with ProcessPoolExecutor(max_workers=4) as executor:
start = time.time()
futures = [ executor.submit(parse, url) for url in URLs ]
results =
for result in as_completed(futures):
results.append(result)
end = time.time()
print("Time Taken: {:.6f}s".format(end-start))



this brings backs results for websites i.e www.google.com,
however my problem is I have no idea to view the data it brings back
I only get future objects.



Please can someone explain/show me how to do this.



I appreciate anytime you give to help me with this.









By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Arduino Mega cannot recieve any sketches, stk500_recv() programmer is not responding

Visual Studio Code: How to configure includePath for better IntelliSense results

C++ virtual function: Base class function is called instead of derived