Ideone.com

download

copy

#!/usr/bin/env python3
# requirements.txt: pip3 install beautifulsoup4
import requests
try:
    from bs4 import BeautifulSoup
except ImportError:
    print('warning: might be using BS3!')
    from BeautifulSoup import BeautifulSoup
 
def web_crawler(max_page):
    page = 1
    while max_page > page:
        url = 'http://b...content-available-to-author-only...e.com/catalogue/category/books_1/page-' + str(page) + '.html'
        source_code = requests.get(url)
        plain_text = source_code # .text => bs3: AttributeError: 'str' object has no attribute 'text', musi byc bs4
        soup = BeautifulSoup(plain_text.content, 'html.parser')
        for link in soup.findAll('h3'):
            print('link', link)
            href = 'http://b...content-available-to-author-only...e.com' + str(link.find('a').get('href'))
            title = link.string
            print('HREF', href)
            print('TITLE', title)
        page += 1
 
web_crawler(2)

Runtime error #stdin #stdout #stderr 0.01s 27704KB

stdin

copy

Standard input is empty

stdout

copy

Standard output is empty

stderr

copy

Traceback (most recent call last):
  File "./prog.py", line 3, in <module>
    import requests
ImportError: No module named 'requests'

https://ideone.com/fqnu40

language:

Python 3 nbc (python 3.7.3)

created:

visibility:

secret

Share or Embed source code

Discover > Sphere Engine API

The brand new service which powers Ideone!

Discover > IDE Widget

Widget for compiling and running the source code in a web browser!

Discover > Sphere Engine API

Discover > IDE Widget

Choose your language