CopyPastor

Detecting plagiarism made easy.

Score: 1.8486103415489197; Reported for: String similarity, Exact paragraph match Open both answers

Possible Plagiarism

Plagiarized on 2018-05-16
by PPL

Original Post

Original - Posted on 2016-12-09
by furas



            
Present in both answers; Present only in the new answer; Present only in the old answer;

Check below example <br> To loop pages with page=x you need for loop like this
import requests from bs4 import BeautifulSoup url = 'http://www.housingcare.org/housing-care/results.aspx?ath=1%2c2%2c3%2c6%2c7&stp=1&sm=3&vm=list&rp=10&page=' for page in range(10): print('---', page, '---') r = requests.get(url + str(page)) soup = BeautifulSoup(r.content, "html.parser") # String substitution for HTML for link in soup.find_all("a"): print("<a href='>%s'>%s</a>" % (link.get("href"), link.text)) # Fetch and print general data from title class general_data = soup.find_all('div', {'class' : 'title'}) for item in general_data: print(item.contents[0].text) print(item.contents[1].text.replace('.','')) print(item.contents[2].text)
Every page can be different and better solution needs more inforamtion about page. Sometimes you can get link to last page and then you can use this information instead 10 in range(10)
Or you can use while True to loop and break to leave loop if there is no link to next page. But first you have to show this page (url to real page) in question.
EDIT: example how to get link to next page and then you get all pages - not only 10 pages as in previous version.

import requests from bs4 import BeautifulSoup # link to first page - without `page=` url = 'http://www.housingcare.org/housing-care/results.aspx?ath=1%2c2%2c3%2c6%2c7&stp=1&sm=3&vm=list&rp=10' # only for information, not used in url page = 0 while True: print('---', page, '---') r = requests.get(url) soup = BeautifulSoup(r.content, "html.parser") # String substitution for HTML for link in soup.find_all("a"): print("<a href='>%s'>%s</a>" % (link.get("href"), link.text)) # Fetch and print general data from title class general_data = soup.find_all('div', {'class' : 'title'}) for item in general_data: print(item.contents[0].text) print(item.contents[1].text.replace('.','')) print(item.contents[2].text) # link to next page next_page = soup.find('a', {'class': 'next'}) if next_page: url = next_page.get('href') page += 1 else: break # exit `while True`

To loop pages with `page=x` you need `for` loop like this>
import requests from bs4 import BeautifulSoup url = 'http://www.housingcare.org/housing-care/results.aspx?ath=1%2c2%2c3%2c6%2c7&stp=1&sm=3&vm=list&rp=10&page=' for page in range(10): print('---', page, '---') r = requests.get(url + str(page)) soup = BeautifulSoup(r.content, "html.parser") # String substitution for HTML for link in soup.find_all("a"): print("<a href='>%s'>%s</a>" % (link.get("href"), link.text)) # Fetch and print general data from title class general_data = soup.find_all('div', {'class' : 'title'}) for item in general_data: print(item.contents[0].text) print(item.contents[1].text.replace('.','')) print(item.contents[2].text) Every page can be different and better solution needs more inforamtion about page. Sometimes you can get link to last page and then you can use this information instead `10` in `range(10)`
Or you can use `while True` to loop and `break` to leave loop if there is no link to next page. But first you have to show this page (url to real page) in question.
---
**EDIT:** example how to get link to next page and then you get all pages - not only 10 pages as in previous version.

import requests from bs4 import BeautifulSoup # link to first page - without `page=` url = 'http://www.housingcare.org/housing-care/results.aspx?ath=1%2c2%2c3%2c6%2c7&stp=1&sm=3&vm=list&rp=10' # only for information, not used in url page = 0 while True: print('---', page, '---') r = requests.get(url) soup = BeautifulSoup(r.content, "html.parser") # String substitution for HTML for link in soup.find_all("a"): print("<a href='>%s'>%s</a>" % (link.get("href"), link.text)) # Fetch and print general data from title class general_data = soup.find_all('div', {'class' : 'title'}) for item in general_data: print(item.contents[0].text) print(item.contents[1].text.replace('.','')) print(item.contents[2].text) # link to next page next_page = soup.find('a', {'class': 'next'}) if next_page: url = next_page.get('href') page += 1 else: break # exit `while True`

        
Present in both answers; Present only in the new answer; Present only in the old answer;