python - BeautifulSoup scraping - AttributeError: 'NoneType' object has no attribute 'find' -


i practicing scraping beautifulsoup. below code , screenshot of webspage , it's elements. trying title of each post reddit.com.

code:

import urllib2 bs4 import beautifulsoup  url = 'https://www.reddit.com/' page = urllib2.urlopen(url) soup = beautifulsoup(page, 'html.parser') posttitles = soup.find_all("div", {"class", "thing"}) title in posttitles:     tclass = title.find("div", {"class", "entry"})     posttitle = tclass.find("a", {"class", "title"})     print posttitle     print "\n\n" 

error:

traceback (most recent call last):   file "scrapingtest.py", line 21, in <module>     posttitle = tclass.find("a", {"class", "title"}) attributeerror: 'nonetype' object has no attribute 'find' 

enter image description here

reason of error

traceback (most recent call last):   file "scrapingtest.py", line 21, in <module>     posttitle = tclass.find("a", {"class", "title"}) attributeerror: 'nonetype' object has no attribute 'find' 

you getting because value of tclass none. can not call find on it. that's error message states.

debugging

please print out value of soup check html response get. reddit blocks repeated requests , sends simple message instead of usual listings.

possible workaround

use proper user agents , other stuff simulate proper behaviour of human being, browsing reddit on browser.

you might want try doing using selenium.

alternatives

reddit provides apis collecting data , building bots. have never tried it. not sure allowed , not. might apis see if matches needs.


Comments