i want go deep statistic data, , write code.
import urllib2 import htmlparser response = urllib2.urlopen('http://www.eia.gov/dnav/pet/hist/leafhandler.ashx?n=pet&s=mcestus1&f=m') data = response.read() class tableparser(htmlparser.htmlparser): def __init__(self): htmlparser.htmlparser.__init__(self) self.in_td = false def handle_starttag(self, tag, attrs): if tag == 'td': self.in_td = true def handle_data(self, data): if self.in_td: print data def handle_endtag(self, tag): self.in_td = false p = tableparser() raw_data = p.feed(data) raw_list = [] string in raw_data: raw_list.append(string) print raw_list
some output script/
2015 421,472 448,039 474,815 483,379 479,335 469,539 455,470 457,810 460,786 486,700 - release date: 12/31/2015 next release date: 1/29/2016 traceback (most recent call last): file "firstdata.py", line 28, in <module> string in raw_data: typeerror: 'nonetype' object not iterable
that work, cant iterate through nonetype object, first.
and second how can put data in pandas graph month , quantity?
looks parser parses need data out of parser via other method return value of p.feed(data)
going none
. how accumulating in list property of parser object:
import urllib2 import htmlparser response = urllib2.urlopen('http://www.eia.gov/dnav/pet/hist/leafhandler.ashx?n=pet&s=mcestus1&f=m') data = response.read() class tableparser(htmlparser.htmlparser): def __init__(self): htmlparser.htmlparser.__init__(self) self.in_td = false self.raw_data = [] def handle_starttag(self, tag, attrs): if tag == 'td': self.in_td = true def handle_data(self, data): if self.in_td: print data self.raw_data.append(data) def handle_endtag(self, tag): self.in_td = false p = tableparser() p.feed(data) print p.raw_data
untested.
Comments
Post a Comment