Scraping Table With Python/BS4 -


im trying scrape "team stats" table http://www.pro-football-reference.com/boxscores/201602070den.htm bs4 , python 2.7. im unable anywhere close it,

url = 'http://www.pro-football-reference.com/boxscores/201602070den.htm' page = requests.get(url) soup = beautifulsoup(page.text, "html5lib") table=soup.findall('table', {'id':"team_stats", "class":"stats_table"})   print table 

i thought above code work no luck.

the problem in case "team stats" table located inside comment in html source download requests. locate comment , reparse beautifulsoup "soup" object:

import requests bs4 import beautifulsoup, navigablestring  url = 'http://www.pro-football-reference.com/boxscores/201602070den.htm' page = requests.get(url, headers={'user-agent': 'mozilla/5.0 (macintosh; intel mac os x 10_11_4) applewebkit/537.36 (khtml, gecko) chrome/51.0.2704.103 safari/537.36'})  soup = beautifulsoup(page.content, "html5lib") comment = soup.find(text=lambda x: isinstance(x, navigablestring) , "team_stats" in x)  soup = beautifulsoup(comment, "html5lib") table = soup.find("table", id="team_stats") print(table) 

and/or, can load table into, example, pandas dataframe convenient work with:

import pandas pd import requests bs4 import beautifulsoup bs4 import navigablestring  url = 'http://www.pro-football-reference.com/boxscores/201602070den.htm' page = requests.get(url, headers={'user-agent': 'mozilla/5.0 (macintosh; intel mac os x 10_11_4) applewebkit/537.36 (khtml, gecko) chrome/51.0.2704.103 safari/537.36'})  soup = beautifulsoup(page.content, "html5lib") comment = soup.find(text=lambda x: isinstance(x, navigablestring) , "team_stats" in x)  df = pd.read_html(comment)[0] print(df) 

prints:

            unnamed: 0            den            car 0          first downs             11             21 1         rush-yds-tds        28-90-1       27-118-1 2    cmp-att-yd-td-int  13-23-141-0-1  18-41-265-0-1 3         sacked-yards           5-37           7-68 4       net pass yards            104            197 5          total yards            194            315 6         fumbles-lost            3-1            4-3 7            turnovers              2              4 8      penalties-yards           6-51         12-102 9     third down conv.           1-14           3-15 10   fourth down conv.            0-0            0-0 11  time of possession          27:13          32:47 

Comments

Popular posts from this blog

jOOQ update returning clause with Oracle -

java - Warning equals/hashCode on @Data annotation lombok with inheritance -

java - BasicPathUsageException: Cannot join to attribute of basic type -