python - How to fix "unexpected keyword argument 'useChardet'" in html5lib -


i'm using html5lib , after updating latest version, keep getting error:

traceback (most recent call last):   file "/home/travis/build/freelawproject/juriscraper/tests/test_everything.py", line 119, in test_scrape_all_example_files     site.parse()   file "/home/travis/build/freelawproject/juriscraper/juriscraper/abstractsite.py", line 95, in parse     self.html = self._download()   file "/home/travis/build/freelawproject/juriscraper/juriscraper/abstractsite.py", line 384, in _download     html_tree = self._make_html_tree(text)   file "/home/travis/build/freelawproject/juriscraper/juriscraper/opinions/united_states/federal_appellate/ca11_u.py", line 26, in _make_html_tree     e = html5parser.document_fromstring(text)   file "/home/travis/virtualenv/python2.7.9/lib/python2.7/site-packages/lxml/html/html5parser.py", line 64, in document_fromstring     return parser.parse(html, usechardet=guess_charset).getroot()   file "/home/travis/virtualenv/python2.7.9/lib/python2.7/site-packages/html5lib/html5parser.py", line 235, in parse     self._parse(stream, false, none, *args, **kwargs)   file "/home/travis/virtualenv/python2.7.9/lib/python2.7/site-packages/html5lib/html5parser.py", line 85, in _parse     self.tokenizer = _tokenizer.htmltokenizer(stream, parser=self, **kwargs)   file "/home/travis/virtualenv/python2.7.9/lib/python2.7/site-packages/html5lib/_tokenizer.py", line 36, in __init__     self.stream = htmlinputstream(stream, **kwargs)   file "/home/travis/virtualenv/python2.7.9/lib/python2.7/site-packages/html5lib/_inputstream.py", line 149, in htmlinputstream     return htmlunicodeinputstream(source, **kwargs) typeerror: __init__() got unexpected keyword argument 'usechardet' 

the code i'm using simple:

from lxml.html import html5parser html5parser.document_fromstring(u'<html></html') 

any ideas?

turns out if feed unicode object document_fromstring method, barfs. didn't used because happened when updated dependencies.

anyway, fix easy:

html5parser.document_fromstring(u'<html></html'.encode('utf-8')) 

Comments

Popular posts from this blog

jOOQ update returning clause with Oracle -

java - Warning equals/hashCode on @Data annotation lombok with inheritance -

java - BasicPathUsageException: Cannot join to attribute of basic type -