Once I decided that I need an XML support for my new plug-in, I start searching ways to use it in python.
The default XML package seems okay, but there is a huge issue with this, it has a poor formatting support. The issue that by default you’ll get non formatted XML string which is okay, but hard to work with. For sure there is work around, but this is tedious and causing some issues too(for example) So that’s not the way I want to work. There is a bunch of options around web, but I choose lxml package, since it looks pretty easy to use and you can read about it’s benefits here.
The speed comparison(source)
|Python 2.7||Python 3.2|
|Parser||Speed (KB/s)||Success rate||Speed (KB/s)||Success rate|
|Beautiful Soup 3.2 (SGMLParser)||211||100%||–||–|
|html5lib (BS3 treebuilder)||253||99%||–||–|
|Beautiful Soup 4.0 + lxml||255||100%||2140||96%|
|html5lib (lxml treebuilder)||270||99%||–||–|
|Beautiful Soup 4.0 + html5lib||271||98%||–||–|
|Beautiful Soup 4.0 + HTMLParser||299||59%||1705||57%|
|html5lib (simpletree treebuilder)||332||100%||–||–|
But I hit the same problem as many users around the web- it doesn’t work for me out of the box. After I installed it from VS2013 through pip, it appeared in my import, but some of the dlls were missing. I found nonofficial compilation of this package here . Next step is installation, well this took me a bit of time to figure this out, here is step by step instruction.
- download lxml from here http://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml
- copy file to some directory, let’s say C:\temp
- open VS and in solution eplorer window RMB on your python interpreter. In my case it called Maya.
- Click Install python package
- Now type path and file name in the field and click run as administrator
- once you click OK installation will run.
In case it doesn’t work for you, try to install “wheel” package the same way first.
After that lxml should be available to work with!