A fast, tolerant HTML parser written in C. It can parse broken HTML and is designed to be easy to use for scraping and manipulation. It's designed to be very accommodating (like Tanaka Akira's HTree) and to have a very helpful library (like some JavaScript libs -- JQuery, Prototype -- give you.) The XPath and CSS parser, in fact, is based on John Resig's JQuery.
... part of T2, get it here
URL: https://github.com/hpricot/hpricot
Author: why the lucky stiff
Maintainer: Boudewijn van der Heide <boudewijn [at] delta-utec [dot] com>
License: MIT
Version: 0.8.6
Download: https://github.com/hpricot/hpricot/ hpricot-0.8.6.tar.gz
T2 source: hpricot.cache
T2 source: hpricot.desc
Build time (on reference hardware): 2% (relative to binutils)2
Installed size (on reference hardware): 2.18 MB, 81 files
Dependencies (build time detected): bash coreutils diffutils gawk grep gzip linux-header make openssl ruby sed sysfiles tar util-linux
Installed files (on reference hardware): n.a.
1) This page was automatically generated from the T2 package source. Corrections, such as dead links, URL changes or typos need to be performed directly on that source.
2) Compatible with Linux From Scratch's "Standard Build Unit" (SBU).