This project is read-only.

Project Description
A html parser that turns badly formatted html into XPath query able xml. Similar to html tidy and html agility pack; I suppose you can call it "Just Another Html Parser". Written in c# and does not require anything that isn't found in the dot net framework.

* Knows to close tags that are left open when the parent tag closes.
* Attempts to move elements that aren't supposed to be nested within each other.
* Puts the quotation marks around attributes that have none.
* A direct inheritant of the XmlDocument object so all XmlDocument features work
* Contains an embeded XSLT tranform function that accepts nothing more than text!
* Contains an embeded web client that allows it to load documents from a Uri object.
* Knows to only parse known html elements and ignores all others.

It's everything that you need to start parsing web pages quickly.

It also can be used as a start for creating web services out of websites.

Last edited Aug 30, 2012 at 4:32 AM by kurtnelle, version 6