Current standard web documents are designed to be presented to humans. Machines have no idea about the information located in a web document. Semantic web is organized in a structured way so that it is meaningful to both machines and humans. In this presentation, we suggest a framework that will process the web documents and produce machine readable format in RDF (Resource Description Framework) collaborated with the OWL (Web Ontology Language).
Our suggested framework, which we call RS2 (RDF by Structured Reference to Semantics), takes an HTML document as input, extracts the plain text from it. Natural language context of plaintext is then parsed to yield subject-object-predicate of each sentence. This data is used to lookup in the ontology and generate RDF graph which is the machine intelligible semantic equivalent to the original human recognized text.