Abstract
Background: Web documents display information in the form of natural language text which is not understandable by machines. To search specific information from sea of web documents has become very challenging as it shows many unwanted non relevant documents along with relevant documents. To retrieve relevant information semantic knowledge can be stored in the domain specific ontology which helps in understanding user’s need to retrieve relevant information.
Methods: In this paper, framework for extracting and visualising semantic knowledge has been designed. Proposed approach is based on the assumption that semantics of text can be extracted by creating syntactic structure of the text. To extracts syntactic structure Stanford parser has been used. Parsing of corpus text is done to obtain morphological structures which is in more machine readable format, and thus provides a better structure for constructing syntactic-semantic rules manually. The tagged form of each sentence is taken and set of rules based on dependency relationship are build manually. Sentence level analysis is performed for concepts generation, for properties and for hierarchical relation extraction using dependency parse tree as a means for relation extraction.
Results: Extracted concepts and relation among various entities constitute knowledge base in the form of ontology.
Conclusion: Proposed information extraction model successfully filter the desired information from the large ocean of internet and create semantic structure to represent data in standard machine understandable format which explain the details about entities along with their properties and their relationship.
Keywords: Ontology, natural language, relation extraction, semantic knowledge, www, crawler.
Graphical Abstract