Web mining, an application of data mining is the process of discovering patterns from the web. It can be also stated as the process of analyzing the data or information about a small quantity of data from the web.
- Web mining is the integration of information gathered from the conventional data mining methodologies and techniques to recognize or discover patterns from the large data sets.
- Web mining is usually done to understand the customer behaviour, evaluate the effectiveness of a product and study the feedback about a product.
- It involves the parameters of traditional data mining such as clustering, classification, association and examination of sequential parameters.
Web mining is further classified into three types
- Web usage mining.
- Web structure mining.
- Web content mining.
Web Usage Mining
Web usage mining is the process of extracting useful information from server logs. It’s the application of data mining techniques to discover interesting usage patterns from web data in order to serve the needs of web-based applications.
Web Structure Mining
Web structure mining is the process of using graph theory to analyze the structure of a website. It is further divided into two types. 1. Extracting patterns and 2. Mining the structure of the document.
Web Content Mining
Web content mining is the mining, extraction and integration of useful data, information and knowledge from web page content. Web crawlers, meta crawlers provide some comfort to users to categorize, filter and interpret the documents.
- Content mining extracts patterns from online information, such as HTML files, images, blogs or posts.
- Web structure mining focuses on using the analysis of the link arrangement of the web to identify preferable documents.
- Web usage mining for user communications whenever requests for resources are received.
- Full featured IDE includes syntax highlighting, graphical implementation, real time result delivery, and network monitoring .
- Separate extraction techniques for unstructured, semi-structured, and structured data are deployed.
- Integration options include Java, .NET, ActiveX, C++, and allowing queries to become Web services.
- Reads and writes common file formats such as HTML, XML, PDF, DOC, CSV, TSV, images, databases, etc.
Real Time Examples
The real-time examples for the web mining is their use in various areas, they are
- E-Commerce to do personalised marketing.
- Government agencies to identify threat and fight terrorism.
- Companies to keep better relation with their customers.
- Analyze feedback for a particular product.