0% found this document useful (0 votes)
86 views3 pages

Web Mining: Content, Structure, and Usage

Uploaded by

Prafull Sawant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views3 pages

Web Mining: Content, Structure, and Usage

Uploaded by

Prafull Sawant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

1.

Web Content Mining:


Web content mining can be used to extract useful data, information, knowledge from the
web page content. In web content mining, each web page is considered as an individual
document. The individual can take advantage of the semi-structured nature of web pages,
as HTML provides information that concerns not only the layout but also logical structure.
The primary task of content mining is data extraction, where structured data is extracted
from unstructured websites. The objective is to facilitate data aggregation over various
web sites by using the extracted structured data. Web content mining can be utilized to
distinguish topics on the web. For Example, if any user searches for a specific task on the
search engine, then the user will get a list of suggestions.
2. Web Structured Mining:
The web structure mining can be used to find the link structure of hyperlink. It is used to
identify that data either link the web pages or direct link network. In Web Structure
Mining, an individual considers the web as a directed graph, with the web pages being the
vertices that are associated with hyperlinks. The most important application in this regard
is the Google search engine, which estimates the ranking of its outcomes primarily with
the PageRank algorithm. It characterizes a page to be exceptionally relevant when
frequently connected by other highly related pages. Structure and content mining
methodologies are usually combined. For example, web structured mining can be
beneficial to organizations to regulate the network between two commercial sites.
3. Web Usage Mining:
Web usage mining is used to extract useful data, information, knowledge from the weblog
records, and assists in recognizing the user access patterns for web pages. In Mining, the
usage of web resources, the individual is thinking about records of requests of visitors of a
website, that are often collected as web server logs. While the content and structure of
the collection of web pages follow the intentions of the authors of the pages, the individual
requests demonstrate how the consumers see these pages. Web usage mining may
disclose relationships that were not proposed by the creator of the pages.

Challenges in Web Mining:


The web pretends incredible challenges for resources, and knowledge discovery based on
the following observations:

o The complexity of web pages:

The site pages don't have a unifying structure. They are extremely complicated as
compared to traditional text documents. There are enormous amounts of documents in
the digital library of the web. These libraries are not organized according to a specific
order.

o The web is a dynamic data source:

The data on the internet is quickly updated. For example, news, climate, shopping,
financial news, sports, and so on.
o Diversity of client networks:

The client network on the web is quickly expanding. These clients have different interests,
backgrounds, and usage purposes. There are over a hundred million workstations that are
associated with the internet and still increasing tremendously.

o Relevancy of data:

It is considered that a specific person is generally concerned about a small portion of the
web, while the rest of the segment of the web contains the data that is not familiar to the
user and may lead to unwanted results.

o The web is too broad:

The size of the web is tremendous and rapidly increasing. It appears that the web is too
huge for data warehousing and data mining.

You might also like