A common request is to extract information from one or more websites where the data desired is across a very large array of pages and getting this data by hand is out of the question.

What is data extraction?

Extracting data from the web is a process in the field of data extraction.  Internet pages in html, xml, etc are considered an unstructured data source due to the wide variety in the code, styles, and of course exceptions and violations of standard coding practices.  Due to this variety, extracting data from the web is a highly customizable process depending on the specific source of information one is trying to retrieve.  The definition of data extraction is taking an unstructured form of data and parsing that information into a structured data set.

extracting data can parse information and data from web pages and documents, basically organizing a lot of data in whatever format you need.