The Internet can be cons idered as a data source (belonging to the vast category of Big Data. that
may be harnessed in substitution, or in combination with, data collected by means of the traditional
instruments of a statistical survey. In case of substitution. the aim is to reduce respondent burden. ia
case of integration the increase in accuracy of the estimates is the main goal.
Web scraping is the process of automatically collecting in formation from the World Wide Web. It is a field with active developments sharing a com mon goal with the semantic web vision; it is based on a tool that navigates and extracts the content of a website, and stores the scraped data in a local data base. It has to be observed that, from a legal point of view, web scraping may be against the terms of use of Some websites: courts are prepared to protect proprietary content of Commercial sites from undesirable uses, even though the degree of protection for SUch content is not clearly settled. The amount of information accessed and copied depends on the degree to which the access is perceived as adversely affecting the site owner's system, and the types and manner of restrictions to such access. In the following twO different solutions for the web scraping are described: the first one is already available and has been used specifically for this experiment, while the second one is still in the development process