As the quality and quantity of information available online continues to increase, scraping has become a must-have tool for those who wish to collect, store and analyse web data for research purposes. Our workshop provides an introduction to web scraping using R, an open-source programming language. We will begin by presenting node identification within CSS source code, which allows us to isolate the content we want to scrape from a given website. We will then extract textual and numeric data from various web pages, including in sequential patterns (i.e. where the data is encompassed on multiple pages). We will conclude the workshop by presenting how to clean and export the scraped data in a .csv file.
Basic knowledge of R is recommended, but not necessary.