How to Scrape a Static Website

A really quick tutorial

Prerequisites: Knowledge of React.js will be required for this tutorial.

Let’s say you want to pull data from the frontend of a website because there’s no API available. You inspect the page and see that the data is available in the HTML, so how do you gather that information to be used in your app? It’s rather simple, we’re going to install two libraries and write less than 50 lines of code to demonstrate the scraping of a website. To keep this tutorial simple, we’ll use https://pokedex.org/ as our example.

create-react-app scraping-demo
cd scraping-demo
npm i request-promise
npm i cheerio

2. We’re going to start by using request-promise to get the HTML from https://pokedex.org/ into a console log.

In App.js:

3. Sometimes you may come across a CORS error blocking you from fetching. For demonstration purposes, try fetching pokemon.com

rp("https://www.pokemon.com/us/pokedex/")

You should see an error like this in the console:

4. You can get around CORS by using https://cors-anywhere.herokuapp.com. Simply add that URL before your desired fetch URL like so:

rp("https://cors-anywhere.herokuapp.com/https://www.pokemon.com/us/pokedex/")

Now you should be able to see the HTML from pokemon.com show in your console.

5. But we won’t have to use cors-anywhere for rp("https://pokedex.org/"), so let’s proceed

6. Now that we have the HTML, let’s use the cheerio library to help us grab the exact data that we want from desired element tags. In this example, we’ll grab all the names of the pokemon then display them in a list.

In App.js:

7. You should see a list of all the pokemon names display onto your screen:

It’s that simple! You scraped those names from the HTML without having to directly access any backend. Now try scraping the examples on http://toscrape.com/ for practice. Enjoy your new abilities!

If you enjoyed that, check out my latest series: Imagine if Codecademy and Clash Royale made a baby

You can also find me on Youtube and Twitch

Software Engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store