How to scrape Yelp Business Reviews?

Yelp is one of the biggest crowd-sourced ratings and review websites for local businesses. It is a trusted review website because of the limited amount of spam and ads. With more than 200 million public reviews on its platform makes Yelp, a data-rich website for data miners and also an alternative to the Google Maps Platform.

In this tutorial, we are going to scrape Yelp Business Reviews using Node JS. And at the end, to make things simple, I will suggest you a Yelp Search API, which you can use to scrape data from Yelp easily. 

How to Scrape Yelp Business Reviews

Why Scrape Yelp Reviews?

Yelp has a mighty base of 90 million visitors per month across its website and mobile app, with users and businesses contributing to this platform day-to-day.

Scraping Yelp reviews can help you to get information about your competitors. You can analyze their ratings and reviews and get to know where your business stands in the market and the weak points that are becoming problems in your business expansion.

By monitoring the negative reviews left by your customers, you can analyze if there is some problem with your business that can be solved as soon as possible.

Yelp’s ultra-big size business directory can also help you to generate quality leads for your business. You can also collect addresses, phone numbers, and other details by scraping Yelp.

Before we start with this blog, let me explain some requirements.

Let’s start scraping Yelp Business Reviews

Scraping Yelp Business Reviews With Node JS Is Pretty Easy. First, we will scrape the overall rating and the reviews given by the customers of this restaurant.

Here is a list of data that we are gonna scrape in this tutorial:

  1. Average Rating of the business
  2. Name of the person
  3. Location of the person
  4. The review by the person
How to scrape Yelp Business Reviews 2
Reviews of a Restaurant

The Yelp Reviews scraping can be divided into two parts:

  1. Making the HTTP request on the target URL to extract the raw HTML data.
  2. Parsing the HTML data to extract the required information.

Set-Up

For beginners, to install Node JS on your device, you can watch these videos:

  1. How to install Node JS on Windows?
  2. How to install Node JS on macOS?

Install Libraries

To start scraping Yelp Reviews we need to install some NPM libraries so that we can move forward.

  1. Puppeteer
  2. Cheerio

Or you can directly install them by running the below commands:

npm i puppeteer
npm i cheerio

So before starting, we must ensure that We Have Set Up Our Node JS Project And Installed Both Puppeteer And Cheerio JS.

Process

Now, let’s start scraping Yelp Business Reviews by making a GET request on the target URL using Puppeteer to get the raw HTML data.

    const yelpReviewScraper = async() => {


        browser = await puppeteer.launch({
            headless: false,
            args: ["--disabled-setuid-sandbox", "--no-sandbox"],
            });
            const [page] = await browser.pages();        
        
            await page.goto("https://www.yelp.com/biz/hard-rock-cafe-san-francisco-5" , {
                waitUntil: 'domcontentloaded',
            })
            await page.waitForTimeout(5000)  
            let html = await page.content();
    
            await browser.close();
            getData(html);
    };
  1. puppeteer.launch() – This will launch the Chromium browser with the options we have set in our code. In our case, we are launching the browser in non-headless mode.
  2. browser.newPage() – This will open a new page or tab in the browser.
  3. page.goto() – This will navigate the page to the specified target URL.
  4. page.waitForTimeout() – It will cause the page to wait for 3 seconds to do further operations.
  5. page.content() – It will return the raw HTML scraped from the web page.

Now, we have completed the first part of scraping the HTML.
Then, we will parse this raw HTML data using Cheerio.

Const $ = Cheerio.Load(Html)

If you use the CSS selector Gadget correctly, you will find all these reviews are under the tag  .review__09f24__oHr9V

How to scrape Yelp Business Reviews 3

We will run a loop for this selector and will extract every possible information inside this individual review.

Similarly, we can find the tags for other respective data also. For example, the tag for the name is .css-ux5mu6 .css-1m051bw, the tag for the location is .responsive-hidden-small__09f24__qQFtj .css-qgunke, etc.

And this is what our parser looks like:

    const getData = (html) => {
        const $ = cheerio.load(html)
    
        const avg_rating = $(".five-stars__09f24__mBKym").attr("aria-label");
        const reviews = $(".padding-t0-5__09f24__lDQoQ .css-foyide").text();
        let user_reviews = [];
        $(".review__09f24__oHr9V").each((i,el) => {
            user_reviews.push({
                name: $(el).find(".css-ux5mu6 .css-1m051bw").text(),
                location: $(el).find(".responsive-hidden-small__09f24__qQFtj .css-qgunke").text(),
                review: $(el).find(".comment__09f24__gu0rG").text(),
                date: $(el).find(".css-chan6m").text(),
                rating: $(el).find(".five-stars__09f24__mBKym").attr("aria-label"),
                friends: $(el).find("[aria-label='Friends'] span > span").text(),
                reviews: $(el).find("[aria-label='Reviews'] span > span").text(),
                photos: $(el).find("[aria-label='Photos'] span > span").text(),
                thumbnail: $(el).find(".css-1pz4y59").attr("src")
            })
            let images = [];
            if($(el).find(".photo-container-small__09f24__obhgq"))
            {
            $(el).find(".photo-container-small__09f24__obhgq").each((i,el) => {
                images[i] = $(el).find("img").attr("src")
            })
            user_reviews[i].images = images
            }
        })
        console.log(avg_rating)
        console.log(reviews)
        console.log(user_reviews)
    } 

Here are the results:

  3 star rating
    991 reviews
    {
      name: 'Jacqueline B.',
      location: 'Chicago, IL',
      review: 'We flew into San Francisco on our way to Napa and decided to add to our Hard Rock SHIRT collection since it was too early to check in to the hotel. The area near Alcatraz was cool! The Pier 39 area is right off the water, with shops, rides, sightseeing, etc. The store for Hard Rock was smallish but nice, and the restaurant seated us quickly. It was busy for a Monday at 2:00, service was a tad slow but the waiter was friendly. I got iced tea/lemonade and the salmon salad. Very good, a nice size too.I know this chain can be over priced and a tourist trap, but the rock/music memorabilia on the walls are just so fascinating, no two restaurants are alike! I re-joined the points program and earned a free shot glass with my shirt purchase!',
      date: '11/30/2022',
      rating: '4 star rating',
      friends: '51',
      reviews: '454',
      photos: '873',
      thumbnail: 'https://s3-media0.fl.yelpcdn.com/photo/1zU0fO63-mrEBNFHM5UrRQ/60s.jpg',
      images: [
        'https://s3-media0.fl.yelpcdn.com/bphoto/E5wwDryXExDLMIUw0TYyaQ/180s.jpg',
        'https://s3-media0.fl.yelpcdn.com/bphoto/TvXYc1t1H223B6xTv1uHaw/180s.jpg',
        'https://s3-media0.fl.yelpcdn.com/bphoto/zcHaS-DIidDXQyqvnbTsvw/180s.jpg',
        'https://s3-media0.fl.yelpcdn.com/bphoto/YIBW-vgdHWZyXfPuywm4jg/180s.jpg'
        ]
    }                    

Here is the complete code:

    const cheerio = require("cheerio")
    const puppeteer = require("puppeteer")
    
    const getData = (html) => {
        const $ = cheerio.load(html)
    
        const avg_rating = $(".five-stars__09f24__mBKym").attr("aria-label");
        const reviews = $(".padding-t0-5__09f24__lDQoQ .css-foyide").text();
        let user_reviews = [];
        $(".review__09f24__oHr9V").each((i,el) => {
            user_reviews.push({
                name: $(el).find(".css-ux5mu6 .css-1m051bw").text(),
                location: $(el).find(".responsive-hidden-small__09f24__qQFtj .css-qgunke").text(),
                review: $(el).find(".comment__09f24__gu0rG").text(),
                date: $(el).find(".css-chan6m").text(),
                rating: $(el).find(".five-stars__09f24__mBKym").attr("aria-label"),
                friends: $(el).find("[aria-label='Friends'] span > span").text(),
                reviews: $(el).find("[aria-label='Reviews'] span > span").text(),
                photos: $(el).find("[aria-label='Photos'] span > span").text(),
                thumbnail: $(el).find(".css-1pz4y59").attr("src")
            })
            let images = [];
            if($(el).find(".photo-container-small__09f24__obhgq").length)
            {
            $(el).find(".photo-container-small__09f24__obhgq").each((i,el) => {
                images[i] = $(el).find("img").attr("src")
            })
            user_reviews[i].images = images
            }
        })
        console.log(avg_rating)
        console.log(reviews)
        console.log(user_reviews[0])
    }
    
    const yelpReviewScraper = async() => {
    
    
            browser = await puppeteer.launch({
            headless: false,
            args: ["--disabled-setuid-sandbox", "--no-sandbox"],
            });
            const [page] = await browser.pages();        
        
            await page.goto("https://www.yelp.com/biz/hard-rock-cafe-san-francisco-5" , {
                waitUntil: 'domcontentloaded',
            })
            await page.waitForTimeout(5000)  
            let html = await page.content();
    
            await browser.close();
            getData(html);
        
        
        };
    
        yelpReviewScraper()                                       

And that’s what a basic scraper of Yelp Reviews looks like. Similarly, this same process of scraping and selecting tags can be followed in other programming languages.
The second way to scrape Yelp data is by using Serpdog’s Yelp Search API.

With Yelp Search API

Scrape Yelp Search Results with our powerful scraper which is equipped with a massive pool of 10M+ residential proxies, and can bypass any anti-bot mechanism present on the website to allow a smooth scraping service to our customers.

Our Yelp Search API allows developers to scrape every inch of information from Yelp Search Results in JSON format which also includes restaurants address and phone numbers, useful for lead generation. Our Yelp API supports tons of parameters to filter the data according to the user’s needs.

Get your API Key and 100 free credits by registering at Serpdog.

const axios = require('axios');
  axios.get('https://api.serpdog.io/yelp?api_key=APIKEY&find_desc=burger&find_loc=San+Francisco,CA')
        .then(response => {
        console.log(response.data);
        })
        .catch(error => {
        console.log(error);
        });

Conclusion:

In this tutorial, we learned to scrape Yelp Business Reviews by making a basic scraper with the help of Node JS. We also saw how Serpdog Yelp Search API could help you with scraping the results.

I hope you enjoyed the tutorial. Feel free to message me if I missed something. Follow me on Twitter. Thanks for reading!

Additional Resources

Want to learn more about web scraping? Not a problem! We have already prepared the list of tutorials so you can kickstart your web scraping journey.

  1. Web Scraping Google With Node JS – A Complete Guide
  2. Web Scraping Google Maps Results
  3. Scrape Google Shopping Results
  4. Scrape Google Maps Reviews
  5. Web Scraping Amazon

Frequently Asked Questions

Q. How to scrape Yelp data for free?

You can use Serpdog’s Yelp Search API to scrape Yelp Data for free. Serpdog also provides 100 free credits to the users on their first sign-up.

Q. Can you scrape Yelp data?

Yes, you can scrape Yelp data with the help of any scraping API present in the market.

Similar Posts