...

How to Scrape Google Shopping Results

Scrape Google Shopping Results

In an environment of cut-throat competition, it would be a bad idea not to include web scraping as a marketing and monitoring strategy to keep check on your competitors. Extracting publicly available data not only provides you with competitive leverage but also empowers you to make astute strategic decisions to expand your foothold in the market.

In this tutorial, we will be scraping Google Shopping Results using Node JS. We will also explore the benefits and solutions to the problems that might occur while gathering data from Google Shopping. 

Why Scrape Google Shopping?

Google Shopping, formerly known as Google Product Search or Google Shopping Search, is used for browsing products from different online retailers and sellers for online purchases.

Consumers and retailers benefit from Google Shopping, making it a valuable e-commerce tool. Consumers can compare and select different ranges of products, while it helps retailers by increasing their discoverability on the platform and potentially driving more sales.

Read More about it here.

Scraping Google Shopping is essential if you want to avail the following benefits:

Benefits Of Scraping Google Shopping

Price Monitoring – Scrape Google Shopping to monitor the pricing of a particular product from various sources, and compare them to get the cheapest source, which also saves customers money.

Product Information – Get detailed information about the products from Google Shopping, and compare their reviews and features with various other products to find the best from them.

Product Availability – You can use Google Shopping data to monitor the availability of a set of products instead of manually checking the product from different sources, which consumes your valuable time.

Let’s start scraping Google Shopping:

In this section, we will be scraping Google Shopping Results. But let us first complete the requirements for this project.

Web Parsing with CSS selectors

Searching for tags in HTML files is not only a difficult thing to do but also a time-consuming process. It is better to use the CSS Selectors Gadget for selecting the perfect tags to make your web scraping journey easier.

This gadget can help you develop the perfect CSS selector for your needs. Here is the link to the tutorial, which will teach you to use this gadget for selecting the best CSS selectors according to your needs.

Install Libraries

To scrape Google Shopping Results we need to install some NPM libraries so we can move forward.

  1. Unirest JS
  2. Cheerio JS

So before starting, we have to ensure that we have set up our Node JS project and installed both the packages — Unirest JS and Cheerio JS. You can install both packages from the above link.

Process

Google Shopping Page Components

It is a great practice to decide in advance which entities are required to scrape before starting anything. These are the following data points which we will cover in this tutorial:

  1. Title
  2. Rating
  3. Reviews
  4. Pricing
  5. Source

Since we have completed the setup, we will now hit our target URL using Unirest JS to get the HTML data and then we will parse the extracted HTML data with the help of Cheerio JS.

const unirest = require("unirest");
const cheerio = require("cheerio");

const getShoppingData = () => {

  try
  {
  return unirest
    .get("https://www.google.com/search?q=nike+shoes&tbm=shop&gl=us")
    .headers({
      "User-Agent":
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36",
    })
    .then((response) => {

Look at the tbm parameter and its value(shop, here) in the URL. This value shop will tell Google that we are looking for shopping results.

Open this URL in your browser. Inspect the code. You will see that every organic shopping result is inside this tag .sh-dgr__gr-auto.

To extract information from each product, we need to run a loop on every product using the Cheerio instance variable. Let’s Not Forget We Have To Initialize Our Cheerio Instance Variable To Parse The HTML.

let $ = cheerio.load(response.body);

After initializing the variable, we will begin parsing each element using Cheerio.

        let shopping_results = [];
    
        $(".sh-dgr__gr-auto").each((i,el) => {
            shopping_results.push({

Extracting Product Name

Let’s find the location of the product name by inspecting it.

From the above image, we can conclude that the product name is under the tag h3 with the class tAxDx.

Then, using the Cheerio instance we will navigate inside the HTML DOM structure to target the H3 element.

            title: $(el).find("h3.tAxDx").text(),

Extracting Product rating and Number of Reviews

Similarly, we can extract the product rating and the number of reviews.

The product rating is contained under the tag span with the class Rsc7Yb. Let’s push this also.

            rating: $(el).find("span.Rsc7Yb").text(),

And then, we will locate the reviews inside the HTML.

So, the number of reviews of a product is inside the span tag with class QIrs8 which is also under the div container with class NzUzee. We can extract the number of reviews using the following code.

                reviews: parseFloat($(el).find(".NzUzee .QIrs8").text()?.split("stars.")[1]?.trim()?.replace(/,/g, "")),

Extracting Pricing

Pricing is stored inside the span tag with the class a8Pemb contained inside the div tag with the class XrAfOe.

You can extract pricing using the following code.

                price: $(el).find(".XrAfOe .a8Pemb").text(),

Extracting Source

Finally, we will extract the name of the retailer which is selling the product.

The source is present inside the div tag with class IuHnof.

                source: $(el).find(".IuHnof").text().replace(/\.aULzUe\{.*?\}\.aULzUe::after\{.*?\}/ , ''),

The replace() method will remove all the unnecessary text inside the style tag.

Complete Code

I have also extracted some extra information. However, if you want to get more data you can follow the same method we applied previously.

const unirest = require("unirest");
    const cheerio = require("cheerio");
    
    const getShoppingData = () => {
      
      try
      {
      return unirest
        .get("https://www.google.com/search?q=nike+shoes&tbm=shop&gl=us")
        .headers({
          "User-Agent":
            "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36",
        })
        .then((response) => {
          let $ = cheerio.load(response.body);
    
    
        let ads = [];
        
        $(".sh-np__click-target").each((i,el) => {
         ads.push({
            title: $(el).find(".sh-np__product-title").text(),
            link: "https://google.com" + $(el).attr("href"),
            source: $(el).find(".sh-np__seller-container").text(),
            price: $(el).find(".hn9kf").text(),
            delivery: $(el).find(".U6puSd").text(),
         })
         if($(el).find(".rz2LD").length)
         {
          let extensions = []
            extensions = $(el).find(".rz2LD").text()
            ads[i].extensions = extensions
         }
         })
    
        for (let i = 0; i < ads.length; i++) {
            Object.keys(ads[i]).forEach(key => ads[i][key] === "" ? delete ads[i][key] : {});  
        }
    
        let shopping_results = [];
    
        $(".sh-dgr__gr-auto").each((i,el) => {
            shopping_results.push({
                title: $(el).find("h3.tAxDx").text(),
                link: $(el).find(".zLPF4b .eaGTj a.shntl").attr("href").substring($(el).find("a.shntl").attr("href").indexOf("=")+1),
                source: $(el).find(".IuHnof").text().replace(/\.aULzUe\{.*?\}\.aULzUe::after\{.*?\}/ , ''),

                price: $(el).find(".XrAfOe .a8Pemb").text(),
                rating: $(el).find(".NzUzee .QIrs8").text() ? parseFloat($(el).find(".NzUzee .QIrs8").text()?.split("out")[0]?.trim()) : "",
                reviews: $(el).find(".NzUzee .QIrs8").text() ? parseFloat($(el).find(".NzUzee .QIrs8").text()?.split("stars.")[1]?.trim()?.replace(/,/g, "")) : "",
                delivery: $(el).find(".vEjMR").text()
            })
            if($(el).find(".Ib8pOd").length)
            {
                let extensions = [];
                extensions = $(el).find(".Ib8pOd").text();
                shopping_results[i].extensions = extensions
            }
        })
    
        for (let i = 0; i < shopping_results.length; i++) {
            Object.keys(shopping_results[i]).forEach(key => shopping_results[i][key] === "" ? delete shopping_results[i][key] : {});  
         }
         
         console.log(ads)
         console.log(shopping_results)
        })
      }
     catch(e)
     {
        console.log(e)
     }
    }
    
    
    getShoppingData();

Run this code in your terminal to get the desired results:

Scrape Google Shopping Results 7

Our result should look like this 👆🏻.

Save the data in a CSV file

Instead of creating a mess of data, we should save the extracted information in a CSV file. We will use an npm library Object-to-CSV to complete this task.

Let us install it.

npm i objects-to-csv

Then import it into your code.

const ObjectsToCsv = require('objects-to-csv');

Then, we are going to use this library to store the scraped Google Shopping Data in the CSV file.

const csv = new ObjectsToCsv(shopping_results)
csv.toDisk('./shopping_data.csv', { append: true })

To ensure that the fresh data is transferred to the appended file, we have passed append:true as a parameter to the toDisk() method.

After executing this code in your terminal, you will get a file named shopping_data.csv in your root project folder.

Shopping Data CSV

With Google Shopping API

If you don’t want to code and maintain the scraper in the long run, then you can try our Google Shopping API to scrape shopping results.

Scrape Google Shopping Results 8
Google Shopping API

We also offer 100 free requests on the first sign-up.

After registering successfully on Serpdog, embed your API Key in the below code you will be able to scrape Google Shopping Results without any blockage.

const axios = require('axios');
    
    axios.get('https://api.serpdog.io/shopping?api_key=APIKEY&q=shoes&gl=us')
      .then(response => {
        console.log(response.data);
      })
      .catch(error => {
        console.log(error);
    });

Conclusion:

Google Shopping is the most trouble-free spot when attempting to retrieve data from multiple e-commerce sources. It aids in monitoring consumers and competitors, enabling data-driven, informed decisions, and helping your business in its growth efforts.

I hope this tutorial gave you a clear understanding of why it can be beneficial for businesses in scraping shopping results.

Feel free to message me anything you need clarification on. Follow me on Twitter. Thanks for reading!

Additional Resources

  1. How to scrape Google Organic Search Results using Node JS?
  2. Scrape Google Images Results
  3. Scrape Google News Results
  4. Scrape Google Maps Reviews

Frequently Asked Questions