In this post, we will learn to scrape Google Maps Reviews.

Scrape Google Maps Reviews 1

Requirements:

Web Parsing with CSS selectors

Searching the tags from the HTML files is not only a difficult thing to do but also a time-consuming process. It is better to use the CSS Selectors Gadget for selecting the perfect tags to make your web scraping journey easier.

This gadget can help you to come up with the perfect CSS selector for your need. Here is the link to the tutorial, which will teach you to use this gadget for selecting the best CSS selectors according to your needs.

User Agents

User-Agent is used to identify the application, operating system, vendor, and version of the requesting user agent, which can save help in making a fake visit to Google by acting as a real user.
You can also rotate User Agents, read more about this in this article: How to fake and rotate User Agents using Python 3.
If you want to further safeguard your IP from being blocked by Google, you can try these 10 Tips to avoid getting Blocked while Scraping Google.

Install Libraries

To scrape Google maps reviews we need to install some NPM libraries so we can move forward.

  1. Node JS
  2. Unirest JS
  3. Cheerio JS

So before starting, we have to ensure that we have set up our Node JS project and installed both the packages - Unirest JS and Cheerio JS. You can install both packages from the above link.

Target:

Scrape Google Maps Reviews 2

We will target to scrape the user reviews on Eiffel Tower.

Process:

There are many ways to scrape reviews from Google Maps. Here are two methods:

Method 1 - Using Google Maps Network URL

Now, we have set up all the things to prepare our scraper. We will use an npm library Unirest JS to make a get request to our target URL to get our raw HTML data. Then we will use Cheerio JS for parsing the extracted HTML data.
We will target this URL:

                         
https://www.google.com/async/reviewDialog?hl=en_us&async=feature_id:${data_ID},next_page_token:${next_page_token},sort_by:qualityScore,start_index:,associated_topic:,_fmt:pc
                         
                        
Where,
data_ID - Data ID is a unique ID given to a particular location in Google Maps.
next_page_token - The next_page_token is used to get the next page results.
sort_by - It is used for sorting and filtering results.
The various values of sort_by are:
  1. qualityScore - the most relevant reviews.
  2. newestFirst - the most recent reviews.
  3. ratingHigh - the highest rating reviews.
  4. ratingLow - the lowest rating reviews.

Now, the question arises how do we get the Data ID of any place?
Scrape Google Maps Reviews 3
                            
https://www.google.com/maps/place/Eiffel+Tower/@48.8583701,2.2922926,17z/data=!4m7!3m6!1s0x47e66e2964e34e2d:0x8ddca9ee380ef7e0!8m2!3d48.8583701!4d2.2944813!9m1!1b1
                            
                           
You can see, in the URL, the part after our !4m7!3m6!1s and before !8m2! is our Data ID.
So, our data ID in this case is - 0x47e66e2964e34e2d:0x8ddca9ee380ef7e0
You can also use Serpdog's Google Maps Data ID API to retrieve the Data ID of any place.
                           
  const axios = require('axios');

  axios.get('https://api.serpdog.io/dataId?api_key=APIKEY&q=Statue Of Liberty&gl=us')
  .then(response => {
    console.log(response.data);
  })
  .catch(error => {
    console.log(error);
  });
  
  Result:
  {
  "meta": {
    "api_key": "APIKEY",
    "q": "Statue Of Liberty",
    "gl": "us"
  },
  "placeDetails": [
    {
      "Address": " New York, NY 10004"
    },
    {
      "Phone": " (212) 363-3200"
    },
    {
      "dataId": "0x89c25090129c363d:0x40c6a5770d25022b"
    }
  ]                                 
                                  
                           
Our target URL should look like this:
                            
https://www.google.com/async/reviewDialog?hl=en_us&async=feature_id:0x47e66e2964e34e2d:0x8ddca9ee380ef7e0,next_page_token:,sort_by:qualityScore,start_index:,associated_topic:,_fmt:pc
                            
                           
Copy this URL in your browser and press enter. A text file will be downloaded after entering this URL. Open this file in your respective code editor. Convert it into an HTML file. After opening the HTML file, we will search for the HTML tags of the elements we want in our response.
We will first parse the location information of the place, which contains - the location name, address, average rating, and total reviews. Scrape Google Maps Reviews 4 From the above image, the tag for our location name is .P5Bobd, the tag for our address is .T6pBCe, tag for our average rating is span.Aq14fc and tag for our total number of reviews is span.z5jxId.
All done for the location information part, we will now move towards parsing Data ID and next_page_token.
Search for the tag .lcorif. In the above image, you can find the .lcorif tag in the second line. Under this tag, we have our tag for Data ID as .loris and of next_page_token as .gws-localreviews__general-reviews-block.
Now, we will search for the tags which contain data about the user and his review.
Search for the tag .gws-localreviews__google-review Scrape Google Maps Reviews 5 This tag contains all information about the user and his reviews.
We will now parse the extracted HTML for the user's name, link, thumbnail, number of reviews, rating, review, and the images posted by the user, which makes our code look like this:
                              
    const unirest = require("unirest");
    const cheerio = require("cheerio");
    
    const getReviewsData = () => {
      return unirest
        .get("https://www.google.com/async/reviewDialog?hl=en_us&async=feature_id:0x47e66e2964e34e2d:0x8ddca9ee380ef7e0,next_page_token:,sort_by:qualityScore,start_index:,associated_topic:,_fmt:pc")
        .headers({
          "User-Agent":
            "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36",
        })
        .then((response) => {
          console.log(response.body)
          let $ = cheerio.load(response.body);
    
          let user = [] , location_info,data_id,token;
    
          $(".lcorif").each((i, el) => {
            data_id = $(".loris").attr("data-fid");
            token = $(".gws-localreviews__general-reviews-block").attr(
              "data-next-page-token"
            );
            location_info = {
              title: $(".P5Bobd").text(),
              address: $(".T6pBCe").text(),
              avgRating: $("span.Aq14fc").text(),
              totalReviews: $("span.z5jxId").text(),
            };
          });
    
          $(".gws-localreviews__google-review").each((i, el) => {
            user.push({
            name:$(el).find(".TSUbDb").text(),
      
            link:$(el).find(".TSUbDb a").attr("href"),
      
            thumbnail: $(el).find(".lDY1rd").attr("src"),
      
            numOfreviews:$(el).find(".Msppse").text(),
      
            rating:$(el).find(".EBe2gf").attr("aria-label"),
      
            review:$(el).find(".Jtu6Td").text(),
      
            images:$(el)
              .find(".EDblX .JrO5Xe")
              .toArray()
              .map($)
              .map(d => d.attr("style").substring(21 , d.attr("style").lastIndexOf(")")))
            })
        });
        console.log("LOCATION INFO: ")
        console.log(location_info)
        console.log("DATA ID:")
        console.log(data_id)
        console.log("TOKEN:");
        console.log(token)
        console.log("USER:")
        console.log(user)
        });
    };
    
    getReviewsData();
                              
                          
You can also check some of my other Google scrapers in my Git Repository: https://github.com/Darshan972/GoogleScrapingBlogs

Result:

Scrape Google Maps Reviews 6 Our result should look like this 👆🏻.
These are the results of the first ten reviews. If you want to get another ten results, put the token, which we have found by running our code, in the below URL:
    
https://www.google.com/async/reviewDialog?hl=en_us&async=feature_id:0x47e66e2964e34e2d:0x8ddca9ee380ef7e0,next_page_token:tokenFromResponse,sort_by:qualityScore,start_index:,associated_topic:,_fmt:pc

    

In this case, we have our token as CAESBkVnSUlDZw== .
You can find the reviews for every next page using the token from their previous pages.

Method 2 - Using Puppeteer Infinite Scrolling

Another method you can use for scraping Google Maps Reviews is Puppeteer Infinite Scrolling. So, first, let us open the reviews page of Google Maps on our browser. Scrape Google Maps Reviews 7 Here is the URL:

                            
                                https://www.google.com/maps/place/Eiffel+Tower/@48.8583701,2.2944813,15z/data=!4m7!3m6!1s0x0:0x8ddca9ee380ef7e0!8m2!3d48.8583701!4d2.2944813!9m1!1b1
                            
                        
Now, we will make the main function, in which we will first navigate to the target URL and extract the average reviews and ratings given by users.
                            
    const getMapsData = async () => {
    try {
        let url =
        "https://www.google.com/maps/place/Eiffel+Tower/@48.8583701,2.2944813,15z/data=!4m7!3m6!1s0x0:0x8ddca9ee380ef7e0!8m2!3d48.8583701!4d2.2944813!9m1!1b1";
        browser = await puppeteer.launch({
        args: ["--disabled-setuid-sandbox", "--no-sandbox"],
        headless: false
        });
        const page = await browser.newPage();
    
        await page.goto(url, { waitUntil: "domcontentloaded" , timeout: 60000});
        await page.waitForTimeout(3000);
    
        let ratings = await page.evaluate(() => {
        return Array.from(document.querySelectorAll(".PPCwl")).map((el) => {
            return {
            avg_rating: el.querySelector(".fontDisplayLarge")?.textContent.trim(),
            total_reviews: el.querySelector(".fontBodySmall")?.textContent.trim(),
            five_stars: el.querySelector(".ExlQHd tbody tr:nth-child(1)").getAttribute("aria-label").split("stars, ")[1].trim(),
            four_stars: el.querySelector(".ExlQHd tbody tr:nth-child(2)").getAttribute("aria-label").split("stars, ")[1].trim(),
            three_stars: el.querySelector(".ExlQHd tbody tr:nth-child(3)").getAttribute("aria-label").split("stars, ")[1].trim(),
            two_stars: el.querySelector(".ExlQHd tbody tr:nth-child(4)").getAttribute("aria-label").split("stars, ")[1].trim(),
            one_stars: el.querySelector(".ExlQHd tbody tr:nth-child(5)").getAttribute("aria-label").split("stars, ")[1].trim(),
            };
        });
        });
    
        console.log(ratings)
    
        let data =  await scrollPage(page,'.DxyBCb', 10);
    
        console.log(data);
        await browser.close();
    } catch (e) {
        console.log(e);
    }
   };
                            
                        
Step-by-step explanation:
  1. puppeteer.launch() - This method will launch the Chromium browser with the options we have set in our code. In our case, we are launching our browser in non-headless mode.
  2. browser.newPage() - This will open a new page or tab in the browser.
  3. page.goto() - This will navigate the page to the specified target URL.
  4. page.waitForTimeout() - It will cause the page to wait for the specified number of seconds we passed as a parameter to do further operations.
  5. scrollPage() - At last, we called our infinite scroller to extract the data we need with the page, the tag for the scroller div, and the number of items we want as parameters.
  6. browser.close() - This will close the browser.
After this, we will move to our infinte scroller function.
                        
    const scrollPage = async(page, scrollContainer, itemTargetCount) => {
        let items = [];
        let previousHeight = await page.evaluate(`document.querySelector("${scrollContainer}").scrollHeight`);
        while (itemTargetCount > items.length) {
            items = await extractItems(page);
            await page.evaluate(`document.querySelector("${scrollContainer}").scrollTo(0, document.querySelector("${scrollContainer}").scrollHeight)`);
            await page.evaluate(`document.querySelector("${scrollContainer}").scrollHeight > ${previousHeight}`);
            await page.waitForTimeout(2000);
        }
        return items;
    }
                        
                    
Step-by-step explanation:
  1. previousHeight - Scroll height of the container.
  2. extractItems() - Function to parse the scraped HTML.
  3. In the next step, we just scrolled down the container to height equal to previousHeight.
  4. And in the last step, we waited for the container to scroll down until its height got greater than the previous height.

After this, we will parse the HTML in the extractItems function.
                        
    async function extractItems(page) {
        const reviews = await page.evaluate(() => {
        return Array.from(document.querySelectorAll(".jftiEf")).map((el) => {
        return {
            user: {
            name: el.querySelector(".d4r55")?.textContent.trim(),
            thumbnail: el.querySelector("a.WEBjve img")?.getAttribute("src"),
            localGuide: el.querySelector(".RfnDt span:nth-child(1)")?.style.display === "none" ?  false : true,
            reviews: parseInt(el.querySelector(".RfnDt span:nth-child(2)")?.textContent.replace("·", "")),
            link: el.querySelector("a.WEBjve")?.getAttribute("href"),
            },
            rating: el.querySelector(".kvMYJc")?.getAttribute("aria-label").trim(),
            date: el.querySelector(".rsqaWe")?.textContent,
            review: el.querySelector(".wiI7pd")?.textContent.trim(),
            images: Array.from(el.querySelectorAll(".KtCyie button")).length
            ? Array.from(el.querySelectorAll(".KtCyie button")).map((el) => {
                return {
                thumbnail: getComputedStyle(el).backgroundImage.split('")')[0].replace('url("',""),
                };
            })
            : "",
          };
            });
        });
        return reviews;
        }
                        
                    
Step-by-step explanation:
  1. document.querySelectorAll() - It will return all the elements that matches the specified CSS selector. In our case, it is jftiEf.
  2. getAttribute() - This will return the attribute value of the specified element.
  3. textContent - It returns the text content inside the selected HTML element.
  4. split() - Used to split a string into substrings with the help of a specified separator and return them as an array.
  5. trim() - Removes the spaces from the starting and end of the string.
  6. replaceAll() - Replaces the specified pattern from the whole string.

Here is the full code:
                        
    const puppeteer = require("puppeteer");

    async function extractItems(page) {
        const reviews = await page.evaluate(() => {
        return Array.from(document.querySelectorAll(".jftiEf")).map((el) => {
            return {
            user: {
                name: el.querySelector(".d4r55")?.textContent.trim(),
                thumbnail: el.querySelector("a.WEBjve img")?.getAttribute("src"),
                localGuide: el.querySelector(".RfnDt span:nth-child(1)")?.style.display === "none" ?  false : true,
                reviews: el.querySelector(".RfnDt span:nth-child(2)")?.textContent.replace("·", "").replace("reviews", "").trim(),
                link: el.querySelector("a.WEBjve")?.getAttribute("href"),
            },
            rating: el.querySelector(".kvMYJc")?.getAttribute("aria-label").trim(),
            date: el.querySelector(".rsqaWe")?.textContent,
            review: el.querySelector(".wiI7pd")?.textContent.trim(),
            images: Array.from(el.querySelectorAll(".KtCyie button")).length ? Array.from(el.querySelectorAll(".KtCyie button")).map((el) => {
                return {
                    thumbnail: getComputedStyle(el).backgroundImage.split('")')[0].replace('url("',""),
                };
                })
            : "",
            };
        });
        });
        return reviews;
    }
    
    const scrollPage = async(page, scrollContainer, itemTargetCount) => {
        let items = [];
        let previousHeight = await page.evaluate(`document.querySelector("${scrollContainer}").scrollHeight`);
        while (itemTargetCount > items.length) {
        items = await extractItems(page);
        await page.evaluate(`document.querySelector("${scrollContainer}").scrollTo(0, document.querySelector("${scrollContainer}").scrollHeight)`);
        await page.evaluate(`document.querySelector("${scrollContainer}").scrollHeight > ${previousHeight}`);
        await page.waitForTimeout(2000);
        }
        return items;
    }
    
    const getMapsData = async () => {
        try {
        let url =
            "https://www.google.com/maps/place/Eiffel+Tower/@48.8583701,2.2944813,15z/data=!4m7!3m6!1s0x0:0x8ddca9ee380ef7e0!8m2!3d48.8583701!4d2.2944813!9m1!1b1";
        browser = await puppeteer.launch({
            args: ["--disabled-setuid-sandbox", "--no-sandbox"],
            headless: false
        });
        const [page] = await browser.pages();
    
        await page.goto(url, { waitUntil: "domcontentloaded" , timeout: 60000});
        await page.waitForTimeout(3000);
    
        let ratings = await page.evaluate(() => {
            return Array.from(document.querySelectorAll(".PPCwl")).map((el) => {
            return {
                avg_rating: el.querySelector(".fontDisplayLarge")?.textContent.trim(),
                total_reviews: el.querySelector(".fontBodySmall")?.textContent.trim(),
                five_stars: el.querySelector(".ExlQHd tbody tr:nth-child(1)").getAttribute("aria-label").split("stars, ")[1].trim(),
                four_stars: el.querySelector(".ExlQHd tbody tr:nth-child(2)").getAttribute("aria-label").split("stars, ")[1].trim(),
                three_stars: el.querySelector(".ExlQHd tbody tr:nth-child(3)").getAttribute("aria-label").split("stars, ")[1].trim(),
                two_stars: el.querySelector(".ExlQHd tbody tr:nth-child(4)").getAttribute("aria-label").split("stars, ")[1].trim(),
                one_stars: el.querySelector(".ExlQHd tbody tr:nth-child(5)").getAttribute("aria-label").split("stars, ")[1].trim(),
            };
            });
        });
    
        console.log(ratings)
    
        let data =  await scrollPage(page,'.DxyBCb',10);
    
        console.log(data);
        await browser.close();
        } catch (e) {
        console.log(e);
        }
    };
    getMapsData();                            
                        
                    
Our results should look like this 👇🏻:
                        
  [
   {
    avg_rating: '4.6',
    total_reviews: '3,10,611 reviews',
    five_stars: '243,237 reviews',
    four_stars: '42,702 reviews',
    three_stars: '13,474 reviews',
    two_stars: '4,163 reviews',
    one_stars: '7,035 reviews'
   }
  ]
  [
   {
    user: {
      name: 'Wagner Castro',
      thumbnail: 'https://lh3.googleusercontent.com/a-/ACNPEu9wP6T1uyo2ga98cVBzIW0uH6NMyA2vX7KWB26hFeQ=w36-h36-p-c0x00000000-rp-mo-ba6-br100',
      localGuide: true,
      reviews: '554',
      link: 'https://www.google.com/maps/contrib/113391288797697364105/reviews?hl=en-US'
    },
    rating: '5 stars',
    date: '2 months ago',
    review: 'Paris is an incredible experience with innumerable museums, parks, restaurants and  beautiful sites but the Eiffel Tower is one of the most interesting places to visit. …',
    images: [
      [Object], [Object], [Object],
      [Object], [Object], [Object],
      [Object], [Object], [Object],
      [Object], [Object], [Object],
      [Object], [Object], [Object],
      [Object], [Object], [Object],
      [Object], [Object], [Object],
      [Object], [Object], [Object],
      [Object], [Object], [Object],
      [Object], [Object]
    ]
  },
  .......
                        
                    
But the main disadvantage associated with this method is it is quite slow, and if you want to scrape tons of results from this method, then I recommend not to try it as it might easily crash the browser.

With Google Maps Reviews API:

Serpdog API offers you 100 free requests on sign-up
Scraping can take a lot of time sometimes, but the already made structured JSON data can save you a lot of time.

                        
    const axios = require('axios');
    
    axios.get('https://api.serpdog.io/reviews?api_key=APIKEY&data_id=0x89c25090129c363d:0x40c6a5770d25022b')
      .then(response => {
        console.log(response.data);
      })
      .catch(error => {
        console.log(error);
    });   
                        
                    

Result:

Scrape Google Maps Reviews 8

Conclusion:

In this tutorial, we learned how to scrape Google Maps Reviews. Feel free to ask me anything on my email. Follow me on Twitter. Thanks for reading!

Additional Resources:

  1. How to scrape Google Organic Search Results using Node JS?
  2. Scrape Google Shopping Results
  3. Scrape Google News Results
  4. Scrape Google Scholar Results
  5. Web Scraping Google Maps
  6. Web Scraping Google With Node JS - A Complete Guide

Frequently Asked Questions

Q. How to scrape Google Maps Reviews


You can visit the Google Maps Reviews documentation, where you can find the complete step-by-step guide to scrape the Google Maps Reviews.

Q. Is it legal to scrape Google Maps?


Yes, scraping data from Google Maps is completely legal as extraction of any public data is protected by US Constituion . You can also use the Serpdog's Google Maps Reviews API to extract data from Google Maps.

Q. How do I scrape Google Maps for free?


You can read this blog on scraping Google Maps, where I have explained and has given a step-by-step by guide to scrape Google Maps.

Author:

My name is Darshan and I am the founder of serpdog.io.