...

A Complete Guide To Web Scraping Google Maps

Google Maps Data is a crucial piece of information for software companies, sentimental analysts, and data miners as it consists of valuable details like user ratings and reviews, phone numbers, addresses, images, and other relevant attributes about a particular location.

Furthermore, Google Maps data serves businesses by allowing them to verify their geographical presence, gather insights into their competitors, and optimize their local SEO efforts.

The limitations imposed by the official Google Maps API have discouraged developers from directly extracting Google Maps data via the official API. Consequently, they migrated towards third-party solutions to meet their requirements. Moreover, the official API has much higher pricing, rendering it an unfeasible solution for many.

Remember, you can also design your scraper, giving you complete control over the results.

This tutorial will teach us to scrape Google Maps with Node JS using Puppeteer.

Let’s start scraping Google Maps:

Before starting the tutorial, let’s discuss the requirements necessary to complete this.

Web Parsing With CSS Selectors

Searching the tags from the HTML files is not only a difficult thing to do but also a time-consuming process. It is better to use the CSS Selectors Gadget for selecting the perfect tags to make your web scraping journey easier.

This gadget can help you to come up with the perfect CSS selector for your need. Here is the link to the tutorial, which will teach you to use this gadget for selecting the best CSS selectors according to your needs.

User Agents

User-Agent is used to identify the application, operating system, vendor, and version of the requesting user agent, which can save help in making a fake visit to Google by acting as a real user.
You can also rotate User Agents, read more about this in this article: How to fake and rotate User Agents using Python 3.

If you want to further safeguard your IP from being blocked by Google, try these 10 Tips to avoid getting Blocked while Scraping Websites.

Install Libraries

Before we begin, install these libraries so we can move forward and prepare our scraper.

  1. Puppeteer JS

Or you can type the below commands in your project terminal to install the libraries:

npm i puppeteer

Process

Before proceeding with the tutorial, let us outline the data points we are going to scrape:

  1. Place Name or Title
  2. Rating and Reviews
  3. Address
  4. Place Description
  5. Timings 
  6. And other relevant data…
Google Maps Results

Copy the below target URL to extract the HTML data:

https://www.google.com/maps/search/coffee/@28.6559457,77.1404218,11z

Coffee is our query. After that, we have our latitudes and longitudes. The number before z at the end is nothing but the zooming intensity of Google Maps. You can decrease or increase it as per your choice. Its value ranges from 2.92, in which the map completely zooms out, to 21, in which the map completely zooms in.

Note: Latitudes and longitudes are required to pass in the URL. But the zoom parameter is optional.

We will use the Puppeteer Infinite Scrolling Method to scrape the Google Maps Results. So, let us start preparing our scraper.

First, let us create a main function to launch the browser and navigate to the target URL.

const getMapsData = async () => {
        browser = await puppeteer.launch({
            headless: false,
            args: ["--disabled-setuid-sandbox", "--no-sandbox"],
        });
        const page = await browser.newPage();
        await page.setExtraHTTPHeaders({
            "User-Agent":
                "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_10) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4882.194 Safari/537.36",
        })
        
        
        await page.goto("https://www.google.com/maps/search/Starbucks/@26.8484046,75.7215344,12z/data=!3m1!4b1" , {
            waitUntil: 'domcontentloaded',
            timeout: 60000
        })
        
        await page.waitForTimeout(3000);
        
        let data =  await scrollPage(page,".m6QErb[aria-label]",2)
        
        console.log(data)
        await browser.close();
        };

Step-by-step explanation:

  1. puppeteer.launch() – This will launch the Chromium browser with the options set in our code. In our case, we are launching our browser in non-headless mode.
  2. browser.newPage() – This will open a new page or tab in the browser.
  3. page.setExtraHTTPHeaders() – It is used to pass HTTP headers with every request the page initiates.
  4. page.goto() – This will navigate the page to the specified target URL.
  5. page.waitForTimeout() – It will cause the page to wait for 3 seconds to do further operations.
  6. scrollPage() – At last, we called our infinite scroller to extract the data we needed with the page, the tag for the scroller div, and the number of items we want as parameters.

Now, let us prepare the infinite scroller.

const scrollPage = async(page, scrollContainer, itemTargetCount) => {
        let items = [];
        let previousHeight = await page.evaluate(`document.querySelector("${scrollContainer}").scrollHeight`);
        while (itemTargetCount > items.length) {
            items = await extractItems(page);
            await page.evaluate(`document.querySelector("${scrollContainer}").scrollTo(0, document.querySelector("${scrollContainer}").scrollHeight)`);
            await page.evaluate(`document.querySelector("${scrollContainer}").scrollHeight > ${previousHeight}`);
            await page.waitForTimeout(2000);
        }
        return items;
    }

Step-by-step explanation:

  1. previousHeight – Scroll the height of the container.
  2. extractItems() – Function to parse the scraped HTML.
  3. In the next step, we just scrolled down the container to a height equal to previousHeight.
  4. In the last step, we waited for the container to scroll down until its height became larger than the previous height.

Finally, we will discuss the working of our parser.

const extractItems = async(page)  => {
        let maps_data = await page.evaluate(() => {
        return Array.from(document.querySelectorAll(".Nv2PK")).map((el) => {
            const link = el.querySelector("a.hfpxzc").getAttribute("href");
            return {
            title: el.querySelector(".qBF1Pd")?.textContent.trim(),
            avg_rating: el.querySelector(".MW4etd")?.textContent.trim(),
            reviews: el.querySelector(".UY7F9")?.textContent.replace("(", "").replace(")", "").trim(),
            address: el.querySelector(".W4Efsd:last-child > .W4Efsd:nth-of-type(1) > span:last-child")?.textContent.replaceAll("·", "").trim(),
            description: el.querySelector(".W4Efsd:last-child > .W4Efsd:nth-of-type(2)")?.textContent.replace("·", "").trim(),
            website: el.querySelector("a.lcr4fd")?.getAttribute("href"),
            category: el.querySelector(".W4Efsd:last-child > .W4Efsd:nth-of-type(1) > span:first-child")?.textContent.replaceAll("·", "").trim(),
            timings: el.querySelector(".W4Efsd:last-child > .W4Efsd:nth-of-type(3) > span:first-child")?.textContent.replaceAll("·", "").trim(),
            phone_num: el.querySelector(".W4Efsd:last-child > .W4Efsd:nth-of-type(3) > span:last-child")?.textContent.replaceAll("·", "").trim(),
            extra_services: el.querySelector(".qty3Ue")?.textContent.replaceAll("·", "").replaceAll("  ", " ").trim(),
            latitude: link.split("!8m2!3d")[1].split("!4d")[0],
            longitude: link.split("!4d")[1].split("!16s")[0],
            link,
            dataId: link.split("1s")[1].split("!8m")[0],
            };
        });
        });
        return maps_data;
        }

Step-by-step explanation:

  1. document.querySelectorAll() – It will return all the elements that match the specified CSS selector. In our case, it is Nv2PK.
  2. getAttribute() -This will return the attribute value of the specified element.
  3. textContent – It returns the text content inside the selected HTML element.
  4. split() – Used to split a string into substrings with the help of a specified separator and return them as an array.
  5. trim() – Removes the spaces from the starting and end of the string.
  6. replaceAll() – Replaces the specified pattern from the whole string.

Complete Code:

const puppeteer = require('puppeteer');
const extractItems = async(page)  => {
        let maps_data = await page.evaluate(() => {
        return Array.from(document.querySelectorAll(".Nv2PK")).map((el) => {
            const link = el.querySelector("a.hfpxzc").getAttribute("href");
            return {
            title: el.querySelector(".qBF1Pd")?.textContent.trim(),
            avg_rating: el.querySelector(".MW4etd")?.textContent.trim(),
            reviews: el.querySelector(".UY7F9")?.textContent.replace("(", "").replace(")", "").trim(),
            address: el.querySelector(".W4Efsd:last-child > .W4Efsd:nth-of-type(1) > span:last-child")?.textContent.replaceAll("·", "").trim(),
            description: el.querySelector(".W4Efsd:last-child > .W4Efsd:nth-of-type(2)")?.textContent.replace("·", "").trim(),
            website: el.querySelector("a.lcr4fd")?.getAttribute("href"),
            category: el.querySelector(".W4Efsd:last-child > .W4Efsd:nth-of-type(1) > span:first-child")?.textContent.replaceAll("·", "").trim(),
            timings: el.querySelector(".W4Efsd:last-child > .W4Efsd:nth-of-type(3) > span:first-child")?.textContent.replaceAll("·", "").trim(),
            phone_num: el.querySelector(".W4Efsd:last-child > .W4Efsd:nth-of-type(3) > span:last-child")?.textContent.replaceAll("·", "").trim(),
            extra_services: el.querySelector(".qty3Ue")?.textContent.replaceAll("·", "").replaceAll("  ", " ").trim(),
            latitude: link.split("!8m2!3d")[1].split("!4d")[0],
            longitude: link.split("!4d")[1].split("!16s")[0],
            link,
            dataId: link.split("1s")[1].split("!8m")[0],
            };
        });
        });
        return maps_data;
        }
    
        const scrollPage = async(page, scrollContainer, itemTargetCount) => {
        let items = [];
        let previousHeight = await page.evaluate(`document.querySelector("${scrollContainer}").scrollHeight`);
        while (itemTargetCount > items.length) {
            items = await extractItems(page);
            await page.evaluate(`document.querySelector("${scrollContainer}").scrollTo(0, document.querySelector("${scrollContainer}").scrollHeight)`);
            await page.evaluate(`document.querySelector("${scrollContainer}").scrollHeight > ${previousHeight}`);
            await page.waitForTimeout(2000);
        }
        return items;
        }
    
    
    
    const getMapsData = async () => {
        browser = await puppeteer.launch({
        headless: false,
        args: ["--disabled-setuid-sandbox", "--no-sandbox"],
        });
        const [page] = await browser.pages();
        await page.setExtraHTTPHeaders({
            "User-Agent":
            "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_10) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4882.194 Safari/537.36",
        })
        
    
        await page.goto("https://www.google.com/maps/search/Starbucks/@26.8484046,75.7215344,12z/data=!3m1!4b1" , {
            waitUntil: 'domcontentloaded',
            timeout: 60000
        })
    
        await page.waitForTimeout(5000)  
    
    let data =  await scrollPage(page,".m6QErb[aria-label]",2)
    
    console.log(data)
    await browser.close();
    };
    
    getMapsData();

Our result should look like this 👇🏻:

[
  {
    title: 'STARBUCKS',
    avg_rating: '4.6',
    reviews: '3,643',
    address: 'D-Block Fort Anandam Near Gaurav Tower, next to Mercedes showroom',
    description: 'Iconic Seattle-based coffeehouse chain',
    website: 'http://starbucks.in/',
    category: 'Coffee shop',
    timings: 'Open ⋅ Closes 12 am',
    phone_num: '079765 62949',
    extra_services: 'Dine-inTakeawayNo-contact delivery',
    latitude: '26.8556211',
    longitude: '75.8058191',
    link: 'https://www.google.com/maps/place/STARBUCKS/data=!4m7!3m6!1s0x396db5631deeb72b:0xb06c14c4ce81e24a!8m2!3d26.8556211!4d75.8058191!16s%2Fg%2F11qg98xgt3!19sChIJK7fuHWO1bTkRSuKBzsQUbLA?authuser=0&hl=en&rclk=1',
    dataId: '0x396db5631deeb72b:0xb06c14c4ce81e24a'
  },
  {
    title: 'Starbucks Coffee',
    avg_rating: '4.7',
    reviews: '1,028',
    address: 'Ground Floor, Near Naturals Icecream, A-5, C scheme chomu circle, Sardar Patel Marg',
    description: 'Iconic Seattle-based coffeehouse chain',
    website: 'https://www.starbucks.com/',
    category: 'Coffee shop',
    timings: 'Open ⋅ Closes 11 pm',
    phone_num: '022 6611 3939',
    extra_services: 'Dine-inTakeawayNo-contact delivery',
    latitude: '26.9109303',
    longitude: '75.7953463',
    link: 'https://www.google.com/maps/place/Starbucks+Coffee/data=!4m7!3m6!1s0x396db577dcbbb503:0x6bc55ebbd466bfde!8m2!3d26.9109303!4d75.7953463!16s%2Fg%2F11q95t5shd!19sChIJA7W73He1bTkR3r9m1LtexWs?authuser=0&hl=en&rclk=1',
    dataId: '0x396db577dcbbb503:0x6bc55ebbd466bfde'
  },
 ......

So, this is how you can create a basic scraper to extract data from Google Maps. However, this solution is not scalable and is too time-consuming to perform large extraction. This problem can be easily solved by using Serpdog’s Google Maps API, which we will discuss shortly.

Advantages of Google Maps Scraping

Web Scraping Google Maps can provide you with various types of benefits: 

Lead Generation – Scraping Google Maps can help you collect emails and phone numbers in large numbers from various places, and you can use them to create your database of leads that can sell at a high price in the market.

Sentimental Analysis – Google Maps data can be used for analyzing the sentiment of the public based on the average ratings and reviews given by the public.

Location-Based Services – One can scrape Google Maps data to provide services like finding nearby businesses, restaurants, pubs, and cafes based on the user’s location.

Limitations of using official Google Maps API

The official Google Maps is not considered an alternative due to several reasons:

Access Restrictions — The official Google Maps API has imposed restrictions on the number of requests you can hit per second to a limit of 100 only. These types of restrictions disallow extensive data extraction required for scraping purposes.

Limited Amount of Data — The Google Maps API returns structured data, but it is not as flexible and detailed compared to what other web scrapers provide in the market.

Cost Issues — Google Maps API can become a prohibitively costly solution for large-scale data extraction due to the usage-based pricing model, making it a non-feasible solution for developers.

Using Serpdog’s Google Maps API to scrape Google Maps

Creating your own dedicated solution or relying solely on the official API might not be a long-term alternative as various limitations disallow extensive data scraping to collect vital data needed for further analysis.

However, Serpdog’s Google Maps API provides a robust and streamlined solution for businesses struggling to extract Google Maps data at scale. It provides several data points including contact numbers, customer reviews, and operating hours to its users to enrich their application or database with the most accurate and quality information.

Google Maps API

It also provides free 100 credits to users on registration!!

Serpdog Dashboard

After getting registered, you will be redirected to our dashboard, where you can find your API Key. 

Integrate this API Key in the below code, and you will be able to web scrape Google Maps at a rapid speed without experiencing any blockage.

const axios = require('axios');

axios.get('https://api.serpdog.io/maps_search?api_key=APIKEY&q=coffee&ll=@40.7455096,-74.0083012,15.1z')
  .then(response => {
    console.log(response.data);
  })
  .catch(error => {
    console.log(error);
  });

Results: 

"search_results": [
      {
          "title": "Gregorys Coffee",
          "place_id": "ChIJQTNrM69ZwokR3ggxzgeelqQ",
          "data_id": "0x89c259af336b3341:0xa4969e07ce3108de",
          "data_cid": "-6586903648621492002",
          "reviews_link": "https://api.serpdog.io/reviews?api_key=APIKEY&data_id=0x89c259af336b3341:0xa4969e07ce3108de",
          "photos_link": "https://api.serpdog.io/maps_photos?api_key=APIKEY&data_id=0x89c259af336b3341:0xa4969e07ce3108de",
          "posts_link": "https://api.serpdog.io/maps_post?api_key=APIKEY&data_id=0x89c259af336b3341:0xa4969e07ce3108de",
          "gps_coordinates": {
              "latitude": 40.7477283,
              "longitude": -73.9890454
          },
          "provider_id": "/g/11xdfwq9f",
          "rating": 4.1,
          "reviews": 1153,
          "price": "££",
          "type": "Coffee shop",
          "types": [
              "Coffee shop"
          ],
          "address": "874 6th Ave New York, NY 10001 United States",
          "open_state": "Open ⋅ Closes 7 pm",
          "hours": "Open ⋅ Closes 7 pm",
          "operating_hours": {
              "monday": "6:30 am–7 pm",
              "tuesday": "6:30 am–7 pm",
              "wednesday": "6:30 am–7 pm",
              "thursday": "6:30 am–7 pm",
              "friday": "6:30 am–7 pm",
              "saturday": "7 am–7 pm",
              "sunday": "7 am–7 pm"
          },
          "phone": "+1 877-231-7619",
          "description": "House-roasted coffee, snacks & free WiFi Outpost of a chain of sleek coffeehouses offering house-roasted coffee, free WiFi & light bites.",
          "thumbnail": "https://lh5.googleusercontent.com/p/AF1QipNq-8YRdAjiVW7uFMWDzHarqoK2Pr7bxIqI7t8A=w86-h114-k-no"
      },
  ......

Conclusion:

In this tutorial, we learned to scrape Google Maps Results using Node JS. Feel free to message me if I missed something. Follow me on Twitter. Thanks for reading!

Frequently Asked Questions

Google Maps Scraping is the process of extracting details like business names, addresses, phone numbers, and reviews from the displayed listings for the corresponding query on Google Maps.

Additional Resources

  1. Scrape Google Maps Reviews
  2. Scrape Google Maps Places Results
  3. Web Scraping With Node JS
  4. Scrape Yelp Reviews