...

Scrape Google Search Results With Go


GO
, a procedural programming language was launched by three Google developers, Robert Griesemer, Rob Pike, and Ken Thompson, as an open-source programming language in 2009. It is a statically typed language and has excellent support for concurrency, making it a useful tool for web scraping.

GO is designed to be simple to learn, and with the support of concurrency, it has become a fast and robust language.

Web scraping, data scraping, or data extraction can be defined as the process of extracting a specific piece of data from websites. It can be done manually at a small scale, but the term specifically refers to an automated extraction of data using a scraping bot or a crawler.

In this tutorial, we’ll be using Go Lang to scrape Google Search Results. We will also discuss why GO can be used as an alternative to other languages for scraping Google Search Results.

By the end of the article, you will be able to deal with the complex HTML structure of Google Search Results. You can also leverage or use this knowledge for other web scraping tasks.

Why GO for scraping Google?

GO Lang, has gained great popularity recently in the web scraping community due to its quality features:

Concurrency — One of the great features offered by Go is support for concurrency, which allows multiple threads to run under a single process, making it possible to scrape multiple pages at once increasing the speed and efficiency of the scraper.

Simple Syntax — GO language is designed to be easy to learn and read, and is straightforward with simple scraping tasks, thus minimizing any complexity in the process.

Complied Language — GO is a compiled language with an excellent garbage setup. That is why it can offer such extreme and fast performance which is important for improving the latency of the scraper.

Let’s start scraping Google Search Results With Go

In this section, we will focus on preparing a basic script to scrape the initial ten Google search results, including their title, description, link, etc.

Set-Up:

If you have not already installed GO, you can watch these videos for the installation.

  1. How to set up GO on Windows?
  2. How to set up GO on MacOS?

Requirements:

For scraping Google search results with GO, we will install a library:

  1. GoQuery —A library in GO Lang that brings a set of features similar to jQuery and is used for parsing HTML.

You can also install this library in your project folder by running the below command. 

 go get github.com/PuerkitoBio/goquery

Process:

So, I assume that you have set up your GO Lang project folder. We will begin with scraping HTML from the web page URL and then parsing it using GoQuery to extract the required data.

This is the URL we are going to target: 

https://www.google.com/search?q=go+tutorial&gl=us&hl=en

So, let us start creating our scraper by importing the libraries we’ll use later.

import (
 "fmt"
 "log"
 "net/http"

 "github.com/PuerkitoBio/goquery"
)

Then, we will define a function to get the data from Google Search results.

func getData() {
 url := "https://www.google.com/search?q=go+tutorials&gl=us&hl=en"
 req, err := http.NewRequest("GET", url, nil)
 if err != nil {
  log.Fatal(err)
 }

 req.Header.Set("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36")

After initializing the URL, we made a request object using http.NewRequest() which takes three parameters, the request type, the target URL, and the request body (nil in our case). If any error occurs while creating the object, we print the error in the terminal using the log.Fatal() and exit the function without returning anything.

We also set the header as the User Agent so that our scraping bot can mimic an organic user.

 client := &http.Client{}
 res, err := client.Do(req)
 if err != nil {
  log.Fatal(err)
 }
 defer res.Body.Close()

 doc, err := goquery.NewDocumentFromReader(res.Body)
 if err != nil {
  log.Fatal(err)
 }

Step-by-step explanation:

  1. In the first line, we created an HTTP client object and called client.Do() to make an HTTP request on the server using the req object.
  2. Then, we used the defer to close the response body.
  3. Finally, we called goquery.NewDocumentFromReader() with res.Body as a parameter to create a Document Object Model.

So, we have completed our scraping part of this program. Let us now move to the parsing part by searching for the required elements from the HTML.

Inspecting Google Search Results
Inspecting Google Search Results

If you inspect the HTML, you will find that every organic result is under the “g” tag.

So, looping over this g tag will help us to get the data it holds inside it.

 c := 0
doc.Find("div.g").Each(func(i int, result *goquery.Selection) {

That c variable is for displaying the position of the result.

 Then, we will extract the tags for the title, link, and description from the HTML.

Finding tags for the Required Elements
Finding tags for the Required Elements

If you look inside the div.g container, you will find that the tag for the title is h3, the tag for the link is .yuRUbf > a, and the tag for the description is .VwiC3b.

This makes our parser look like this:

  title := result.Find("h3").First().Text()
  link, _ := result.Find("a").First().Attr("href")
  snippet := result.Find(".VwiC3b").First().Text()

  fmt.Printf("Title: %s\n", title)
  fmt.Printf("Link: %s\n", link)
  fmt.Printf("Snippet: %s\n", snippet)
  fmt.Printf("Position: %d\n", c+1)
  fmt.Println()

  c++
 })
}

Then, we will call the getData() function to execute our scraper.

func main() {
 getData()
}

Run this code in your terminal. You will get the results like this:

Title: Tutorial: Get started with Go
Link: https://go.dev/doc/tutorial/getting-started
Snippet: In this tutorial, you'll get a brief introduction to Go programming. Along the way, you will: Install Go (if you haven't already). Write some simple "Hello, ...
Position: 1

Title: Go Tutorial
Link: https://www.w3schools.com/go/
Snippet: Well organized and easy to understand Web building tutorials with lots of examples of how to use HTML, CSS, JavaScript, SQL, PHP, Python, Bootstrap, ...
Position: 2


Title: Go Tutorial
Link: https://www.tutorialspoint.com/go/index.htm
Snippet: This tutorial is designed for software programmers with a need to understand the Go programming language from scratch. This tutorial will give you enough ...
Position: 4

Congratulations🎉🎉!!! You have successfully created a scraper to extract Google Search Results. 

But, this solution can result in an IP block if you use it for scraping large amounts of data from the Google search engine. Instead, you can use this Google Scraper API available in the market, which uses a large pool of residential and data center proxies to bypass anti-scraping mechanisms implemented by Google.

Using Google Search API to Scrape Search Results

Serpdog provides an easy and streamlined API solution to scrape Google Search Results using its powerful Google SERP API. Additionally, it manages the proxies and CAPTCHAs for a smooth scraping experience, and not only provides organic results but tons of other featured snippets found in the Google Search Results including knowledge graph, PAA, ads and much more. 

Serpdog: Google Search API
Serpdog — Google Search API

You will also receive 1000 free requests upon signing up.

After registering, you will get an API Key to start using our service. Embed the API Key in the code below, and you will be able to scrape Google Search Results at a rapid speed.

 url := "https://api.serpdog.io/search?api_key=APIKEY&q=go+lang+tutorial&gl=us"

 client := &http.Client{}
 req, err := http.NewRequest("GET", url, nil)
 if err != nil {
  fmt.Println(err)
  return
 }

 req.Header.Set("Content-Type", "application/json")

 res, err := client.Do(req)
 if err != nil {
  fmt.Println(err)
  return
 }
 defer res.Body.Close()

 body, err := ioutil.ReadAll(res.Body)
 if err != nil {
  fmt.Println(err)
  return
 }

 fmt.Println(string(body))

Conclusion:

In a nutshell, Go is one of the best languages for web scraping that can be used for large-scale scraping projects including scraping Google with great efficiency.

This tutorial taught us to scrape Google Search Results using GO Lang. Feel free to message me anything you need clarification on. Follow me on Twitter. Thanks for reading!

Additional Resources

  1. Web Scraping With Python
  2. Web Scraping With Node JS
  3. Scrape Yelp Business Reviews
  4. Scraping Google News Results
  5. Scrape Google Maps Reviews