This tutorial is on scraping search engine results using simple API by SerpApi using PHP and cURL.

Are you new to cURL? Learn the basics of making requests with cURL in PHP.

In this tutorial, we're not going to use SerpApi's PHP library, instead we'll stick to native implementation with cURL.

Here is the complete documentation of the Google Search API we're going to use: https://serpapi.com/search-api
Google search results scraping with PHP illustration

Basic search tutorial in PHP

Step 1: Prepare cURL

Let's prepare our cURL code. Add the SerpApi endpoint as the URL value.

<?php

// Initialize cURL session
$ch = curl_init();

// Set cURL options
curl_setopt($ch, CURLOPT_URL, "https://serpapi.com/search");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

Step 2: Prepare the parameter

Feel free to adjust this parameter based on search engine and other relevant parameters you want to implement.

// Set the data fields for GET request
$fields = array(
    'api_key' => 'YOUR_API_KEY',
    'engine' => 'google',
    'q' => 'Coffee',
    'location' => 'Austin, Texas, United States',
    'google_domain' => 'google.com',
    'gl' => 'us',
    'hl' => 'en'
);

Step 3: Execute cURL
Let's execute the code above. Don't forget to close the connection at the end.

// Build the URL query string
$queryString = http_build_query($fields);
curl_setopt($ch, CURLOPT_URL, "https://serpapi.com/search?" . $queryString);

// Execute cURL session
$response = curl_exec($ch);

// Check for errors
if(curl_errno($ch)) {
    echo 'Request Error:' . curl_error($ch);
}

// Close cURL session
curl_close($ch);

// Output the response
echo $response;

That's it. You should be able to see the result when running a server. Try php -S localhost:8000 (depend on your OS).

Async search implementation in PHP

Sometimes, you need to perform a "bulk search" or search for multiple keywords at once. We can implement this using the async option.

Make sure to see the basic endpoint of SerpApi Search Archive API here. It looks like this

https://serpapi.com/searches/$searchID.json?api_key=SECRET_API_KEY

Here is a basic example:
- Add an array to store key value of $searchID and temporary $searchData
- Set async to true

$searchQueue = [];
$items = ['green tea', 'black tea', 'oolong tea'];

// Batch search in async mode
foreach ($items as $item) {
    // Set the data fields for GET request
    $fields = array(
        'api_key' => $API_KEY,
        'engine' => 'google',
        'q' => $item,
        'location' => 'Austin, Texas, United States',
        'google_domain' => 'google.com',
        'gl' => 'us',
        'hl' => 'en',
        'async' => true
    );

    // Build the URL query string
    $queryString = http_build_query($fields);
    curl_setopt($ch, CURLOPT_URL, "https://serpapi.com/search?" . $queryString);

    // Execute cURL session
    $response = curl_exec($ch);

    // Check for errors
    if(curl_errno($ch)) {
        echo 'Request Error:' . curl_error($ch);
    }
    
    // save the search id for later retrieval
    $searchResult = json_decode($response, true);
    $searchQueue[$searchResult['search_metadata']['id']] = $searchResult;
}

In above example, I only make the keyword dynamic. Feel free to adjust other parameters based on your use case.

So far, we only store the temporary search information in an array $searchQueue now how do we retrieve it?

Retreive the async result

The idea is:
- Doing an infinite loop of the search (since the search can take some time to finish)
- We keep checking if the id status already updated to success .
- If so, we remove the ID from the search (to make sure we're not running in infinite loop).
- Otherwise, keep checking the search status

while (!empty($searchQueue)) {
    foreach($searchQueue as $search) {
        $searchID = $search['search_metadata']['id'];
        $fields = array(
            'api_key' => $API_KEY
        );

        // Access archive API endpoint
        $queryString = http_build_query($fields);
        curl_setopt($ch, CURLOPT_URL, "https://serpapi.com/searches/$searchID.json?" . $queryString);
        $response = curl_exec($ch);
        
        $search = json_decode($response, true);
        echo $search['search_metadata']['status'];
        if ($search['search_metadata']['status'] == 'Success'
            || $search['search_metadata']['status'] == 'Cached') {
            // Remove the search from the queue based on the search id
            unset($searchQueue[$searchID]);
            echo $search['search_metadata']['id'] . " is done";
            echo "<br>";
            echo "First result title : " .  $search['organic_results'][0]['title'];
            echo "<br>";
        } else {
            echo $search['search_metadata']['id'] . " is not done yet";
            echo "<br>";
        }
    }
}

echo "All searches are done";
// Close cURL session
curl_close($ch);
?>

When the status of the search is succeeded, I print the first organic result's title. Feel free to do anything on this condition block. Since the data is already available here.

That's it! Feel free to try!