Reverse Engineering Google Streetview

Created Thursday, Apr 13, 2023

Motivation

There are two websites which have always led to hours of unplanned exploration for me: Wikipedia and Google Maps. Wikipedia gave me the historical facts and Maps gave me the context to understand them in terms of geography. Part of the reason it is so enticing is that I can explore a place not just from the air but also from the ground; Streetview. One nagging gripe I had was how slow it was to navigate through photo spheres in an entire region and the inability to download those photo spheres for later viewing. From experience with the Mario Maker 2 API I know how to reverse engineer clients and get direct access to their data. Using that experience I aimed to make a custom client for Google Streetview that:

  1. Lets me download photo spheres in their native format (equirectangular images)
  2. Obtain a list of photo spheres near a location
  3. Lets me navigate with arrow keys between all available photo spheres in a region with no animations
  4. Uses less resources than Google Maps

Starting Down, Going Up

My strategy to reverse engineering clients is starting at the activation, opening a Streetview photo sphere, and working up to recursively identify every call I am dependent on for parameters. The first call I notice is the following:

https://www.google.com/maps/preview/photo?authuser=0&hl=en&gl=us&pb=!1e3!5m54!2m2!1i203!2i100!3m3!2i4!3sCAEIBAgFCAYgAQ!5b1!7m42!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!8m0!9b0!11m1!4b1!6m3!1sCqW8ZMSZH8m2qtsP4s2swAM!7e81!15i11021!9m2!2d5.120313975838092!3d52.09013020176323!10d45.12133303837374

With truncated output:

)]}'
[[["bl3v0-ol5SonuMF_4ozgxQ",0,1,"8",null,520.83167,["https://streetviewpixels-pa.googleapis.com/v1/thumbnail?panoid\u003dbl3v0-ol5SonuMF_4ozgxQ\u0026cb_client\u003dmaps_sv.tactile.gps\u0026w\u003d203\u0026h\u003d100\u0026yaw\u003d80.1311\u0026pitch\u003d0\u0026thumbfov\u003d100","8",null,[203,100]],null,[[3,5.120198628668783,52.090117872306024],[80.1311,90],[1024,768],75],
...

Followed by a number of calls similar to the below:

https://streetviewpixels-pa.googleapis.com/v1/tile?cb_client=maps_sv.tactile&panoid=bl3v0-ol5SonuMF_4ozgxQ&x=0&y=0&zoom=2&nbt=1&fover=2

Which each returned a small portion of the photo sphere as a JPEG.

I know I opened a photo sphere near latitude 52.09, longitude 5.12 so that informs my first observations:

  • The pb argument encodes most of the data of the call and, like many other Google projects, probably indicates protobuffer
  • ! indicates the start of an argument
  • Integers follow e, m and i so they likely indicate integers
  • b is only ever followed by 0 and 1 so it is likely a boolean
  • s is often followed by alphabetical characters so is likely a string
  • d likely indicates floating point numbers

With this info I can already guess some arguments:

https://www.google.com/maps/preview/photo?authuser=0&hl=en&gl=us&pb=!1e3!5m54!2m2!1i203!2i100!3m3!2i4!3sCAEIBAgFCAYgAQ!5b1!7m42!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!8m0!9b0!11m1!4b1!6m3!1sCqW8ZMSZH8m2qtsP4s2swAM!7e81!15i11021!9m2!2d
[longitude]
!3d
[latitude]
!10d45.12133303837374

Some arguments contain enough information, especially strings, that they deserve additional attention:

https://www.google.com/maps/preview/photo?authuser=0&hl=en&gl=us&pb=!1e3!5m54!2m2!1i203!2i100!3m3!2i4!3s
CAEIBAgFCAYgAQ
!5b1!7m42!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!8m0!9b0!11m1!4b1!6m3!1s
CqW8ZMSZH8m2qtsP4s2swAM
!7e81!15i11021!9m2!2d
[longitude]
!3d
[latitude]
!10d
45.12133303837374

If I change either string argument the response does not resemble the one above so I assume these identify my client or authenticate me, so before I continue I need to find where these strings are obtained. The first string I don't find in the response of any other request (I later prove this is a constant I can hardcode in my client) so I go onto the next string. Curiously this string is contained within the initial HTML in a javascript constant:

...
null,null,1,null,["en","us","United States"],null,0,"CqW8ZMSZH8m2qtsP4s2swAM",null,null
...
null,1],null,0],[\"CqW8ZMSZH8m2qtsP4s2swAM\",null,null,null
...

The Google Maps homepage is, predictably, not authenticated so I can download the HTML and extract my key from one of these locations. The second location has adjacent strings that seem too generic so I go with the first. I create a naïve pattern I can assume always holds true:

..."],null,0,"<key>"...

Which becomes the regex "],null,0,"[^"]+". With testing this proves reliable.

Returning to the photo sphere image, it is quickly apparent bl3v0-ol5SonuMF_4ozgxQ is shared between the response of photo?authuser and the arguments of tile?cb_client (quick identifiers for endpoints are a necessity while reverse engineering). The response of photo?authuser resembles JSON so I assume the first 4 characters are garbage to make JSON over JS CORS requests harder. Stripping those chars reliably creates valid JSON. You know how they say "If it walks like a duck, talks like a duck, it probably is a duck". Whatever bl3v0-ol5SonuMF_4ozgxQ is, it can be obtained from response[0][0][0].

Our simplified client looks a bit like this now:

import requests
import re
import json
import io

MAPS_PREVIEW_ID = "CAEIBAgFCAYgAQ"
UNKNOWN_PREVIEW_CONSTANT = 45.12133303837374
latitude = 52.09
longitude = 5.12

client_id = re.search(
    '"],null,0,"[^"]+"', requests.get(url="https://www.google.com/maps").text
).group()[11:-1]

preview_document = json.loads(
    requests.get(
        url="https://www.google.com/maps/preview/photo?authuser=0&hl=en&gl=us&pb=!1e3!5m54!2m2!1i203!2i100!3m3!2i4!3s%s!5b1!7m42!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!8m0!9b0!11m1!4b1!6m3!1s%s!7e81!15i11021!9m2!2d%f!3d%f!10d%f"
        % (MAPS_PREVIEW_ID, client_id, longitude, latitude, UNKNOWN_PREVIEW_CONSTANT)
    ).text[4:]
)

chunk = io.BytesIO(
    requests.get(
        url="https://streetviewpixels-pa.googleapis.com/v1/tile?cb_client=maps_sv.tactile&panoid=%s&x=0&y=0&zoom=2&nbt=1&fover=2"
        % (preview_document[0][0][0])
    ).content
)

print(chunk)

Stitching a Photo Sphere

Next, I need to extract a usable photo sphere from tile?cb_client. Lets start with the first two arguments, x and y. I iterate through every permutation, starting from 0, of those two values, ignoring requests that return:

{
  "error": {
    "code": 400,
    "message": "Request contains an invalid argument.",
    "status": "INVALID_ARGUMENT"
  }
}

The following permutations work:

  • x = 0, y = 0
  • x = 0, y = 1
  • x = 1, y = 0
  • x = 1, y = 1
  • x = 2, y = 0
  • x = 2, y = 1
  • x = 3, y = 0
  • x = 3, y = 1

The edges of each image, and the popular usage of x and y as coordinates, suggest the intended way to stitch these images is:

Stitching

Demonstrate stitch

It is quickly apparent that Google Streetview stores photo spheres as equirectangular images. It is also clear from the black bar at the bottom of the image that photo?authuser is not immediately sufficient for creating an equirectangular image, I need to crop the stitched tiles, and there might be an additional call necessary to get info like the dimensions of the image.

By analyzing the network traffic again I see a new call:

https://www.google.com/maps/photometa/v1?authuser=0&hl=en&gl=us&pb=!1m4!1smaps_sv.tactile!11m2!2m1!1b1!2m2!1sen!2sus!3m3!1m2!1e2!2sbl3v0-ol5SonuMF_4ozgxQ!4m57!1e1!1e2!1e3!1e4!1e5!1e6!1e8!1e12!2m1!1e1!4m1!1i48!5m1!1e1!5m1!1e2!6m1!1e1!6m1!1e2!9m36!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e3!2b1!3e2!1m3!1e3!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e1!2b0!3e3!1m3!1e4!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e3

With truncated output:

)]}'
[[],[[[1],[2,"bl3v0-ol5SonuMF_4ozgxQ"],[2,2,[6656,13312],[[[[208,416]],[[416,832]],[[832,1664]],[[1664,3328]],[[3328,6656]],[[6656,13312]]],[512,512]],null,null,null,null,null,"bl3v0-ol5SonuMF_4ozgxQ"],[null,null,[["8 Lijnmarkt","nl"],["Utrecht","en"]]],[[[["© 2023 Google"]]],[[["Google"],"//www.google.com/streetview","//lh5.googleusercontent.com/-B45RREcFXCo/AAAAAAAAAAI/AAAAAAAAAAA/Z6DH8qvKlJ0/s48-p-k-no-mo/p.jpg"]]],[[[1],[[null,null,52.09010892045943,5.120165714557119],[9.318060874938965],[162.0236968994141,79.269287109375,5.573602676391602],null,"NL"],null,[[[[2,"bl3v0-ol5SonuMF_4ozgxQ"],null,[[null,null,52.09010892045943,5.120165714557119],[9.318060874938965],[162.0236968994141,79.269287109375,5.573602676391602]]],
...

The height of the equirectangular image can readily be determined by considering the black bar as the bottom of the image: 832 pixels. The width is harder to find but, by finding the point at which the right side wraps around to the left, I can say it is 1664 pixels. Neither of those numbers is immediately apparent in my output. However with some experimentation I see 13312 / 1664 = 8 and 6656 / 832 = 8, curiously 6566 and 13312 defy my expectations of width coming before height. Can I use this relationship to find an equation?

Looking at tile?cb_client again may give me a clue. There is another argument, zoom, which often correlates with dimensions. By varying this argument I get the following widths:

  • 1: width = 832, height = 416
  • 2: width = 1664, height = 832
  • 3: width = 3328, height = 1664
  • 4: width = 6656, height = 3328
  • 5: width = 13312, height = 6656

It is clear that each width/height is twice the preceding value, as well as width always being twice the height (something which makes sense, as this is an equirectangular image). The equation I need likely involves powers of 2. Because zoom equaling 5 yields the dimensions (6656, 13312) and values decrease by 2 from there the equation must be (6656 / 2^(4 - zoom), 6656 / 2^(5 - zoom)) or (photometa[1][0][2][2][0] / 2^(4 - zoom), photometa[1][0][2][2][0] / 2^(5 - zoom)). Since every tile downloaded thus far has been (512, 512) I can also say the dimensions of the tiles that need to be downloaded is (⌈width / 512⌉, ⌈height / 512⌉). Ceiling is necessary because I need to download an entire tile even if only part of the tile is in use.

We can now update our simplified client:

import requests
from PIL import Image
import re
import json
import io
import math

MAPS_PREVIEW_ID = "CAEIBAgFCAYgAQ"
UNKNOWN_PREVIEW_CONSTANT = 45.12133303837374
latitude = 52.09
longitude = 5.12
zoom = 3

client_id = re.search(
    '"],null,0,"[^"]+"', requests.get(url="https://www.google.com/maps").text
).group()[11:-1]

preview_document = json.loads(
    requests.get(
        url="https://www.google.com/maps/preview/photo?authuser=0&hl=en&gl=us&pb=!1e3!5m54!2m2!1i203!2i100!3m3!2i4!3s%s!5b1!7m42!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!8m0!9b0!11m1!4b1!6m3!1s%s!7e81!15i11021!9m2!2d%f!3d%f!10d%f"
        % (MAPS_PREVIEW_ID, client_id, longitude, latitude, UNKNOWN_PREVIEW_CONSTANT)
    ).text[4:]
)
sphere_id = preview_document[0][0][0]

photometa_document = json.loads(
    requests.get(
        url="https://www.google.com/maps/photometa/v1?authuser=0&hl=en&gl=us&pb=!1m4!1smaps_sv.tactile!11m2!2m1!1b1!2m2!1sen!2sus!3m3!1m2!1e2!2s%s!4m57!1e1!1e2!1e3!1e4!1e5!1e6!1e8!1e12!2m1!1e1!4m1!1i48!5m1!1e1!5m1!1e2!6m1!1e1!6m1!1e2!9m36!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e3!2b1!3e2!1m3!1e3!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e1!2b0!3e3!1m3!1e4!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e3"
        % (sphere_id)
    ).text[4:]
)

width = int(photometa_document[1][0][2][2][0] / pow(2, 4 - zoom))
height = int(photometa_document[1][0][2][2][0] / pow(2, 5 - zoom))
tiles_width = math.ceil(width / 512)
tiles_height = math.ceil(height / 512)

image_output = Image.new(mode="RGB", size=(width, height))
for x in range(tiles_width):
    for y in range(tiles_height):
        image_chunk = Image.open(
            io.BytesIO(
                requests.get(
                    url="https://streetviewpixels-pa.googleapis.com/v1/tile?cb_client=maps_sv.tactile&panoid=%s&x=%d&y=%d&zoom=%d&nbt=1&fover=2"
                    % (sphere_id, x, y, zoom)
                ).content
            )
        )
        image_output.paste(image_chunk, (x * 512, y * 512))

image_output.save("photo-sphere.png")

With that we've accomplished goal 1. Can I now get a list of photo spheres near a location?

Getting Multiple

Lets start with returning to the first call I made, it is a much bigger payload than I pasted here, 2.6 kilobytes. By looking at some of the arrays more closely I see:

)]}'
[[["bl3v0-ol5SonuMF_4ozgxQ",0,1,"8",null,520.83167,["https://streetviewpixels-pa.googleapis.com/v1/thumbnail?panoid\u003dbl3v0-ol5SonuMF_4ozgxQ\u0026cb_client\u003dmaps_sv.tactile.gps\u0026w\u003d203\u0026h\u003d100\u0026yaw\u003d80.1311\u0026pitch\u003d0\u0026thumbfov\u003d100","8",null,[203,100]],null,[[3,5.120198628668783,52.090117872306024],[80.1311,90],[1024,768],75],
...],
["AF1QipNXGZRPf9ugNgRu_3GcS6X7w_PYglNxj925hL_v",10,11,"",null,325.98135,["https://lh5.googleusercontent.com/p/AF1QipNXGZRPf9ugNgRu_3GcS6X7w_PYglNxj925hL_v\u003dw203-h100-k-no-pi-0-ya92.25356-ro-0-fo100","Pakhuis Utrecht",[13312,6656],[203,100]],null,[[3,5.120289153851183,52.09020283194869],[168.14113,90],[13312,6656],75],"J6W8ZKrBDp6wqtsP8vOjmAs","0ahUKEwjq8Kf496OAAxUemGoFHfL5CLMQzCcIAygB",null,null,
...],
["AF1QipP0mzSqKcFbsxVxx5_eEXLatUromQAtOuNZPLkg",10,11,"",null,381.57483,["https://lh5.googleusercontent.com/p/AF1QipP0mzSqKcFbsxVxx5_eEXLatUromQAtOuNZPLkg\u003dw203-h100-k-no-pi-0-ya51.365257-ro-0-fo100","Pakhuis Utrecht",[13312,6656],[203,100]],null,[[3,5.120213537951163,52.090175620734485],[126.35281,90],[13312,6656],75],"J6W8ZKrBDp6wqtsP8vOjmAs","0ahUKEwjq8Kf496OAAxUemGoFHfL5CLMQzCcIBCgC",null,null,
...],
["I7WJc7PwpEWnC3qD8MXG5w",0,1,"Oil \u0026 Vinegar Utrecht",null,524.61926,["https://streetviewpixels-pa.googleapis.com/v1/thumbnail?panoid\u003dI7WJc7PwpEWnC3qD8MXG5w\u0026cb_client\u003dmaps_sv.tactile.gps\u0026w\u003d203\u0026h\u003d100\u0026yaw\u003d42.61835\u0026pitch\u003d0\u0026thumbfov\u003d100","Oil \u0026 Vinegar Utrecht",null,[203,100]],null,[[3,5.120228125350842,52.090072875185065],[42.61835,
...]],
...

Maybe there is not one sphere ID at preview_document[0][0][0] but actually multiple sphere IDs at preview_document[0][...][0].

I encounter a problem when I test this assumption: every sphere ID longer than 22 characters fails on the photometa request. For now I can ignore every ID longer than 22 characters but I will return to that problem. Ignoring those IDs fixes the issue and every photo sphere is downloaded.

Can I increase the number of sphere IDs returned? It is time to revisit photo?authuser. By searching for the number 4, the number of photo sphere IDs, in the request I find there is only one place where it is present:

https://www.google.com/maps/preview/photo?authuser=0&hl=en&gl=us&pb=!1e3!5m54!2m2!1i203!2i100!3m3!2i
4
!3s
[Maps preview ID]
!5b1!7m42!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!8m0!9b0!11m1!4b1!6m3!1s
[client ID]
!7e81!15i11021!9m2!2d
[longitude]
!3d
[latitude]
!10d
45.12133303837374

With testing I find yes, increasing this number increases the number of photo spheres returned (within reason):

https://www.google.com/maps/preview/photo?authuser=0&hl=en&gl=us&pb=!1e3!5m54!2m2!1i203!2i100!3m3!2i
[number of photo spheres]
!3s
[Maps preview ID]
!5b1!7m42!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!8m0!9b0!11m1!4b1!6m3!1s
[client ID]
!7e81!15i11021!9m2!2d
[longitude]
!3d
[latitude]
!10d
45.12133303837374

There is however metadata I haven't collected: the GPS location of each photo sphere. The photometa request contained information about the dimensions of the photo sphere, could it also contain the location? As a reminder the output of that call is:

)]}'
[[],[[[1],[2,"bl3v0-ol5SonuMF_4ozgxQ"],[2,2,[6656,13312],[[[[208,416]],[[416,832]],[[832,1664]],[[1664,3328]],[[3328,6656]],[[6656,13312]]],[512,512]],null,null,null,null,null,"bl3v0-ol5SonuMF_4ozgxQ"],[null,null,[["8 Lijnmarkt","nl"],["Utrecht","en"]]],[[[["© 2023 Google"]]],[[["Google"],"//www.google.com/streetview","//lh5.googleusercontent.com/-B45RREcFXCo/AAAAAAAAAAI/AAAAAAAAAAA/Z6DH8qvKlJ0/s48-p-k-no-mo/p.jpg"]]],[[[1],[[null,null,52.09010892045943,5.120165714557119],[9.318060874938965],[162.0236968994141,79.269287109375,5.573602676391602],null,"NL"],null,[[[[2,"bl3v0-ol5SonuMF_4ozgxQ"],null,[[null,null,52.09010892045943,5.120165714557119],[9.318060874938965],[162.0236968994141,79.269287109375,5.573602676391602]]],
...

I requested photo spheres from latitude 52.09 and longitude 5.12 and numbers very close to those appear in the output at, get ready for this, photometa_document[1][0][5][0][1][0][2] and photometa_document[1][0][5][0][1][0][3]. By crosschecking a few photo spheres with these values I can verify it is the location.

Knowing the locations makes a previously mysterious argument clearer: the large floating point number in photo?authuser. By increasing it I tend to get more spread out photo spheres. The new URL is as follows:

https://www.google.com/maps/preview/photo?authuser=0&hl=en&gl=us&pb=!1e3!5m54!2m2!1i203!2i100!3m3!2i
[number of photo spheres]
!3s
[Maps preview ID]
!5b1!7m42!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!8m0!9b0!11m1!4b1!6m3!1s
[client ID]
!7e81!15i11021!9m2!2d
[longitude]
!3d
[latitude]
!10d
[range]

With both number and range of photo spheres I have accomplished goal 2. I take goal 2 even farther but for now our simplified client looks like this:

import requests
from PIL import Image
import re
import json
import io
import math

MAPS_PREVIEW_ID = "CAEIBAgFCAYgAQ"
r = 1000
latitude = 52.09
longitude = 5.12
zoom = 3
num_photo_spheres = 10

client_id = re.search(
    '"],null,0,"[^"]+"', requests.get(url="https://www.google.com/maps").text
).group()[11:-1]

preview_document = json.loads(
    requests.get(
        url="https://www.google.com/maps/preview/photo?authuser=0&hl=en&gl=us&pb=!1e3!5m54!2m2!1i203!2i100!3m3!2i%d!3s%s!5b1!7m42!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!8m0!9b0!11m1!4b1!6m3!1s%s!7e81!15i11021!9m2!2d%f!3d%f!10d%f"
        % (num_photo_spheres, MAPS_PREVIEW_ID, client_id, longitude, latitude, r)
    ).text[4:]
)

for panorama in preview_document[0]:
    panorama_id = panorama[0]

    if len(panorama_id) == 22:
        photometa_document = json.loads(
            requests.get(
                url="https://www.google.com/maps/photometa/v1?authuser=0&hl=en&gl=us&pb=!1m4!1smaps_sv.tactile!11m2!2m1!1b1!2m2!1sen!2sus!3m3!1m2!1e2!2s%s!4m57!1e1!1e2!1e3!1e4!1e5!1e6!1e8!1e12!2m1!1e1!4m1!1i48!5m1!1e1!5m1!1e2!6m1!1e1!6m1!1e2!9m36!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e3!2b1!3e2!1m3!1e3!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e1!2b0!3e3!1m3!1e4!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e3"
                % (panorama_id)
            ).text[4:]
        )

        latitude = photometa_document[1][0][5][0][1][0][2]
        longitude = photometa_document[1][0][5][0][1][0][3]
        print(
            "Panorama %s Latitude %f Longitude %f" % (panorama_id, latitude, longitude)
        )

        width = int(photometa_document[1][0][2][2][0] / pow(2, 4 - zoom))
        height = int(photometa_document[1][0][2][2][0] / pow(2, 5 - zoom))
        tiles_width = math.ceil(width / 512)
        tiles_height = math.ceil(height / 512)

        image_output = Image.new(mode="RGB", size=(width, height))
        for x in range(tiles_width):
            for y in range(tiles_height):
                image_chunk = Image.open(
                    io.BytesIO(
                        requests.get(
                            url="https://streetviewpixels-pa.googleapis.com/v1/tile?cb_client=maps_sv.tactile&panoid=%s&x=%d&y=%d&zoom=%d&nbt=1&fover=2"
                            % (panorama_id, x, y, zoom)
                        ).content
                    )
                )
                image_output.paste(image_chunk, (x * 512, y * 512))

        image_output.save("photo-sphere-%s.png" % (panorama_id))

Creating An Interactive Client

For this next part I need more performance than Python can provide, so I switch to C++.

After porting our existing code I need a way of rendering photo spheres with interactive controls. I go with Skia and OpenGL as I have experience with them in other projects. Skia has a shader language called SKSL and I think it is a good fit for converting our equirectangular images into equinormal projections (or gnomonic) in realtime. The inputs I need are:

  • Resolution of equirectangular image
  • Resolution of viewport
  • Viewing yaw
  • Viewing pitch
  • FOV (function of viewport dimensions and zoom)

The resulting shader looks like this:

uniform shader image;

uniform vec2 u_imageResolution;
uniform vec2 u_viewResolution;
uniform vec2 u_rotation; // yaw, pitch
uniform float u_fovH;
uniform float u_fovV;

const float PI = 3.14159265358979323846264;

vec3 rotateXY(vec3 p, vec2 angle) {
    vec2 c = cos(angle), s = sin(angle);
    p = vec3(p.x, c.x * p.y + s.x * p.z, -s.x * p.y + c.x * p.z);
    return vec3(c.y * p.x + s.y * p.z, p.y, -s.y * p.x + c.y * p.z);
}

float4 main(float2 fragCoord) {
    // Place 0,0 in center from -1 to 1 ndc
    vec2 uv = fragCoord * 2.0 / u_viewResolution - 1.0;

    // Spherical
    vec3 camDir = normalize(vec3(uv * vec2(tan(0.5 * u_fovH), tan(0.5 * u_fovV)), 1.0));

    // Rotate
    vec3 rd = normalize(rotateXY(camDir, u_rotation.yx));

    // Radial azmuth polar
    vec2 texCoord = vec2(atan(rd.z, rd.x) + PI, acos(-rd.y)) / vec2(2.0 * PI, PI);

    // Y is flipped, X is not
    vec2 imageCoord = vec2(texCoord.x, 1 - texCoord.y) * u_imageResolution;

    return image.eval(imageCoord);
}

By integrating the discovery that photo spheres encode photo spheres near them, the clickable arrow in Streetview, I made a usable interface for navigating photo spheres as quickly as possible, accomplishing goal 3. Writing this client in C++ also accomplishes goal 4 for me:

Navigating photo spheres

Navigating photo spheres

You can use the client I created here.

Recursive Streetview Scraping

By using nearby photo spheres I can recursively scrape photo spheres around a location. The steps are as follows:

  1. photo?authuser once for a starting list
  2. Sort list of photo spheres ascending by proximity to a point
  3. Get adjacent photo spheres to first in list. If all adjacent photo spheres are duplicates go to the next
  4. Add adjacent photo spheres to list
  5. Remove all duplicates
  6. Repeat steps 2 through 5 some number of times
  7. Download equirectangular + json or just json of every photo sphere in list

This process results in a map like the below:

Map of photo spheres

Map of photo spheres

Next steps

  • Open source Geoguessr
  • Large scale photogrammetry (early tests here)
  • Webpage to remove need to use native application

Questions?

Use the Contact button to the side or join my Discord.