Skip to main content

Characterize cities with embeddings of Overture Place categories

Β· 3 min read
Maribel Hernandez
Computational Genomics Researcher @ CINVESTAV

TL;DR Maribel Hernandez shows how to create clusters of business categories using Overture Places data.

Maribel Hernandez is a computer scientist and researcher at CINVESTAV, a multidisciplinary academic institution in Mexico. She specializes in graph theory in the field of computational genomics and complex networks. In this blog post, Maribel shows how she characterizes cities based on the distribution of businesses by rolling-up business categories by H3.

As someone who works with urban networks, Maribel's focus is on exploring how cities function and how their design impacts inclusivity. The core question driving this analysis is: Does the city have an even distribution of business services? Or are there shortages such as food deserts or unequal access to health facilities?

info

To follow along, you can clone and run the associated:

Introduction​

Consider these scenarios:

  • How connected are areas dominated by upper-class establishments to the broader city fabric?
  • Are health services distributed with equal access from low-income neighborhoods?
  • Do certain metropolis have better accessibility and service availability compared to others?

Take, for example, a comparison between 3 key cities in Mexico: Mexico City, LeΓ³n, and MΓ©rida. Are neighborhoods in these cities equally served by essential services such as healthcare, food, and transportation? Could some areas be considered food deserts, while others enjoy easy access to all services?

Creating Embeddings of Overture Places​

To answer these questions, I use a method that can scale across cities and ensure consistency. By grouping business categories in Overture Places at the H3 grid level, we can analyze the distribution of services and their accessibility. This approach provides a quantifiable way to compare cities globally, shedding light on the urban inclusivity of different regions. Workflow

The UDF I created follows this workflow:

  1. Load Overture Places records for a given H3
  2. Create business category column
  3. Create an embedding to represent business categories
  4. Run UDF for all H3 spanning an area of interest (polyfill)
  5. Calculate Kmeans clustering based on cosine similarity.
  6. Describe each cluster
File

Preview of UDF on Workbench.

Descriptive analysis​

When analyzing the urban fabric of cities like Mexico City, Leon, and Merida, we can obsere distinct patterns of service distribution that manifest as rings around the city center.

File

Mexico City.

a. The heart of the city​

Especially in Leon and Merida, the central area tends to form a cohesive, dense cluster of services. The core zone houses a variety of businesses including health services, retail, and food outlets, which are generally well-connected and easily accessible. Intuitively, the central cluster can be thought of as the β€œheart” of the city, serving as the primary hub for commerce, social interaction, and access to essential services.

File

Leon.

b. Islands of Services​

Beyond the center, we notice smaller clusters of services scattered across the city, often in the form of β€œislands.” These islands represent pockets of neighborhoods that, while not part of the dense city center, offer an array of unique services. These can serve to highlight the phenomenon of emergent neighborhoods within a greater whole.

File

Merida.

c. Peripheral ring​

The periphery of these cities forms another distinct pattern. These outer regions also tend to cluster together in similar service categories, forming a ring that surrounds the central core. It’s possible services in these peripheral areas are often more limited in scope and may reflect a focus on residential and less commercial needs, or reflect lower-income neighborhoods with scarcer access to key services.

Conclusion​

This ring-like pattern of service distribution suggests a common trend where the core of the city is highly accessible, while the periphery often lacks the same diversity and depth of business services. The islands of services in between can be seen as attempts to bridge this gap, but they are not always as effective in meeting the needs of the population on the periphery.

These spatial patterns offer valuable insights into the inclusivity of urban networks, highlighting areas that may need attention to ensure strategic business placement and guarantee equitable access to essential services.

Future work​

  • Create Network: Create a bipartite network between place categories and H3 indices, using weights as counts for each category.
  • Tie in Demographic Data: Integrate INEGI sociodemographic data at the census block group level to understand how services align with population needs.
  • Access by Neighborhood: Use origin-destination (OD) movement networks to evaluate service accessibility by neighborhood of residence.
  • Segregation Analysis: Identify zones with low visit rates or difficult accessibility. These zones may represent areas of segregation or neglect in the urban network.