Files
my_openplace/scripts
zack3d 30f6a76891 feat(db): add tile-region mapping datasets
Add regions.csv and tile_region_mapping.csv to provide region metadata
and tile-to-region relationships. Update server initialization and auth
routes to integrate the new data, and expand .env.example with related
configuration variables.

This makes region information queryable and prepares scripts for
seeding/import.
2025-10-02 23:50:06 -07:00
..
b
2025-10-02 19:27:15 -07:00
b
2025-10-02 19:27:15 -07:00
b
2025-10-02 19:27:15 -07:00

Region Data Scraping Guide

This guide explains how to scrape region data from wplace.live and import it into your database.

Prerequisites

  • Python 3.7+ with cloudscraper library
  • Node.js with tsx and csv-parse packages

Install dependencies:

# Python - required to bypass Cloudflare and use SOCKS5 proxies
pip install cloudscraper pysocks

# Node.js (already installed via npm)
npm install -D csv-parse tsx

Steps

1. Scrape Region Data

Run the Python scraper to fetch region data from wplace.live:

python scripts/scrape_regions.py

This will create two CSV files:

  • regions.csv - Unique regions (id, name, cityId, countryId, etc.)
  • tile_region_mapping.csv - Tile-to-region mappings

How it works:

  • Since regions are determined by tile coordinates (not individual pixels), the script samples ONE pixel per tile
  • This is much faster than the old approach - only ~4 million requests instead of 42 million!
  • Default: Samples all tiles (TILE_SAMPLE_STEP=1)

Configuration options in scrape_regions.py:

  • TILE_SAMPLE_STEP - Sample every Nth tile (1 = all tiles, 2 = every other tile, etc.)
  • TILE_X_MIN/MAX, TILE_Y_MIN/MAX - Canvas bounds to scrape (default: 0-2047)

⏱️ Estimated time: With default settings and 0.05s delay between requests, this will take approximately:

  • ~4.2 million tiles × 0.05s = ~58 hours
  • For faster testing: Set TILE_SAMPLE_STEP = 10 (~6 hours) or TILE_SAMPLE_STEP = 100 (~35 minutes)

2. Import into Database

Once you have the CSV files, import them:

npm run import:regions

This will:

  • Import all unique regions into the Region table
  • Analyze tile coverage for each region
  • Display which regions have the most tiles

3. Create TileRegion Lookup Table

Add this to your prisma/schema.prisma:

model TileRegion {
  tileX    Int
  tileY    Int
  regionId Int
  region   Region @relation(fields: [regionId], references: [id])

  @@unique([tileX, tileY])
  @@index([tileX, tileY])
}

Then run:

npm run db:push

4. Import Tile Mappings

Create a new script scripts/import-tile-mappings.ts:

#!/usr/bin/env tsx
import { PrismaClient } from "@prisma/client";
import { readFileSync } from "fs";
import { parse } from "csv-parse/sync";

const prisma = new PrismaClient();

interface TileRow {
  tile_x: string;
  tile_y: string;
  region_id: string;
  city_id: string;
  region_name: string;
  region_number: string;
  country_id: string;
  flag_id: string;
}

async function main() {
  const csvContent = readFileSync("tile_region_mapping.csv", "utf-8");
  const records = parse(csvContent, {
    columns: true,
    skip_empty_lines: true
  }) as TileRow[];

  console.log(`Importing ${records.length} tile mappings...`);

  // Batch insert for performance
  const batchSize = 1000;
  for (let i = 0; i < records.length; i += batchSize) {
    const batch = records.slice(i, i + batchSize);
    await prisma.tileRegion.createMany({
      data: batch.map(r => ({
        tileX: Number.parseInt(r.tile_x),
        tileY: Number.parseInt(r.tile_y),
        regionId: Number.parseInt(r.region_id)
      })),
      skipDuplicates: true
    });
    console.log(`Imported ${Math.min(i + batchSize, records.length)} / ${records.length}`);
  }

  console.log("✓ Import complete!");
  await prisma.$disconnect();
}

main();

Run it: tsx scripts/import-tile-mappings.ts

5. Update Region Lookup Function

Update src/config/regions.ts:

import { prisma } from "./database.js";

export async function getRegionForCoordinates(
  tileX: number,
  tileY: number,
  _x: number,
  _y: number
): Promise<Region | null> {
  const tileRegion = await prisma.tileRegion.findUnique({
    where: { tileX_tileY: { tileX, tileY } },
    include: { region: true }
  });

  if (!tileRegion) {
    return null;
  }

  return {
    id: tileRegion.region.id,
    cityId: tileRegion.region.cityId,
    name: tileRegion.region.name,
    number: tileRegion.region.number,
    countryId: tileRegion.region.countryId,
    flagId: tileRegion.countryId // Note: You might need to add flagId to Region model
  };
}

Notes

  • The scraper includes a 0.05s delay between requests to be respectful to wplace.live's servers
  • Each tile maps to exactly one region, making lookup very simple
  • The TileRegion table will have ~4 million rows (one per tile on the canvas)
  • Database lookups are fast thanks to the unique index on (tileX, tileY)
  • The CSV files are plain text and can be inspected/edited before importing