GeoDataViewer
Menu
Launch Studio
Theme
GeoDataViewer Team

What Is GeoParquet? Columnar Geospatial Data for Analytics

Learn what GeoParquet is, why it works well for large geospatial datasets, how browser rendering works today, and where deck.gl fits next.

GeoParquet is a Parquet-based format for storing geospatial data in a columnar, analytics-friendly way. It’s often used in data engineering and cloud analytics workflows where performance and interoperability matter for large datasets.

To preview a GeoParquet file quickly, use the GeoParquet viewer: /open-geoparquet-online/.

Why GeoParquet is different

Compared to text-based formats, Parquet is designed for analytics:

  • Columnar storage (faster scans for selected fields)
  • Compression and efficient encoding
  • Friendly for large-scale processing tools

GeoParquet adds conventions to represent geometry and geospatial metadata.

In practice, that means a GeoParquet file is not just “Parquet with a geometry column.” It usually includes metadata that tells software which column stores geometry, how that geometry is encoded, and what coordinate reference information should travel with the dataset.

Common use cases

  • Large-scale spatial analytics and ETL pipelines
  • Sharing datasets for modern data platforms
  • Efficient storage for repeated querying and filtering

How browser rendering works today

In the current GeoDataViewer stack, GeoParquet parsing is Arrow-first. The browser reads the Parquet structure, uses GeoParquet metadata to find the geometry column, and only then materializes rows for inspection and styling.

That parsing path is different from older “guess the geometry field from row objects” approaches. It is more reliable for modern GeoParquet files and a better fit for large datasets.

After parsing, the current renderer normalizes geometry into plain GeoJSON before sending it to the map. Today that map stack is MapLibre + GeoJSON source, not a binary rendering path.

Advantages of the current MapLibre path

  • Good browser compatibility for interactive viewing
  • Easy integration with attribute tables, filters, and timeline controls
  • Straightforward point, line, and polygon styling in one Studio workspace
  • A stable rendering path for small, medium, and moderately large datasets

Limits of the current MapLibre path

  • Binary or typed-array geometry cannot stay in the render path all the way to the map
  • Very large GeoParquet layers pay more memory and worker-transfer cost after geometry is normalized to GeoJSON
  • This is less efficient than a dedicated binary rendering pipeline for sustained large-data visualization

Why this differs from kepler.gl

kepler.gl can stay closer to Arrow and binary geometry because its rendering stack is centered on deck.gl, which is designed to work well with binary layer data.

The current GeoDataViewer stack uses MapLibre GeoJSON sources for rendering. That is why geometry is normalized before it reaches the map. The difference is renderer architecture, not a limitation of GeoParquet itself.

Future deck.gl route

The current ingest path is already moving in the right direction for larger rendering systems:

  • Keep Arrow-first parsing
  • Keep metadata-aware geometry detection
  • Split geometry output by render target in the future

That future split would likely look like this:

  • MapLibre target: normalize geometry to plain GeoJSON for broad browser compatibility
  • deck.gl target: keep more binary geometry in the rendering path for very large datasets

In other words, the current browser viewer favors compatibility and unified workspace behavior first. A later deck.gl path would favor larger-scale rendering performance.

Limitations to know

  • It’s not as universally supported by classic desktop GIS tools as Shapefile.
  • It’s often better as a “data lake” format than a hand-edited format.

How to open GeoParquet online

  1. Go to /open-geoparquet-online/.
  2. Upload your .parquet file.
  3. Inspect geometry and attributes to validate the dataset.

If you are validating a large file, focus first on:

  • whether the geometry column was detected correctly
  • whether the visible layer count matches expectations
  • whether attribute fields and time fields look correct before conversion or export