- Published on
Extract OSM data with Osmium
- Authors
- Name
- YuChun Tsao
The previous article (import-osm-data-into-postgis-with-osm2pgsql) introduces how to import OSM data into PostGIS through osm2pgsql. In this article, I will extract OSM with Osmium and import customized osm.pbf
into PostGIS.
Environment
- Ubuntu 18.04
- CMake 3.24.1
Installation
Follow the official documentation to install: Osmium Command Line Tool
Download OSM data
Download from geofabrik
In this article, I will use taiwan-latest.osm.pbf for the next step.
You can use osmium fileinfo
to inspect your osm.pbf
file
osmium fileinfo taiwan-latest.osm.pbf
File:
Name: taiwan-latest.osm.pbf
Format: PBF
Compression: none
Size: 114332934
Header:
Bounding boxes:
(118.1036,20.72799,122.9312,26.60305)
With history: no
Options:
generator=osmium/1.14.0
osmosis_replication_base_url=http://download.geofabrik.de/asia/taiwan-updates
osmosis_replication_sequence_number=3577
osmosis_replication_timestamp=2023-01-14T21:21:25Z
pbf_dense_nodes=true
pbf_optional_feature_0=Sort.Type_then_ID
sorting=Type_then_ID
timestamp=2023-01-14T21:21:25Z
You can use
-e
option to show more information about the pbf.
Creating geographic extracts
Extract with bounding box
osmium extract \
-s simple \
-b 121.561,25.030,121.568,25.036 \
taiwan-latest.osm.pbf \
-o bounding_box_output.pbf
-s
,--strategy
Osmium offers three different extract strategiessimple
,complete_ways
(default) andsmart
. Their results are different, more or less OSM objects will be included in the output.
More information about extract strategy can read manual of Osmium.
data:image/s3,"s3://crabby-images/9d7b5/9d7b5e9bca451c8aed243c1fed1489457521ea7c" alt="Extract with bounding box"
Extract with OSM boundary
You can find relation id from OpenStreetMap.
In my case, I typed taipei
keyword in search bar to find the relation id of Taipei City as 1293250.
data:image/s3,"s3://crabby-images/5bc9a/5bc9a8cdde7521eec6cf17538eecb75f5bfedbe8" alt="Get relation id from openstreetmap"
Extract the boundary through the relation id of Taipei City.
osmium getid -r -t taiwan-latest.osm.pbf r1293250 -o taipei-boundary.osm
Then extract the pbf file of Taipei City through taipei-boundary.osm
.
osmium extract -s simple -p taipei-boundary.osm taiwan-latest.osm.pbf -o boundary_output.pbf
data:image/s3,"s3://crabby-images/5bb65/5bb65712a42384c92f056e72bee2c89a3eae0ae0" alt="Extract with OSM boundary"
Extract with GeoJSON
If you have geographic data in GeoJSON format also used to extract the data.
I got a circle from geojson.io as my input geojson data.
data:image/s3,"s3://crabby-images/00106/001068c921685b37743bfa5d579f2540bd03bd74" alt="Example GeoJSON"
Extract data through the circle in GeoJSON format.
osmium extract -s simple -p polygon.geojson taiwan-latest.osm.pbf -o geojson_output.pbf
data:image/s3,"s3://crabby-images/2de85/2de85c60c9212759a4f0c27baf4ec3000af33a9a" alt="Extract with GeoJSON"
Several extracts with config file
The config file is in JSON format. The top-level is an object which contains at least an extracts
array. It can also contain a directory
entry which names the directory where all the output files will be created.
config.json
{
"extracts": [
{
"output": "bounding_box_output.pbf",
"output_format": "pbf",
"description": "extract with bbox",
"bbox": [121.561, 25.03, 121.568, 25.036]
},
{
"output": "boundary_output.pbf",
"description": "extract with osm boundary",
"polygon": {
"file_name": "taipei-boundary.osm",
"file_type": "osm"
}
},
{
"output": "geojson_output.pbf",
"description": "extract with geojson polygon",
"polygon": {
"file_name": "polygon.geojson",
"file_type": "geojson"
}
}
],
"directory": "./output/"
}
The output directory must exist.
Extract data with this config file.
osmium extract -s simple -c config.json taiwan-latest.osm.pbf
After execution you will be able to find the pbf in the output directory.
$ tree output
output
├── boundary_output.pbf
├── bounding_box_output.pbf
└── geojson_output.pbf
0 directories, 3 files