Difference between revisions of "Sprint bdfs25"

From OSGeo
Jump to navigation Jump to search
 
(69 intermediate revisions by the same user not shown)
Line 1: Line 1:
'''TorchGeo''' an OSGeo Project
+
[[File:yaosgeologo.jpg|thumb]]
 +
 
 +
'''TorchGeo''' an OSGeo [https://www.osgeo.org/projects/torchgeo/ Project]  [https://github.com/torchgeo/torchgeo CODE]  [https://torchgeo.readthedocs.io/ DOCS]
 +
 
 +
datasets [https://torchgeo.readthedocs.io/en/latest/api/datasets.html LINK], samplers, transforms, and pre-trained models for geospatial data
  
 
Additional Topics:
 
Additional Topics:
  
1. '''Copernicus AI4EO Workshop in Bonn'''
+
1. '''Workshops'''
 +
 
 +
Berkeley Climate AI Day [https://qb3.org/recap-symposium-on-ai-and-climate-technology-during-sf-climate-week/ LINK]
 +
 
 +
'''ml4earth.de''' [https://ml4earth.de/workshop_2025/ workshop] featured a keynote by Prof. Dr. Xiaoxiang Zhu from TUM Germany, also home to the creator of TorchGeo Adam Stewart PhD  [https://www.asg.ed.tum.de/sipeo/team/dr-adam-j-stewart/ POSTDOC].  Applications of artificial intelligence (AI) in Earth observation, with a focus on machine learning (ML) approaches for remote sensing.
  
The [https://ml4earth.de/workshop_2025/ workshop] featured a keynote by Xiaoxiang Zhu from TUM Germany, also home to the creator of TorchGeo. The discussion likely  entered around the applications of artificial intelligence (AI) in Earth observation, with a focus on machine
+
TerraBytes Canada [https://terrabytes-workshop.github.io/ LINK]
learning (ML) approaches for remote sensing.
 
  
https://ml4earth.de/workshop_2025/  https://ceos.org/ard/
+
Geometa Lab [https://www.giswiki.ch/Agenda LINK]
  
http://dataspace.copernicus.eu/    https://eotdl.com
 
  
 
2. '''PANGAEA project'''
 
2. '''PANGAEA project'''
  
This large research initiative aims to provide public data for land cover detection across remote areas such as Africa, Amazonia, and Siberia. The conversation might have touched upon the importance of publicly available datasets for Earth observation research and the potential applications of this data.
+
[[File:Pangaea geofmbenchmark.png|thumb]]
 +
 
 +
a standardized evaluation protocol that covers a diverse set of datasets, tasks, resolutions, sensor modalities, and temporalities, establishing a robust and widely applicable benchmark for Geospatial Foundation Models (GFMs). [https://arxiv.org/abs/2412.04204v2 PDF]
 +
 
 
https://bpa.st/BTQAE  ## code updates
 
https://bpa.st/BTQAE  ## code updates
  
 
https://eotdl.com/blog/pangaea
 
https://eotdl.com/blog/pangaea
  
[PANGAEA benchmark](https://github.com/VMarsocci/pangaea-bench) shows that specialized, not-CNN and not-ViT , machine learning models can perform better than current (trendy) "Foundation Models" for remote sensing data
 
  
 
3. '''Machine learning approaches'''
 
3. '''Machine learning approaches'''
  
A key takeaway from this discussion is that traditional ML methods (e.g., XGBoost, Random Forest) often outperform trendy CNN/ViT models for remote sensing tasks. This highlights the value of specialized ML models over foundation models in this domain.
+
PANGAEA [https://github.com/VMarsocci/pangaea-bench benchmark]  shows that specialized, not-CNN and not-ViT , machine learning models can perform better than various Foundation Models for remote sensing landuse / landcover analysis and other specific classification tasks.
 +
Supervised learning, XGBoost / Catboost or various Random Forest learning can perform better than ViT variations, CNN and related approaches on the six key use cases as stated by FAST-EO : weather and climate disaster analysis, methane leak detection, forest above-ground biomass change, soil property estimation, semantic land cover change detection, and monitoring the expansion of mining fields into farmlands.
 +
 
 +
Thoughts on Geospatial Foundation Models [https://arxiv.org/pdf/2405.04285 PDF] Zhu, Stewart, et al 2024
 +
 
 +
this [https://huggingface.co/ibm-esa-geospatial/TerraMind-1.0-base TerraMind] mixed model is experimental,
 +
 
 +
[https://github.com/swiss-territorial-data-lab/proj-vit swiss-territorial-data-lab]  [https://huggingface.co/datasets/heig-vd-geo/M3DRS DATA]  https://stdl.ch/
 +
 
 +
A sparse matrix math tutorial [https://iclr-blogposts.github.io/2025/blog/sparse-autodiff/ LINK]
 +
 
 +
A Geospatial Foundation Model [https://github.com/NASA-IMPACT/Prithvi-EO-2.0 Prithvi-EO-2.0]
 +
 
 +
Model building libraries [https://github.com/opengeos/geoai/tree/main geoAI] by _giswqs_ Qiusheng Wu, UTenn  [https://opengeoai.org/#statement-of-need DOCS]
 +
 
 +
processsing toolkit [https://github.com/nasa-nccs-hpda/pytorch-caney pytorch-caney]  [https://nasa-nccs-hpda.github.io/pytorch-caney/latest/readme.html#objectives DOCS]
 +
 
 +
MOSAIKS CodeCapsule [https://codeocean.com/capsule/6456296/tree/v2 LINK]
  
these (https://huggingface.co/ibm-esa-geospatial/TerraMind-1.0-base) mixed models are wild, though
+
Satlas [https://satlas-pretrain.allen.ai/ LINK] a platform for visualizing and downloading global geospatial data products generated by AI using satellite images. [https://satlas.allen.ai/map DEMO] YNews [https://news.ycombinator.com/item?id=37387556 REVIEW]
  
like this one https://github.com/swiss-territorial-data-lab/proj-vit  [DATA\_LINK](https://huggingface.co/datasets/heig-vd-geo/M3DRS)
+
DeepLearning and OBIA [https://arxiv.org/pdf/2408.01607? PDF] sponsored by State Key Labs [https://en.wikipedia.org/wiki/State_Key_Laboratories LINK]
  
 +
GOOGLE_SATELLITE_EMBEDDING_V1 [https://deepmind.google/discover/blog/alphaearth-foundations-helps-map-our-planet-in-unprecedented-detail/ BLOG] [https://medium.com/google-earth/ai-powered-pixels-introducing-googles-satellite-embedding-dataset-31744c1f4650 MED] [https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_SATELLITE_EMBEDDING_V1_ANNUAL LINK] related [https://christopherren.substack.com/p/embedding-fields-forever LINK]
  
 +
@inproceedings{brown2024learned,
 +
  title={Learned embedding fields for multi-source, multi-temporal earth observation imagery},
 +
  author={Brown, Christopher and Kazmierski, Michal and Rucklidge, William and Pasquarella, Valerie and Shelhamer, Evan},
 +
  booktitle={ICLR Workshop on Machine Learning for Remote Sensing (ML4RS)},
 +
  year={2024}
 +
}
  
 
4. '''Data availability'''
 
4. '''Data availability'''
  
The conversation likely emphasized the importance of public datasets for Earth observation research. For instance, projects like TerraMesh, funded by ESA (European Space Agency), provide valuable data sources for researchers and developers.
+
Promotion of public datasets for Earth observation research.  
  
hm - on the US West Coast side - team MSFT, Planet Labs and The Nature Conservancy .. [D2](https://reatlas42216storage.blob.core.windows.net/public/wind_all_2024q2_3_11_2025.gpkg) [LINK](https://www.microsoft.com/en-us/research/wp-content/uploads/2025/03/Global-Renewables-Watch_Caleb-Robinson_2025.pdf)  [DATA0]
+
Global Forest Watch edu [https://glad.umd.edu/projects/global-forest-watch LINK]
  
project 2020 "[TerraMesh](https://openaccess.thecvf.com/content/CVPR2025W/EarthVision/html/Blumenstiel_TerraMesh_A_Planetary_Mosaic_of_Multimodal_Earth_Observation_Data_CVPRW_2025_paper.html) is part of the FAST‑EO project funded by the European Space Agency Φ‑Lab (contract #4000143501/23/I‑DT)."
+
US West Coast -- team MSFT, Planet Labs and The Nature Conservancy .. [https://www.microsoft.com/en-us/research/wp-content/uploads/2025/03/Global-Renewables-Watch_Caleb-Robinson_2025.pdf REF] [https://reatlas42216storage.blob.core.windows.net/public/wind_all_2024q2_3_11_2025.gpkg DATA0]
 +
 
 +
TerraMesh,  ESA (European Space Agency) -- [https://openaccess.thecvf.com/content/CVPR2025W/EarthVision/html/Blumenstiel_TerraMesh_A_Planetary_Mosaic_of_Multimodal_Earth_Observation_Data_CVPRW_2025_paper.html TerraMesh] ,  part of the FAST‑EO project funded by the European Space Agency Φ‑Lab (contract #4000143501/23/I‑DT).
 +
 
 +
The [https://www.dlr.de/en/eoc/research-transfer/projects-missions/fast-eo FAST-EO] [https://www.fast-eo.eu/ project], officially [https://www.fz-juelich.de/en/ias/jsc/news/news-items/news-flashes/2024/fast-eo-launched launched] on February 5, 2024, is an initiative funded by the European Space Agency ([https://eo4society.esa.int/projects/fast-eo/ ESA]) and led by the German Aerospace Center (DLR).
 +
Its primary goal is to advance AI Foundation Models (FMs) for Earth Observation (EO) by exploring large multimodal foundation models through unsupervised and self-supervised learning to address downstream tasks.
 +
The project involves a consortium of partners including Forschungszentrum Jülich, KP Labs, and IBM Research, who are collaborating on six key use cases: weather and climate disaster analysis, methane leak detection, forest above-ground biomass change, soil property estimation, semantic land cover change detection, and monitoring the expansion of mining fields into farmlands. The project is committed to open science, planning to release model weights, configurations, datasets, and source code under a free and permissive Apache-2 license, accessible via platforms like GitHub, HuggingFace, and the SpatioTemporal Asset Catalog (STAC).
 +
 
 +
Other public training [https://x-ytong.github.io/project/Five-Billion-Pixels.html DATA] for China interior, by Prof. Dr. Xiou xiang Zhu .. who is very much in Germany and is the boss of the department with the `torchGeo` project founder
 +
 
 +
gdal containers [https://github.com/OSGeo/gdal/pkgs/container/gdal LINK]
 +
 
 +
opendatacube containers [https://github.com/opendatacube/datacube-docker LINK]
 +
 
 +
https://ceos.org/ard/
 +
 
 +
[https://github.com/DLR-MF-DAS/SSL4EO-S12-v1.1 SSL4EO-S12-v1]
 +
 
 +
http://dataspace.copernicus.eu/    https://eotdl.com
 +
 
 +
https://github.com/opendatacube
 +
 
 +
[https://eumap.readthedocs.io/en/latest/ EUMap] client libraries
 +
 
 +
Open Earth Monitor [https://cordis.europa.eu/project/id/101059548/results EU] [https://www.youtube.com/@OpenGeoHubFoundation YouTube] Channel
  
hey - public training [data](https://x-ytong.github.io/project/Five-Billion-Pixels.html) for China interior, by Xiou xiang Zhu .. who is very much in Germany and is the boss of the department with the `torchGeo` guy
 
  
 
5. '''Geospatial data infrastructure'''
 
5. '''Geospatial data infrastructure'''
  
References to geospatial data infrastructure initiatives such as STAC (SpatioTemporal Asset Catalog) and OSGeoLive were likely discussed, highlighting the need for standardized data formats and efficient data access mechanisms.
+
standardized data formats and efficient data access mechanisms such as STAC (SpatioTemporal Asset Catalog) and OSGeoLive  
 +
 
 +
[https://openlandmap.org/ OpenLandMap]
 +
 
 +
 
  
 
'''Key points and takeaways''':
 
'''Key points and takeaways''':
Line 49: Line 108:
 
'''Specialized ML models''': Traditional ML methods can outperform trendy AI approaches in remote sensing tasks.'''Public data availability''': Public datasets are essential for Earth observation research, enabling  collaboration and innovation. '''Collaboration opportunities''': The conversation likely touched upon the potential for international  collaborations across different countries and regions. '''Geospatial data infrastructure''': Standardized data formats and efficient data access mechanisms are crucial for geospatial research.
 
'''Specialized ML models''': Traditional ML methods can outperform trendy AI approaches in remote sensing tasks.'''Public data availability''': Public datasets are essential for Earth observation research, enabling  collaboration and innovation. '''Collaboration opportunities''': The conversation likely touched upon the potential for international  collaborations across different countries and regions. '''Geospatial data infrastructure''': Standardized data formats and efficient data access mechanisms are crucial for geospatial research.
  
Preparation for Big Data from Space #osgeo code sprint
+
Preparation for #bids25 Big Data from Space #osgeo + Pangeo code sprint
  
 
The participants' discussion is a preparation activity for an upcoming code sprint, where they will work together to develop innovative solutions using remote sensing and Earth observation datasets. A wiki page summarizing these ideas could be a valuable output from this collaboration.
 
The participants' discussion is a preparation activity for an upcoming code sprint, where they will work together to develop innovative solutions using remote sensing and Earth observation datasets. A wiki page summarizing these ideas could be a valuable output from this collaboration.
 +
 +
A Path for Science‑ and Evidence‑based AI [https://understanding-ai-safety.org/ Policy]
  
 
Overall, the conversation highlights the intersection of geospatial technology, machine learning, and open science initiatives in Earth observation, emphasizing the importance of data availability, specialized ML models, and geospatial data infrastructure for advancing research and innovation in this field.
 
Overall, the conversation highlights the intersection of geospatial technology, machine learning, and open science initiatives in Earth observation, emphasizing the importance of data availability, specialized ML models, and geospatial data infrastructure for advancing research and innovation in this field.

Latest revision as of 14:11, 1 October 2025

Yaosgeologo.jpg

TorchGeo an OSGeo Project CODE DOCS

datasets LINK, samplers, transforms, and pre-trained models for geospatial data

Additional Topics:

1. Workshops

Berkeley Climate AI Day LINK

ml4earth.de workshop featured a keynote by Prof. Dr. Xiaoxiang Zhu from TUM Germany, also home to the creator of TorchGeo Adam Stewart PhD POSTDOC. Applications of artificial intelligence (AI) in Earth observation, with a focus on machine learning (ML) approaches for remote sensing.

TerraBytes Canada LINK

Geometa Lab LINK


2. PANGAEA project

Pangaea geofmbenchmark.png

a standardized evaluation protocol that covers a diverse set of datasets, tasks, resolutions, sensor modalities, and temporalities, establishing a robust and widely applicable benchmark for Geospatial Foundation Models (GFMs). PDF

https://bpa.st/BTQAE ## code updates

https://eotdl.com/blog/pangaea


3. Machine learning approaches

PANGAEA benchmark shows that specialized, not-CNN and not-ViT , machine learning models can perform better than various Foundation Models for remote sensing landuse / landcover analysis and other specific classification tasks. Supervised learning, XGBoost / Catboost or various Random Forest learning can perform better than ViT variations, CNN and related approaches on the six key use cases as stated by FAST-EO : weather and climate disaster analysis, methane leak detection, forest above-ground biomass change, soil property estimation, semantic land cover change detection, and monitoring the expansion of mining fields into farmlands.

Thoughts on Geospatial Foundation Models PDF Zhu, Stewart, et al 2024

this TerraMind mixed model is experimental,

swiss-territorial-data-lab DATA https://stdl.ch/

A sparse matrix math tutorial LINK

A Geospatial Foundation Model Prithvi-EO-2.0

Model building libraries geoAI by _giswqs_ Qiusheng Wu, UTenn DOCS

processsing toolkit pytorch-caney DOCS

MOSAIKS CodeCapsule LINK

Satlas LINK a platform for visualizing and downloading global geospatial data products generated by AI using satellite images. DEMO YNews REVIEW

DeepLearning and OBIA PDF sponsored by State Key Labs LINK

GOOGLE_SATELLITE_EMBEDDING_V1 BLOG MED LINK related LINK

@inproceedings{brown2024learned,
 title={Learned embedding fields for multi-source, multi-temporal earth observation imagery},
 author={Brown, Christopher and Kazmierski, Michal and Rucklidge, William and Pasquarella, Valerie and Shelhamer, Evan},
 booktitle={ICLR Workshop on Machine Learning for Remote Sensing (ML4RS)},
 year={2024}
}

4. Data availability

Promotion of public datasets for Earth observation research.

Global Forest Watch edu LINK

US West Coast -- team MSFT, Planet Labs and The Nature Conservancy .. REF DATA0

TerraMesh, ESA (European Space Agency) -- TerraMesh , part of the FAST‑EO project funded by the European Space Agency Φ‑Lab (contract #4000143501/23/I‑DT).

The FAST-EO project, officially launched on February 5, 2024, is an initiative funded by the European Space Agency (ESA) and led by the German Aerospace Center (DLR). Its primary goal is to advance AI Foundation Models (FMs) for Earth Observation (EO) by exploring large multimodal foundation models through unsupervised and self-supervised learning to address downstream tasks. The project involves a consortium of partners including Forschungszentrum Jülich, KP Labs, and IBM Research, who are collaborating on six key use cases: weather and climate disaster analysis, methane leak detection, forest above-ground biomass change, soil property estimation, semantic land cover change detection, and monitoring the expansion of mining fields into farmlands. The project is committed to open science, planning to release model weights, configurations, datasets, and source code under a free and permissive Apache-2 license, accessible via platforms like GitHub, HuggingFace, and the SpatioTemporal Asset Catalog (STAC).

Other public training DATA for China interior, by Prof. Dr. Xiou xiang Zhu .. who is very much in Germany and is the boss of the department with the `torchGeo` project founder

gdal containers LINK

opendatacube containers LINK

https://ceos.org/ard/

SSL4EO-S12-v1

http://dataspace.copernicus.eu/ https://eotdl.com

https://github.com/opendatacube

EUMap client libraries

Open Earth Monitor EU YouTube Channel


5. Geospatial data infrastructure

standardized data formats and efficient data access mechanisms such as STAC (SpatioTemporal Asset Catalog) and OSGeoLive

OpenLandMap


Key points and takeaways:

Specialized ML models: Traditional ML methods can outperform trendy AI approaches in remote sensing tasks.Public data availability: Public datasets are essential for Earth observation research, enabling collaboration and innovation. Collaboration opportunities: The conversation likely touched upon the potential for international collaborations across different countries and regions. Geospatial data infrastructure: Standardized data formats and efficient data access mechanisms are crucial for geospatial research.

Preparation for #bids25 Big Data from Space #osgeo + Pangeo code sprint

The participants' discussion is a preparation activity for an upcoming code sprint, where they will work together to develop innovative solutions using remote sensing and Earth observation datasets. A wiki page summarizing these ideas could be a valuable output from this collaboration.

A Path for Science‑ and Evidence‑based AI Policy

Overall, the conversation highlights the intersection of geospatial technology, machine learning, and open science initiatives in Earth observation, emphasizing the importance of data availability, specialized ML models, and geospatial data infrastructure for advancing research and innovation in this field.