NASA thinks its Earth science missions will generate around a quarter million terabytes of data in 2024 alone. That's a ton of satellite data! In order for climate scientists and researchers to efficiently sort through all this raw data, IBM, HuggingFace, and NASA have teamed up to build an open-source model that will be the foundation for new AIs to track things like deforestation, predict crop yields, and measure greenhouse gas emissions.
For this project, IBM used its new Watsonx.ai as the base model, training it on a year's worth of NASA's Harmonized Landsat Sentinel-2 satellite data. This data comes from the European Space Agency's two Sentinel-2 satellites, which are designed to capture super high-res images over land and coastal areas.
As for HuggingFace, they are hosting the model on their open-source AI platform. IBM says that by training the model on labeled data for mapping floods and burn scars, the team improved its performance by 15% compared to the current state-of-the-art, using only half as much data.
"The critical role of open-source tech to speed up discoveries like climate change has never been clearer," said Sriram Raghavan, VP of IBM Research AI. "By combining IBM's work on flexible, reusable AI with NASA's satellite data, and making it available on HuggingFace's leading platform, we can use teamwork to build faster solutions that will improve our planet."
Sources: nasa.gov / engadget.com