This event has ended. Create your own event on Sched.
Welcome to the Earth Science Information Partners (ESIP) 2018 Summer Meeting! The 2018 theme is Realizing the Socioeconomic Value of Data. The theme is based on one of the goals in the 2015 - 2020 ESIP Strategic Plan, which provides a framework for ESIP’s activities over the next three years.

All Presentations are being added to a Google Folder temporarily and then will be moved to FigShare and linked to the sessions here. 
Back To Schedule
Tuesday, July 17 • 9:30am - 11:00am
Optimizing Data for the Cloud

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.

Session Description: When data is shared in the cloud, anyone can analyze it without having to download it or store it themselves, which lowers the cost of new product development, reduces the time to scientific discovery, and can accelerate innovation. However, staging large-scale datasets for analysis in the cloud requires consideration of how data should be prepared and organized to allow fast, efficient, and programmatic access from distributed computing systems. This workshop will provide a forum for members of the community to share lessons learned as they explore ways to use the cloud to expand access to data. It seeks to encourage dialog between users interested in leveraging data in the AWS Cloud for research and application development.

Data Optimization for the cloud: Tools and Services (July 17th, 9:30 am – 11:30 am):


Joe Flasher, AWS (10 min)


Dan Pilone, Element84 (10 min)
Title: Interdisciplinary research, heterogeneous data, and the case for Archives of Convenience
Description: Earth Science data is measured in petabytes and represents decades of data collection, evolution of technology and practices, and provides an unparalleled view of our planet. The pace of change is only accelerating: NASA and other agencies are on their way to making hundreds of Petabytes of data available in the cloud, highly scalable processing and analysis architectures and tools are in active use with more being developed every day, and each of these brings with it opportunities for optimization and innovation. This talk demonstrates leveraging the elastic nature of the cloud using GOES-16 data to create ephemeral Archives of Convenience, targeting individual researcher needs, optimized for their problems and tool suites, instead of trying to settle on a single "cloud optimized" solution.

Ilya Khamushkin, Intertrust (10 min)
Title: Earth Data for Everyone
Description: At Intertrust, we believe that working with Earth science data should be easy. Too often file formats, transfer protocols, and cumbersome access interfaces make it difficult for users without domain knowledge to incorporate these data into their workflows. During this session we’ll share our experiences from the past five years building and operating the Planet OS Datahub, our cloud-based data as a service platform.

Marty J. Sullivan, Cornell University (10 min)
Title: The Need for Data Lakes in Climate Science
Description: Climate data is massive. The archive data formats used in the field are difficult to retrieve and analyze, they also come from so many different sources. Learn how and why Cornell University’s department of Earth & Atmospheric Sciences is moving toward the concept of building geospatial data lakes in Amazon S3 and using tools like Amazon Athena.

Sudhir Shrestha, ESRI (10 min)
Title: Scientific Earth Science Data to Cloud Optimized Web Services;
Description: Working with earth science data to extract information sometimes can be challenging due to its diversity and complexity. In this session, we will demonstrate real world examples of successful application of open earth science data in ArcGIS platform. We will share briefly the workflow of optimized scientific data management (ingesting, managing, analyzing and sharing) in cloud and how you can quickly spin up the web applications to share your information products including analytics to larger community. We will share few use cases, such as NOAA High Resolution Refresh Radar (HRRR), Sentinel data and other webmap applications that demonstrate how we access large collections of near real-time data that are stored on-premise or on the cloud, disseminate them dynamically, process and analyze them on-the-fly, and serve them to a variety of geospatial applications.

General discussion (10 min)

Breakout groups: focus on tools and services (30 minutes)

(Continue conversation over coffee - 30 minutes)

Speakers & Moderators
avatar for Joe Flasher

Joe Flasher

Open Geospatial Data Lead, Amazon Web Services
Joe Flasher is the Open Geospatial Data Lead at Amazon Web Services helping organizations most effectively make data available for analysis in the cloud. The AWS open data program has democratized access to petabytes of data, including satellite imagery, genomic data, and data used... Read More →
avatar for Ana Pinheiro Privette

Ana Pinheiro Privette

Amazon Sustainability Data Initiative (ASDI) Lead, Amazon
Dr. Ana Pinheiro Privette is a senior program manager with Amazon's Sustainability group and she leads the Amazon Sustainability Data Initiative (ASDI), a Tech-for-Good program that seeks to leverage Amazon’s scale, technology, and infrastructure to help create global innovation... Read More →

Tuesday July 17, 2018 9:30am - 11:00am PDT
  Pima, Workshop
  • Subject Jump In, Deep Dive
  • Remote Participation Link https://global.gotomeeting.com/join/752150301
  • Remote Participation Access Code 752-150-301
  • Remote Participation Phone # (646) 749-3129 More phone numbers Australia: +61 2 9087 3604 Austria: +43 7 2081 5427 Belgium: +32 28 93 7018 Canada: +1 (647) 497-9391 Denmark: +45 32 72 03 82 Finland: +358 942 72 1060 France: +33 170 950 594 Germany: +49 692 5736 7317 Ireland: +353 16 572 651 Italy: +39 0 247 92 13 01 Netherlands: +31 202 251 017 New Zealand: +64 9 280 6302 Norway: +47 21 93 37 51 Spain: +34 932 75 2004 Sweden: +46 853 527 827 Switzerland: +41 435 5015 61 United Kingdom: +44 20 3713 5028
  • Tags Cloud Computing, Data Analytics