Calendar An icon of a desk calendar. Cancel An icon of a circle with a diagonal line across. Caret An icon of a block arrow pointing to the right. Email An icon of a paper envelope. Facebook An icon of the Facebook "f" mark. Google An icon of the Google "G" mark. Linked In An icon of the Linked In "in" mark. Logout An icon representing logout. Profile An icon that resembles human head and shoulders. Telephone An icon of a traditional telephone receiver. Tick An icon of a tick mark. Is Public An icon of a human eye and eyelashes. Is Not Public An icon of a human eye and eyelashes with a diagonal line through it. Pause Icon A two-lined pause icon for stopping interactions. Quote Mark A opening quote mark. Quote Mark A closing quote mark. Arrow An icon of an arrow. Folder An icon of a paper folder. Breaking An icon of an exclamation mark on a circular background. Camera An icon of a digital camera. Caret An icon of a caret arrow. Clock An icon of a clock face. Close An icon of the an X shape. Close Icon An icon used to represent where to interact to collapse or dismiss a component Comment An icon of a speech bubble. Comments An icon of a speech bubble, denoting user comments. Ellipsis An icon of 3 horizontal dots. Envelope An icon of a paper envelope. Facebook An icon of a facebook f logo. Camera An icon of a digital camera. Home An icon of a house. Instagram An icon of the Instagram logo. LinkedIn An icon of the LinkedIn logo. Magnifying Glass An icon of a magnifying glass. Search Icon A magnifying glass icon that is used to represent the function of searching. Menu An icon of 3 horizontal lines. Hamburger Menu Icon An icon used to represent a collapsed menu. Next An icon of an arrow pointing to the right. Notice An explanation mark centred inside a circle. Previous An icon of an arrow pointing to the left. Rating An icon of a star. Tag An icon of a tag. Twitter An icon of the Twitter logo. Video Camera An icon of a video camera shape. Speech Bubble Icon A icon displaying a speech bubble WhatsApp An icon of the WhatsApp logo. Information An icon of an information logo. Plus A mathematical 'plus' symbol. Duration An icon indicating Time. Success Tick An icon of a green tick. Success Tick Timeout An icon of a greyed out success tick. Loading Spinner An icon of a loading spinner.

The Case for Data Decarbonisation – Part Two

The case for data decarbonisation - Snowflake
The case for data decarbonisation - Snowflake

This is the second in a series, by Snowflake, examining the concept of net zero data and how advances in technology can help the world’s largest organisations—especially those which are particularly emissions-intensive like oil and gas—reduce the carbon emissions footprint of their data. For part one, see here.

Fully exploiting the flexibility of cloud computing enables organisations to gain meaningful energy and emissions efficiencies. Unfortunately, the realisation of these benefits is often constrained by data platform architectures designed for fixed-capacity environments. We’ve generally observed this at nearly all of the analytical database services available in the market today.

These design choices were revolutionary at the time they were made, enabling the use of massive parallel processing (MPP) computing techniques required to handle the proliferation of ‘Big Data’ data sets. In today’s context, however, those design choices prevent analytics services from using computing resources efficiently. And a less efficient use of CPU additionally affects the energy impact, emissions footprint, and cost of data operations.

How Net Zero Data Works 

The increased server utilisation that comes from running on co-located resources in the public cloud has a positive impact on emissions, but employing a modern, multi-cluster, shared-data architecture built for the cloud provides additional benefits, specifically:

  • Eliminating the need to transform and process large data sets. Some analytics databases native to the large Cloud Service Provider (CSP) suites require transforming raw semi-structured data files into a traditional columnar structure to be ready for analytical workloads. This transformation requires a great deal of compute power. But new methods for partitioning and indexing highly compressed, cost-efficient data formats—like Parquet, JSON, or XML—now allow organisations to use the full range of traditional enterprise-grade SQL to query that data such that the data structure, or schema, is determined on-read. This eliminates entirely the energy, emission, and cost associated with transforming semi-structured, machine-generated data sets which are so common in the energy industry. These data sets include seismic data in subsurface exploration; IoT data in upstream fossil fuel production, refinery processing, and renewable electricity generation; time-series or other market data in energy trading; and transaction data in forecourt and convenience retail.
  • Eliminating the need to store multiple forms and copies of the same data. By eliminating data transformations, we simultaneously eliminate the need to store pre- and post-processed forms of those data sets. Indeed, the concept of pre- and post-processing is replaced entirely by a single, unified data environment across semi-structured and structured data. By creating software-defined, functional views on top of that single data environment, we can additionally eliminate the need to make copies of data for different use cases, for example, data engineering, data science, business intelligence reporting, financial and regulatory compliance reporting, or ad hoc analysis. And fewer copies means less disk space is required to store that data, thus reducing the energy, emissions, and cost requirements of data operations.
  • Reducing the CPU capacity required to run a global, enterprise-grade analytics platform. The shared-nothing architecture underpinning some of the CSP’s native analytics database offerings requires a one-to-one scaling of compute clusters and database instances, where the number of nodes in any single cluster has a fixed upper bound. Delivering the 24×7 speed and performance required by a large, complex global organisation thus requires full-time availability of peak capacity CPU resources to avoid, for example, latency associated with boot-up or concurrent users. Through a shared-data architecture, however, we can achieve a complete logical decoupling of storage and compute, provisioning resources per second of actual CPU usage and eliminating machines idling on stand-by. No machines on stand-by means radically less energy usage, emissions, and cost attributed to data.
  • Compounding the effects of more efficient data centre design and management. Other non-software-related improvements in the design and management of data centres lead to additional large energy and emissions gains. AWS is contracting more and more power from renewable energy producers, including from BP’s solar energy plants1. Microsoft is experimenting with sinking data centres into the ocean2 to reduce the energy associated with cooling servers. In contrast to many on-premises corporate data centres, CSPs run their cloud operations as profit centres, and the profit motive has naturally accelerated the development of structural data centre designs that yield a significant improvement in energy and cost efficiency. Optimisations made at the database architecture level are likely to compound the effects of more efficient data centre designs, although further research in this area is required to understand to what extent this holds true.

In our third and final post covering net zero data, we’ll explore how one of the largest energy companies in the world can leverage better, faster data to decarbonise their operations.

Recommended for you

More from Energy Voice

Latest Posts