Homepage Segment newsroom

Announcing Segment Data Lakes, a New Data Architecture for Customer-First Companies

Announcement posted by Segment 09 Sep 2020

Built on the World’s Leading CDP, Segment Data Lakes Provides Businesses with the Foundation Needed for Advanced Analytics, AI and Machine Learning

SAN FRANCISCO, Calif. -- September 8, 2020 --  Segment, the world’s leading customer data platform (CDP), today announced the launch of Data Lakes, a new data architecture product built specifically to help companies create cutting-edge customer experiences with their customer data. Flexible, affordable and easy to use, Segment Data Lakes provides companies with the foundation needed to produce advanced analytics, uncover rich customer insights, and power machine learning and AI initiatives.
 
“We all know that customer data creates a competitive advantage, but what companies sometimes fail to recognise is that the quality of their data architecture will determine just how significant that advantage really is,” said Tido Carriero, Chief Product Development Officer at Segment. “Segment Data Lakes gives data scientists the foundation they need to unlock the full potential of their customer data, so they can build better products and provide world-class customer experiences.” 
 
When Data Warehouses Are Not Enough

Segment Data Lakes builds on the foundation that customer-centric, digitally-driven companies first created through data warehouses. For years, data warehouses have been a critical part of any company on the journey to digital transformation, because they provided access to key data needed to understand the digital customer journey. However, as the amount of customer data being generated continues to grow, and as customer expectations for highly personalised, real-time experiences increase, their limitations are now clear. Though they are valuable within a straightforward data architecture, data warehouses can limit a rapidly growing company’s ability to get the most from its customer data.
 
Data warehouses are not designed with the flexibility that data scientists need to power complex machine learning and AI use cases -- they often face severe limitations and are limited to SQL-only tools for analysis. In addition, performance issues and maintenance headaches are a common challenge. As a company’s data volumes multiply, the associated cost of storage only adds to the burden of keeping data warehouses running.  
 
Uncovering custom insights and fuelling predictive models also requires access to raw customer data, often going back years, as well as detailed granularity at the event level. This is not something that data warehouses are designed to hold, leaving data scientists without the historic data sets they need to build and train their models.
 
Although data warehouses are useful for storing recent, structured data, data lakes are the best-in-class solution for storing large quantities of raw information that can then be used to create world-class customer experiences.
 
The Next Generation of Customer Data Architecture

Built specifically for customer data, Segment Data Lakes future-proofs a company’s architecture so its analytics, customer insights, and AI and machine learning initiatives can scale with the needs of the business. Combining the best of data lake architecture and data warehouse architecture, Segment Data Lakes provides companies with a flexible and affordable foundation for advanced customer analytics, data science and machine learning.
 
It is designed to be:
 

  • Clean: Unlike traditional data lakes, Segment Data Lakes is populated with customer data that has already been schematised and optimised, making it clean, accessible and easy for data teams to use.
  • Powerful: Data Lakes enables users to get more value from the first-party customer data they collect via Segment, enabling it to power predictive analytics and machine learning as well as deeper insights into everything from customer engagement to marketing performance. By providing access to historical customer data, companies can see accurate year-over-year trends and better understand customer longevity.
  • Affordable: Data Lakes helps companies manage and reduce their storage costs by limiting the amount of data stored in third-party warehouses.
  • Scalable: Data Lakes provides a scalable foundation for a company’s machine learning and AI investments, future-proofing its customer data architecture and preventing vendor lock-in so a company can build a truly best-of-breed tool stack.
  • Efficient: Data Lakes saves engineering teams from having to spend their time building and maintaining an inefficient data architecture, so they can focus instead on building new products and features.
  • Flexible: Data Lakes gives companies the option to de-couple compute from storage, and allows data scientists the flexibility to query raw data directly or plug in whichever analytical tools and processing systems best suit their needs, whether it’s a Jupyter Notebook or loading their data in Databricks or other tools for further analysis.


“Segment Data Lakes has been an absolute game changer for us. In fact, we no longer need a data warehouse, and we can query what we want, when we want, without worrying about costs,” said Casey Kent, Lead Infrastructure Engineer at Rokfin, a Segment customer. “It’s provided us with a foundation that gives us a much deeper understanding of our customers than ever before, and we’ve been able to use these insights to build a truly differentiated experience for our customers. We’ve generated new types of customer insights that we didn’t think were possible.”
 
“Customer data is essential to delivering exceptional products and experiences, but since data lakes are difficult to build and maintain, few businesses have the architecture in place to truly make the most of it,” said Daniel Newman, Principal Analyst at Futurum Research. “As a data lake built specifically for customer data, Segment Data Lakes provides the foundation to unlock complex use cases like machine learning and advanced analytics, so businesses can do more with their data and power deeper personalisation of the customer journey.”
 
Availability
Segment Data Lakes is now available to all Segment customers using Amazon Web Services (AWS). Support for Microsoft Azure and Google Cloud is planned to follow. 
 
For additional details, please visit the blog post about the announcement here: https://segment.com/blog/introducing-segment-data-lakes
 
About Segment:
Segment is the world’s leading customer data platform (CDP). Our platform democratizes access to reliable data for all teams and offers a complete toolkit to standardise data collection, unify user records, and route customer data into any system where it’s needed. More than 20,000 companies like Intuit, FOX, Instacart, and Levi’s use Segment to make real-time decisions, accelerate growth, and deliver compelling user experiences. For more information, visit: https://segment.com