Data lake: how Red Hat maintains data quality across multiple Drupal sites

Session Category Development & Performance Room 182 Audience Intermediate Time Slot Sat 2:00pm to 2:45pm (2/18/23)

Data accuracy and consistency is an important goal for any organization.

Maintaining data quality across multiple websites and applications (Drupal or otherwise), with different teams managing the same data in multiple systems, becomes complex and difficult to manage. Having a pool of data becomes an attractive solution to resolve some of these issues and allow for greater transparency and consistency across an organization. But, creating a scalable, reliable, and useful system brings its own challenges.

Join us, as we explore several ways that Red Hat is using a data lake architecture to share data between different Drupal sites.

We’ll cover:

  • What is a data lake?
  • The benefits, challenges, and considerations of using a data lake.
  • Several ways Red Hat has integrated a data lake architecture with Drupal.
  • Lessons learned along the way.

About the Speakers

April Sides

Senior Software Engineer at Red Hat

Asheville, NC

I am a Senior Software Engineer (Drupal Back-end Developer) at Red Hat. My hobbies are saying "yes" to too many volunteer opportunities and going on "adventures" with my step-granddaughter. I am a philosopher in the void.