Data Engineers: The Hidden Drivers of the Great Data Disruption

Companies’ accumulation of big data was on the rise before COVID-19. When nearly every industry shifted to virtual offerings, opportunities to collect and strategically leverage data and to engineer new data solutions boomed. In the years since, we have seen the marketing and sales chessboard overturned. Data, companies have come to realize, now infuses every solution, every sale, every customer interaction.

Who is the gatekeeper of this power and potential? Data Engineers.

Data engineers build the systems that collect, store and analyze companies’ data assets. They construct the worlds that IT leaders and business execs probe to uncover insights and make business decisions. For thirty years or so, there were relatively few changes in the core tenets of data engineering. Those days are gone. Soon, we’ll barely recognize the field.

Data engineering has evolved so quickly that the tools and techniques learned on the first day of university are obsolete by the time many engineers enter the field. Talent wars are underway for the brightest graduates. We are witnessing the dawn of astonishing technologies that can propel organizations to exponential growth. Those that fail to adapt may lose top talent and fall behind.

This is the Great Data Disruption. Data engineers are sitting on a corporate fault line — and the cracks are showing, even if many execs can’t see them yet. The rewards will be great for organizations that pay attention to the tremors and are prepared to adapt.

A Seismic Shift

Data engineering encompasses all the tools and techniques used to turn data into business value. In the past, that included four core skills:

  1. SQL, or structured query language: This is the programming language (code) used in relational database management systems. Relational databases’ elements fit together in a highly organized manner. Users can query (search) these databases using keys that help the data elements relate to one another. For example, tables in a relational healthcare database might show how patient age and location relate to health outcomes like falls.
  2. Data warehouse design: Data warehouses are a type of database that combines historical and current data from a company’s multiple systems. Typically, data warehouses are organized, searchable and relational because they’re meant to help companies analyze patterns in their data over time.
  3. ETL, or “extract, transform and load”: To relate company data from multiple sources for use in data warehouses, engineers must extract data from its original source, transform it so it can speak to other data, or have interoperability, and load it into the new data warehouse.
  4. Reporting and dashboards: Data insights are only valuable if people can understand them. A critical aspect of data engineering is knowing how to create easy-to-use front ends, dashboards and reports to guide business decisions.

Historically, data engineers have focused on one or two of these areas, with many specializing in a single tool. A major industry shift has been the growing expectation that data engineers should be adept in a multitude of tools and techniques. Today, data engineering skills should include, but are not limited to:

  • NoSQL databases store collections of nested data structures rather than tables and columns. Nested data structures refer to multiple groupings of varied types of data such as transactions or master data. Querying this data requires software code rather than traditional SQL.
  • A wide array of additional languages like Python, Javascript and Scala and the many programming-Techvedic tools that use them.
  • The fundamentals of cloud computing infrastructure.
  • New analytics toolsets like log analytics and streaming analytics.
  • Test-driven development (TDD) approach toward data loading and management.

The most important skill for data engineers today is the need to excel at quickly learning and applying new technologies. Companies won’t laude the highest compensated engineers for their years of experience in a singular technology. The greatest reward will go to those who can assess an array of tech options and apply the best solution(s) for the business problem at-hand.

The question is, will IT leadership be able to keep up with the pace of change?

Implications for IT Managers

IT managers have traditionally blanched at the idea of their developers exploring innovative tools and approaches. They have often viewed the pursuit of new technology as a diversion from productivity.

If mid- and upper-level managers hope to remain competitive, however, they will need to shake this old way of thinking. Here are three reasons why:

1. Technology is streamlining innovation.

Guiding a data solution through the innovation lifecycle, such as phases of product development, once meant years of uncertainty, uncountable people hours and massive investments. Today, innovation processes do not have to be so complicated.

What’s changed? With cloud computing, it’s now fast and relatively inexpensive to trial new tools and solution approaches. Many technologies are available ready-to-use with only a few clicks or pre-installed on containers. There is now an entire ecosystem of documentation and demo implementations available at no or limited cost. Open-source technologies often give direct access to the community of developers who created them.

In the past, engineers often dedicated themselves, either by choice or instruction, to one technology. Moving forward, the fastest way to advance your bottom line will be to have engineers explore and innovate on multiple technologies. It will pay to support engineers’ continuing education, send them to conferences, let them demo products, experiment with system disruptions — and see how quickly they can stitch together a better business approach.

2. New models mean new opportunities.

Any classically trained data engineer would know the term “Kimball.” This foundational model for data warehouse design was developed in 1996 by Ralph Kimball with co-author Margy Ross. Until recently, the Kimball was a widely known and often used data modeling technique — but now, it feels a lot like a dinosaur.

Why? You can now easily amp up processing power in the cloud, meaning it’s not necessary to optimize data models for compute. With tools like Power BI, Power Query and Analysis Services, analysts can explore data from any source. The data warehouse is no longer the only place to find useful data.

And the solutions keep coming: we’ve seen clients use Azure Machine Learning Studio to realize the potential of AI and machine learning. Robotic process automation (RPA) tools can automate and even improve frustrating or time-sinking tasks. We’ve seen clients build front-end applications to predict and respond to customer demand. These are only some of the solutions that data engineers can build or leverage.

3. IT is now a profit center.

In the past, IT was viewed organizationally and financially as a cost center. Today, IT teams, and specifically data engineers, genuinely create business value.

IT used to be the department that kept the lights on for operations, finance and other departments. Today, IT is operations. It is finance. It is marketing. The solutions IT produces generate customer insights, differentiate companies, streamline processes, reduce spending and more.

The business-saving solutions IT generates cannot exist without the data engineers who conceive them. Data engineers can contribute material financial improvements to your bottom line — if you give them the time, resources and flexibility to do so.

The Risks of Avoiding Change

At Techvedic we’ve seen around 50 percent of IT leaders embrace the Great Data Disruption. These organizations attract outstanding talent. The value extracted from their data skyrockets as engineers harness it for predictive insights and innovations.

This future is not reserved for the largest and most sophisticated companies alone. Early adopters range from low-to-mid-market firms to those that dominate markets. They span every industry, including logistics, insurance, healthcare and more.

On the other hand, we’ve seen organizations steadfastly hold on to the past. Transitions to cloud computing may not be a priority. Traditional data warehousing is the most IT delivers. AI is considered out of reach. There are many reasons for resisting new technologies — budget, risk, infrastructure, staffing. These concerns are valid, but they should be seen as obstacles to overcome, not final answers.

Companies that fail to evolve risk falling behind on two fronts:

  • Competition for top talent: Emerging data engineers won’t be satisfied with traditional warehouse work. Their horizons are expanding, and many want more. They want to use programming and data analytics skills, now present in most college curricula, in their everyday work. Companies that offer professional growth alongside the newest solutions will attract the newest talent.
  • Competition for market share: Likewise, within every market is a host of companies seeking market shares. In the past, data engineers delivered internal reports and analyses. Today, they are creating external commercial advantages to help set companies apart.
Embracing Disruption

The bottom line: companies who hold onto traditional approaches will struggle to compete and fail to attract new talent. Leaders can embrace the Great Data Disruption by investing in data engineering and data engineers’ careers.

Build your talent pool by offering attractive recruitment packages. Be purposeful about retention as well: listen to your engineers’ career aspirations, professional development interests and financial needs. Tailor retention packages accordingly.

Lastly, encourage engineers to pursue innovative and disruptive techniques. Give them permission to test new, cloud-based solutions. Collaborate with them, each step of the way, to ensure IT strategies are tied to business strategies and add meaningful value to the organization.

How can we help?