ELTIMS — The New Data Acronym

ETL doesn’t cover our modern data needs

ELTIMS encapsulates both Data and MLOps
Data is everywhere | Raconteur

The original acronym - ETL

The ETL Process | BMC Blogs

Get the data in, deal with it later! — ELT

The concept of ELT is quite compelling, rather than having an intermediary system that bottlenecks your data transfer, you move it downstream and let that system deal with the data! While this sounds like a shift of responsibility, you get a couple of benefits:

  • The compute for conducting transformation is closer to the storage which usually leads to faster performance.
  • The data storage team now owns this process, so you’re likely to have faster iteration and better collaboration
  • You have more flexibility on what new views you create, since the data is centralized.
  • Snowflake and Databricks providing storage at cloud provider cost (S3, Blob, etc) while enabling full analytics capabilities
  • Snowflake and Databricks decoupling compute from storage, with several solution providers getting closer to enabling true server less capabilities
  • Snowflake, BigQuery, Redshift, Azure Synapse, Databricks supporting JSON and semi structured data
  • Major players (listed above) allowing native transformation tasks and jobs to be run within their data storage procedures.

Competitive insight: controlling compute and storage

As we’ve mentioned earlier, controlling both compute and storage was the name of the game. Previous leaders like Terradata and IBM have fallen behind in the space due to their slow start in the cloud space, and large set of on-prem legacy customers which has limited their ability to offer competitive elastic solutions.

Building moats — custom workflows and new formats

As the ELT space matures, major players will need to defend their existing accounts. The best way to do this is to offer customers convenient features that make their lives easier but also increase the switching costs.

  • Serverless Azure DataFactory with dozens of integrations
  • Serverless Glue jobs
  • Snowpipe and custom SQL/UDF tasks in Snowflake
  • Jobs Api and Cheaper clusters introduced into Databricks
Databricks jobs, built in orchestration | Databricks
Open Source Unicorns | https://tracxn.com/

Limited operations, rising expenses

While the past couple of paragraphs have commented positively on Databricks and Snowflake as case studies, they still have some key limitations that have prevented them from winning the space.

Get some insights, do some modeling — ELTIM

At the end of the day, companies invest in data infrastructure to get insights that can advance their business. Customer, industry, and product trends all need to be understood and made actionable, with data being the key ingredients.

Worldwide spending on big data and business analytics (BDA) solutions is forecast to reach $215.7 billion this year, an increase of 10.1% over 2020, according to a new update to the Worldwide Big Data and Analytics Spending Guide from International Data Corporation (IDC)

With the data market growing quickly and new features being released every day, there will significant churn in the market as companies continue to invest in their data practices. A fully integrated data platform is especially compelling as it helps users get their products to market faster, reduces engineering cost and provides better security.

Serving what you made — ELTIMS

Deploying models to prod | Algorithmia
ELTIMS encapsulates both Data and MLOps

ELTIMS Is The Future, For Now

We’ve gone through the evolution of data products over the past 40 years, and many things have certainly changed. With machine learning currently the leading frontier for company digital transformation and evolution, achieving fully functional ELTIMS is target state for many companies.

--

--

ML Architect @ Voiceflow

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store