loader image

Data Trends & Predictions 2023

Data Trends & Predictions 2023

The great thing about being a data agency is the diversity of clients we get to work with! We work with clients of all shapes, sizes and sectors, in everything from underwear to elearning, and from dating apps to banking. As such we have our fingers on the pulse of what is happening in the world of data, what are the key challenges organisations face, and what might be in store for the next couple of years…

The Great Migration

The number one enquiry we’ve seen this year has been companies looking to move to new infrastructure. Digital transformation was of course, greatly accelerated by the pandemic and now with the advent of AI it is not surprising that companies are re-assessing their readiness to adopt new technology, with data infrastructure being key for the long-term.

For smaller companies, we’ve seen migration away from “Off-The-Shelf” tools  to modern, composable data stacks. “Off-The-Shelf” tools are a great way to get value from your data quickly but we’ve found that at a certain stage companies will naturally outgrow them as an opportunity cost arises in which they aren’t surfacing detailed/bespoke data insights. Our prediction is that more SMEs will actually adopt data stacks earlier in their journey as those technologies become more accessible.

For more established companies,  let’s talk about messy infrastructure. It’s not unusual at all for architecture to be built up on over time, with multiple people coming into a company, adding their own tables and models etc. The end result is often a company where there’s hundreds of dashboards, hundreds of models and so data teams become focused on firefighting and keeping everything going instead of providing value. Issues start to arise when business users start to see inconsistencies with numbers which impact overall data trust. At this stage companies have a big decision as to whether they want to untangle the mess they’ve created, or migrate to new infrastructure.

It can be difficult to create a strong business case for migration. After all, you can often spend months to essentially get to a cleaner version of where you were before, but your company’s data needs are only going to go up and without clean and scalable infrastructure the long-term costs (and opportunity costs) may be significant. Our prediction here is that we will see data cataloguing and governance tools becoming more important, because without strong documentation and processes, data efforts towards AI will struggle.

In terms of infrastructures, we’ve seen a clear shift towards warehouses that bring more flexibility and can accommodate lakehouse structure.(Snowflake, Data Bricks) Lakehouse architecture uses platforms like Apache Spark and Delta Lake to enable data processing, quality management and transactional capabilities on top of a data lake. Allowing you to deal with both structured and raw data. We predict that more warehouses will follow suite in this “All-In-One” offering but in doing so that data governance will become more important. We foresee that many companies will use this raw data capability to hoard data they don’t need, or don’t have the resources to utilise. 

The Rise Of Reverse ETL

The second most frequent enquiry we’ve had has been about replacing Customer Data Platforms. CDPs, like Segment and mParticle act as a centralised solution for marketing teams to gather data across channels. They provide a central repository for customer data, making it easier to access and analyse information across channels. The problem with these platforms, clients tell us, is that activating longer-term metrics, such as identifying churn risk, can be challenging inside of CDPs. The other problem is that the cost of CDPs increases with the amount of events that are tracked and so as a company grows, the cost can also grow quite substantially.

These cost concerns have been one of the primary motivations behind people looking to move away from these tools and wondering if their own data stack could act as an alternative, to which the short answer is yes. 

Reverse ETL tools like Census and Hightouch can take your modelled data and then pipe it into the end applications that your teams use. This means they can dynamically segment users in your CRM, add flags for churn risk and even send LTV signals back into your ad platforms to improve ROI. While we have some clients already deploying Reverse ETL, we feel as if this solution has somewhat gone under the radar with all the buzz around AI. 

We predict that more and more companies will begin to see the benefits of reverse ETL tools over CDPs and this may even challenge those companies to look at easier ways to activate those metrics.

Less Features and More Niche Tools

On the theme of tools. In the past couple of years we have seen data tools attempting to add more and more features. As an example we have visualisation tools that offer some level of data modelling, and a number of Extraction/Loading tools have included some standard transformation around popular data sources. (For example finance reporting from accounting software.) The interesting thing has been that users/consumers have largely rejected these additions!

It seems that the data community values tools that do a specific job, and do it well. While tool providers may hold lofty ambitions of being a ‘single tool that offers everything’ with the pace of technology advancement, organisations will fear putting too many eggs in one box. The amount of tool providers today is perhaps overwhelming, and so we predict that there will be a period of consolidation to the point where there are less companies, more focused on a particular solution.

Data Literacy Takes Centre Stage

Every year we run a small survey on data challenges, and every year the number one answer is democratisation. Companies are struggling to become truly data-driven and at the most fundamental level this is down to a barrier of understanding. Data is complex. Start talking about lakehouses and ETL processes to your marketing colleague and see how quickly they turn off! 

It’s often been said that “It’s easier to teach a marketing person SQL, than to teach a data person marketing.” But that is only true if we try and keep data as simple and accessible as possible. We see an immediate need for more courses, workshops and certifications in order to help bridge that gap. We can also see more data roles that aren’t technical around change management and democratisation. People who understand data and can help translate between the two parties, ensure that projects not only stay on time but that they deliver that ROI to the business, and who can spend the time training people on whatever the end data product may be.

Competitive Edge Shifting

Today it is digital natives and scale-ups who have a clear strategic advantage in that they are able to start fresh with modern data infrastructure and not be held back by monolithic legacy systems and data silos. This is still a major problem for older brands and in a session at BigData Ldn about half of the room were dealing with this issue. But as we said, the biggest trend we’ve seen this year are companies migrating.

That should sound warning bells if you’re a scale-up. Your competitive advantage will soon be disappearing! We see the next year as a crucial time for these companies to try and stay one step ahead. While demand for data professionals has outstripped demand since 2016, we predict that analytics engineers in particular will be in demand next year.

The AI Hype Train

No 2023 review article could be complete without a mention of AI and so last but not least, obviously we have had a lot of clients asking us about the best ways to start incorporating it. The answer in most cases is that there is no easy route, AI adoption is VERY reliant on cleaned, well-formatted data and without that bedrock, your AI efforts may fall flat. 

No doubt we’ve all had some experience in the last year where AI was added in without a strong use-case, was poorly implemented and actually caused issues. Notion seemed to be loading incredibly slowly after they launched their own writing assistant. Did anyone ask for a writing assistant in Notion?

We predict that as more companies look to explore AI features, there will be an emergence of tools like Neptune.ai that can help with the development and implementation of Large Language models.

AI is scary and exciting and we can all see that we are only a few years away from something with massive potential, but companies should be wary today. Poor execution and integration of AI can be expensive mistakes, and there is already so much value that companies can get from data that should be a first priority.


2023 seems like a transitional year in which most companies have recognised that in order to facilitate new technologies and a growing data need, they need to incorporate a composable data stack and quickly. While technologies such as AI and Reverse ETL can make activating data easier, there are still big human challenges in democratisation and training. 

1280 720 173tech