Back to the Basics of Everything Data — Technology Expansion, Snowflake, and the Modern Data Cloud
In the second of a two-part series, David and Ajay cover a wide variety of topics on technology expansion, challenges in the modern data cloud, data warehouses, data governance, and more!
“To me, it’s just been an incredible paradigm, and the journey with Snowflake has been really amazing.”
— David Hrncir
A Quick Recap
Hashmap, an NTT DATA Company, recently attended Snowflake Summit 2022. As an Elite Snowflake partner, Hashmap has a great working knowledge of Snowflake, and the capabilities of the Snowflake Data Cloud help us to succeed in our belief that we can do data better, together.
At Snowflake Summit, Hashmap had the perfect opportunity to present a live episode of Hashmap on Tap — a data podcast that covers a wide variety of topics in data science. Additionally, Hashmap on Tap has featured several guests from Snowflake, discussions involving its capabilities, impacts on the Data Cloud (and more!).
This Snowflake-centered, exclusive version of Hashmap on Tap was presented by David Hrncir, Regional Technical Expert at Hashmap, and Ajay Bidani, Digital Enablement and Insights Manager at Powell Industries (and a Hashmap client).
In the previous article, David and Ajay explore several topics of data culture including data strategy and how it compares to data culture, why a data-driven mindset matters, and leadership’s critical role in organizational success. If you haven’t checked it out yet, I recommend doing so before continuing on — there are plenty of insights that you will find valuable!
In this follow-up article, David and Ajay will explore technology expansions, challenges in the modern data cloud, a debate on the state of data warehouses, data governance, and more. So without further ado, let’s dive in and see what insights they have to offer.
“I’ve been seeing quite a bit where there is a high degree of data culture, you tend to get a high degree of data governance. When there is a low degree of data culture, the converse is true.”
— David Hrncir
Technology Expansion
Have you heard of the ADKAR methodology? (Yes — another acronym, we know, but they work). If you haven’t heard of the terminology, David says that you may want to subtly send the definition to the leadership team. So what is it and what is it used for?
ADKAR Methodology
- Awareness
- Desire
- Knowledge
- Ability
- Reinforcement
David shares that it’s a methodology used for building data culture within your organization. This is critical when thinking about the modern data cloud and all that it encompasses, such as data integration and the modern data warehouse.
Challenges of the Modern Data Cloud
The data cloud is a “big deal” to Ajay — he says that it has “permanently changed my outlook” on the way data is done. In his experience, two of the biggest challenges are transformation and modernization. When it comes to Snowflake, the ideas of being able to scale, remove friction, or not worry about building infrastructure have been game-changers. Just a few years back, these were roadblocks faced by many organizations that made it difficult to make regular progress, but Ajay offers simple advice: “Continue to look further ahead.”
Snowflake and the Data Cloud
David recalls when we first partnered with Snowflake and shares his first thought when seeing Snowflake’s architecture, “This seems so simple, extremely logical…and makes so much sense.” In his opinion, building for the cloud, particularly when it comes to Snowflake and the Snowflake Data Cloud, has changed the mindset around what we can do right now and how we can do it.
Data Warehouses: In or Out?
“What are we doing differently because it’s the data cloud and not just a data warehouse?” — Ajay Bidani
It’s important to recognize the differences between the data cloud and a data warehouse. Many people take data warehousing for data cloud, but they’re not exactly interchangeable. As Ajay puts it, it’s a term to describe where the data will go as a part of an essential repository.
David and Ajay both agree that they still use the term “data warehouse” to some degree because it is a vital capability of the data cloud. David uses the term to describe a type of “workload” within an enterprise. For Powell, Ajay says they use the terms “data warehouse” or “enterprise data warehouse” to denote workloads built in the past where “data cloud” is used for present and future workloads.
While the jury is still out on the terminology, David and Ajay’s perspectives help to shed light on technology expansion.
Innovation
What factors help to drive innovation, especially when it comes to the data cloud?
Experimentation is certainly one of them, and Ajay offers some valuable insight into doing it right. He advises that one of the best things is to remain practical about the progression and know how quickly things can move. It’s more important to start off at a slow pace and make sure people have bought into it and understand the necessary processes. Once this mindset is in place, the organization can participate to drive innovation and pick up the pace to achieve the desired outcome.
In David’s opinion, “innovation is the key” when talking about anything in data science, especially when it comes to Snowflake, data warehouses, tools, etc. Innovation is helping to drive new workloads and technologies in order to expand the datalink concept — which is making tsunami-size waves in data science. It’s imperative to understand the technology, business, and leadership in order to make decisions that will serve your goals — if something is not the right fit, move on to find something that is.
One example of this innovation at work is the ELT process. ELT has been a “source of positive things” for Ajay at Powell. Some of the benefits are reducing complexity and having the ability to bring in more data without having to stack processing in numerous places. There are some challenges that come along with it, such as increased volume of data and different types of data. However, in Ajay’s world, these are good challenges to have due to the potential opportunities that await.
How can I make business decisions with innovation in mind?
Ajay’s advice here is to find the right tooling, the right people, and the right process to get up and running as quickly as possible. However, just because you have great tools and know what’s possible doesn’t mean things will happen fast — the last thing you want is for the business to get caught flat-footed.
In the past, it was common to jump to a data decision — for example, deciding to waterfall a process and be a DI project. The challenge here is that things have shifted in the market today — data governance is now growing in popularity primarily because of the influence of time-to-market expectations.
Data Governance
“If you don’t have a data governance policy, odds are you ‘truly’ do not know your data.”
— David Hrncir
So what’s the deal with data governance? Why is this such an important topic in data right now?
To put it simply, David explains that in today’s data space, many data governance “issues” stem from having vast numbers of data sets derived from tunnel vision based on agility. One or more parts of the organization wants to expand and create more features (which is great), but this typically leaves other parts of the organization wondering about some of the “- I T Y s” such as accountability, security, validity, traceability, quality, discoverability, usability, and observability.
While Ajay states that Powell handles data governance “pretty well”, he also admits that discoverability is their biggest challenge. In his experience, the “quality” and “usability” of data goes down when consumers are unable to determine which dataset they are to use for a particular purpose — even though access is fully granted. Additionally, this leads to “shadow” or “independent” governance models/policies causing further data confusion.
“A lot of times companies store data just to store data and make it available. ‘We’ve got to capture it — we need it’. But is it [being applied] towards business value through governance?”
— David Hrncir
David says that data governance has been a key topic in most client discussions he has had over the last year or two. He states that if any of the below questions are being asked, you may not have a data problem — you most likely have a data governance problem.
- What dataset should I use?
- Is there sample data for this dataset?
- What are the data sources for this dataset?
- How was this data curated?
- Who is the data owner?
- Who should or should not be seeing this data?
- How often is the data updated/maintained?
- What is the purpose of this dataset?
Agility and speed are great for DI processes, but that’s only half the battle. If consumers are unable to find, gain access, and understand the data that’s available, how effective is an organization in terms of data?
“Metadata management tooling is servicing many challenges involving observability and discoverability.”
— Ajay Bidani
Ajay admits that data governance can be “a little intimidating” but oftentimes, looks can be deceiving and it’s not as scary as it seems. In his experience, it can be easy to overlook the minute details, but it’s imperative to be mindful so these details don’t slip through the cracks.
By staying vigilant to the details and processes when it comes to data governance, this helps build a level of trust between the business and technology teams. Ajay says that “Having more things that you can actually see [in terms of datasets] will make a big difference in trust.” Increasing visibility of these details is crucial in helping individuals understand why data should be trusted, and in turn, why they can trust their own teams and other teams within their organization.
David wraps this up by saying that many companies view data governance initiatives as, “What is required for us to do legally?” While that is one aspect, data governance is truly a key component of your comprehensive data strategy. Not only is governance implemented to avert data breaches and theft, for example, but it should also be implemented to promote the effectiveness of data use, data culture, and innovation.
Did you know? Hashmap has many data governance vendor partners and has recently performed some benchmark studies on data governance vendors. If you’d like to hear more about our findings, please contact us — we’d love to share our insight with you!
Frameworks
“Data mesh has a lot of promise to it.”
— Ajay Bidani
Data Mesh
In the world of data and data hybrids, data governance isn’t the only buzzword on the rise — data mesh is appearing pretty frequently, too. So what exactly is data mesh? According to David, “it’s a mind shift.” Think —SOA/MSA++.
Core Principles of Data Mesh:
- Domain-driven architecture
- Data as a Product
- Self-Service Infrastructure as a Platform
- Federated Computational Governance
David gives an example in which the sales team are data owners of a set of data. They write it, curate it, work with it, etc. (or they lead the projects for the data engineers that do) This subset of data is built for use within an organization — the sales team is not building the data products simply because they want it, rather, they want everyone to use it. There is a demand/need.
In order for people to use the sales team’s data products, the organization needs to have a self-service mechanism (platform). This self-service platform ensures that those who need to use data as a product have a quick way to get to it. Finally, the last piece that feeds into this is federated access which governs who can access the sales team’s data.
With these four basic principles in place, everybody that needs the data should be able to get to it quickly within an organization. David goes on to say that “if they can’t, you’re probably not doing it correctly.” He compares data mesh to the concept of an SOA architecture coupled with DI (data integration) and very dependent on development technology but with technology-agnostic consumption.
When thinking about domain teams, they each serve a different purpose. For example, the sales and marketing teams are going to be doing their day-to-day operations while concurrently developing data products for the organization. This allows the sales and marketing teams to be data owners and enables them to present what they have to IT [and other departments within the organization] in order to partner and work together on new ideas. At the same time, this takes some of the work off of IT’s plate, and instead of keeping new ideas in one domain team, it encourages the collaboration of ideas across different teams in the organization.
Ajay agrees adding that “getting IT out of trying to solve everything for everyone is definitely something we need more of.” He admits that the challenges don’t get any easier to solve and they’re never really removed — one challenge is just traded off the list for another one. In a way, it’s sometimes like taking two steps forward and one step back.
He shares that Powell is working towards engaging the business to be a part of solving problems, as opposed to simply relying on technology teams to solve everything. The key to success here, in his opinion, is to find a way to take these steps consistently with people who are interested and invested in making it happen.
Data Mesh vs. Data Fabric
New ideas, such as data mesh, are changing the paradigm of how data projects are done today compared to the last 30 or 40 years — spreading out responsibilities across interdepartmental teams in an organization. In thinking about these ideas, how does the concept of data mesh contrast to data fabric?
David explains that “data fabric is a much simpler concept in terms of development.” He explains that while they are similar, data mesh is about normalizing data products where consumption is through developed APIs while data fabric is about virtualizing data — where it sits. He says that with data fabric, “You create data as a service (DaaS) very similar to data products, but it’s a virtualized DaaS.” He continues to explain this by saying “You’re still creating and promoting data curation throughout the organization, you’re just not developing the consumption APIs.”
David continues to deepen this explanation by saying “data mesh is more organization-oriented with heavy development where data fabric is more technology-oriented allowing the virtualization tool to perform all the heavy lifting.” Data fabric allows easier access and consumption of the data (typically), but it’s going to be less refined (as data products). In his opinion, he thinks that “data fabrics are going to be a little bit easier to start with and implement versus data meshes.”
Ajay agrees that there are upsides and downsides to both and that data fabric relies more on technology whereas data mesh relies more on development. In his opinion, it comes down to the talent of your people when estimating what it would take to implement a data mesh or data fabric successfully.
“Hybrids of these two concepts are going to slowly become the norm. ‘Hybrids’ because we want fast/simplistic data exchange, but also have the ability to run high-end analytics across possibly massive datasets.”
— David Hrncir
Data Applications
No matter how many data warehouses, data fabrics, data meshes, or hybrids exist, consumption and consumption agility will be in focus. This brings David and Ajay to their next (and final) section on data applications (or data apps).
“You’re changing the paradigm of what IT does versus what the business does/needs — you’re going to have crossover.”
— David Hrncir
“How can I get quick access or quick data?” This seems like an easy question to ask, but the answer is one that clients most likely don’t want to hear. The problem with this, Ajay explains, is that “we don’t quite have the way to give it to them — yet. The ways to do it easier or faster are just not quite there yet.”
These self-serve technologies are being developed and released, and Snowflake is joining by taking it to the next level, such as with their new Snowsight features. The Snowsight UI now allows you to build and share dashboards, utilize worksheets to access, analyze, and manage your data, monitor activity, administer your account, and do anything and everything ‘data sharing’ via the Snowflake Marketplace. Additionally, the Snowflake Marketplace is no longer just data — they’re able to publish consumable data apps. This allows consumers to instantly visualize what they need without having to first heavily interpret the data.
“Anything that gets people access to insights faster is definitely what we’re about.
Is it something I’m excited about? Absolutely.”
— Ajay Bidani
David also recently performed a demo with another of Hashmap’s partners — ThoughtSpot. ThoughtSpot is embracing this drive for data apps through the use of self-serve as well as embedded, live analytics (ThoughtSpot Everywhere). The demo involved embedding live, natively drillable visualizations into a React app hosted on a network outside of ThoughtSpot. David shares “these companies are taking this [data apps] to the next level and redefining consumption agility. It’s a new era.”
Wait…That’s a data app?
“Data is truly critical for any organization processes and the goals you’re trying to hit.
I consider it to be a major risk to go at anything without putting data first.”
— Ajay Bidani
Be honest, who did a little shopping in the Amazon app during Prime Day(s)? (No judgment here because I definitely did.)
I hate to break this to you — well, technically you can blame David since I’m simply the messenger — but if you think the Amazon app is simply an ordering portal, it isn’t. It’s also a data app. In the case of Amazon’s app, David explains that, “They’re giving you instant feedback on things you want…and things you might possibly want/need based on your actions and selections.”
In fact, there are quite a few apps being used in everyday life that qualify as data apps — we’re just not thinking about them in those terms. As a frequent blood donor, David recalls when he realized he has a data app for tracking his blood donations. Then, there are grocery apps, coupon apps, company portals, landing pages…you name it. In his opinion, the ability to get access to data quickly via these data apps is what breeds data culture — turning that ‘useless’ data (you had to be there) into ‘useful’ information.
“One of the key messages we are trying to convey is to have ‘leadership’ promote data culture. Effective, data-driven leadership is paramount to an organization’s success. Data apps will also be a factor in that success.”
— David Hrncir
As for Ajay, he thinks that “data apps are probably the thing that will change a lot of what people want from us (data engineers).” In his opinion, change is inevitable when it comes to data apps, and it’s important to skill up and provide that agility when needed. David adds that “it’s going to change the paradigm of how we think about consumption agility going forward. Exciting times are ahead!”
Wrap-Up
I’d like to think that if JFK worked in data today, his famous quote would go something like, “Ask not for what your data can do, but for what you can do with your data!”
David and Ajay covered a number of fascinating perspectives that do pose the question of what you can do to help your organization be successful when it comes to data and data culture. If you want to hear more from David and Ajay, you can read their info below and check out links to listen to their previous appearances on the Hashmap on Tap podcast.
What was your biggest takeaway from this article? Let me know in the comments! Finally, you can also learn more about how Hashmap and Snowflake can help you do data better, together here.
About the Hosts
David Hrncir — Regional Technical Expert at Hashmap, an NTT DATA Company
David is a Data Cloud Expert at Hashmap, an NTT DATA Company, and is an enterprise data technologist and Snowflake enthusiast with an affinity for helping organizations build their data strategy roadmap. He’s been in the application development and data engineering game for over 27 years and has experience in numerous verticals.
Listen to David on these episodes of Hashmap on Tap:
- Episode #70: The Facebook Insights Show Featuring Fivetran, Snowflake, & Looker
- Episode #71: The Updated Snowflake Estimator with Sam and David
- Episode #80: The Cloud Data Platform Benchmarking Show with Chinmayee, David, and George
- Episode #83: The Data Ops Show with Chinmayee and David
- Episode #88: Leaning In On Semi-Structured Data with Chinmayee and David
- Episode #100: Hashmap on Tap 2021 Flight: A Sampling of Our Favorite Podcast Moments
Ajay Bidani — Digital Enablement and Insights Manager at Powell Industries
Ajay is a seasoned data veteran performing technical alignment for Powell working with numerous business functions to provide time-saving automation and optimization with a focus on creating scalable architectures. He has an affinity for driving business value through data stack and business application roadmap development.
Listen to Ajay on these episodes of Hashmap on Tap:
- Episode #29: Leadership Perspectives with Ajay Bidani from Powell Industries
- Episode #68: Manufacturing & Data with Ajay Bidani at Powell Industries
- Episode #97: Data Integration, Fivetran, and HVR at Powell Industries with Ajay Bidani
Listen to Hashmap on Tap’s most recent episode:
🚨 Subscribe to Hashmap on Tap on Spotify, Apple Podcasts, & Google Podcasts. 🚨
Additional Resources:
Ready to Accelerate Your Digital Transformation?
At Hashmap, an NTT DATA Company, we work with our clients to build better, together. We are partnering with companies across a diverse range of industries to solve the toughest data challenges and design and build data products — we can help you shorten time to value!
We offer a range of enablement workshops and assessment services, data modernization and migration services, and consulting service packages for building new data products as part of our service offerings. We would be glad to work through your specific requirements. Connect with us here.
Holly Hilton is a Marketing Associate at Hashmap, an NTT DATA Company. She writes blogs (and other content) for Hashmap and is a co-producer for the Hashmap on Tap Podcast. When she’s not busy crafting content, she can most likely be found with a YA novel or Xbox controller in her hand, ready for the next adventure.