Christof Wenzeritt

Data Innovators Exchange

Activity

Mon

Wed

Fri

Sun

Oct

Nov

Dec

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

What is this?

Less

Memberships

Data Innovators Exchange

Public • 181 • Free

Skool Community

Public • 141.1k • Paid

Data Alchemy

Public • 19.8k • Free

15 contributions to Data Innovators Exchange

Paul Barlow

1d ago

General

The ultimate Data Breach nightmare scenario

In 2017, the third largest consumer credit agency in the world was victim of one of the largest ever data breaches on record. In this week's Data Radio Show, Ignition's Julien Redmond talks with Equifax's Bob Sparshatt about what happened, what was learned and how it impacted the wider publics's understanding of what data is held on them.

New comment 1d ago

The ultimate Data Breach nightmare scenario

Christof Wenzeritt

1 like • 1d

What an interesting episode. Data breaches are happening so frequently these days and not every company can recover from this. I still see relatively small focus on data security for most projects as it seems like an expensive thing to do without having any short-term value for the business. Also there ia a huge lack of knowledge around security for many data teams. This topic also suffers from the reality that it still rarely happens to you, so you don't feel a risk until it's to late. It's important to understand that everyone can be targeted, and basic things like not giving every developer access to basically everything + admin rights can already make a big difference. In general, as data platforms are getting seen as a product (which is great), it also needs a security concept and control system like a software product would (at least, the good ones).

Soumendu Dutta

9d ago

General

Hi from Soumendu Dutta

Hi All - I have just joined after completion of Data Vault 2.0 Bootcamp.

New comment 3d ago

Christof Wenzeritt

1 like • 9d

Welcome Soumendu 🙂

Christof Wenzeritt

1 like • 9d

@Soumendu Dutta That's great to hear! Thanks for the positive feedback 🙂 If you want to get further info around the topic, feel free to check out the classrooms here or our Knowledge Hub on our Website and join the Data Vault Fridays with Michael 🙂 And all the best for the certification exam 😉

Christof Wenzeritt

9d ago

General

Taming the Wild West of Distributed Ownership (Data Mesh)

In principal Data Mesh offers a brilliant approach to managing data at scale, decentralizing ownership while maintaining centralized governance. However, that requires a lot of change in the organization and without a clear strategy, Data Mesh can easily lead to anarchy, data silos and many more meetings. I'm very much looking forward to be taking the stage with @Marc Winkelmann at Data Dreamland to dive into this topic with our presentation: "Data Mesh Governance: Taming the Wild West of Distributed Ownership" I hope to see many of you in Hanover! Sign up here if you want to join: https://scalefr.ee/d7goui #DataGovernance #DataManagement #DataDreamland

New comment 8d ago

Lorenz Kindling

10d ago

Ask your community

Relational Stage vs. Data Lake in Data Vault—Where Are the Differences?

Relational stages handle structured data with real-time processing and schema validation, while Data Lakes are built for unstructured data, offering flexibility and scalability for large datasets and analytics. Where do you see the biggest differences in how they’re used in your Data Vault setup?

New comment 9d ago

Christof Wenzeritt

3 likes • 10d

I'm generally a fan of a Data Lake if they are well structured to have a more open architecture that as a standard handles semi-structured and unstructured data. However, one big thing that's a drawback is deletions of certain records in a Data Lake for e.g. privacy reasons. Of course that's still possible, but difficult. I would love to hear your thoughts or from anyone else here what you would prefer? Or maybe a mix of both to easily handle deletions for some data?

Marc Winkelmann

18d ago

General

My 5 Tips when working with Snowflake

Of course there are dozen of tips available for Snowflake, but let me share the ones which came into my mind very quickly: 1) Understand how Snowflake stores the data! They are using micro-partitions, organized in a columnar way. Micro Partitions store statistics like distinct values and value-ranges for each column. Your goal should always be to prune as much as possible from both when querying data. For example: Only select columns you really need, and apply filters on columns where the values are mostly not overlapping multiple Micro Partitions. Also think on re-clustering your data if necessary, or creating your own values with a pattern to cluster your data on (usually only necessary for huge amounts of data in one table). 2) When data is spilled to local storage while querying, is a good indicator that a bigger warehouse makes sense. I assume here that the query itself is already optimized and we are just dealing with a lot of data and maybe complex logics. But keep in mind: Increasing the size of the Snowflake Virtual Warehouse by 1 step (i.e. M -> L), doubles the costs or the same runtime! (calculated per cluster). So, when the query time is less than 50%, we achieved a win-win: faster & cheaper result! If the runtime could not be reduced by 50% or more, then you have to decide whether the quicker response is worth the money you now spend. 3) Snowflakes no-copy clones allow you to test features and fixes against your production in a very easy and fast way. It should be part of your deployment pipelines. 4) Insert-only reduces the number of versions Snowflake has to create for the Micro Partitions. Updates and Deletes cause this versioning of already existing Micro Partitions what costs time and additional storage. That also means that Data Vault with its Insert-Only approach meets the scalability factors of Snowflake! 5) The QUALIFY statement improved the code writing a lot. It is using the result of a window-function as filter, means, you don't have to write nested sub-queries with and self-joins.

New comment 13d ago

Christof Wenzeritt

3 likes • 17d

Amazing tips! Thanks for sharing 🙂

1-10 of 15

Level 3 - Investigator

18points to level up

Christof Wenzeritt

@christof-wenzeritt-9987

CEO at Scalefree

Active 23h ago

Joined Apr 11, 2024

Contributions

Followers

Following