Activity
Mon
Wed
Fri
Sun
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
What is this?
Less
More

Memberships

Data Innovators Exchange

Public • 181 • Free

12 contributions to Data Innovators Exchange
Recommendations Needed: Best Resources for Learning Data Governance Strategies
Hi everyone! I'm looking for recommendations on where to learn more about Data Governance, especially from a strategic and requirements perspective, rather than just technical-focused content. Any good courses, training programs, or resources that dive deeper into governance frameworks and best practices? I’d appreciate your insights!
3
6
New comment 2d ago
0 likes • 3d
For all data management aspects, where data governance is part of, I would think of www.dama.org
0 likes • 2d
@Lorenz Kindling Not personally, but I know that they are quite active as an organization. I own the book DAMA-DMBOK. It is really set of principles for data management (like TOGAF for architecture, or ITIL for IT Management) - so quite boring read :) AI Assistant would give you probably something shorter and equally good - to start with. 🤫
Is hashing good enough for anonymising data?
https://www.rnz.co.nz/news/business/527419/inland-revenue-giving-thousands-of-taxpayers-details-to-social-media-platforms-for-ad-campaigns?fbclid=IwY2xjawFLSwJleHRuA2FlbQIxMAABHfiQoZd2lKNuLPKWDo5IrGSrtYtTKNwWBrS0kfJBtccVTTWP9FPKrjY3zg_aem_jp9YxiznNYVeVO5Oo9wqfA
5
7
New comment 2d ago
Is hashing good enough for anonymising data?
1 like • 11d
Seems right. What are alternatives?
0 likes • 2d
@Shane Gibson this is an operational solution. I what would be the technical better solution, something like “salted hashing”
Tools Data Engineers need to know in 2024
There are so many tools out there, but I found this video on YouTube. It did a good job of breaking down the essentials. Here's a quick list of the tools mentioned: - Basics: SQL, Python, Linux, bash scripting, network understanding - Technical Basics: Git, SFTP, PGP - Databases: PostgreSQL, MySQL, MongoDB - Data Platforms: Snowflake, Databricks, BigQuery, Redshift, Azure Synapse Analytics - Orchestration, ETL & Data Pipelines: Airflow, Dagster, SSAS, Azure Data Factory - Cloud: AWS, Azure, GCP - Others: Docker, Kubernetes, Terraform If you had to give advice on where to start, what would that be? What are your favored tools?
6
5
New comment 3d ago
Tools Data Engineers need to know in 2024
3 likes • 5d
I would add to this “data storage formats”: table, json, parquet, delta, iceberg
2 likes • 3d
@Lorenz Kindling Correct. I think that for each person that works with data, on each level of data stream, it is important to be at least familiar with basic modeling terminology and concepts.
I got my ticket for the Data Dreamland! You also?
Excited to be presenting alongside @Christof Wenzeritt at Data Dreamland! Main topic is Data Mesh and how to do one. It offers an effective approach for managing large-scale data, but the transition isn't always smooth sailing. Join us when we delve into the challenges of designing ownership structures within a Data Mesh framework. We'll explore how to achieve the optimal balance between technical expertise and organizational design. Let's share our knowledge and navigate the exciting world of Data Mesh together! Get your ticket here: https://scalefr.ee/d7goui
8
2
New comment 8d ago
I got my ticket for the Data Dreamland! You also?
0 likes • 8d
I own one too! 👍
Data Vault Modeling: Where are you doing it?
At some point in each Data Vault project, the source data is analyzed, and a Data Vault 2.0 model is drafted. I saw a lot of different locations where teams draft their Data Vault model, Excel sheets, drawing tools like Draw.io or Miro.com, dedicated modeling tools, directly inside their automation tools, or on pen and paper. My past experience highlighted three most important things to consider when deciding where to model: - Must fit into the tool stack: if your automation tool offers Data Vault modeling, do it in there - Ease of use: Simplicity is important to streamline initial modeling, changes and additions - Persistent and centralized: All components of the whole model should be stored in a central place, to help other modelers identify what has been done already. Where are you currently designing the Data Vault model? What are your experiences? Let me know!
4
15
New comment 23d ago
0 likes • 23d
@Michael Müller ok!! Seems like the AI models are not ready for the data modeling task yet ;)
2 likes • 23d
@Lorenz Kindling in high level - it is more coding and document analysis oriented. Here you can get some more details: https://zapier.com/blog/claude-vs-chatgpt/
1-10 of 12
Jaroslaw Syrokosz
3
17points to level up
@jaroslaw-syrokosz-8262
Data Analytics Engineer | Data Modeling Enthusiast ⭐️Certified in: Data Vault 2.0, Snowflake, DBT, Azure, Airflow, Teradata, MicroStrategy

Active 14h ago
Joined Jul 28, 2024
EU
powered by