Posts

AWS Partner: Cloud Economics Accreditation

Image
During the last three months, I have been involved in the migration program into AWS, following the Cloud Adoption Framework and MAP program guidance. Cloud Economics is an inevitable topic to consider, especially during the Assess and Modernize phases. Bearing in mind that I'm an Azure "guy" I wanted to check how far my Azure knowledge of Cloud Economics principles differs from AWS. The answer is - it's (almost) the same, but let's take a look at what   Cloud Economic s actually means. AWS  Cloud Economic AWS Cloud Economic relies on two main areas - business value and cloud financial management .  Business values consider the components that advance your customer's business. Usually, it's all about money, but not only. There are 5 pillars to consider here. Cost savings are realized by avoiding the on-premises infrastructures with large fixed spend and reducing the ongoing variable spending through the economies of scale that AWS offers. Staff producti...

Access to Azure blobs from Azure Function by using Managed Identity

How to grant access to the blob files available on the Azure Storage Account without providing a SAS token nor Access Key? In some cases, it might be helpful, especially when access to the Azure Storage Account is available via Managed Instance.  In that case, we don't want to use any secrets provided explicitly.  In fact, we still want to use the SAS token behind the scene, but generated on-the-fly, with a very short lifetime.  Each SAS token requires to be built based on the Access Key. We may want to use one of 2 access keys provided by the Storage Accounts, but it's not a good option as we don't want to deal with them.  Azure lets us generate a temporary access key, based on our credentials, and then use it to generate a SAS token. In that case, we need to: Create a Storage Account Create Azure Function App  Assign the Storage Blob Delegator  (or Storage Blob Data Owner ) role to the Manage Identity of the Function App. The action needs to be performed ...

Azure Data Scientist Associate

Image
 The last week only, I renewed my Azure Data Scientist Associate certification. Before the first re-attempt, I promised myself to spend some time on preparation. Was it worth it ? www.credly.com Of course! I love it. The preparation for certification forces me to take a closer look at the aspects that are out of my daily interest. In my current project, the limited part of Azure ML was in use - the Data Drift, which is a small piece of the bigger piece. During my study, I've walked through the whole journey once again, starting from data preparation, and ending up with the model deployment on the cloud. This revealed the mystery, of why I wasn't able to convince my client since the last year to go deeper into that platform, despite the fact, that the project itself is data science-driven  and already includes all the juicy parties,  like model training (PKL) and data/feature preparation.  The answer was simple. Following the bible - "And how can they hear about som...

Podman using Ubuntu WSL2

Image
The Docker Desktop for Windows is no longer free for commercial usage, or at least in many cases. If your local development environment is Windows, and you're looking for an alternative, this article is for you. "Docker Desktop remains free for small businesses (fewer than 250 employees AND less than $10 million in annual revenue), personal use, education, and non-commercial open-source projects.". Unfortunately, containerization is a foundation of most enterprise-level projects these days, which size causes the licence issue.  The commercial doesn't cost an arm and leg, even though it builds unnecessary concern and limitations. Even if you use the WSL2 Linux image, you need to use the docker engine, which is a part of the Docker Desktop for Windows. Podman as Docker And here is the beautiful moment when Podman comes into the play. It's a root-less and demon-less solution, fully compatible with docker. If you have been ever concerned about the demon working in the...

Azure Purview - to own your data

Image
While exploring the problem of migration to the cloud I've realized that it would be worth putting some light on the data owning issue. This is when the Azure Purview comes in. I'm not a Chief Data Officer, but I if had been the one, would have been asked some fundamental questions about my data - what is the source, how the ETL/ELT process looked like, who own the data ... and, what is the data linage..., and finally what all those data mean.  All those questions are very important these days. Those might be triggered by regulatory requirements (data lineage, anonymisation), data scientists implementations (Where is my source dataset used to train the model?), internal auditors or the business itself, who struggles with defining the common domain models for years. By working for one of my client's I've faced the problem of defining the Product Master application, which would be a golden source of domain model definitions across the company. The issue was too big to be ...

Data Drifting Monitor in Azure

Image
In order to capture suspicious data from external sources, we usually define a set of rules, that explicitly examine upcoming data and validate the data against those rules. What happens if the data still looks good and stays within defined frames and schemas, but something is smelly? Classic approach Let's consider the case of a company that tracks the estate market changes. If the volume of the data that comes from an external data provider rumps down, or values break their banks, then it is easy to capture that breach by introducing validation rules. For example, If a price of an estate is bigger than 100M$ or lower than zero, then such input data (like a file) should be rejected or fixed, before processing. The business users maybe not be happy with some delay, but still ... it's better to be safe than sorry. Now let's consider a case when the average price of your property drifts during the time. If one week an average price of a property equals 100k$ and the next week...

Azure Synapse - first contact

Image
Despite the fact I've spent the last two years working with Microsoft Azure cloud tech stack, I didn't have a chance working with Azure Synapse yet. Finally I've met this guy. The first impression is very good. Azure Synapse Workspace To create the first Azure Synapse you need to create an  Azure Synapse Workspace . If you are already familiar with Azure you may already know Azure Log Analitycs Workspace to examine logs and operation of your applications and resources. Here is the same. Workspace is created within the Azure resource group, building a landing page for its settings, analytic pools, security, and monitoring aspects. Azure Synapse Workspace will require a storage account to be created/selected under-the-hood. A storage account will be used to build Azure Data Lake Storage Gen2 for the files you will manage by yourself (sample data for processing) or by the Azure Synapse itself (ex. PySpark trained models will be saved here as well).  The better place to deal...