What is Python data acquisition
Data collection options for Azure Machine Learning workflows
- 2 minutes to read
In this article, you will learn about the pros and cons of data ingestion options available with Azure Machine Learning.
The following options are available:
Data ingestion is the process of extracting unstructured data from one or more sources and then preparing it for training machine learning models. It is also time consuming, especially if it is done manually and you have large amounts of data from multiple sources. Automating this effort frees up resources and ensures that your models are using the most current and appropriate data.
Azure Data Factory
Azure Data Factory provides native support for data source monitoring and triggers for data collection pipelines.
The following table summarizes the pros and cons of using Azure Data Factory for your data collection workflows.
|Specifically designed for extracting, loading, and transforming data.||Currently offers a limited number of pipeline tasks for Azure Data Factory.|
|Allows you to create data-driven workflows to orchestrate data moves and transformations on a large scale.||It is expensive to build and maintain. For more information, see the Azure Data Factory pricing page.|
|Integrated with various Azure tools such as Azure Databricks and Azure Functions||Does not run native scripts; instead, it relies on a separate compute to run scripts|
|Natively supports data acquisition triggered by the data source|
|The processes for data preparation and model training are separate from each other.|
|Embedded data lineage capabilities for Azure Data Factory dataflows|
|Provides a low-level code experience user interface for non-scripting approaches|
These steps and the following diagram illustrate Azure Data Factory's data collection workflow.
Pulling the data from the sources
Transform and store the data in an output blob container that acts as a data store for Azure Machine Learning
With prepared data stored, the Azure Data Factory pipeline calls a machine learning pipeline that receives the prepared data for model training
Learn how to build a data collection pipeline for machine learning with Azure Data Factory.
Python SDK for Azure Machine Learning
You can use the Python SDK to integrate data collection tasks into an Azure Machine Learning pipeline.
The following table summarizes the advantages and disadvantages of using the SDK and an ML pipeline step for data collection tasks.
|Configure your own Python scripts||Does not natively support triggering data source changes. Requires Logic App or Azure Functions implementations|
|Data preparation as part of every model training run||Requires development skills to script data collection|
|Supports data prep scripts for various compute targets, including Azure Machine Learning compute||Does not provide a user interface for creating the capture mechanism|
In the following figure, the Azure Machine Learning pipeline consists of two steps: data collection and model training. The data collection step includes tasks that can be performed using Python libraries and the Python SDK, such as: B. extracting data from local / web-based sources and data transformations such as imputation of missing values. The training step then uses the prepared data as input to your training script to train your machine learning model.
Follow these instructions:
- How does it work at Fission Labs
- Which functions are missing in hotels
- What are some quotes about Lady Macbeth
- What is a subjunctive verb mood
- What is oversaturated solution
- Who is the best Physiotherapist in Gurgaon
- What is the percentage of 11 16
- What do foreign consulates mean?
- For how much was StyleFeeder purchased
- Blinds fell in the apartment
- What is the best flu medicine
- What is a distributed energy resource DER
- Can cause indigestion, shoulder pain
- What are insurance drones
- What is Microsoft Outlook Express
- Why is it important to be rated positively
- Where are the incumbent institutes
- Successful managers are effective managers
- How can I make myself strong inside
- What is satellite internet
- Why is your ex crazy
- How to grow Super Napier weed
- Did Prince have Klinefelters Syndrome
- Can my dog run a marathon