Data Flow of AI

Data Flow is a machine learning pattern representing the data movement sequence in the AI engineering life cycle.

First, Data is processed layer by layer, as shown in Fig.1, to prepare it for storage, training, etc.

Then, data passes through processing layers as it is stored, refined, and prepared for use in Machine Learning models and applications. In a more functional perspective, the data is then used by different machine learning function groups, as shown below:

A detail for each layer in the above chart is as follows:

Sources

Data sources include:

Company Internal Databases
Company Internal Files
Websites
Public Data
Smartphone Apps
IoT Devices
Commercial Data Aggregators
Point of Sale
Corporate Internal Processes
Social Media
Data Streams

Capture

Capture mechanisms include:

Website Scraping
Website and Smartphone Chat Dialogues
Website and Smartphone Form Submissions
IoT Device Interfaces
Commercial Data Aggregator Feeds
Corporate Internal Process Feeds

Pipeline

Pipeline processes include:

Data Ingestion
Data Temporary Storage
Data Subscription
Data Publication

Databases

Databases include:

Data Lakes
Sequel Databases
Document Databases
Graph Databases

ETLs

ETLs Include:

Extract Functions: pulling data from selected sources
Transform Functions: normalization, regularization, aggregation
Load Functions: saving data in formats for use in modeling processes

Models

Model-type category examples include:

Artificial Neural Networks
Decision Trees
Probabilistic Graphical Models
Cluster Analysis
Gaussian Processes
Regression Analysis

Applications

Application examples include:

Medical Diagnosis
Autonomous Vehicles
Chatbot Dialog
Image Recognition
Face Recognition
Product Recommendations
Churn Prediction
Malware Detection
Search Refinement

上一页🔢 Data Layer: All for the Data 下一页Data Collection