Taxonomy of Generative AI
Generative artificial intelligence encompasses a family of computational systems capable of creating new content—text, images, audio, video, or synthetic data—based on patterns learned from training data. This taxonomy provides a structured framework for understanding the key technologies and their potential applications within the agrifood value chain.
1. Foundation Models and Architecture Families
1.1 Foundation Models
Definition: Large-scale models trained on diverse, multi-modal datasets that serve as the basis for multiple downstream applications.
Key Characteristics:
Pre-trained on massive datasets (typically billions of parameters)
Generalizable across multiple tasks and domains
Require fine-tuning for specific applications
High computational requirements for training and inference
Agrifood Applications: Crop monitoring, supply chain optimisation, predictive analytics for yield forecasting, automated quality control systems.
1.2 Transformer Architecture
Definition: A neural network architecture based on attention mechanisms that processes sequential data in parallel rather than sequentially.
Key Components:
Multi-head attention mechanisms
Position encoding
Feed-forward networks
Encoder-decoder structures (optional)
Significance: Forms the backbone of most modern language models and many multimodal systems.
2. Language-Based Systems
2.1 Large Language Models (LLMs)
Definition: Transformer-based models specifically designed for natural language understanding and generation.
Technical Specifications:
Parameter counts ranging from millions to trillions
Trained on text corpora spanning multiple languages and domains
Capability for few-shot and zero-shot learning
Agrifood Applications:
Technical documentation generation
Regulatory compliance reporting
Market analysis and trend identification
Customer service automation
Supply chain communication
2.2 Small Language Models (SLMs)
Definition: Compact language models optimised for specific tasks or deployment constraints.
Characteristics:
Fewer parameters (typically under 10 billion)
Domain-specific training
Lower computational requirements
Faster inference times
Agrifood Advantages: Suitable for edge deployment in agricultural IoT systems, offline processing in remote locations, cost-effective implementation for smaller operations.
2.3 Multimodal Large Models (MLMs)
Definition: Systems that process and generate content across multiple modalities (text, images, audio, video).
Capabilities:
Image-to-text generation
Text-to-image synthesis
Cross-modal understanding
Integrated reasoning across modalities
Agrifood Applications: Automated crop disease identification through image analysis, quality assessment systems, precision agriculture monitoring, automated reporting systems combining sensor data with visual inspection.
3. Image and Visual Generation Systems
3.1 Generative Adversarial Networks (GANs)
Definition: A framework consisting of two neural networks competing against each other—a generator and a discriminator.
Architecture:
Generator: Creates synthetic data
Discriminator: Distinguishes between real and synthetic data
Training through adversarial loss functions
Agrifood Applications:
Synthetic crop image generation for training datasets
Data augmentation for pest identification systems
Simulation of crop growth under different conditions
Quality control system training
3.2 Variational Autoencoders (VAEs)
Definition: Probabilistic models that learn to encode data into a latent space and decode it back to the original space.
Technical Features:
Encoder-decoder architecture
Probabilistic latent space representation
Continuous latent space allows for interpolation
Principled approach to generation through sampling
Agrifood Applications:
Crop yield prediction through genetic variation modelling
Soil composition analysis and synthesis
Quality parameter interpolation
Anomaly detection in agricultural systems
3.3 Diffusion Models
Definition: Generative models that learn to reverse a gradual noise addition process to create high-quality samples.
Process:
Forward process: Gradually adds noise to data
Reverse process: Learns to denoise and generate samples
Iterative refinement approach
Advantages: Superior image quality, stable training, controllable generation process.
Agrifood Applications:
High-resolution satellite imagery enhancement
Synthetic agricultural scene generation
Climate simulation visualisation
Product packaging design
4. Application-Specific Systems
4.1 Chatbots and Conversational AI
Definition: Interactive systems designed for natural language dialogue with users.
Technical Components:
Natural language understanding (NLU)
Dialogue management
Natural language generation (NLG)
Context retention mechanisms
Agrifood Implementation:
Farmer advisory systems
Technical support automation
Supply chain coordination
Market information dissemination
4.2 Code Generation Models
Definition: Specialised language models trained to generate, complete, or modify programming code.
Capabilities:
Code completion and suggestion
Bug detection and fixing
Documentation generation
Cross-language translation
Agrifood Applications:
Agricultural software development acceleration
IoT device programming
Data analysis script generation
Legacy system modernisation
Research by: Beatriz Vallina, PhD
Thesis Supervisors: Roberto Cervelló, Prof.PhD & Juan José Lull, PhD
Institution: Doctorate in Agrifood Economics, Universitat Politècnica de València
© 2025. All rights reserved.