Taxonomy of Generative AI

Generative artificial intelligence encompasses a family of computational systems capable of creating new content—text, images, audio, video, or synthetic data—based on patterns learned from training data. This taxonomy provides a structured framework for understanding the key technologies and their potential applications within the agrifood value chain.

1. Foundation Models and Architecture Families

1.1 Foundation Models

Definition: Large-scale models trained on diverse, multi-modal datasets that serve as the basis for multiple downstream applications.

Key Characteristics:

  • Pre-trained on massive datasets (typically billions of parameters)

  • Generalizable across multiple tasks and domains

  • Require fine-tuning for specific applications

  • High computational requirements for training and inference

Agrifood Applications: Crop monitoring, supply chain optimisation, predictive analytics for yield forecasting, automated quality control systems.

1.2 Transformer Architecture

Definition: A neural network architecture based on attention mechanisms that processes sequential data in parallel rather than sequentially.

Key Components:

  • Multi-head attention mechanisms

  • Position encoding

  • Feed-forward networks

  • Encoder-decoder structures (optional)

Significance: Forms the backbone of most modern language models and many multimodal systems.

2. Language-Based Systems

2.1 Large Language Models (LLMs)

Definition: Transformer-based models specifically designed for natural language understanding and generation.

Technical Specifications:

  • Parameter counts ranging from millions to trillions

  • Trained on text corpora spanning multiple languages and domains

  • Capability for few-shot and zero-shot learning

Agrifood Applications:

  • Technical documentation generation

  • Regulatory compliance reporting

  • Market analysis and trend identification

  • Customer service automation

  • Supply chain communication

2.2 Small Language Models (SLMs)

Definition: Compact language models optimised for specific tasks or deployment constraints.

Characteristics:

  • Fewer parameters (typically under 10 billion)

  • Domain-specific training

  • Lower computational requirements

  • Faster inference times

Agrifood Advantages: Suitable for edge deployment in agricultural IoT systems, offline processing in remote locations, cost-effective implementation for smaller operations.

2.3 Multimodal Large Models (MLMs)

Definition: Systems that process and generate content across multiple modalities (text, images, audio, video).

Capabilities:

  • Image-to-text generation

  • Text-to-image synthesis

  • Cross-modal understanding

  • Integrated reasoning across modalities

Agrifood Applications: Automated crop disease identification through image analysis, quality assessment systems, precision agriculture monitoring, automated reporting systems combining sensor data with visual inspection.

3. Image and Visual Generation Systems

3.1 Generative Adversarial Networks (GANs)

Definition: A framework consisting of two neural networks competing against each other—a generator and a discriminator.

Architecture:

  • Generator: Creates synthetic data

  • Discriminator: Distinguishes between real and synthetic data

  • Training through adversarial loss functions

Agrifood Applications:

  • Synthetic crop image generation for training datasets

  • Data augmentation for pest identification systems

  • Simulation of crop growth under different conditions

  • Quality control system training

3.2 Variational Autoencoders (VAEs)

Definition: Probabilistic models that learn to encode data into a latent space and decode it back to the original space.

Technical Features:

  • Encoder-decoder architecture

  • Probabilistic latent space representation

  • Continuous latent space allows for interpolation

  • Principled approach to generation through sampling

Agrifood Applications:

  • Crop yield prediction through genetic variation modelling

  • Soil composition analysis and synthesis

  • Quality parameter interpolation

  • Anomaly detection in agricultural systems

3.3 Diffusion Models

Definition: Generative models that learn to reverse a gradual noise addition process to create high-quality samples.

Process:

  • Forward process: Gradually adds noise to data

  • Reverse process: Learns to denoise and generate samples

  • Iterative refinement approach

Advantages: Superior image quality, stable training, controllable generation process.

Agrifood Applications:

  • High-resolution satellite imagery enhancement

  • Synthetic agricultural scene generation

  • Climate simulation visualisation

  • Product packaging design

4. Application-Specific Systems

4.1 Chatbots and Conversational AI

Definition: Interactive systems designed for natural language dialogue with users.

Technical Components:

  • Natural language understanding (NLU)

  • Dialogue management

  • Natural language generation (NLG)

  • Context retention mechanisms

Agrifood Implementation:

  • Farmer advisory systems

  • Technical support automation

  • Supply chain coordination

  • Market information dissemination

4.2 Code Generation Models

Definition: Specialised language models trained to generate, complete, or modify programming code.

Capabilities:

  • Code completion and suggestion

  • Bug detection and fixing

  • Documentation generation

  • Cross-language translation

Agrifood Applications:

  • Agricultural software development acceleration

  • IoT device programming

  • Data analysis script generation

  • Legacy system modernisation

a close up of a piece of art with a green background
a close up of a piece of art with a green background