The following changes include all those made between plantiSMASH 1.0 and plantiSMASH 2.0.

Important notes#

This release increases the minimum python version to 3.9

New features#

Major

Expanded BGC Detection Rules

plantiSMASH version 2 introduces updated and additional biosynthetic gene cluster (BGC) detection rules, significantly expanding the framework’s ability to identify specialized metabolic pathways in plant genomes. These improvements:

  • Broaden the range of detectable cluster types.
  • Increase accuracy in gene grouping and cluster boundary prediction.
  • Enable detection of newly characterized biosynthetic gene families.
  • Detection of repeats in BURP domains (cyclopeptide BGCs).
  • Updated ClusterBlast database with 382 NCBI plant genomes.

Substrate Prediction

This version adds substrate specificity prediction for selected enzyme families, providing functional insight into the metabolic products of plant BGCs. Features include:

  • Prediction of likely substrates based on sequence motifs and known biochemistry.
  • Integration of substrate information in cluster output and visualizations.
  • Enhanced interpretability of BGC functions and potential products.

Regulatory Feature Detection

plantiSMASH 2 enhances regulatory analysis by predicting transcription factor binding sites (TFBS) in the promoter regions of BGC-associated genes. This new functionality:

  • Leverages known plant TFBS motif libraries.
  • Identifies putative regulatory elements linked to biosynthetic gene expression.
  • Facilitates hypotheses about coordinated gene regulation within clusters.

Supported Cluster Types (version 2)#

Rule Min. Generic Domains* Special Domains Required¹
alkaloid 3 At least one of: Bet_v_1, Cu_amine_oxid, Str_synth, BBE, Orn_DAP_Arg_deC, Pyridoxal_deC
fatty_acid 3 At least one of: FA_desaturase, FA_desaturase_2, FA_hydroxylase, CER1-like_C
OR Transferase + ECH_2
OR Transferase + AMP-binding
lignan 3 Dirigent (required)
phenolamide 2 At least one of: Transferase, Pyridoxal_deC
OR Orn_DAP_Arg_deC, Orn_Arg_deC_N
plant 4
polyketide 3 At least one of: Chal_sti_synt_C, Chal_sti_synt_N
OR AMP-binding + Thr_dehydrat_C
saccharide 3 At least one of: Glycos_transf_1, Glycos_transf_2, Glycos_transf_28, UDPGT, UDPGT_2, Glyco_hydro_1, Cellulose_synt
sesterterpene 2 Terpene_synth_C + polyprenyl_synt (both required)
strictosidine_like 2 Str_synth + Pyridoxal_deC (both required)
terpene 3 At least one of: Terpene_synth, Terpene_synth_C, Prenyltrans, SQHop_cyclase_C, SQHop_cyclase_N, PRISE
transporter 3 At least one of: LTP_2, ABC2_membrane, ABC_tran, MatE

* Generic domains:
NAD_binding_4, FAE1_CUT1_RppA, HAD_RAM2_N, Orn_DAP_Arg_deC, Pyridoxal_deC, BBE, FA_hydroxylase, CER1-like_C, ECH_2, Oxidored_FMN, 3Beta_HSD, Glyco_hydro_1, ADH_N, ADH_N_2, Abhydrolase_3, Aldo_ket_red, cMT, nMT, oMT, adh_short, Chal_sti_synt_C, Chal_sti_synt_N, COesterase, UDPGT, Glyco_transf_28, Glycos_transf_1, Glycos_transf_2, Lycopene_cycl, NAD_binding_1, p450, SQHop_cyclase_C, SQHop_cyclase_N, Prenyltrans, Terpene_synth_C, Terpene_synth, Transferase, Aminotran_1_2, AMP-binding, DIOX_N, Dirigent, Bet_v_1, Cu_amine_oxid, Str_synth, Trp_syntA, His_biosynth, adh_short_C2, Peptidase_S10, Prenyltransf, Epimerase, 2OG-FeII_Oxy, Aminotran_3, Methyltransf_2, Methyltransf_3, Methyltransf_7, PRISE, Cellulose_synt, Chalcone, ERG4_ERG24, FA_desaturase, FA_desaturase_2, Methyltransf_11, polyprenyl_synt, SE, SQS_PSY, TPMT, UbiA, Lipoxygenase, Lyase_aromatic, HMGL-like, Chalcone_3, Chalcone_2, Acetyltransf_1, UDPGT_2, GMC_oxred_N, GMC_oxred_C, Amino_oxidase, DAHP_synth_1, DAHP_synth_2

¹ The column “Special Domains Required” reflects whether domains are required all together (AND) or at least one is sufficient (OR).

  • When you see “+” → all domains are required together (AND).
  • When you see “At least one of” → any one is sufficient (OR).
  • Multiple alternative rules are indicated by lines with “OR”.