How machine learning can help us conserve orchid seeds

Learn how we are using machine learning to tell us if our orchid seeds in the MSB are still alive, and how this will benefit us in the future.

By Abigail Tucker , Pablo Gómez Barreiro and Dr Alice Hudson

A 20p coin is next to tiny orchid seeds, the coin is far larger

You usually wouldn’t think of artificial intelligence (AI) and botany as things that go together, but what if we told you that this is already the case in our labs here at Kew and Wakehurst?

AI can have its down sides, it has a big impact on the environment from the enormous amount of power and components required to run it. However, in this project, we are using a form of AI called machine learning to aid in the conservation effort.

In fact, we have multiple ongoing projects that are using AI to help us in our aim to conserve the planet's flora. This read and watch will tell you about one that we are currently running at the Millennium Seed Bank (MSB).

an aerial view of the Millennium Seed Bank buildings at Wakehurst — Wakehurst MSB, Visual Air © RBG Kew

AI Or Machine Learning?

There are subtle differences between AI and machine learning. AI is an umbrella term for systems that are made to mimic human intelligence for tasks like making decisions, problem solving and reasoning. The aim is to complete complex tasks with little human input so they can be done faster and made more accessible. Think of things like self-driving cars and virtual assistants.

Machine learning is a type of AI that is mostly used to find patterns or relationships in existing data. It then uses these patterns to make predictions on new data that it hasn’t seen before. Machine learning is normally used to improve efficiency for a specific task over time as it ‘learns’ from new data fed into it. Facial recognition is an example of a machine learning model.

Creating machine learning models requires a lot of data and fine tuning. This is where the human input comes in, we have to gather and input the data, then train, optimise and deploy the model.

What we’re doing in this project is training our model to outline and label objects in an image. In this case, those objects are seeds.

8 small, violet coloured orchid flowers on a single stem — One of the many orchid species tested and imaged in this project: *Laelia rubescens* © A.McRobb

But why do we need machine learning for this, and how can it help?

Our project

When storing seeds at the MSB, we need to know how many of them are viable. We want to be able to grow new plants from them in the future, so there’s no point in storing dead seeds! Usually, we check how healthy our seeds are by trying to germinate a sample of them. However, for some species, such as orchids, this can be tricky. Orchid seeds are exceptionally small and are very particular about when and how they germinate.

In the wild, orchid seeds form symbiotic relationships with mycorrhizal fungi which help them get the nutrients they need. We don’t have these fungi, so our only option is a special medium containing the correct nutrients. Even once the seeds have all the right conditions, it can take years for anything to happen.

Orchid seeds are spread by the wind, which is why they're so tiny. Varying when each seed germinates protects the population from being wiped out by a single catastrophic event, because seeds are much tougher than young plants.

Unfortunately, these very traits which help orchids survive in the wild also make them some of the most complicated seeds we work with at the MSB.

A 20p piece is used for scale showing tiny orchid seeds — A British 20p piece emphasises the tiny size of the orchid seeds lying next to it © RBG Kew

Because of these challenges, we’ve been looking at alternatives to germination testing. One of these is a chemical stain - Tetrazolium Chloride - that is normally clear in colour but turns red when it comes into contact with living tissues. This allows us to estimate how many viable seeds we have, by counting the number of red-stained seeds in each collection.

Approximately 100 orchid seeds, most of their embryos stained red by Tetrazolium Chloride. Some embryos are left transparent or grey in colour. — Orchid seeds imaged through a microscopic USB camera; the viable seeds are stained red by Tetrazolium Chloride - Abigail Tucker © RBG Kew

However, this then creates a new problem: their dust-like size and large numbers make them very tricky to count. It could take anywhere from 5 to 20 minutes to count and classify the seeds from a single collection. Multiply this by the number of orchid collections that come through our doors at the seed bank each year, and it becomes countless hours of work looking down a microscope.

And there is yet another challenge: the seeds can’t be simply split into stained and not stained. Some will have a much fainter colour than others and even experts disagree on how dark the red staining has to be to reliably indicate seed viability. This makes interpretating the results very subjective.

A blue gloved hand placing a glass cover-slip over a sample of orchid seeds and a few drops of clear liquid, onto a microscope slide. The hundreds of orchid seeds are barely visible due to their small size. — The preparation of a slide to go under the USB microscope, these seeds have been submerged in stain for over 24 hours and are ready to be imaged and annotated - Pablo Gómez Barreiro © RBG Kew

Can AI mitigate all these problems? We decided to try!

Over the course of six months, we studied the viability of 108 collections and created a dataset of 522 images, containing around 63,000 individual annotations, all classed as viable or not viable by our team of experts. If we can have a model to count and calculate for us, it will free up our expert’s time for other ventures. Not only could it be faster, but our model can also make the process more consistent by removing human error and subjective interpretation of the staining colour.

A collection of colorful orchids growing at Kew Gardens — Orchids at Kew Gardens

The model

When developing an AI tool, the first step is to create a dataset which can be used to train it. In our case, we took images of stained orchid seeds under a USB-microscope, then manually outlined and labelled each individual seed in every image according to if they have turned red or not.

Two images. The left is a small camera mounted on a stand above a microscope slide on an A4 size flat light. The right image is a close up of the camera showing the tip flush against the slide. — Imaging setup: A Dinolite USB microscope camera and a light box- Pablo Gómez Barreiro © RBG Kew

We split our set of images in two, with one part used to train the model and the other to test it. The stages of training and testing are as follows:

The model annotates the images on its own
It then compares its results to those of a human
The model adjusts its annotations and repeats the process until there are no more improvements to make.
Finally, it’s tested on the fresh images that it’s never seen before to get a final score on how well it does.

So far our model is almost as good as a real person, but with a few tweaks to the code behind the model and some quality checking of our training data (such as annotating seeds that may have been missed out the first time around), we’re hoping to improve the model until it’s as accurate as a human expert, if not better!

Two identical images of seeds. In the left image around 40 seeds are outlined in red and 6 in green. In the right image one of the red outlines is now black and one of the green outlines is now red. — Stained seeds annotated by a human on the left and by our machine learning model on the right. Non-viable seeds are outlined in red, viable are in green. © RBG Kew

Once we’ve perfected this model, we want to make it available online so that it can be used by other seed banks globally, as well as by our teams here at Kew.

What next?

The next steps are to ‘ground truth’ the dataset, by comparing if as many seeds germinated from the collections as the AI predicted. Unfortunately, this means that, for now, we still need to germinate our orchid seeds.

If there’s a mismatch, we’ll need to examine if the level of staining that we tell the AI indicates viability is too conservative, and then re-train the model. It’s still hotly debated exactly how much staining indicates the potential for life in seeds.

However, we also have to remember that there are many other factors which can impact whether a seed germinates, beyond whether it is alive or not. Is the nutrient medium correct? Is the temperature ideal for germination? Is it getting the right amount of light? All of these things we have to be able to answer, and we hope to be able to do so more reliably in the future.

There is also scope to expand our model in future, to work with other types of staining methods such as Fluorescein diacetate (FDA), which fluoresces under UV light. This would show us if orchid seeds with darker seed coats are viable.

A single light purple and white orchid with a rich yellow centre surrounded by delicate hair-like structures. — One of the many species tested in this project: *Dendrobium loddigesii* - Andre Schuiteman © RBG Kew

Conclusion

When AI is used ethically it can have great benefits in the realm of conservation. By completing time consuming and repetitive tasks, it allows us to focus on more pressing issues that need innovation and problem solving that only humans can do. Our project aims to use AI to help those within the field of conservation to not only get more accurate, less biased measurements, but also to allow the team more time to tackle issues that require the skills only people can possess.

Acknowledgements

We would like to thank Silo National des Graines Forestieres (SNGF) Madagascar; The Ministry of Agriculture, Lands, Housing & Environment Montserrat; Instituto de Investigação Agrária de Moçambique (IIAM) Mozambique; Departmento de Recursos Naturales y Ambientales, Puerto Rico; and the British Virgin Islands National Parks Trust for allowing us to use seeds they collected for this project.

This project is supported by Bloomberg Philanthropies.