NEYEDC improve and inform environmental decision making, conservation, land management and sustainable development in North and East Yorkshire through the collation, management, analysis and dissemination of biodiversity information.

Updates & Insights

Blogs, news, articles, and insights from NEYEDC

Species Records and Data Flows

NEYEDC’s work revolves mainly around collating, managing and distributing species records, but occasionally we get the opportunity to collect them ourselves

Species data flows are complicated to say the least. These data flows lie at the heart of the work we do at NEYEDC so trying to get my head around them has been high on the agenda. Throughout my first few months as an Assistant Ecological Data Officer, I have interacted with a variety of our data donors, collected species records myself, and disseminated records from within our data base to clients. Participation in every step of the process has allowed me to begin to understand the dynamics at play as a species record passes through the system. In this blog post I will try to explain how species data flows work as best as I understand them now. I will also look at the issues I see with the current system and the difficulties faced when trying to address these issues.

Some of the complexity comes from there being numerous recording groups operating individually, each with their own agenda. This is partly owed to the Victorian origins of biological recording, with the legacy of natural history societies and recording schemes set up in the 1800s still existing in today’s recording groups. For example, Yorkshire Naturalists’ Union (set up 1861), like most natural history societies, is a self-governed collective that has developed their own way of collecting and storing data. As different groups have evolved over time, they have also developed unique processes and methods of recording, which may be specific to their local or species-specific needs. A similar process has arisen amongst Local Environmental Record Centres (LERCs), most of whom sit within local authorities and run independently to each other, thus operating slightly differently. According to local needs, the types of records received, the preferences of managers, funding, and technical capabilities, each LERC has developed differently and many even use different database systems to hold their species records. Without intervention this had led to rather complex and disjointed relationships between the different organisations that collect, maintain, and distribute species data.

Keiron Brown of the Biological Recording Company organises species data flows into this (very helpful) infographic.

Simplification of the pathway splits species data custodians into three main component organisations:

  • Local Environmental Records Centres (LERCs) who collect data of any taxa from a local perspective (such as NEYEDC)

  • National Recording Schemes (NRSs) who collect species-specific data

  • National Biodiversity Network (NBN) who attempt to collate data covering both a geographical and taxonomic range.

Each of these organisations are responsible for the verification, storage and maintenance of these species records. In addition, these organisations may or may not share data with each other and will each receive data from a variety of sources. To put this data into our database it must be rejigged into a standardised format, whilst ensuring the integrity of the data is maintained i.e., no meaning is added or taken away. This interpreting and prepping of datasets is one of the principal jobs of all LERCs and has been one of my first tasks as Assistant Ecological Data Officer.

Taking it back to NEYEDC, I will now describe some of the data as it arrives with us. The range is immense, from hand-drawn paper records to GIS layers with species mapped as percentages across polygons to ran-over cats and dogs recorded by National Highways...

Heterogeneous data

The context and origins of each dataset received presents its own unique issues and the summation of each of these contexts results in a very heterogeneous spread of data. This creates difficulties integrating data and making analyses on a larger scale. Here I will illustrate these using examples of some of the data I have worked on so far.

Personal Records

The first dataset I worked on was a large dataset of around 4000 species records. This data was collected in a personal capacity, by an amateur botanist collecting plant records for their own interest. Within this dataset several plant species were discovered by the recorder that were the first records for York, such as Grass vetchling, Lathyrus nissolia, an unusual member of the Fabaceae family with grass like narrow leaves (pseudo-leaves), and usually native to Southern England. Another plant in this dataset was Pennyroyal, Mentha pulegium, an endangered plant on the UK Red List. As this recorder is a frequent data donor who is known to us and a highly knowledgeable botanist we don’t require verification for more common species. However, for more unusual species like the two mentioned here, we would require verification, and these were verified by one of the BSBI (Botanical Society of Britain and Ireland) Vice County recorders for Yorkshire; a regional expert.

Volunteer recorders make up a large portion of our data donors and are an extremely valuable source of expertise. In these instances, where the recorder records in their own free time, records made will often be of a particular species or taxa that interests them, and in an area that is local to them.

Consultant Records

Consultant records are usually focused on protected or invasive species as these are the species which impact planning and decision-making, thus are most commercially significant. Although consultant ecologists will record other notable species that they come across during their surveys, the data we receive is predominantly bats, birds, amphibians and reptiles. This adds another layer of stratification to the species data landscape.

Historical Records

NEYEDC also hold lots of historical records across a range of formats, including paper records which we must archive, store, maintain and digitise. For example, this group of bird records collected by Flamborough Ornithological Group in 1970. These records reflect the recording methods of the species and time when they were recorded; alongside each record are long pages of text explaining the key identifying features of the bird as well as annotated drawings.

A beautiful hand painted record of rare passage migrant bird the Melodious Warbler, rarely seen in the UK. This record, courtesy of Flamborough Ornithological Group, is part of the Yorkshire Naturalist’ Union library.

Although older records are unlikely to influence planning today, they enable researchers to track geographical and temporal changes in species, such as shifting migratory routes. Datasets donated by NRSs are particularly valuable for researchers. However, the availability of data, and in fact the presence of a recording scheme altogether, is often biased towards the more charismatic or easily identifiable species.

 

Data sharing

The other consequence of the messy process of data sharing is that clear agreements and dataflows are not established or documented. This means an individual record may make it to the NBN Atlas numerous times, or, not at all. For example, if a recorder diligently submits their records to both a national recording scheme and their local record LERC, they may both pass their records on to NBN Atlas where it will then appear as a duplicate. This is tricky as there are different motivations for passing the record on to either group; passing it onto an LERC allows it to impact local decision making, whilst passing it on to a national recording scheme allows it to contribute to the national distribution and atlas of that species.

This is further complicated by the fact that many national recording schemes and LERCs don’t upload their data onto NBN Atlas at all. Without financial or legal incentive to do so organisations and individuals are expected to take the time to do this out of their own good will. Other pathways of data sharing, such as the sharing of species data generated by ecological consultants is recommended by the professional membership body The Chartered Institute of Ecology and Environmental Management (CIEEM), but not enforced. Whilst some consultants do share data with us, the figure is much less than those who actively request data from us. Sitting down to submit records to us may not be a priority for busy professionals and disagreements surrounding issues of intellectual copyright and ownership further muddy the waters.

However, whilst wider sharing of data between different groups, and clearly defined agreements on who should share what could help to make sure that every record is accounted for, data sharing is not as simple as it may seem.

There must still be some restrictions upon how data is shared, as the income generated by commercial users paying for data searches is required to fund the organisations who create and maintain the databases themselves. Collation, management, validation and verification of records by LERCs is a lengthy process requiring technical expertise, time and resources. If all data held by LERCs had unrestricted access on NBN Atlas, consultants and others who profit from using the data, would not be required to financially contribute toward the continuation of the system they benefit from. There is also the issue of safeguarding protected species by keeping their location and distribution out of the public sphere.

Although procedure on data sharing at a national level is lacking, many LERCs, including NEYEDC, have begun to establish formal Data Exchange Agreements with organisations on a local level. In these arrangements data searches will be returned at a reduced cost (or no cost) in exchange for regular donation of data. Whilst this increases the quality of data in our database, it is at the expense of revenue generated from data searches. For this reason, Data Exchange Agreements are only financially viable on a small scale with organisations possessing data particularly beneficial to us.

 

Thoughts for the future

Species data flows are far from perfect. However, the landscape is changing and as technology evolves, the potential for biological recording does too. Perhaps in the future we can look to drone surveys to cover the mundane and everyday locations that volunteers and professionals don’t bother to survey, identification will become automated by artificial intelligence and taxonomic experts will be paid to verify records. Future databases could store data of any kind, analytics could account for differing data quality and strict metadata would ensure each dataset makes it way through the pathway correctly. These possibilities certainly feel a long way in the distance.

But in the short term, changes in species data flows are brewing. Collaboration on a smaller scale, such as the cross-boundary agreements we have set up with our neighbouring LERCs and operating as Yorkshire Environmental Data Network (YEDN) to represent all the LERCs of Yorkshire, and further afield as the Association of Local Environmental Records Centres (ALERC) to represent all LERCS nationally. Anecdotally colleagues have noticed an increase in data donations by local consultants, and we hope this will improve further following the arrival of a new direct upload tool that should greatly improve ease of donation. Hopefully these trends of collaboration will continue upwards and in absence of a wider strategy or funding stream, common interests will unite organisations in their pursuit of accurate and extensive species data. As new government measures relating to the Environment Act come into place, such as Local Nature Recovery Strategies, the necessity for reliable and extensive ecological data becomes greater than ever.

 

Disclaimer: Whilst I have done my best to summarise data flows within this article, there is lots I have been unable to include, such as verification processes, recording apps etc… Either way I hope this has been useful to read for those who may be new to the sector or interested to learn more.