Slow-growing broilers are healthier and express more behavioural indicators of positive welfare

Broiler chicken welfare is under increasing scrutiny due to welfare concerns regarding growth rate and stocking density. This farm-based study explored broiler welfare in four conditions representing commercial systems varying in breed and planned maximum stocking density: (1) Breed A, 30 kg/m2; (2) Breed B, 30 kg/m2; (3) Breed B, 34 kg/m2; (4) Breed C, 34 kg/m2. Breeds A and B were ‘slow-growing’ breeds (< 50 g/day), and Breed C was a widely used ‘fast-growing’ breed. Indicators of negative welfare, behavioural indicators of positive welfare and environmental outcomes were assessed. Clear differences between conditions were detected. Birds in Condition 4 experienced the poorest health (highest mortality and post-mortem inspection rejections, poorest walking ability, most hock burn and pododermatitis) and litter quality. These birds also displayed lower levels of behaviours indicative of positive welfare (enrichment bale occupation, qualitative ‘happy/active’ scores, play, ground-scratching) than birds in Conditions 1–3. These findings provide farm-based evidence that significant welfare improvement can be achieved by utilising slow-growing breeds. There are suggested welfare benefits of a slightly lower planned maximum stocking density for Breed B and further health benefits of the slowest-growing breed, although these interventions do not offer the same magnitude of welfare improvement as moving away from fast-growing broilers.

Commercial broiler chicken welfare is receiving increasing scrutiny from the media and Non-Governmental Organisations (NGOs) concerning welfare issues associated with rapid growth and rearing conditions.European and North American NGOs are targeting food companies, requesting that they meet a number of requirements to "best mitigate […] the most pressing welfare concerns relating to broiler production" 1,2 . Two of these requirements are: to "implement a maximum stocking density of 30 kg/m 2 " (6.0lbs/ft 2 in North America) and to "adopt breeds that demonstrate higher welfare outcomes". These requirements raise questions about the impact of varying combinations of stocking density and breed on the welfare of broilers under commercial conditions. Stocking density is the "total live weight of chickens present in a house at the same time per square metre of usable area" 3 . The European Council Broiler Directive (2007/43/EC) sets a maximum stocking density (providing specific requirements are met) of 42 kg/m 2 whilst the UK codes of practice do not permit stocking densities over 39 kg/m 24 . Most UK retailer standards state a maximum stocking density of 38 kg/m 2 . 'Higher welfare' retailer standards may set somewhat more stringent upper limits (e.g. 34 kg/m 2 ) or even more stringent limits in line with the NGO 'Chicken Commitment' requirements 1,2 . Higher stocking densities typically result in greater economic returns for broiler producers due to increased numbers of animals and, therefore, kg of meat produced per house. To avoid exceeding maximum stocking densities, in practice, the number of chicks placed at the start is calculated from the available floor area of the house and the target final weight of the birds (whilst also allowing for some mortality). For the same final target weight, a higher stocking density would equate to more animals within a given area (animal density). It has been reported that broilers will actively work to avoid higher stocking densities in certain contexts 5 whilst clustering together in other contexts 6 . Distance travelled or walking bout length decreases with stocking density 6,7 and broilers have been observed to 'jostle' one another 8 and experience more interruptions to resting periods 9 at higher densities. In general, there is a trend for reduced health of broilers at higher stocking densities, including poorer walking ability 6,10-12 and increased footpad dermatitis 11,13,14 .
Global broiler production generally utilises breeds with mean growth rates of > 50 g/day ('fast-growing broilers'). 'Slow-growing broilers' (< 50 g/day) are supplied by traditional breed providers or arms of the major genetics companies. The market for slow-growing breeds is currently a small portion of all broiler production (for the UK this is estimated to be around 11% 15 ). Interest in slow-growing broilers is driven by diverse region-specific consumer trends, including animal welfare interest (promoted by NGOs), legislation for age at processing, demands from traditional cuisines and retailer initiatives to create premium products (e.g. 'Higher welfare' branded meat products). There are few published direct comparisons of breeds, particularly under commercial production or undertaken within the last 10 years. Given the fast development of broiler genetics previous research may not be reflective of today's genetics. Pen trials have, however, shown differences in behaviour between breeds. For example, Bokkers and Koene 16 reported that slow-growing broilers perched, walked and ground-scratched more whereas fast-growing broilers sat, ate and drank more. Further, birds growing at > 41 g/day performed a reduced variety of behaviours when compared to breeds growing at 25-40 g/day and < 24 g/day 17 . Fast-growing breeds have also been reported to have poorer walking ability [18][19][20] , more foot lesions 16,17,21 , higher mortality, culls and biological indicators of poorer immunity 17 .
Assessments of broiler welfare such as those described above have typically focused on negative welfare outcomes. Recently, there has been an evolution of welfare science to explore positive experiences of animals 22 recognising that good welfare, a "good life" [23][24][25] , is not just about negating negative states but also the promotion of positive experiences and emotional states. Positive animal welfare and its assessment emphasises resources that are valued by animals 25 as well as positive emotions and the natural behaviours animals are motivated to perform 22 .
The aim of this on-farm study was to evaluate the welfare of broilers in four commercially relevant systems with varying combinations of breed (across three breeds selected for different growth rates) and stocking density (planned for 30 vs 34 kg/m 2 at slaughter age). This study is the first to utilise an extensive suite of specific behavioural measures of positive welfare alongside more traditional negative welfare outcomes and environmental outcomes in a large-scale trial. We predicted that negative welfare outcomes would increase, and positive welfare outcomes would decrease, with increased mean growth rate and stocking density, equivalent to increased productivity of the system. Thus, we expected that the condition that would achieve the best welfare would be that with the slowest growing birds and lowest stocking density.

Results production information.
There was a 14 day difference in production cycle length between Conditions 1 and 4 (Table 1a). This difference in growth rate was already apparent at Production Stage 1, with birds in Condition 4 being 41% heavier than the birds in Condition 1. While final animal densities remained different for the two planned maximum stocking densities, final stocking densities were lower than planned based on a target weight of 2.2 kg. negative welfare outcomes. Mortality. Condition 4 resulted in the numerically highest 7d and Total Mortality (Table 1b; Fig. S1, Supplementary Information). Production Cycle 2 of Condition 1 experienced high 7d Mortality. Because it occurred only in one production cycle, this mortality was unlikely to have been related specifically to Condition 1 and so mortality data from this production cycle were excluded in Table 1. When including the Production Cycle 2 mortality figures in the mean score (± SE), Condition 1 had 2.27 ± 1.52% 7d Mortality and 4.00 ± 1.91% Total Mortality.
Processing welfare outcomes. All conditions had a similar percentage of birds Dead on Arrival at the processor but Condition 4 had a greater percentage of Pre-processing Culls (Table 1b). A stepwise increase in Total Postmortem Inspection Rejections was observed from Condition 1-4. Condition 4 had 9.6 times more rejections than Condition 1 as well as a greater variety of reasons for rejection ( Fig. 1).
Qualitative behaviour assessment. From principal component (PC) analysis of 48 assessments, two main PCs (PC1, PC2) were identified by visual inspection for the point of deflection in the Scree plot. PC1 and PC2 together explained 54.10% of the variance (39.18 and 14.90%, respectively). PC1 ranged from 'Happy/Active' to 'Flat/Stressed' and PC2 ranged from 'Calm' to 'Flighty/Alert' (Table S3).
For PC2, there was no interaction between Condition and Production Stage (F 6,24 = 0.329, p = 0.915, partialŋ 2 = 0.076), and no main effect of Condition (F 3,12 = 0.231, p = 0.873, partialŋ 2 = 0.055) or Production Stage Table 1. (a) Production information and (b) production-related negative welfare outcomes by Condition (Mean ± SE per production cycle). *n = 4 production cycles for Conditions 2, 3 and 4; n = 3 for Condition 1.   . Mean (± SE) percentage of birds with each Gait Score (ranging from 0, walks with ease, to 5, unable to walk) by Condition (n = 100 birds per production cycle 2d before processing, across four production cycles). Different letters indicate differences in Gait Score distribution between conditions as identified by pairwise comparisons using Dunn's 52 procedure (p < 0.0083). Figure Table S2 for behaviour definitions, and Table S4 for significance of differences associated with Disturbance and Production Stage. www.nature.com/scientificreports/ p = 0.001) and Any Positive Behaviour (OR (95% CI) 0.13 (0.06-0.26); p < 0.001) were lower in Production Stage 1 than in Production Stage 3. When exploring associations between stocking density and positive behaviours in Breed B (Condition 2 vs 3), there was no difference in the odds of observing any of the positive behaviours in Undisturbed or Disturbed patches (p > 0.05). Furthermore, when comparing the two slow-growing breeds (A vs B) stocked at similar planned stocking densities (Conditions 1 vs 2), no differences in display of positive behaviours were detected in either Undisturbed or Disturbed patches (p > 0.05). environmental outcomes. Litter quality. Litter quality was better in Conditions 1-3, with mean (± SE) litter scores of 0.29 ± 0.03, 0.21 ± 0.02 and 0.46 ± 0.10, respectively, compared to 2.19 ± 0.09 in Condition 4 (n = 4 per condition).

Discussion
This is the first study to utilise a comprehensive suite of measures to specifically identify the differences in negative welfare outcomes and positive behavioural outcomes across four commercial broiler systems of varying planned maximum stocking density and breeds. Clear differences in outcomes can be observed particularly between Condition 4 (standard fast-growing breed, highest planned maximum stocking density) and all other conditions. Condition 4 birds experienced the poorest health as indicated by levels of 7d and Total Mortality, Post-mortem Inspection Rejections at processing, Gait Score, Hock Burn and Pododermatitis. In addition, birds in Condition 4 had poorer positive welfare outcomes (Bales Occupied, qualitative 'Happy/Active' scores, and Play and Exploratory behaviours) than birds in the other conditions, and poorer Litter Quality. In Breed B, the somewhat higher planned stocking density and resulting higher animal density of Condition 3 vs 2 was associated with a higher Avoidance Distance to the observer. Comparing the two slow-growing breeds (A and B) at similar stocking density, Breed A (Condition 1) had better health for some measures compared to Breed B (Condition 2), though no differences in positive behaviour rates were detected.
Condition 1 had the lowest Post-mortem Inspection Rejections and range of reasons for rejection. This suggests that Breed A was the most resilient and that resilience was affected by breed and growth rate. This is supported by van der Most et al. 's 26 findings of compromised immune function when selecting for growth rate in poultry.
In addition to having the highest Gait Scores, birds in Condition 4 showed greater avoidance of humans than birds in Conditions 1 and 2, suggesting that they were more fearful or less inquisitive. Previous application of the Avoidance Distance Test as an indicator of welfare in broiler chickens has shown conflicting results. Vasdal et al. 27 found that flocks of fast-growing broilers with higher gait scores moved away from the observer less, possibly because they were less able or motivated to do so. In the current study, we consider that it was a combination of higher gait scores, reduced motivation and increased fearfulness that meant that the birds' did not return towards the observer. It should also be considered that birds in Condition 4 were the youngest at the time of assessment, which may have contributed to a more fearful response. Vasdal et al. 27 found no effect of stocking density on human avoidance, and Tuyttens et al. 28 observed higher avoidance distances in flocks kept at lower stocking densities. These findings contrast with our observation that Condition 3 birds avoided the observer more than those in Condition 2 (assessed at the same age). Differences in the animal densities may have contributed to these different results, possibly influenced by unrest resulting from increased disturbances at higher densities 6,8 .
Leg problems are a key welfare and production concern. Walking difficulties (lameness) can be associated with both non-infectious and infectious skeletal disorders. Lameness indicates pain 29 that results in reduced mobility 30 and access to resources. Behavioural observations have shown that lame birds spend less time standing, running, preening and dustbathing and more time sitting 31,32 . In the current study, birds in Condition 4 had the highest Gait Scores, consistent with previous work showing poorer gaits in fast compared to slow-growing breeds. For example, Kestin et al. 19 found that slow-growing Hubbard strains had lower mean gait scores than fast-growing genotypes (Ross 308, 508) when reared on the same feeding regime.
Condition 4 birds were found to have the poorest footpad and hock health compared to the other conditions. Broilers that are less active tend to spend a greater proportion of their time sitting in contact with the litter 33 . The higher levels of Hock Burn and Pododermatitis found in Condition 4 birds are consistent with lower activity as indicated by lower Bales Occupied and Play and Exploratory behaviour results. Furthermore, Condition 4 had the poorest Litter Quality, meaning that these inactive birds not only sat for longer but they were also in contact with poorer litter, increasing the likelihood of hock burn 34 and pododermatitis 35 .
To date, there has been limited use of Qualitative Behaviour Assessment (QBA) in broiler flocks. Where QBA has been applied, there have been mixed results 36 . We found two components. Condition 4 birds were more likely to be 'Stressed/Flat' and less 'Happy/Active' (Component 1) than birds in the other conditions. Muri et al. 37 also found two components in commercial fast-growing broilers, explaining a higher percentage of the variance (70%) compared to the 54% found here, but few associations with other welfare outcomes. The authors considered that the homogeneity of the broiler system resulted in limited variation to detect associations and that QBA is perhaps more appropriate for larger animals in smaller groups, allowing for easier observation of behavioural expression. In the current study, the demeanour of birds in Condition 4 (fast-growing birds) was sufficiently different to be detected by QBA. Condition 4 birds were more likely to be scored with negatively valanced, low arousal terms ('Stressed/Flat') compared to the other conditions (slow-growing birds). We assessed the odds of observing several specific behaviours considered indicative of positive welfare. These behaviours were selected based on being easy to recognise from a distance, infrequent enough to be counted and recorded quickly, and expressive of positive qualities such as joyfulness, curiosity, self-care and agency. Condition 4 birds ground-scratched less (representing Exploratory behaviour) when undisturbed, and Played less and performed Any Positive Behaviour less following disturbance, compared to birds in the other three conditions. Foraging involves exploring the environment by scratching and pecking at the substrate 38 followed by consumption of edible discoveries. Chickens are highly motivated to explore for food despite provision of an ad libitum diet. Foraging is considered a 'behavioural need' of chickens 39 , with chickens preferring to obtain some of their diet by working for it 40 . Foraging requires a loose friable substrate. Condition 4 had the poorest litter quality, potentially reducing opportunities and motivation for foraging. Play behaviour was stimulated by disturbance, as it was less often observed in Undisturbed patches. However, even when space opened up in Disturbed patches, birds in Condition 4 (Breed C) had lower odds of expressing Play behaviours than birds in the other conditions, including Condition 3 (Breed B) birds kept a similar planned maximum stocking density. The Condition 4 birds also tended to engage less in vertical wing shaking (representing Comfort behaviour) when undisturbed (p = 0.099) and had no observations of perching on wires (representing Safety behaviour; too rare for statistical assessment). Given that the Condition 4 birds had poorer Gait, and Pododermatitis scores, higher 7d and Total Mortality and higher Post-mortem Inspection Rejections than birds in the other conditions, we suggest that poor physical health was a key contributor to their relatively low display of positive behaviours.
In the Undisturbed patches, a reduction in Play, Exploration, Comfort behaviour and Any Positive Behaviour counts was observed from the first to last Production Stages. Lower Play in the final production stage is consistent with Vasdal et al. 's 41 finding of less play at 30 days compared to 16 days, and the observed reduction in Exploratory behaviour with increasing age is consistent with Bizeray et al. 38 . For the Disturbed patches, less Play was recorded in Production Stage 1 than when the birds were older. It is possible that these behaviours were not fully captured by our methodology, as the birds moved away from the observer and played in other space that was readily available in Production Stage 1 due to the birds' small size. Play can occur as a rebound following deprivation of an environmental resource (such as space) 42 . At Production Stages 2 and 3, the creation of space in the observer's path of movement may have stimulated a rebound effect, resulting in an increase in odds of observing Play behaviours for these Production Stages. Similarly, Baxter et al. 43 reported more observations of play-like behaviours following displacement by an observer at 4 and 5 weeks than at 3 weeks of age, consistent with our findings from Disturbed patches.
In the current study we assessed both disturbed and undisturbed patches. Whilst a greater variety of behavioural categories was observed in undisturbed patches, the frequency of observation of play behaviours was much lower than in disturbed patches. In order to gain a clearer picture of the nuanced positive behaviours of broilers future research should focus on undisturbed patches, ideally automating the recognition of these relatively rare behaviours. Observations of disturbed patches may be beneficial for auditing purposes, to provide a discussion point for auditors and farmers such that the promotion of positive experiences is incorporated into expected standards of commercial broiler welfare.
Wetter litter has been associated with poorer broiler health 8 . Conditions 1 and 2 had the best Litter Quality while the highest litter scores were observed in Condition 4 (mean score 2.19, where a score of 2 indicated "leaves imprint of foot and will form a ball if compacted, but ball does not stay together well"). Condition 3′s mean score was lower than Condition 4′s but more variable than Condition 1 and 2′s mean score. Poorer quality litter contains more moisture. High stocking densities put pressure on the litter due to increased faeces excreted within a given area. The somewhat higher densities in Condition 3 and 4, would have resulted in higher faeces and litter moisture per unit of floor space 14 . In addition, the more rapid growth of the Condition 4 birds would have resulted in more rapid build-up of faeces in the litter, explaining the higher litter scores in that Condition despite having a similar planned maximum stocking density to Condition 3 as well as a shorter production duration. Ammonia in poultry houses has deleterious human, bird and environmental effects 44 . However, ammonia concentration at the end of the growth period was similar across conditions and did not exceed recommended levels of < 20 ppm 45 despite the longer production duration of the birds in Conditions 1-3 or higher litter scores in Condition 4. A more detailed study exploring total emissions over the entire production cycle is required to assess the wider environmental impact of higher welfare systems.
Although stocking density is often referred to when considering the welfare of broilers, higher stocking densities are ultimately achieved by placing more chicks in a given area (when producing birds to the same final weight). Final stocking densities are influenced by the growth rate of the birds which in turn is sensitive to a multitude of variables (e.g. pathologies, feed quality, house temperature profiles, management etc.) while final animal densities are less sensitive to these variables as they are only influenced by mortality. This is reflected in the results in this study. While final stocking densities did not vary greatly across conditions, the difference in animal densities did remain consistent with the planned maximum stocking densities to the point of processing. One could hypothesise that it is the animal density that is more likely to impact the welfare of the individual birds due to competition for resources and behavioural disruption, rather than the space available due to the size of the birds. However, for Breed B, we observed no differences between Conditions 2 and 3 (which differed in animal density), in most of the Negative Welfare Outcomes including Gait Score (in agreement with Bailie et al. considering stocking density 11 ). Similarly, a lack of association between animal density and Bales Occupied or positive behaviour observations suggests that opportunities to perform behaviours important for positive welfare were no more compromised at the higher animal density. However, we cannot rule out that the lower animal density of Breed B in Condition 2 was also sufficient to restrict behaviour as the birds grew. For example, Comfort behaviour (dustbathing, as measured by vertical wing shakes) when Undisturbed declined across the Production Stages, whereas in studies where birds were stocked at lower stocking densities, no reduction in dustbathing was observed as birds aged 32,46 . The difference in the Avoidance Distance between Conditions 2 Scientific RepoRtS | (2020) 10:15151 | https://doi.org/10.1038/s41598-020-72198-x www.nature.com/scientificreports/ and 3 is perhaps reflective of greater fearfulness at the higher animal density. Use of a slightly higher planned maximum stocking density may help to partially mitigate the loss of productivity from moving to a slow-growing breed in commercial systems. At similar planned maximum stocking and animal densities, birds in Condition 1 (Breed A) were healthier than those in Condition 2 (Breed B) as indicated by lower 7-day and Total Mortality (when excluding the production cycle that experienced high early mortality), lower Post-mortem Inspection Rejections, and better Gait Scores. A lack of behavioural differences between these conditions suggests that Breed B's ability to perform specific positive behaviours was not impaired by their slightly faster growth rate.
Broiler producers manage their flocks to achieve the best possible results regardless of the specific marketing scheme demands. In this study, although breed and planned maximum stocking density varied, the producer was continuously monitoring and adapting the management of each house to optimise outcomes. This varies from a controlled experimental study in which specific factors are varied and all other factors held constant. Both approaches are valuable in providing different information, the former about outcomes under real-life conditions (emphasising external validity) and the latter about the impact of individual factors under a single set of controlled conditions (emphasising internal validity). The latter typically involves different management practices to those found commercially such as use of smaller flocks and pen sizes, as well as differences in ventilation patterns, litter management, feeding and watering practices, labour routines, biosecurity between experimental units and veterinary practices. Here, we used the former approach, showing how commercial marketing scheme demands for specific product attributes, in this case, use of slower-growing breeds and lower planned maximum stocking density, were associated with broiler welfare outcomes on one commercial farm. By limiting the study to one farm, we were able to control potentially influential environmental variables (stockperson, house size and equipment, processor, processing dates, seasonal and regional conditions), allowing for detection of differences between conditions in a smaller number of production cycles than would have been required if collecting data across multiple farms and companies. Despite this, the small replicate number and variation within variables contraindicated statistical tests on the production information; further replicates would overcome this. As in the case of results from a single research facility, caution is needed in applying our on-farm results more generally.

conclusion
In this study, we compared four conditions simulating four possible marketing schemes for broiler chickens varying in growth rate (varying across three breeds) and planned maximum stocking density (higher vs lower). We predicted that a wide range of negative and positive welfare measures would, overall, indicate the highest welfare in the slowest-growing breed (Breed A) kept at the lower planned maximum stocking density (Condition 1) and the lowest welfare in the fastest-growing strain (Breed C) kept at the higher planned maximum stocking density (Condition 4), with Conditions 2 (slower-growing Breed B at the lower planned maximum stocking density) and 3 (slower-growing Breed B at the higher planned maximum stocking density) having intermediate results.
The differences between Conditions 1 to 3 were subtle. Nevertheless, the overall results suggest that, on average, chickens experienced better welfare in Condition 1 than in Conditions 2 and 3 (based on lower 7d and Total Mortality (when excluding one production cycle), fewer Post-mortem Inspection Rejections and variety of reasons for rejection and lower Gait Scores). Differences in welfare between Condition 2 and 3 were less prominent with only a difference in Avoidance Distance Test results. The clearest pattern of findings was for higher welfare in Conditions 1 to 3 when compared with Condition 4 (lower 7d and Total Mortality, Post-mortem Inspection Rejections, Gait Score, Hock Burn and Pododermatitis scores, more Bales Occupied, 'better' PC1 Qualitative Behaviour Assessments, higher Play behaviours in Disturbed patches and Exploratory behaviour in Undisturbed Patches). These findings, obtained from an observational study under practical commercial conditions, indicate that the most significant overall welfare improvement can be achieved utilising a slow-growing breed compared to standard fast-growing breeds. There were suggested benefits of utilising a slightly lower planned maximum stocking density and further health benefits in systems utilising the slowest growing genotype. However, these benefits did not give welfare benefits of the same magnitude as could be realised by moving away from the fastgrowing broilers (Condition 4, Breed C).

Material and methods
conditions. Birds were kept in four conditions varying in breed and planned maximum stocking density representing commercial production systems: Condition 1: Breed A (expected growth rate 45 g/day); planned maximum stocking density 30 kg/m 2 . Condition 2: Breed B (expected growth rate 49 g/day); planned maximum stocking density 30 kg/m 2 . Condition 3: Breed B (expected growth rate 49 g/day); planned maximum stocking density 34 kg/m 2 . Condition 4: Breed C (expected growth rate 63 g/day); planned maximum stocking density 34 kg/m 2 . Breeds A and B were 'slow-growing' breeds (< 50 g/day) and Breed C was a standard 'fast-growing' (> 50 g/day) breed. Commercial sensitivities preclude identifying breed names. The commercial nature of the study meant that the combinations of breed/planned maximum stocking density available for investigation were limited.
The four conditions were pseudo-randomly allocated across four houses over four production cycles. Full random allocation was not possible as Condition 4 had to be housed in one of two houses due to feed bin requirements. Chicks were placed as hatched and placement was staggered with the intention that birds within a production cycle would reach a targeted final weight of 2.2 kg on the same processing date.
Housing and management. The study was conducted on a single farm over four production cycles between January and October, 2018. The farm was managed by the same stockperson throughout. Four houses were allocated as trial houses, all similar in layout, size and orientation. House internal dimensions were Scientific RepoRtS | (2020) 10:15151 | https://doi.org/10.1038/s41598-020-72198-x www.nature.com/scientificreports/ 18.1 m × 48.7 m. For Condition 2, Production Cycle 2, insufficient chicks were delivered to the farm such that the house length was reduced to 42.3 m to maintain planned stocking densities. Each house contained six rows of drinkers and four rows of feeders, with 60 feed pans per row. Feeders and drinkers had wires above, installed to discourage birds from perching on them. All houses had three rows of structural posts, and Perspex windows along both long sides. Artificial light was provided by a row of energy-saving compact fluorescent light bulbs down each side of the house and a row of fluorescent tubes down the middle. Each house had two gas heaters, positioned on the same wall, at either end of the house. Fresh wood-shavings litter covered the concrete floor at the start of each production cycle at a depth of 50-70 mm. Small rectangular straw enrichment bales (960 mm × 430 mm × 360 mm) were provided at a rate of 1.5 bales per 1,000 birds (Fig. 7a). Bales remained present in the house throughout the production cycle. Ten 3 m wooden square baton (51 mm × 51 mm) perches at a height of 20 mm were provided per house. Each condition was managed as per usual commercial practice for that condition. Vaccinations, feed, heating and lighting schedules were implemented according to the breed requirements. Conditions 1-3 received the same diet, whilst Condition 4 received a different diet. Conditions 1-3 had a temperature profile that started at 33 °C and was reduced by 0.4 °C a day to 18 °C whilst Condition 4 started at 34 °C, and was reduced by 0.5 °C a day to 18 °C. All conditions provided natural light from Day 3. By Day 6, Conditions 1-3 provided 6 h of continuous darkness from 21:00, whilst Condition 4 provided a 4 h continuous darkness period from 21:00 followed by a 2 h dark period starting at 03:00 as specified by the breeding company.
Planned maximum stocking density was manipulated by placing different numbers of birds in the house (Table 1). Placement was such as to avoid exceeding maximum stocking densities whilst still accounting for expected mortality as per usual commercial practice. Data collection. Stockperson records were used to collate production information and mortality data. Processing welfare outcomes were provided by the processing plant at the end of each production cycle. Additional negative welfare outcomes, and environmental outcomes, were assessed by an observer (A.R.) two days before processing. Positive welfare outcomes were assessed by the same observer at three Production Stages during each production cycle (Days 14 and 28, and 3d before processing) (Table S1). Morphological differences between the breeds meant that it was not possible to blind the observer to treatment. production information. The number of birds remaining in the house was recorded daily by the stockperson. Birds were weighed weekly by the stockperson by penning and weighing 50 random birds. Birds remaining, average weight and internal house dimensions were used to calculate Stocking Density. negative welfare outcomes. Mortality. Mortality, and reason, were recorded daily by the stockman.
Reason for mortality was defined as culling due to lameness (Leg Culls), culling because moribund or for other reasons (Other Culls) and found dead (Dead). Total Mortality was the sum of these three categories. Values may www.nature.com/scientificreports/ be subject to some error due to possibility of the stockman not finding all dead birds or those that required culling. Values were expressed as percentages of chicks placed per production cycle.
Processing welfare outcomes. Birds from all conditions within a production cycle were processed on the same day and within the same shift, ensuring consistency of catching and processing plant staff and assessors. Dead on Arrival was the percentage of birds that died during transit. Birds were counted prior to being hung on the processing line. Pre-processing Culls included birds deemed to be too small (could fall off the processing line) and emergency culls due to ill-health or sickness. All remaining birds were inspected by poultry meat inspectors. All birds with pathologies were recorded (with reason; Fig. 1) as Post-mortem Inspection Rejections. Data were converted to percentages per production cycle.
Additional negative welfare outcomes. Avoidance Distance, Gait Score, Hock Burn and Pododermatitis assessments were undertaken 2d before processing between 09:30 and 14:30. The order of assessing each condition was randomised. Avoidance Distance Tests were conducted first, followed by Gait Scoring, in pre-determined random areas within a theoretical grid of 176 possible areas per house delineated by the posts, and drinker and feeder lines (Fig. 7). In Production Cycles 3 and 4, Hock Burn and Pododermatitis assessments were undertaken in the afternoon of the same day between 14:30 and 16:00, also in pre-determined random areas of the house.
Avoidance distance test. Avoidance Distance Tests assessed reactivity of the birds to a person as defined in Welfare Quality 47 . In each of 21 random areas, the observer approached a group of at least 3 birds, slowly squatted for 10 s and then by slowly looking around counted the number of birds within an approximate radius of 1 m (an arm's length). The mean percentage of birds over the 21 trials was then calculated.
Gait score. One hundred birds per condition were scored across 50 random locations. At each location, a 30 cm × 30 cm acetate grid with 5cm 2 squares numbered 1-36 was held at arm's length and the two birds closest to a preselected, randomly generated number in the grid were observed to walk for at least 10 steps and scored 48 using the 6-point Bristol gait scoring method 49 . Zero described a bird with no detectable abnormality and 5 described a bird incapable of walking. Birds scored 4 or 5 were carried to the end of the house to be culled by the stockperson. The data were averaged to obtain a mean gait score per 100 birds.
Hock burn and pododermatitis. Roughly 100 birds were penned in four randomly generated locations in each house (Production Cycle 3 and 4 only). All birds penned were assessed. Birds were handled as per routine inspection of hockburn and pododermatitis by the farm manager. Birds were handled according to best practice with two hands around the body (no inversion) allowing for observation of the feet and hocks for scoring 3 . Feet and hocks were scored according to the scales defined in Welfare Quality 47 , whereby 0 equated to no evidence of lesions and 4 to severe lesions. Both legs were inspected and the highest score noted.
positive welfare outcomes. Bales Occupied counts, Qualitative Behaviour Assessment (QBA) and Positive behaviour observations were undertaken once at each production stage. For Production Stage 1 and 2 each condition was assessed at either 10:30 or 12:00, in a pseudo-randomised order (two assessments at each time across the four production cycles). At Production Stage 3, each condition was pseudo-randomly assigned to be assessed at 09:30, 11:30, 13:30 or 15:30, due to all conditions requiring assessment on the same day (one assessment at each time across the four production cycles). The assessments were conducted as follows: The observer entered the house and slowly walked up to the 5th post of Transect 3 to set up a video camera (data not presented). The observer then walked back to the end of Transect 3, near the door-end of the house. After allowing the birds settle for 5 min, they were observed for 10 min and QBA scores were recorded. An instantaneous scan sample of Bales Occupied was then performed. At 20 min after arriving at the end of Transect 3, the observer moved to the starting position for conducting the Positive behaviour observations (Fig. 7).
Bales occupied. The number of occupied bales (having one or more birds on top) was counted and expressed as a percentage of the total number of bales in the house.
Qualitative behaviour assessment. QBA was undertaken using a fixed list methodology 50 . The list of descriptors included 14 expressive qualities: content, flat, active, playful, flighty, stressed, alert, happy, calm, inquisitive, lethargic, comfortable, lively and relaxed. Following the 10-min observation period, the observer scored the entire flock using a paper-based 0-125 mm Visual Analogue Scale, whereby 0 mm indicated that the expressive quality was absent throughout the whole 10 min and 125 mm indicated the maximum possible expression of the quality. Each term was measured and scored as the distance in mm from 0 to the level marked on the scale.
Positive behaviour observations. Transect sampling for positive behaviour was performed using a method modified from Newberry et al. 51 . The feeder and drinker lines were used to delineate 11 transects within each house (Fig. 7a). Transect 11 was on the South-East facing side of each house. Each observation 'patch' was defined by the feeder/drinker lines and the distance between 3 feed pans (1.5 m) (Fig. 7b). Scans were made of 54 Undisturbed (U) patches per house, located two transects across from, and ahead of, the observer's current location, and 57 Disturbed (D) patches through which the observer had slowly walked 75 s previously. For Condition 2, Scientific RepoRtS | (2020) 10:15151 | https://doi.org/10.1038/s41598-020-72198-x www.nature.com/scientificreports/ Production Cycle 2, two fewer U and D patches were observed per transect due to the reduced house length (48 U and 51 D patches in total were observed for this house).
To initiate the observation sequence, the observer started a timer and walked slowly to the 3rd feed pan in Transect 2 by 15 s, then waited a further 60 s before walking to the 6th feeder (Step 1) and continuing the cycle of observations (steps 2-5, Fig. 7b). The observer walked slowly up Transect 2 (South-West facing side), observing D patches behind them in Transects 2 and U patches ahead and two transects across in Transect 4 . They then walked slowly down Transect 5, observing Transects 5 (D) and 7 (U) and, finally, up Transect 8, observing Transects 8 (D) and 10 (U) (Fig. 7).
At each patch, the observer recorded the number of birds in a patch that performed each of the following mutually-exclusive behaviours: worm-running, play-fighting, wing-flapping, jumping, running, ground-scratching, vertical wing shaking and perching on wires (Table S2), where the behaviour closest to the first in this list was recorded if the bird performed more than one of these behaviours. environmental outcomes. After completion of Gait Scoring, Litter quality was scored at 6 random locations across the house using the Welfare Quality 47 classification system. A minimum score of 0 represented "Completely dry and flaky" litter and a maximum score of 4 represented litter that "Sticks to boots once the cap or compacted crust is broken". The mean of the six scores was calculated to provide one score per house per production cycle.
Ammonia was measured after the Avoidance Distance Test, at five standardised locations throughout each house (two front, two back and one in the middle) using pHydrion paper tests. Each strip was wetted with distilled water and held at bird height for 15 s. The change in colour of the paper indicated ammonia concentration against a colour chart ranging from 0 to 100 ppm. The mean of the 5 scores was calculated to provide one ammonia score per house.
Negative welfare outcomes. Mortality and Processing welfare outcomes were collected only once per production cycle per condition and thus only descriptive data are presented in the text Avoidance Distance, Gait Score, Hock Burn and Pododermatitis all violated the assumptions of a parametric ANOVA and Kruskal-Wallis H tests were utilised. Pairwise comparisons between the four conditions were undertaken using Dunn's 52 procedure utilising a Bonferroni correction for multiple comparisons and statistical significance accepted at the p < 0.0083 level.
Avoidance distance test. The mean number of birds within arm's length was compared to a theoretical number of birds within a meter circle (equal to the observer's arm span) to provide % of birds within arm's length, where the theoretical number of birds = stocking density (in birds/m2)*π/2. The theoretical number was divided by two, to account for the space taken by the assessor (as per Welfare Quality 47 ). Kruskal-Wallis H test and pairwise comparisons were run to explore differences between conditions. Positive welfare outcomes. Bales occupied. Bales Occupied was analysed using a mixed ANOVA with Condition as a fixed effect and Production Stage as a repeated measure. Pairwise comparisons utilising a Bonferroni correction were run to explore differences in Bales Occupied between conditions and production stages.
Qualitative behaviour assessment. Scores were analysed using principal component analysis (correlation matrix, no rotation). Separate mixed ANOVAs were run on each identified principal component, with Condition as a fixed effect and Production Stage as a repeated measure. Pairwise comparisons utilising a Bonferroni correction were run to explore differences between conditions and production stages.
Positive behaviour observations. Positive behaviours were categorised as Play (including worm-run, playfight, wing-flap, jump and run), Exploratory (ground-scratch), Comfort (vertical wing shake), Safety (perch on wires) and Any Positive Behaviour (sum of all positive behaviours). The Play category comprised spontaneous, energy-demanding, self-handicapping behaviours performed in a non-life-threatening context 53 . The Exploratory category represented behaviour involved in finding or revealing aspects of the physical environment, Comfort represented body maintenance behaviour (here, measured by vertical wing shakes, a major component of dustbathing behaviour) and Safety represented anti-predator behaviour (here, measuring the skill and agility to perch on narrow, non-stationary, elevated structures).
The expected number of birds in each patch was calculated from the total number of birds in the house on the day of assessment, total area of the house and the proportionate patch size, assuming uniform distribution of birds throughout the house.
Associations of Production Stage, Condition and Disturbance with performance of behaviours in each positive behaviour category were analysed using negative binomial regression with log link function. Scan data were analysed in an 'events out of trials' format whereby counts of the number of birds performing behaviours in each behaviour category were summed across all observed Undisturbed and Disturbed patches in the house. By summing the data, any effect of location of patches was accounted for by the similar layout and transect methodology applied to all houses. The offset variable was the log of the sum of the expected number of birds in each observed patch.