How types of premises modulate the typicality effect in category-based induction: diverging evidence from the P2, P3, and LPC effects

Behavioural studies have indicated that semantic typicality influences processing time and accuracy during the performance of inductive reasoning (i.e., the typicality effect). The present study examines this effect by manipulating the types of premises and conclusions (i.e., general, typical, or atypical) at an electrophysiological level using a semantic category-based induction task. With regard to behavioural results, higher inductive strength was found in typical conclusions in all premise conditions, whereas a longer response time for atypical conclusions was only found in general and typical premise conditions. The ERP results had different response patterns: in the general premise condition, a larger P2, as well as a smaller P3 and LPC (500–600 ms), were elicited by atypical conclusions relative to typical ones; in the typical premise condition, a larger P2 and LPC (600–700 ms) were found for atypical conclusions; in the atypical premise condition, however, only a larger P2 was found for atypical conclusions. The divergent evidence for the typicality effect indicated that the processing of the typicality effect in general, and specific premise conditions, might involve different cognitive processes, such as resource allocation and inference violation, which yielded new insights into the neural underpinnings of the typicality effect in a category-based induction.

(e.g., Crow has property X), and to assess whether, or not, the information about the conclusion (e.g., Penguin has property X) is plausible, and make a "strong" or "not strong" response, which has been used extensively 17,18 .
The key focus here was on the electrophysiological data elicited by the conclusion (s), upon which certain judgments could be made. Based on previous studies, short RTs and higher "strong" responses should be found for typical conclusions relative to atypical conclusions 1,9 . In the field of ERP study, typicality has a pervasive influence on category-based verification and reasoning, which reflects on early N1 and P2 effects related to perceptual and attention processes, and later P3 or N400 effects related to semantic processes and categorisation 6,10,11,15 . Accordingly, significant differences in the ERP indices should be found between typical and atypical members. For example, the P2 effect should be sensitive to the typicality effect in category-based induction, because the representativeness of typical and atypical conclusions was different. Furthermore, research on reasoning found a larger P3, or LPC, effect for matched items when performing conditional reasoning 35,36 , category-based induction 19 , or transitive inference 37 . In the present study, a larger P3 or LPC amplitude should be elicited by typical conclusions, because they tend to match the premises to a greater extent 47,48 .
More importantly, the types of premises should have a critical regulatory role in the typicality effect when exploring the ERP effects elicited by the conclusions, especially for general premises and specific (typical and atypical) premises. In the general premise condition, specifically, the category in the premise includes, or is identical to, the category in the conclusion (e.g., Bird X; → Bird/sparrow/penguin X?), so if the information about the premises is true, the information about the conclusions is also true. For example, if all birds have property X is true, sparrows must have property X, because a sparrow is a type of bird; however, in the typical/atypical premise condition, the information about the conclusions is not necessarily true, but only has a "strong" or "not strong" plausibility. For example, if all sparrows have property X is true, all birds/penguins do not necessarily have property X. Furthermore, different response patterns should also be found in typical and atypical premises, because typical premises are more representative than atypical members, especially when the conclusion is a general category. As results, although early ERP effects related to the perceptual and attention processes for typical and atypical conclusions should be found in different premise conditions (e.g., P2), the late ERP effects related to premise-conclusion matching or categorisation violations should have different response patterns (e.g., P3 or LPC).

Methods
Participants. 15 healthy undergraduate students (nine female) between the ages of 19 and 24 were paid for participating in the main experiment. Participants that initially rated the stimuli did not participate in the main procedure. All participants were healthy and right-handed with normal or corrected-to-normal vision, and gave their informed written consent before the experiment. All experimental protocols were approved by the University's ethics committee (The Medicine Medical Ethics Committee of Shenzhen University), and the methods were carried out in accordance with the relevant guidelines and regulations.
Stimuli. Before the main experiment, six natural categories (i.e., birds, fishes, insects, vegetables, flowers, and fruits) and four artificial categories (i.e., clothing, musical instruments, furniture, and tools) were selected as stimuli. In a manner similar to previous studies 6,14 , forty undergraduates were required to write members of the above categories as much as possible.
After this, another 40 undergraduates participated in a typicality test, in which participants were required to estimate the extent to which these members epitomised the category on a seven-point scale, where 7 indicated the highest typicality. For example, category members such as "sparrow" and "crow" received a typical rating of "6" or "7" as birds. As a bird, however, the degree of typicality for category members such as "penguin" or "duck" was lower than that for "sparrow" and "crow", and they might receive a rating of "3" or "4" as birds. After these studies, five typical members and five atypical members were selected from each category, plus these categories themselves, composed a set of 110 categories. An independent t-test suggested that there was a significant difference between typical members (M = 6.79, SD = 0.15) and atypical members (M = 4.65, SD = 0.52), t (58) = 21.66, p < 0.001.
The premise and conclusion of the arguments consisted of one of the above selected 110 categories or members (See Supplementary Information for S1 Table, S2 Table and S3 Table), as well as the blank property represented by capital letters ranging from A-Z (e.g., Sparrow A, which meant that sparrows have property A; see Table 1). The single premise category-based induction (e.g., Premise: Crows have property A; Conclusion: Sparrows have property A) was used to induce a memory load 1,19 . To reduce the effect of prior knowledge, a blank property (e.g., A) was used and never any actual properties as clarified further elsewhere 19,49 . As mentioned earlier, the types of premises (general, typical, or atypical) and conclusions (general, typical, or atypical) were manipulated, and thus included nine main experimental conditions. Moreover, the identical conditions for typical and atypical members, as well as control condition, were regarded as baseline tasks. Specifically, the premise and conclusion for main experimental and identical conditions consisted of natural categories with the same blank property. For the control condition, however, the premise and conclusion consisted of natural and artificial categories respectively, and their properties were different.
As shown in Table 1, the general premise condition included three types of arguments: (1) General = General (G = G), the premise and conclusion consisted of the same category itself. (2) General-Typical (G-T), in which the premise consisted of a category and the conclusion consisted of a typical member; (3) General-Atypical (G-A), in which the premise consisted of a category but the conclusion consisted of an atypical member. Furthermore, the typical premise condition also included three types of argument: (1) Typical-General (T-G), in which the premise consisted of a typical member and the conclusion consisted of a category. (2) Typical-Typical (T-T), both the premise and conclusion consisted of typical members of the same category. (3) Typical-Atypical (T-A), the premise consisted of a typical member, while the conclusion consisted of an atypical member of the same category. Similarly, the atypical premise condition included three types of argument: (1) Atypical-General (A-G), in which the premise consisted of an atypical member and the conclusion consisted of a category. (2) Atypical-Typical (A-T), the premise consisted of an atypical member, while the conclusion consisted of a typical member of the same category. (3) Atypical-Atypical (A-A), both premise and conclusion consisted of atypical members of the same category. However, the baseline tasks were as follows: (1) Typical = Typical (T = T), the premise and conclusion consisted of the same typical members. (2) Atypical = Atypical (A = A), the premise and conclusion consisted of the same atypical members. (3) Control conditions, the premises were members of the natural category, while the conclusions were members of artificial categories.
Procedure. In the main procedure, participants judged whether the argument was "strong" or "not strong" by pressing one of two keys. The order of the stimuli within the task was randomised and counterbalanced. Following previous studies 17,18,50 , the "strong" argument was defined as: Assuming that the information represented by the premises is true, this makes the information represented by the conclusions plausible.
The categories and category members were presented through one to three black Chinese characters, which appeared on a grey background as generated by the E-prime software. As shown in Fig. 1, a black fixation ("+") was presented in the centre of the screen for 600 ms at the beginning of each trial, followed by a blank screen for 500 ms. Subsequently, the premise and a novel property linked by a blank space (e.g., Apple X) indicating that the category member had that property, was presented for 1000 ms, followed by a blank screen shown for a random duration (800-1000 ms). Next, the conclusion and a novel property ended in "?" (e.g., Fruits X?) linked by a blank space appeared and remained until participants made a response. Participants were instructed to respond as rapidly and accurately as possible to the conclusion, and make a "strong" or "not strong" response by pressing "F" or "J" with the left or right index finger, which was counterbalanced across subjects. The next trial began after the presentation of a blank screen for 500 ms.
To ensure that each participant understood the instructions, they were asked to repeat the instructions in their own words. Furthermore, to reduce the effect of familiarity, all categories and their members were presented by category at the beginning of experiment. The procedure was divided into practice and test phases. The tests phases consisted of 510 trials under the main experimental conditions (30 trials for G = G, 60 trials for each of the other conditions) and 240 trails for baseline conditions (60 trials for A = A, 60 trials for T = T, and 120 trials for the control condition). The identical condition (i.e., A = A, T = T) served as filler trials in which participants had to make "strong" responses, which was compared with the responses to control condition ("not strong"), to improve  the efficacy of the design 51 . Furthermore, thirty practice trails were performed to familiarise participants with the procedure: these were selected from unused category members that were not included in the main experiment.
ERP recordings and data analysis. Brain electrical activities were recorded from 64 tin electrodes mounted on an elastic cap based on the extended 10/20 system (Brain Products, GmbH, Germany; pass band: 0.05-100 Hz, sampling rate: 500 Hz). The ground electrode was on the medial frontal line and references were on the left and right mastoid 52,53 . The vertical and horizontal electro-oculograms (EOGs) were recorded from the left eye infra-orbitally and supra-orbitally, and the orbital rim of both eyes, respectively. All impedances were controlled to below 10 kΩ. All the bioelectric signals were analysed off-line using Brain Vision Analyzer 2.0. The signal was passed through a 0.1 to 35 Hz digital band-pass filter for off-line analysis. Ocular correction ICA was used to eliminate the artifacts such as blinks and eye movements. Off-line computerised artifact rejection was also used to eliminate trials with mean EOG (eye blinks and ocular movements), artifacts due to bursts of electromyography activity, amplifier clippings, or peak-to-peak deflections exceeding ± 80 μ V. In a manner similar to previous studies 6, 19 , we focused on data elicited by the conclusion. As a result, epochs of 1000 ms, time-locked to the conclusion, including a 200 ms pre-stimulus baseline were extracted from the ongoing EEG, segmented and averaged (Fig. 2). In the present study, the identical condition (except for G = G), as well as the control condition, were regarded as baseline tasks to ensure an effective "strong/not strong" judgment 17,51 . In fact, participants made a very high "strong" response for the identical condition (M = 0.99, SD = 0.02), but made a very low response for the control condition (M = 0.08, SD = 0.25). However, including these conditions in the same analysis diluted any possible effect of typicality on induction. Indeed, we compared these identical conditions, and no significant ACC and RTs, as well as any ERP effects, were found among them.
The main effect of premise on RT was significant, F (2, 28) = 22.60, p < 0.001, η 2 = 0.62. Similarly, there was significant main effect on the conclusion, F (2, 28) = 56.98, p < 0.001, η 2 = 0.80. In both situations, RTs in the atypical conditions were longest (Premise: 945 ms; Conclusion: 806 ms), smallest in the general conditions (Premise: 818 ms; Conclusion: 992 ms), and intermediate in the typical conditions (Premise: 892 ms; Conclusion: 862 ms), which differed from each other for the three conditions (ps < 0.006). Furthermore, the interaction between premise and conclusion was significant, F (4, 56) = 3.33, p = 0.025, η 2 = 0.19. Pair-wise comparison indicated that the RT for a general conclusion was shorter than typical and atypical conclusions in general, typical and atypical conditions (ps < 0.006). However, a shorter RT was found for typical conclusions relative to atypical conclusions only when the premise was of the general category (p < 0.001) and typical members (p < 0.05), but not for atypical premise members (p > 0.90). ERP results. As mentioned above, four-way repeated measures ANOVAs were performed for the mean peak amplitude, mean amplitude, and peak latency of N1, P2, N2, and P3, as well as the mean amplitude of LPC (see Tables 2 and 3 Tables 2 and 3, the main effect of the premise on the N1 peak was significant, the N1 elicited by a general premise was larger than that arising from an atypical premise (p = 0.016). The main effects of frontality and laterality on N1 mean amplitude were significant. Pair-wise comparison indicated the N1 component in frontal, frontal central and central sites was larger than central-parietal and parietal sites (ps < 0.05). The N1 at left and middle sites was larger than that at right sites (ps < 0.01). The interactions between laterality and premise on N1 mean amplitude and peak were significant. Pair-wise comparison indicated that the difference between a general premise and an atypical premise was found at middle and right sites (ps < 0.04), but not at left sites (p > 0.10). Although three way interaction among frontality, premise, and conclusion on the N1 amplitude was found, no significant interaction between premise and conclusion was found at sites with different frontality (ps > 0.10). No other significant difference was found.

N1 (50-150 ms). As shown in
Scientific RepoRts | 6:37890 | DOI: 10.1038/srep37890 P2 (150-250 ms). The main effects of conclusion on P2 mean amplitude and peak were significant. Pair-wise comparison found that the P2 peak elicited by general conclusion was smaller than typical and atypical conclusions (ps < 0.01), and the mean amplitude for general conclusions was smaller than that for atypical conclusions (p < 0.001). The interaction between premise and conclusion on P2 latency was significant, the P2 latency for general conclusions was shorter than that for atypical conclusion in the atypical premise condition (p = 0.03), but not in general and typical premise conditions (ps > 0.10).
The main effects of laterality on P2 mean amplitude and peak were significant. The P2 at right sites was larger than that at left sites (p = 0.03). Furthermore, the interaction between frontality and conclusion was significant. In addition to the larger P2 amplitude for atypical members relative to the general category found at all sites (ps < 0.04), the P2 elicited by atypical members was larger than that arising from typical members at frontal, frontal central, and central sites (ps < 0.05), but not at central parietal and parietal sites (ps > 0.33). Similarly, the interaction among frontality, laterality, and conclusion on P2 amplitude was significant. Further analysis indicated that a smaller P2 was elicited by general relative to atypical conclusions at Fz, F3, F4, FCz, FC3, FC4, and Pz (ps < 0.007), whereas such a difference between general and typical conclusions was only found at F3 sites (p = 0.013). Furthermore, the larger P2 elicited by atypical conclusions relative to typical conclusions was only found at F4 and FC4 (ps < 0.05). No other significant difference was found (ps > 0.05).

N2 (250-350 ms).
The main effects of premise, as well as conclusion, on N2 peak and mean amplitude were significant. Pair-wise comparison indicted the N2 elicited by typical premise was larger than general premise (ps < 0.05). Similarly, the N2 elicited by typical and atypical conclusions were larger than general conclusion (ps < 0.02). The interaction between premise and conclusion on N2 mean amplitude was significant. The N2 elicited by a general conclusion was smaller than typical and atypical conclusions only in general premise conditions (ps < 0.01), but not in typical and atypical conditions (ps > 0.20). The interaction among frontality, premise, and conclusion on N2 mean amplitude was significant. Further analysis indicated that the smaller N2 elicited by a general conclusion relative to typical and atypical conclusions was only found at frontal, frontal central, and central sites (ps < 0.01). The three way interaction among laterality, premise, and conclusion on N2 mean amplitude and peak was significant. Further analysis found that the interaction between premise and conclusion was only found at middle sites (ps < 0.01), but not at left and right sites (ps > 0.10).
The main effects of frontality and laterality on N2 mean amplitude and peak were significant. The N2 component in frontal, frontal central sites was larger than central-parietal and parietal sites (ps < 0.05), and the N2 at left and middle sites were larger than that at right sites (ps < 0.001). The interaction between laterality and premise on N2 mean amplitude was significant. Pair-wise comparison indicated that the N2 elicited by a typical premise was larger than that of a general premise at middle (p = 0.03) and right sites (p = 0.01), but not at left sites (p = 0.15). Furthermore, the interactions between laterality and conclusion on N2 mean amplitude and peak were significant. That is, the N2 elicited by general conclusions was smaller than that from typical and atypical conclusions at middle and right sites (ps < 0.03), but not at left sites (ps > 0.24). The interaction among frontality, laterality, and conclusion was significant. Further analysis indicated that the difference between general and typical conclusions was found at Fz, F4, FCz, FC4, Cz, C4, CPz, CP4, Pz, and P4, whereas the difference between general and atypical conclusions was found at Fz, FCz, FC4, Cz, C4, CPz, CP4, Pz, P4, and P3 (ps < 0.05).

P3 (350-450 ms).
The main effects of premise, as well as conclusion, on P3 peak and mean amplitude were significant. Pair-wise comparison indicted the P3 elicited by a general premise was larger than that from a typical premise (p < 0.05). Similarly, the P3 elicited by general conclusions was larger than typical and atypical conclusions (ps < 0.03). The interactions between premise and conclusion on P3 mean amplitude and peak were significant. Although the P3 elicited by a general conclusion was larger than that from typical conclusion in general and typical premise conditions (ps < 0.05), a larger P3 elicited by general conclusion relative to atypical conclusions was found in general and atypical premise conditions (ps < 0.05). Furthermore, the smaller P3 amplitude elicited by atypical conclusions relative to general and typical conclusions was only found in general premise conditions (ps < 0.02). The three interactions among frontality, premise, and conclusion on P3 amplitude and peak were significant. Further analysis indicated that the smaller P3 elicited by atypical conclusions relative to general (ps < 0.01) and typical conclusions (ps < 0.05) was only found in general premise conditions at frontal, frontal central, central, and central parietal sites. The three way interaction among laterality, premise, and conclusion on P3 amplitude was significant. Further analysis found that the interaction between premise and conclusion was more significant at middle (p = 0.001) and right sites (p = 0.003) relative to left sites (p = 0.03). No other significant difference was found.
The main effects of frontality and laterality on P3 amplitude and peak were significant. The P3 in central, central-parietal, and parietal sites was larger than that at frontal, frontal central sites (ps < 0.01), and the P3 at right sites was larger than left and middle sites (ps < 0.01). The interaction between laterality and premise on   Table 3. Four-way repeated-measures ANOVA of mean peak latency and amplitude to assess the effects of typicality on category-based induction.
Scientific RepoRts | 6:37890 | DOI: 10.1038/srep37890 P3 amplitude was significant. Although the P3 elicited by a general premise was larger than that arising from a typical premise at all sites (p < 0.05), it was only larger than that arising from an atypical premise at middle sites (p = 0.025). Furthermore, the interactions between laterality and conclusion on P3 amplitude and peak were significant. That is, the P3 elicited by a general conclusion was larger than that from an atypical conclusions at all sites (ps < 0.03), whereas it was larger than that arising from typical conclusions only at middle and right sites (ps < 0.05). The three way interaction among frontality, laterality, and conclusion on P3 amplitude was significant. Further analysis indicated that the difference between general and typical conclusions was found at F4, FC4, C4, CPz, CP4, and P3, whereas the difference between general and atypical conclusions was found at Fz, F4, FCz, FC4, Cz, C4, CPz, CP4, and Pz (ps < 0.05).

LPC (500-800 ms).
The main effect of premise on LPC was significant, pair-wise comparison found that the LPC elicited by a general premise was larger than that arising from typical and atypical premises at both 500-600 ms and 600-700 ms (ps < 0.03), whereas it was just larger than that from an atypical premise at 700-800 ms (p = 0.042). Furthermore, the interaction between premise and conclusion was significant at 500-700 ms, but not at 700-800 ms. At 500-600 ms, the mean amplitude of LPC elicited by atypical conclusion was smaller than that from a typical conclusion only in general premise condition (p < 0.03), but not in typical and atypical premise conditions (p > 0.50). At 600-700 ms, however, only larger LPC for atypical conclusion relative to typical conclusion was found in typical premise conditions. The main effect of frontality, as well as laterality, was significant at 500-800 ms. The LPC component in central, central-parietal, and parietal sites was larger than frontal, and frontal central, sites (ps < 0.05). In addition, the LPC at right sites was larger than that at left and middle sites (ps < 0.01). Furthermore, the interaction between laterality and premise, as well as laterality and conclusion, was significant at 500-600 ms and 600-700 ms. The larger LPC elicited by general premise relative to typical and atypical premises was found in middle and right sites (p < 0.05), but not in left sites. Although the absolute difference among conclusions was larger at right sites relative to middle and left sites, they did not reach a significant level. The interaction between frontality and conclusion was significant at 600-700 ms and 700-800 ms. Further analysis indicated that the LPC elicited by atypical conclusions was larger than that from typical conclusions at parietal sites (p = 0.03). The three way interaction among frontality, laterality, and conclusion was significant. Further analysis indicated that the difference between general and atypical conclusions was found at F4 and FC4 (ps < 0.05), whereas the difference between typical and atypical conclusions was only found at P4 (ps < 0.01). No other difference was found.
We also analysed the link between behaviour and the evoked components of interest across subjects. The results found that there was no significant correlation between RTs and the mean amplitudes of N1, P2, N2, P3, and LPC for T-T, A-A, and G = G conditions. For the G-T, G-A, T-G, A-G, and A-T conditions, there were significant negative correlations between RTs and the LPC amplitudes at part of the frontal and frontal-central sites. The significant negative correlation between RTs and P2 amplitudes was only found for T-A condition at frontal sites. No other significant correlation was found.

Discussion
The main goal of this study was to examine how types of premises modulate the typicality effect in category-based induction by focusing on the electrophysiological data elicited by the conclusions. The behaviour data showed that the processing of arguments with typical conclusions was faster than that of arguments with atypical ones in both general and typical premise conditions, but not in atypical premise conditions. Furthermore, the "strong" rate for arguments with typical conclusions was higher than that for atypical ones under all premise conditions. According to recent accounts 48,54 , the typicality was determined by the intercorrelation of semantic features. That is, the features that highly intercorrelated with other typical items were posed by typical members of the category relative to the atypical items. For example, sparrow and crow are typical birds because of their typical intercorrelated semantic features (e.g., wings, can fly), while ostrich is an atypical item because of its less intercorrelated features (e.g., cannot fly). In the present study, there are more common attributes between the premises and typical conclusions relative to atypical conclusions, resulting in higher inductive strength for typical items. These results were consistent with previous studies 1,4 , which suggested that arguments consisting of typical members can improve the acceptability of induction relative to arguments consisting of atypical members. One might suggest perhaps that typical items are more rapidly/easily processed than atypical items, leading to a greater ease (and more confidence) when people must make inferences about these items belonging to categories. However, such a view could not explain why no significant difference in RTs was found between typical and atypical conclusions in the atypical premise condition. Furthermore, the "strong" rate for arguments with general conclusions was higher than atypical conclusions. A smaller P2 and N2, as well as a larger P3 and LPC at 500-600 ms, were elicited by general conclusions relative to atypical conclusions. These results are consistent with the research about the inclusion fallacy, in which generalisation for general conclusions is considered stronger than that for atypical conclusions in the specific premise conditions 1,55 . For example, Liang et al. 55 found that the left fronto-temporal and superior medial frontal systems were specifically activated in response to fallacious responses (e.g., the argument such as "robins secrete uric acid crystals, therefore, birds secrete uric acid crystals" was considered more convincing than the argument that "robins secrete uric acid crystals, therefore, ostriches secrete uric acid crystals").
The typicality effect was also present in the ERP data. As shown in Figs 2 and 3, when performing a category-based induction in general premise condition, a larger P2, as well as smaller P3 and LPC effects, were elicited by atypical conclusions relative to typical ones. The results in specific premise condition had different response patterns: larger P2 and LPC effects were found for atypical conclusions relative to typical conclusions in the typical premise condition, whereas only a larger P2 was found for atypical conclusions in the atypical premise condition. The divergence of P2, P3, and LPC effects for the typicality effect in general, and specific, premise conditions yielded a new insight into the neural underpinnings of the typicality effect in category-based induction task.
N1 and P2 predicts perceptual processing and feature detection. Previous studies found a larger N1 effect for atypical stimuli relative to typical stimuli 6,12,13 . In fact, the N1 effect on the typicality effect was mainly found in studies using signal words or pictures as stimuli 12,13 . Although Lei et al. 6 used a category-based deduction task, the conclusions consisted of only one word and a larger N1 was elicited by atypical ones. In the present study, a word was linked with a capital letter linked by a blank space to represent the argument, and this manipulation might reduce the difference in N1 effect. This might be the reason why, although the absolute N1 amplitude elicited by atypical conclusions (− 1.85 μ V) was larger than that for typical conclusions (− 1.35 μ V) in general premise conditions, the difference was not significant (ps > 0.23).
Similar to the findings of a previous study 6 , we also found a larger P2 effect for atypical conclusions relative to typical ones. As Lei et al. 6 said, these results indicated that the typicality effect had a significant effect on the verification of an item's category membership. In fact, research has found that the verification of typical members was faster than that of atypical ones 4 . As mentioned earlier, the P2 effect was modulated by higher-order perceptual processing [23][24][25] , as well as semantic processing and language 26,27 . In the present study, the detection of atypical items might involve more cognitive resources (e.g., attention processes increased with the organisational semantic demand), and elicited a larger P2 amplitude, just like the larger P2 elicited by Chinese characters with low combinability and consistency in orthographic and phonological processing 26,27 . This view can also be used to interpret the larger P2 effect for atypical conclusions relative to general conclusions, because the processing of the general category data at a basic level was faster than that for atypical members at subordinate level 56 .
The P3 wave is specifically tuned to categorisation. Of greater interest was the observation of smaller P3 effect for atypical conclusion relative to typical conclusion in general premise conditions. As mention earlier, the P3 wave reflected the information-processing cascade related to attentional and memory mechanisms 23,31,32 . In the present study, the observed differences in P3 effect might reflect the attentional resource allocation needed for reasoning from general category to its typical or atypical members. That is, the smaller P3 effect indicated that the processing of atypical members involved more attentional resources during category-based induction. This view was further supported by the result that a larger P3 was elicited by general conclusion relative to typical and atypical conclusions, because the reasoning about general conclusions (especially in the G = G condition) was the simplest argument, which mobilised minimal cognitive resource and elicited the largest P3.
A related explanation was that the difference in P3 amplitude just reflected the satisfaction of expectations 19,34-37 , which was larger for matched conclusions than mismatched ones. In the general premise condition, a larger P3 amplitude was elicited by typical conclusions relative to atypical ones, because typical members have more features in common with the prototype induced by the category 57,58 . In fact, these results were consistent with data from studies during category-based verification, which found larger P3 peak amplitudes arising from typical items relative to atypical ones 11 . However, such differences in P3 amplitude between typical and atypical conclusions were not found in typical and atypical premise conditions. This may be because there was no significant difference in the expectations about the conclusions when reasoning under specific premise conditions. That is, different cognitive processes might be involved in reasoning with general and specific premises 1,59 . According to Scientific RepoRts | 6:37890 | DOI: 10.1038/srep37890 the similarity-based induction model, for example, the strength of the arguments was determined by the similarity and coverage processes 1 . Specifically, the similarity process refers to a calculation of the extent to which the premise categories are similar to the conclusion category, whereas the coverage process needs to calculate the degree of similarity between the premise categories and members of the lowest level category that includes both the premise and the conclusion categories. The induction using specific premises included both similarity and coverage processes, while only coverage process was involved in the general premise condition 1 . In the present study, additional processes (e.g., similarity) were involved in specific premise conditions, which reduced the effect of expectation or increased task difficulty, reducing the sensitivity of P3 thereto.
LPC semantic content effects in the inference generation phase. More importantly, the LPC at 500-600 ms elicited by atypical conclusion was smaller than that arising from general and typical conclusions in general premise conditions, whereas the LPC at 600-700 ms elicited by atypical conclusion was larger than that for typical conclusions in typical premise conditions. Furthermore, no significant difference in LPC effect was found in atypical premise conditions. As mentioned earlier, LPC in a reasoning study reflected mismatched conclusions 19,35,37 , as well as the decision accuracy and confidence 46 .
In the general premise conditions, one plausible view was that the LPC effect among different conclusions reflected the old/new effect, as well as the contextual effect 42,43 . That is, for the G = G condition, same general category was presented in the premise and conclusion, and characterised by increases in LPC amplitude. Although the processing of typical members in the G-T condition was not repeated, participants could encode more information about the typical conclusion from the context at general category level, and elicited a larger LPC relative to that arising under atypical conclusions. Another plausible view was that the difference in LPC at 500-600 ms between typical and atypical conclusions was just a continuous P3 effect as mentioned above, which reflected attentional and memory allocation. In fact, the LPC at 500-600 ms elicited by atypical conclusions was also smaller than that arising from general conclusions in general premise conditions, which further supported this view.
In the typical premise condition, however, no item was repeated and the context of typical members could not lead to this difference. Similar to the findings of previous studies 35,37 , one plausible interpretation was that the LPC was more likely to reflect a mismatched conclusion. For example, research using deductive reasoning tasks has found that a larger LPC effect at 330-630 ms was elicited by a mismatched conclusion relative to that from a matched conclusion, which reflected the inference violation. In the present study, the mismatch in reasoning from typical premise to atypical conclusion was higher than reasoning from typical premise to typical conclusion, and thus elicited a larger LPC at 600-700 ms. Another plausible view was that the LPC at 600-700 ms reflected the processing manipulation, or decision accuracy and confidence 22,46 . That is, when participants were required to make inferences about different types of conclusion in the typical premise condition, the inductive strength of, or confidence in, atypical conclusions was smaller than that for typical conclusions, as they were not necessarily true and had different representatives. Or, to make a similar inductive inference, participants had to encode the atypical conclusion more deeply, which thus elicited a larger LPC.
The divergence of the LPC effect between typical and atypical conclusion in general and typical premise conditions might be caused by reasoning processing and the validity thereof. As mentioned earlier, the information about the conclusions in the general premise condition must be true, whereas in the typical premise condition they were not necessarily true. Furthermore, different processes (e.g., similarity and coverage) might be involved in the specific and general premise conditions, resulting in different LPC effects 1 . However, we did not dissociate the general premise from the specific premise by comparing ERP data directly. One reason was that previous studies have found significant differences in ERP waves elicited by categories with different hierarchical levels, such as a basic level, superordinate, and subordinate categories 60,61 . For example, ERP amplitudes between 320 and 420 ms reflected the differentiation between basic level and superordinate categorisations, whereas the dissociation of subordinate and basic level categorisations was mainly found at 450-550 ms 61 . As a result, it was difficult to determine whether, or not, the results were induced by conceptual processing, or reflected inference processing, even though we found significant differences between general and specific premises (e.g., LPC). Future research into this issue was deemed necessary.
More generally, diverging evidence was found for the typicality effect in category-based induction under different premise conditions. That is, although P2 has a stable reaction to the typicality effect under all premise conditions, the P3 and LPC at 500-600 ms were only sensitive to the typicality effect under general premise conditions, and the LPC at 600-700 ms was only found for the typicality effect under typical premise conditions. These results are consistent with the dual-process accounts of reasoning, which posited that there are two distinct cognitive systems underlying reasoning 62,63 . Specifically, system 1 was largely fast, automatic, unconscious, and competed with an analytic system, whereas system 2 was slower, deliberative, conscious, and based on rules. In the present study, the stable differences in P2 effect might be related to system 1, whereas the diverging evidence pertaining to P3 and LPC were related to system 2. By manipulating the premise-conclusion similarity (or argument length) and logical validity of arguments, recent studies indicated that a two-process account of reasoning was more suited to interpretation of the relationships between inductive reasoning and deductive reasoning 17,18 . For example, for a common set of arguments, induction judgments ("strong" or "not strong") were more affected by the premise-conclusion similarity, whereas deduction judgments ("valid" or "not valid") were more affected by validity 1,17 . In the present study, the typicality might be another qualitative phenomenon that could be used to support the two-process account of reasoning, just like the premise-conclusion similarity 17 . However, only induction judgments were included in the present study, and further study including deduction judgments might help to shed light on this issue. As the spatial resolution of the ERP is rather low, any future study should combine the fMRI recordings with ERP recordings to explore this issue further.

Conclusion
The present findings have yielded new insights into the processing of typicality effect in a category-based induction by manipulating the types of premise and conclusion. The larger P2 effect for atypical conclusions relative to typical conclusions, which was found in all three premise conditions, served as markers of typicality processing and features detection. The larger P3 effect for typical conclusions relative to atypical conclusions, which was found only in general premise condition, indicated that the P300 amplitude could reflect resource allocation. Furthermore, a larger LPC effect at 500-600 ms was elicited by typical conclusions relative to atypical conclusions in general premise conditions, which reflected the contextual effect between a category and its members, or just a continuous P3. However, the LPC at 600-700 ms elicited by typical conclusions was smaller than that arising from atypical conclusions, which is more likely to reflect inference violation or processing manipulation. Overall, these results suggested that the types of premises had critical regulatory roles in the typicality effect in category-based induction, which reflected on the P2, P3, and LPC effects.