Figure 2: The combined ChemDataExtractor and modified Snowball pipeline. | Scientific Data

Figure 2: The combined ChemDataExtractor and modified Snowball pipeline.

From: Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction

Figure 2

Article sentences are parsed with the built-in ChemDataExtractor phrase parsers. Any incomplete records that contain properties but no associated compounds are passed to the modified Snowball algorithm pipeline as a candidate for relationship extraction. The candidate sentence is split into its elements and vectorised to form a candidate phrase object. This is then compared to the pre-trained extraction patterns. A similarity measure between the candidate phrase object and these extraction patterns is used to assign a confidence score using Equation 5. If the confidence score is sufficiently high, the relationship within the candidate phrase object is accepted. All complete records are then passed through the ChemDataExtractor interdependency resolution stage and added to the database.

Back to article page