BatteryText Classifier

Input a paragraph (abstract or full text) to classify the battery text.

Text: Computational studies based upon density functional theory (DFT) have been carried out on the LixNi0.5Mn0.5O2 system, a promising cathode material for rechargeable lithium batteries. Electronic structure calculations suggest that the nominal valence state distribution is given by the formula . Possible Ni−Mn cation ordering schemes in the layered structure have been examined including intralayer and interlayer configurations. The results on lithium deintercalation of LixNi0.5Mn0.5O2 indicate that the electrochemical behavior is linked to the oxidation of Ni2+. Our calculated cell voltage range as a function of lithium content (x) is compatible with electrochemical measurements that generally show sloping voltage profiles. The calculated Mn−O bond length shows relative invariance with Li extraction, whereas the Ni−O bond shortens significantly, which accords well with the available structural data.

Source: Chem. Mater. 2003, 15, 22, 4280–4286

Text: A database of battery materials is presented which comprises a total of 292,313 data records, with 214,617 unique chemical-property data relations between 17,354 unique chemicals and up to five material properties: capacity, voltage, conductivity, Coulombic efficiency and energy. 117,403 data are multivariate on a property where it is the dependent variable in part of a data series. The database was auto-generated by mining text from 229,061 academic papers using the chemistry-aware natural language processing toolkit, ChemDataExtractor version 1.5, which was modified for the specific domain of batteries. The collected data can be used as a representative overview of battery material information that is contained within text of scientific papers. Public availability of these data will also enable battery materials design and prediction via data-science methods. To the best of our knowledge, this is the first auto-generated database of battery materials extracted from a relatively large number of scientific papers. We also provide a Graphical User Interface (GUI) to aid the use of this database.

Source: Sci. Data. 2020, 7, 1, 1-13

Text: The emergence of “big data” initiatives has led to the need for tools that can automatically extract valuable chemical information from large volumes of unstructured data, such as the scientific literature. Since chemical information can be present in figures, tables, and textual paragraphs, successful information extraction often depends on the ability to interpret all of these domains simultaneously. We present a complete toolkit for the automated extraction of chemical entities and their associated properties, measurements, and relationships from scientific documents that can be used to populate structured chemical databases. Our system provides an extensible, chemistry-aware, natural language processing pipeline for tokenization, part-of-speech tagging, named entity recognition, and phrase parsing. Within this scope, we report improved performance for chemical named entity recognition through the use of unsupervised word clustering based on a massive corpus of chemistry articles. For phrase parsing and information extraction, we present the novel use of multiple rule-based grammars that are tailored for interpreting specific document domains such as textual paragraphs, captions, and tables. We also describe document-level processing to resolve data interdependencies and show that this is particularly necessary for the autogeneration of chemical databases since captions and tables commonly contain chemical identifiers and references that are defined elsewhere in the text. The performance of the toolkit to correctly extract various types of data was evaluated, affording an F-score of 93.4%, 86.8%, and 91.5% for extracting chemical identifiers, spectroscopic attributes, and chemical property attributes, respectively; set against the CHEMDNER chemical name extraction challenge, ChemDataExtractor yields a competitive F-score of 87.8%.

Source: J. Chem. Inf. Model. 2016, 56, 10, 1894–1904