Author Archive

Workshop on sustainable exploration of chemical spaces with machine learning

02 Jun 2025

Digital Discovery is pleased to support the SusML Workshop 2025!

The rising demand for sustainable machine learning (ML)-assisted solutions to technological and societal challenges has driven significant research and development efforts in materials science and computational chemistry. Despite notable progress, challenges remain in developing Efficient, Accurate, Scalable, and Transferable (EAST) methodologies that minimize energy consumption and data storage while creating robust ML models. The SusML workshop (https://susml.net) aims to bring together renowned scientists and emerging junior researchers pioneering advancements at the intersection of materials science, chemistry, and ML. The workshop will focus on fostering dynamic discussions and generating innovative ideas for developing EAST methodologies—a critical element for sustainable exploration (both directly and inversely) of the chemical space encompassing molecules and materials.

Deadline: Applications and abstract submissions will be accepted until June 15, 2025. See details at https://susml.net/#Application

Venue: Max Planck Institute for the Physics of Complex Systems, Dresden, Germany.

Invited speakers

David Balcells (University of Oslo)
Ganna Gryn’ova (University of Birmingham)
Anatole von Lilienfeld (University of Toronto)
Hanna Türk (École Polytechnique Fédérale de Lausanne)
Anton Bochkarev (Ruhr-Universität Bochum)
Veronika Juraskova (University of Oxford)
Volker Deringer (University of Oxford)
Jacqueline Cole (University of Cambridge)
Johannes Margraf (Universität Bayreuth)
Luca Ghiringhelli (Friedrich-Alexander-Universität)
Rico Friedrich (Technische Universität Dresden)
Janine George (Bundesanstalt für Materialforschung und -prüfung)
Thorben Frank (Technische Universität Berlin)
Adrian Ehrenhofer (Technische Universität Dresden)

Organizers

Leonardo Medrano Sandonas (Technische Universität Dresden)
Mariana Rossi (MPI for the Structure and Dynamics of Matter)
Alexandre Tkatchenko (University of Luxembourg)
Milica Todorović (University of Turku)
Gianaurelio Cuniberti (Technische Universität Dresden)

Contact: susml@tu-dresden.de

Comments Off on Workshop on sustainable exploration of chemical spaces with machine learning

Call for papers celebrating the International Year of Quantum Science and Technology 2025

24 Apr 2025

A banner summarising the information in this post.

We are delighted to announce a call for papers celebrating the UNESCO International Year of Quantum Science and Technology 2025. This collection across a selection of our materials, nanoscience, physical chemistry and interdisciplinary journals is now open for submissions.

The submission deadline is 1 October 2025.

For this broad collection of articles celebrating Quantum Science and Technology we encourage contributions on topics including, but not limited to:

New quantum mechanical computational chemistry methods
- Focusing on new methods to provide expanded variability (customization) to programs and algorithms applied to molecular and materials discovery.
Studies on materials and nanostructures which exploit quantum effects
- The engineering and investigation of materials and nanostructures that exploit QM effects. The collection seeks papers that offer insights into the understanding of Quantum effects or mechanistic insights rather than routine experimental studies that focus on material / device performance.
Cross-disciplinary studies looking at quantum effects in molecular systems
- Studies that bridge chemistry with adjacent disciplines to understand electronic and fundamental effects such as quantum dot cellular automata.
Applications of quantum computing in chemistry
- The design of new quantum algorithms, and application of existing algorithms, in the calculation, prediction and design of atomic, molecular, and materials properties.

This collection will be hosted across the following journals.

Chemical Science, Chemical Communications, RSC Applied Interfaces and RSC Advances

Digital Discovery and Physical Chemistry Chemical Physics

Materials Horizons, Journal Materials Chemistry A, Journal Materials Chemistry B, Journal Materials Chemistry C, and Materials Advances,

Nanoscale Horizons, Nanoscale and Nanoscale Advances

We hope you will accept our invitation to contribute to this collection. If you are interested, please contact us at journals@rsc.org and let us know which journal you would like to contribute to. If you have any questions, we would be delighted to send you more information. If you are unsure which journal would be the most suitable for your work or would like to check a topic’s suitability for a journal, we would be happy to help.

Publishing open access with RSC journals unlocks the full potential of your research – bringing increased visibility, wider readership and higher citation potential to your work. As a not-for-profit organisation serving the chemical sciences community we ensure that our article processing charge (APC) remains the most competitive of major publishers. More details can be found here. You can also use our journal finder tool to check if your institution currently has an agreement with the RSC that may entitle you to a discount or fully cover the APC.

Articles will be added to the collection as soon as they are accepted, and promotion of the collection is scheduled for the end of 2025. Please mention the collection name “Quantum Science and Technology” when you submit your manuscript. Please note that all submissions will undergo peer-review in the usual manner and must comply with each journal’s usual journal scope and standards.

Comments Off on Call for papers celebrating the International Year of Quantum Science and Technology 2025

Call for papers – Quantum Computing in Chemistry, Material Science and Biotechnology

06 Mar 2025

A slide promoting this open call with photographs of the Guest Editors.

Digital Discovery is delighted to welcome papers for its latest themed collection on Quantum Computing in Chemistry, Material Science and Biotechnology themed collection of Digital Discovery, led by Dr Matthias Degroote (Boehringer Ingelheim Quantum Lab), Prof. Joonho Lee (Harvard University) and Dr Pauline Ollitrault (QC Ware Corp.). If you do not directly work in this field, please do feel free to forward this email to any of your colleagues that might be interested in contributing to this themed collection.

Contributions are welcome in both theory for and applications of quantum computers in chemistry, material science and biotechnology. We would especially like to encourage manuscripts that expand the current area of applicability of quantum computers and introduce innovative ways to discover, characterize and produce new molecules. We will consider near-term and fault-tolerant algorithms as well as improvements over current algorithms and entirely new workflows.

We encourage submissions on topics including, but by no means limited to:

Synergies between classical and quantum computers that leverage the strengths of both.
Use of machine learning and data to bring down the cost of quantum computation.
Tailored algorithms for specific subsets of chemical systems or types of interaction.
Prediction of chemical properties with data that can efficiently be extracted from a quantum computer.

The deadline for submissions 11 August 2025.

If you would like to contribute to this collection, please let us know by email at digitaldiscovery-rsc@rsc.org, and we will set up a submission link for you to contribute your article.

Promotion of the collection is scheduled for promotion in late 2025, with articles published online as soon as they’re accepted. Authors are welcome to submit original research in the form of a Communication or Full Paper. Authors who would like to contribute a Review article should contact the Editorial office with their proposal. The Editorial Office reserves the right to check suitability of submissions for both the journal and the scope of the collection, and inclusion of accepted articles in the final themed collection is not guaranteed.

You can find out more detailed information about our journal scope and our valued editorial board members on our website. If you have any questions about the journal or the collection, please contact us at the above address.

Comments Off on Call for papers – Quantum Computing in Chemistry, Material Science and Biotechnology

Introducing “Commit”, a mini article for dynamic reporting of incremental improvements to previous scholarly work

23 Jan 2025

Digital Discovery is pleased to introduce a new article type, “Commit”, a mini article for dynamic reporting of incremental improvements to previous scholarly work. This new type of article allows the community to share changes to work published in Digital Discovery articles, whether this is one’s own work or another’s. We see Commits as citable articles describing the changes made to a project, which could be a full manuscript, or an open hardware or software project published in the journal.

Some examples of Commits could include:

Hardware articles: a device which has the same motivation and use but has an improvement in capabilities or construction.
Software articles: addition of features or improvement of capabilities.
Data: incorporating additional data while keeping the underlying schema the same (for example, new data which has been added since the last article).

Commits are expected to be shorter than a full article, although there is no rigid page limit. We expect that most of the improvements will be present in associated code/data repositories or supporting information associated with the work.

To find out more about preparing, submitting, and citing Commit articles, read our Editorial at DOI: 10.1039/D4DD90053G. We welcome queries or comments by email to the journal’s Editorial Office at digitaldiscovery-rsc@rsc.org.

Comments Off on Introducing “Commit”, a mini article for dynamic reporting of incremental improvements to previous scholarly work

Large language model expert? Review papers for Digital Discovery

11 Dec 2024

A banner inviting readers to become reviewers for Digital Discovery

With the increasing application of large language models (LLMs) in automation and data analysis, Digital Discovery is looking for experts in LLMs to act as peer reviewers. If you would like to take part, please follow the instructions below. Reviewers who have registered their interest will be entered into a prize draw to win an exclusive Digital Discovery mug in March of 2025!

If you have authored or reviewed for us previously, you can log in to your account at https://mc.manuscriptcentral.com/dd and update the “Research Interests” section of your profile to mention “LLMs”, and/or “large language models”. If you don’t currently have an account you can sign up at https://rsc.li/become-a-reviewer, and then complete your Research Interests once the process is complete.

If LLMs are not one of your areas of expertise, but you would be interested in reviewing other papers for Digital Discovery, please let us know, and update your research interests and keywords as mentioned above. We are also interested in recruiting reviewers to assess authors’ datasets and codes – please see this link for more information.

If you have a colleague who is an expert in LLMs, or who would be interested in reviewing for Digital Discovery in general, please feel free to pass this information to them!

Comments Off on Large language model expert? Review papers for Digital Discovery

Digital Discovery Webinar: Artificial Intelligence and Data in Drug Discovery and Development

01 Oct 2024

Digital Discovery invites you to this webinar on opportunities, challenges and techniques in the use of AI and data in drug discovery and development.

Featuring Maximilian Jakobs (DeepMirror), Andreas Bender (University of Cambridge) and Nessa Carson (AstraZeneca), this 90-minute seminar will explore key ideas and case studies, challenges in achieving tangible process improvements, and approaches to interfacing AI, data and robotic systems with pharmaceutical R&D.

Register now!

Program

1400 GMT – Welcome
1405 GMT – Introduction to Digital Discovery, Anna Rulka (Executive Editor, Digital Discovery)
1410 GMT – What is AI, and Why Does It Matter?, Maximilian Jakobs (DeepMirror)
1435 GMT – Aspects of Life Science Data and Translation, Andreas Bender (Cambridge University)
1500 GMT – AI and data in the process development space, Nessa Carson (AstraZeneca)
1525 GMT – Final questions and close

This webinar is free to attend wherever you are, and can be watched either live or on-demand at a time that’s convenient to you. We hope you can join us!

Comments Off on Digital Discovery Webinar: Artificial Intelligence and Data in Drug Discovery and Development

Guest post: The evolving roles of data and citations in journal articles

17 Sep 2024

The evolving roles of data and citations in journal articles

Henry S. Rzepa^a

^aEmeritus Professor of Computational Chemistry, Department of Chemistry, Imperial College London.

Background

The last thirty years have seen enormous changes in the so-called scientific journal model, first introduced some 350 years ago as a paper based medium. The typical journal article in say the chemical sciences has evolved during this period to contain a traditional narrative structure such as an introduction or background to the topic, the presentation of results and data, conclusions drawn from the data, experimental procedures to enable replication and a bibliographic section where relationships to other work can be cited. Such a serial narrative format has itself come under scrutiny, as for example a recent publishing experiment involving its dissection into eight smaller units of publication, potentially with their own structures and authorship and each of which could stand on their own merits, but which can also be assembled to reconstitute an overarching synoptic journal article.¹ The electronic journal era of the last 30 years has also brought with it experiments in how the various constituents of the traditional journal article might be digitally exploited. An example² dating from the start of the e-journal period showed how selected articles in the journal Chemical Communications could be enhanced with “pop-up” interactive molecular models based on 3D coordinate data provided by the authors, thus augmenting the static views provided by conventional figures.

In the present commentary, the focus will be on two other ways of digitally exploiting the medium of the journal, both driven by the extraordinary recent attention given to artificial intelligence or machine learning and questions such as whether the current publishing models need to be prepared for this new era. These are how the availability, discovery and the properties of data associated with journal articles is being improved and secondly of citation enhancement, both being facets of the publication processes and which turn out to be closely inter-related.

Journals and Data

For much of the history of publishing in e.g. chemistry, the data behind a research article has been integrated into the article in the form of tables of numerical results and/or figures derived from these data, along with graphical schemes illustrating other aspects such as molecular structures and associated reactions and mechanisms. Isolated numerical data could often be simply integrated into the text-based narrative. This became impractical when the tables of numerical data swelled in size – an example being e.g. crystallographic information from the 1950s onwards. Procedures for printing this information and then depositing the print copy in a national library or other central resource were introduced and this became more common for a short period during the 1970s.³ In order to re-use such data, an interested reader would have to re-type the numerical information in order to absorb it into say a computer for analysis, and then spend a fair bit of time trying to ensure no errors had been introduced by this process. From the mid 1990s, this paper-based form thankfully started being replaced by “electronic printing” into the PDF format, when it became known as ESI or electronic supporting information – a mechanism that still dominates to this day. Over the last decade however, it has been increasingly recognised⁴ that ESI is not an optimal medium for use in areas such as e.g. artificial intelligence and machine learning (abbreviated AI/ML here), for which specifically structured and semantically rich information is essential or at least greatly helpful.⁵

Journals and Citations

It is appropriate at this point to interleave citations into the discussion. These have their own fascinating history! In the 19^th and early 20^th century, citations in an article were often sparse and cryptic, with journal references heavily abbreviated, possibly to save type-setting effort. I cannot resist citing⁶ this article by Niels Bohr dating from 1922 as an extreme example. Probably one of the most influential articles of that century – leading to a Nobel prize no less – it contains no citations either as footnotes or endnotes and instead, individuals contributing to the area are acknowledged throughout the text. Nonetheless, by the second half of the 20^th century, most research articles had fully separated citations into a discrete list at the end of the article. Arguably, these lists were often mis-used by inclusion of text-based footnotes extending the discussion of the main body of the article. Individual numbered citations could themselves contain sub-lists of journal references associated by an inferred common theme and of hoped-for relevance to the discussion. Such lists started suffering from the same issues as ESI, in other words an apparently lack of the formal structures and declared semantics so helpful for AI/ML; These will be referred to as unstructured citations for reasons that will shortly become apparent.

Journals and Metadata

It is time to introduce the unifying concept of metadata, this being structured and controlled descriptions of a body of data or of a narrative and including simple components such as authorship, article titles, abstracts, affiliations and provenance and publication dates. These formal structures now allow metadata to be more easily processed and analysed using AI/ML methods and provide infrastructures for obtaining for example metrics relating to research impacts. Whereas the commercial models that many publishers used in the past in the era before open-access would result in access to the digital journal article itself being paywall-protected in some manner, the metadata associated with that article was not so protected and was made readily available for use by anyone. In 2000, the Crossref organisation⁷ was set up by a consortium of publishers, libraries, research institutions and funders to accept, store, curate and disseminate this metadata, and Crossref issued what is known as a persistent identifier (the DOI is a specific example of such a PID) to identify the metadata records.

Initially, Crossref metadata did not include the citations from an article, but from 2004⁸ these were added as a discrete component in the form of structured citations. Initial uptake by publishers was slow, but nowadays it is almost universal.⁹ These structured citations of books and journal articles included conventional information such as the author and journal name and the volume and page numbers, but in time these evolved to also include the article DOI, which allows facile and programmatic access to the metadata record for each citation. At this stage a record is introduced for one specific article¹⁰ and its access point in the form suitable for AI/ML applications:

https://api.crossref.org/works/10.1039/D3DD00246B/transform/application/vnd.crossref.unixsd+xml

An example of a structured citation from this record (as of mid 2024) is shown below:

<journal_title>J. Chem. Phys.</journal_title>

<author>Scalmani</author>

<first_page>114110</first_page>

</citation>

If you explore the metadata further, you will soon encounter a slightly different form, which is designated an unstructured citation, arising by virtue of inclusion of a component containing free-text comments. This is how all those citation footnotes, comments and other annotations so beloved by some authors are currently included. In this example, the article DOI itself is also noted, thus rendering the unstructured component somewhat redundant, but this is not always the case!

<volume_title>ChemRxiv</volume_title>

<author>Braddock</author>

<doi>10.26434/chemrxiv-2023-vcmcl</doi>

<unstructured_citation>For a preprint, see, D. C.Braddock, S.Lee and H. S.Rzepa, SWERN Oxidation.

transition structure Theory is OK, ChemRxiv, 2023, preprint, 10.26434/chemrxiv-2023-vcmcl

</unstructured_citation>

</citation>

A third variation in the citation format can also be identified.

<volume_title>Imperial College Research Data Repository</volume_title>

<author>Braddock</author>

<unstructured_citation>

C.Braddock, H. S.Rzepa and S.Lee, Imperial College Research Data Repository, 2023,

10.14469/hpc/13108</unstructured_citation>

</citation>

Here one might infer from the volume title that this is now about data. This is a suitable entry point for the discussion here to rejoin the theme introduced above regarding data and ESI. However, instead of referring to data inside an ancillary PDF file associated with the article, a data DOI is now cited instead. As implied above for article DOIs, this form also has an associated metadata record, being stored, curated and disseminated by DataCite,¹¹ an organisation set up some ten years after Crossref but acting in parallel to allow the citation of data. Unlike data contained in relatively unstructured – or parochially structured ESI documents, this form of data has associated formal descriptors in the metadata record describing the properties of the data. DataCite also allow access to this record, albeit using a slightly different form to that used by Crossref:

https://data.datacite.org/application/vnd.datacite.datacite+xml/10.14469/hpc/13108

The properties as described by such a metadata record constitute information about how Findable, Accessible, Interoperable and Re-usable the data is – properties that became known by the acronym FAIR¹² around 2016 and are important for the application of AI/ML. Note however that again in the citation example shown above, an unstructured component is also included containing the free-text assertion that the data is held in an institutional research data repository. Formally therefore, data is only implied by this form of citation, but at least the metadata record associated with the provided DOI can be used to confirm this. At this stage it is worth noting that around half of all the citations associated with this specific article¹⁰ are of this type, an unusually high proportion. When an assertion is made in the narrative of this article, it can now be supported with a data citation as appropriate. Such multiple and in-context data citation can be contrasted with the conventional data availability statement nowadays found in most journal articles, introduced around 2017 and which often simply points to the single and largely context-free supporting information document listed on the article landing page.

Very shortly the expectation is¹³ that Crossref will modify the unstructured aspect of data citation by a small extension to their schema in the form shown below and hence adding the ability to formalise the citation of data in an article.

<volume_title>Imperial College Research Data Repository</volume_title>

<author>Braddock</author>

</citation>

Formalisation is also proposed by Crossref of the data availability statement alluded to above. In most current articles in this and other journals it appears in the generic form of a Data availability section, where the authors can list how their data can be obtained in the form of e.g. URLs or DOIs. However, this information does NOT currently appear in the Crossref metadata record unless the authors have also included it as an unstructured citation. The proposal is to add it to the metadata record in the form of

<statement type=”data availability”>Data Availability Statement … … </statement>

The content of this statement is still unstructured free-text, but at least it is available for parsing and analysis in ways that might be useful.

At this stage, the assertion above that the two facets of data and citations are in fact closely associated can be summarised as:

Key information about a journal article is now made freely available via its metadata record, a structured and semantically rich format that allows AI/ML processing.
The relationships the article has with other articles is now also present in the form of structured citations in the Crossref metadata record.
Such structured citations should include persistent identifiers such as DOIs with an indication of the type of the citation, such as to a dataset.
The inclusion of persistent identifiers in turn allows AI/ML access to metadata records describing data referred to in the article.

Primary vs processed data

This section contains discussion of two forms of expressions of data in an article, firstly the conventional Tables/Figures/Schemes as contained in the body of the article and secondly the presence of citations allowing specific access to more complete or at least less lossy primary data. The broad distinction is here made that the former representations might constitute processed and interpreted data, whereas ideally the latter types would constitute the more complete data from which the former are derived, such as that obtained from an instrument or output by a computational procedure. Specific examples illustrate the difference between the two.

A form of processed data could be an NMR or frequency domain spectrum presented in association with a chemical structure representation. The combination of the two can be used to confirm the identity of g. the product of a chemical synthesis.
The corresponding primary or raw form would be the time-domain data as produced directly from an NMR instrument, to be converted by g. a Fourier Transform operation to a frequency domain presentation that is more readily analysed. The process of converting the primary data to the processed form is of course lossy; some information at least is lost by this conversion.

A second example derives from computational modelling.

A form of processed data could be a two-dimensional representation or figure corresponding to the highest occupied molecular orbital or the HOMO of a molecule of interest.
The corresponding primary data would be a file containing the full wavefunction calculated for the molecule using a specific model for solution of the Schrödinger equation and presented as loss-free data in the form of a formatted checkpoint or rawbinaryarray file[14] resulting from g. a Gaussian calculation. These forms would allow not only an alternative three-dimensional representation of the HOMO to be generated, but indeed that of any other desired orbital or other property computable from the wavefunction.

The final example is found in the article cited above¹⁰ and relates to the calculation of kinetic isotope effects.

The processed data derives from application of the Bigeleisen model to kinetic isotope effects for deuterium substitution at a specified temperature and for specified atoms, using computer code specified again by a suitable DOI-based citation. It can be presented as numerical values in a table.
The primary data derives from the final calculation checkpoint files, which as well as containing the wavefunction also contain the second derivative force constant matrix, allowing other isotopic substitutions to be made at any location in the molecule and which can be evaluated at any required temperature.

The purpose of including these examples of forms of data is to show that both can be useful! Processed data, in the form of visualisable figures and tables are particularly helpful for the type of perception of complex concepts that humans traditionally excel at. Primary data are useful for access to alternative forms of visualisation, for re-use in a context different to that presented in the body of the article or by application of alternative models to those presented by the original authors, such as might be derived by ML/AI methods. The journal experiment noted above² combined these by accessing the primary data (molecular coordinates) and converting this on the fly to a pop-up visual representation for humans (an interactive 3D model). Even at the simplest level, access to primary data might allow replication of the results quoted in the original article. In the article cited¹⁰ such replication was not always possible because of lack of such primary data associated with the original report.¹⁵

Data Discovery

The examples above illustrate how the various components of a scientific article can be prepared for AI/ML analysis by adding predictable structures to both the citations and the data implicit in the article. There is another important benefit of data citation which is next illustrated, that of data discovery. Finding something in a conventional ESI document is largely limited to searching the free text for appropriate string patterns. The scope of such a pattern search does not extend beyond that document. However, metadata records associated with a dataset are automatically aggregated by the metadata registration agency, being either Crossref or Datacite. Both offer rich structured and federated searches of the metadata across all registered entries, not just of a single ESI document. To illustrate this aspect, the data availability statement in the article discussed above¹⁰ has been modified to include both data availability and discovery. An extended version of the example cited there is shown below:¹⁶

https://commons.datacite.org/?query=(media.media_type:chemical/x-gaussian-log+OR+media.media_type:chemical/x-gaussian-checkpoint)+AND+media.media_type:text/plain+AND+(titles.title:*Endo*+OR+descriptions.description:*Endo*+OR+titles.title:*Exo*+OR+descriptions.description:*Exo*)

If this syntax looks rather long and unwieldy, it is because it is what is called an API (application programming interface) such as used by AI/ML applications (the specific API form of the above is https://api.datacite.org/dois/?query= ). It reveals all datasets derived from using the Gaussian quantum chemical application as restricted by the presence of an additional file containing further information (here the kinetic isotope effects) and by specified title or description keywords, the search being within the global corpus of registered metadata. This extends the scope of the discovery well beyond that of a single ESI document. A way of constraining the search to a particular specified property, namely kinetic isotope effects, would require future community agreement¹⁸ on the vocabulary term and/or scheme to be used for that property. Here a possible such term is invoked by appending +AND+subjects.subjectScheme:*KIE*+AND+subjects.subject:1H/2H to the above search, which constrains the property to KIE and its value to 1H/2H (a hydrogen-deuterium isotope effect).¹⁷ The searches themselves can even be assigned^16,17 a persistent identifier to facilitate discovery by e.g. AI/ML software. The community is here challenged to enable enrichment of the descriptive and relational publication metadata by agreeing wider vocabularies or search terms, thus enabling data discovery to be made ever more specific and accurate.¹⁸

The future

The examples used to illustrate the concepts described above show how a journal article¹⁰ can be very usefully adapted to ensure it is more AI/ML-friendly, with relatively little extra effort required by its authors. Many more innovations associated with both data and citations can be anticipated and that the 350+ year evolution of scientific publishing will continue apace!

Note added after publication

Sara El-Gebali from Datacite has also published a blog post on 20th August 2024 entitled “Connecting the Dots with DataCite DOI Metadata”, which usefully expands upon the discussion in this commentary. This gives a wider range of metadata types that can be used for discovery. See DOI: 10.5438/k81t-zq43

A citable version of this blog post is available on ChemRxiv, at DOI: 10.26434/chemrxiv-2024-dz2dv

References:

¹ The Octopus publishing project, https://www.octopus.ac/about

² D. James, B. J. Whitaker, C. Hildyard, H. S. Rzepa, O. Casher, J. M. Goodman, D. Riddick and P. Murray-Rust, The case for content integrity in electronic chemistry journals: The CLIC project, New Review of Information Networking, 1995, 1, 61–69, DOI: 10.1080/13614579509516846

³ H. S. Rzepa, The Long and Winding Road towards FAIR Data as an Integral Component of the Computational Modelling and Dissemination of Chemistry, Isr. J. Chem. 2022, 62, e202100034, DOI: 10.1002/ijch.202100034

⁴ J. Downing, P. Murray-Rust, A. P. Tonge, P. Morgan, H. S. Rzepa, F. Cotterill, N. Day and M. J. Harvey, SPECTRa : The Deposition and Validation of Primary Chemistry Research Data in Digital Repositories, J. Chem. Inf. Model., 2008, 48, 1571–1581, DOI: 10.1021/ci7004737

⁵ P. Murray-Rust and H. S. Rzepa, Chemical markup Language and XML Part I. Basic principles, J. Chem. Inf. Comput. Sci., 1999, 39, 928, DOI: 10.1021/ci990052b

⁶ N. Bohr, Der Bau der Atome und die physikalischen und chemischen Eigenschaften der Elemente. Zeitschrift für Physik, 1922, 9, 1–67, DOI: 10.1007/BF01326955

⁷ The Formation of Crossref: A Short History, https://www.crossref.org/pdfs/CrossRef10Years.pdf

⁸ See Crossref Schema 2.0.5, 2004, https:// b.archive.org/web/20040202113642/http://www.crossref.org/02publishers/forward_linking_howto.html

⁹ D. Shotton, Publishing: Open citations. Nature, 2013, 502, 295–297, DOI: 10.1038/502295a

¹⁰ D. C. Braddock, S. Lee and H. S. Rzepa, Modelling kinetic isotope effects for Swern oxidation using DFT-based transition state theory, Digital Discovery, 2024, 3, 1496–1508, DOI: 10.1039/D3DD00246B

¹¹ J. Neumann and J. Brase, DataCite and DOI names for research data, J. Comput.-Aided Mol. Des., 2014, 28, 1035–1041, DOI: 10.1007/s10822-014-9776-5

¹² M. Wilkinson, M. Dumontier, I. Aalbersberg, et al., The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data., 2016, 3, 160018, DOI: 10.1038/sdata.2016.18

¹³ Crossref Metadata updates (for public comment) July 2024, https://docs.google.com/document/d/1VPXhTPMZzfvAPmTOlNp-bZf9cTLkw0dPZFTuDtDIPls/

¹⁴ H. S. Rzepa, Quantum chemistry interoperability (library): another step towards FAIR data, 2022, https://www.ch.imperial.ac.uk/rzepa/blog/?p=24543, DOI: 10.59350/mzs83-g6218

¹⁵ T. Giagou and M. P. Meyer, Mechanism of the Swern Oxidation: Significant Deviations from Transition State Theory, J. Org. Chem., 2010, 75, 8088–8099, DOI: 10.1021/jo101636w

¹⁶ H. S. Rzepa, Example of a discovery search procedure, 2024, DOI: 10.14469/hpc/14510

¹⁷ H. S. Rzepa, Example of a discovery search procedure using a subject-constrained search, 2024, DOI: 10.14469/hpc/14517

¹⁸ This is currently being done for e.g. NMR Spectroscopy; R. M. Hanson, D. Jeannerat, M. Archibald, I. Bruno, S. Chalk, A. N. Davies, R. J. Lancashire, J. Lang and H. S. Rzepa, IUPAC specification for the FAIR management of spectroscopic data in chemistry (IUPAC FAIRSpec) – guiding principles, Pure and Applied Chemistry, 2022, 94, 623–636, DOI: 10.1515/pac-2021-2009

Comments Off on Guest post: The evolving roles of data and citations in journal articles

New themed collection in collaboration with Accelerate Conference 2022

20 Aug 2024

Portraits of the three Guest Editors

We’re pleased to announce that a new themed collection from Digital Discovery has now been published online.

Read the collection

This new themed collection represents a collaboration between the editors of Digital Discovery and the Acceleration Consortium, organisers of the Accelerate Conference. The goal of the conference was to explore the power of self-driving labs (SDLs), which combine AI, automation, and advanced computing to accelerate materials and molecular discovery.

This themed collection, Guest Edited by Prof. Keith A. Brown (Boston University, USA), Prof. Fedwa El Mellouhi (Hamad Bin Khalifa University, Qatar), and Prof. Claudiane Ouellet-Plamondon (École de technologie supérieure, Canada), features contributions that cover various aspects of this process, whether specifically presented at the conference or not.

Examples include, realization of new SDLs; fundamental studies of the operation of SDLs; sustainable, resilient, low carbon, materials and chemical discoveries made using SDLs.

A list of the articles has been provided below. All articles in Digital Discovery are open access and free to read.

We hope you enjoy this new themed collection from Digital Discovery.

A new collection to feature contributors to Accelerate Conference 2023 and Accelerate Conference 2024 is currently in preparation – watch this space for more information!

Editorial

Introduction to “Accelerate Conference 2022”
Keith A. Brown, Fedwa El Mellouhi and Claudiane Ouellet-Plamondon
Digital Discovery, 2024, 3, DOI: 10.1039/D4DD90036G

Perspectives

The laboratory of Babel: highlighting community needs for integrated materials data management
Brenden G. Pelkie and Lilo D. Pozzo
Digital Discovery, 2023, 2, 544–556, DOI: 10.1039/D3DD00022B

What is missing in autonomous discovery: open challenges for the community
Phillip M. Maffettone, Pascal Friederich, Sterling G. Baird, Ben Blaiszik, Keith A. Brown, Stuart I. Campbell, Orion A. Cohen, Rebecca L. Davis, Ian T. Foster, Navid Haghmoradi, Mark Hereld, Howie Joress, Nicole Jung, Ha-Kyung Kwon, Gabriella Pizzuto, Jacob Rintamaki, Casper Steinmann, Luca Torresi and Shijing Sun
Digital Discovery, 2023, 2, 1644–1659, DOI: 10.1039/D3DD00143A

Autonomous cementitious materials formulation platform for critical infrastructure repair
Howie Joress, Rachel Cook, Austin McDannald, Mark Kozdras, Jason Hattrick-Simpers, Aron Newman and Scott Jones
Digital Discovery, 2024, 3, 231–237, DOI: 10.1039/D3DD00211J

Papers

A fully automated platform for photoinitiated RAFT polymerization
Jules Lee, Prajakatta Mulay, Matthew J. Tamasi, Jonathan Yeow, Molly M. Stevens and Adam J. Gormley
Digital Discovery, 2023, 2, 219–233, DOI: 10.1039/D2DD00100D

A high-throughput workflow for the synthesis of CdSe nanocrystals using a sonochemical materials acceleration platform
Maria Politi, Fabio Baum, Kiran Vaddi, Edwin Antonio, Joshua Vasquez, Brittany P. Bishop, Nadya Peek, Vincent C. Holmberg and Lilo D. Pozzo
Digital Discovery, 2023, 2, 1042–1057, DOI: 10.1039/D3DD00033H

Neural networks trained on synthetically generated crystals can extract structural information from ICSD powder X-ray diffractograms
Henrik Schopmans, Patrick Reiser and Pascal Friederich
Digital Discovery, 2023, 2, 1414–1424, DOI: 10.1039/D3DD00071K

Driving school for self-driving labs
Kelsey L. Snapp and Keith A. Brown
Digital Discovery, 2023, 2, 1620–1629, DOI: 10.1039/D3DD00150D

Robotically automated 3D printing and testing of thermoplastic material specimens
Miguel Hernández-del-Valle, Christina Schenk, Lucía Echevarría-Pastrana, Burcu Ozdemir, Enrique Dios-Lázaro, Jorge Ilarraza-Zuazo, De-Yi Wang and Maciej Haranczyk
Digital Discovery, 2023, 2, 1969–1979, DOI: 10.1039/D3DD00141E

Towards a modular architecture for science factories
Rafael Vescovi, Tobias Ginsburg, Kyle Hippe, Doga Ozgulbas, Casey Stone, Abraham Stroka, Rory Butler, Ben Blaiszik, Tom Brettin, Kyle Chard, Mark Hereld, Arvind Ramanathan, Rick Stevens, Aikaterini Vriza, Jie Xu, Qingteng Zhang and Ian Foster
Digital Discovery, 2023, 2, 1980–1998, DOI: 10.1039/D3DD00142C

A human-in-the-loop approach for visual clustering of overlapping materials science data
Satyanarayana Bonakala, Michael Aupetit, Halima Bensmail and Fedwa El-Mellouhi
Digital Discovery, 2024, 3, 502–513, DOI: 10.1039/D3DD00179B

Comments Off on New themed collection in collaboration with Accelerate Conference 2022

New themed collection with the NeurIPS AI4Mat 2023 workshop

04 Jun 2024

We’re pleased to announce that a new themed collection from Digital Discovery has now been published online.

Read the collection

The AI for Accelerated Materials Design (AI4Mat) workshop at NeurIPS 2023 featured many of the ongoing major research themes in materials design, synthesis, and characterization by bringing together an international interdisciplinary community of researchers and enthusiasts. The AI4Mat 2023 organizing committee and the editors of Digital Discovery have curated a selection of research papers drawn from some of the most exciting and high-quality paper submissions from the workshop. We are pleased to share these papers, and a perspective on the workshop as a whole, in this themed collection.

You can find the line-up of the collection below. All articles in Digital Discovery are open access and free to read.

Editorial

Perspective on AI for Accelerated Materials Design at the AI4Mat-2023 Workshop at NeurIPS 2023
Santiago Miret, N. M. Anoop Krishnan, Benjamin Sanchez-Lengeling, Marta Skreta, Vineeth Venugopal and Jennifer N. Wei
Digital Discovery, 2024, 3, DOI: 10.1039/D4DD90010C

Communications

Discovery of novel reticular materials for carbon dioxide capture using GFlowNets
Flaviu Cipcigan, Jonathan Booth, Rodrigo Neumann Barros Ferreira, Carine Ribeiro dos Santos and Mathias Steiner
Digital Discovery, 2024, 3, 449–455, DOI: 10.1039/D4DD00020J

A message passing neural network for predicting dipole moment dependent core electron excitation spectra
Kiyou Shibata and Teruyasu Mizoguchi
Digital Discovery, 2024, 3, 649–653, DOI: 10.1039/D4DD00021H

Papers

Connectivity optimized nested line graph networks for crystal structures
Robin Ruff, Patrick Reiser, Jan Stühmer and Pascal Friederich
Digital Discovery, 2024, 3, 594–601, DOI: 10.1039/D4DD00018H

Learning conditional policies for crystal design using offline reinforcement learning
Prashant Govindarajan, Santiago Miret, Jarrid Rector-Brooks, Mariano Phielipp, Janarthanan Rajendran and Sarath Chandar
Digital Discovery, 2024, 3, 769–785, DOI: 10.1039/D4DD00024B

EGraFFBench: evaluation of equivariant graph neural network force fields for atomistic simulations
Vaibhav Bihani, Sajid Mannan, Utkarsh Pratiush, Tao Du, Zhimin Chen, Santiago Miret, Matthieu Micoulaut, Morten M. Smedskjaer, Sayan Ranu and N. M. Anoop Krishnan
Digital Discovery, 2024, 3, 759–768, DOI: 10.1039/D4DD00027G

Gotta be SAFE: a new framework for molecular design
Emmanuel Noutahi, Cristian Gabellini, Michael Craig, Jonathan S. C. Lim and Prudencio Tossou
Digital Discovery, 2024, 3, 796–704, DOI: 10.1039/D4DD00019F

Reconstructing the materials tetrahedron: challenges in materials information extraction
Kausik Hira, Mohd Zaki, Dhruvil Sheth, Mausam and N. M. Anoop Krishnan
Digital Discovery, 2024, 3, 1021–1037, DOI: 10.1039/D4DD00032C

Towards equilibrium molecular conformation generation with GFlowNets
Alexandra Volokhova, Michał Koziarski, Alex Hernández-García, Cheng-Hao Liu, Santiago Miret, Pablo Lemos, Luca Thiede, Zichao Yan, Alán Aspuru-Guzik and Yoshua Bengio
Digital Discovery, 2024, 3, 1038–1047, DOI: 10.1039/D4DD00023D

CoDBench: a critical evaluation of data-driven models for continuous dynamical systems
Priyanshu Burark, Karn Tiwari, Meer Mehran Rashid, Prathosh A. P. and N. M. Anoop Krishnan
Digital Discovery, 2024, 3, DOI: 10.1039/D4DD00028E

We hope you enjoy this new themed collection from Digital Discovery.

Comments Off on New themed collection with the NeurIPS AI4Mat 2023 workshop

Research infographic – Robotically automated 3D printing and testing of thermoplastic material specimens

25 Mar 2024

We’re pleased to share this new infographic on research from Haranczyk et al.

An infographic summarising the linked article.

Read the article here:

Robotically automated 3D printing and testing of thermoplastic material specimens

Miguel Hernández-del-Valle, Christina Schenk, Lucía Echevarría-Pastrana, Burcu Ozdemir, Enrique Dios-Lázaro, Jorge Ilarraza-Zuazo, De-Yi Wang and Maciej Haranczyk, Digital Discovery, 2023, 2, 1969–1979

Comments Off on Research infographic – Robotically automated 3D printing and testing of thermoplastic material specimens

Digital Discovery Blog

Author Archive

Workshop on sustainable exploration of chemical spaces with machine learning

Call for papers celebrating the International Year of Quantum Science and Technology 2025

Call for papers – Quantum Computing in Chemistry, Material Science and Biotechnology

Introducing “Commit”, a mini article for dynamic reporting of incremental improvements to previous scholarly work

Large language model expert? Review papers for Digital Discovery

Digital Discovery Webinar: Artificial Intelligence and Data in Drug Discovery and Development

Register now!

Guest post: The evolving roles of data and citations in journal articles

The evolving roles of data and citations in journal articles

Background

Journals and Data

Journals and Citations

Journals and Metadata

Primary vs processed data

Data Discovery

The future

Note added after publication

References:

New themed collection in collaboration with Accelerate Conference 2022

New themed collection with the NeurIPS AI4Mat 2023 workshop

Editorial

Communications

Papers

Research infographic – Robotically automated 3D printing and testing of thermoplastic material specimens

Categories

Archives

Meta