Spectrum Research, LLC.

 

 

 

 

 

NMR-SAMS

UNIX User’s Guide

 

An expert system for computer-assisted structure elucidation

of organic and natural product compounds based on multidimensional spectroscopy

 

 

 

 

 

 

 

 

 

 

 

 


 

NMR-SAMS User’s Guide, Version 2.4

 

This manual describes release 2.4 of the UNIX SGI/SUN version of NMR-SAMS™.

 

Copyright Notice

Copyright © 1996 through 2002, Spectrum Research, LLC.  All rights reserved.

No part of this document may be reproduced, transmitted, transcribed, stored in a retrieval system, or translated into any language in any form by any means without the written permission of Spectrum Research, LLC.  

 

All possible care has been taken in the preparation of this document but Spectrum Research accepts no liability for any errors/omissions that may be found.

 

Spectrum Research, LLC. reserves the right to change the information in this document without prior notice.

 

Trademarks

SpecManTM and NMR-SAMSTM are trademarks of Spectrum Research, LLC.

 

Acknowledgments

NMR-SAMSTM (originally known as CISOC-SES) has been developed by Dr. Shengang Yuan, Dr. Chen Peng and Prof. Chongzhi Zheng at the Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, P.R. China, 1988-1994.  It has been further improved by Dr. Chen Peng in the group of Dr. Geoffrey Bodenhausen at the National High Magnetic Field Laboratory in 1995-1996.  Portions of NMR-SAMSTM are copyright © 1988 through 1995, Shanghai Institute of Organic Chemistry and Florida State University, and are exclusively licensed to Spectrum Research, LLC.  Title and full ownership rights to the converted/modified NMR-SAMSTM will remain solely with Spectrum Research, LLC, and NMR-SAMSTM is asserted to be Spectrum Research’s proprietary information and trade secret.

 

Credits

If the results (figures and/or data) obtained by NMR-SAMSTM are used for publication purposes, please refer to NMR-SAMSTM in the following manner or any other equivalent form:

" NMR-SAMSTM software, developed by Spectrum Research, LLC., was used to compute the results in this publication".

 

 

 


Table of Contents

Table of Contents. i

Abbreviations And Acronyms. i

Introduction. 1

1.1 General 1

1.2 Application Limitations. 3

1.3 System Requirements. 4

1.4 Help Facility. 4

1.5 Typographical Conventions. 4

1.6 A Note on Operating Systems. 5

Getting Started with NMR-SAMS. 6

2.1 Installation of the Program.. 6

2.2 Spectrum Research Licensing. 6

2.3 Starting NMR-SAMS. 6

2.4 The Basics. 7

2.5 Description of the Main Menus. 9

2.6 The NMR-SAMS Toolbar 10

Understanding NMR-SAMS. 12

3.1 Overview.. 12

3.2 General Procedure of Structure Elucidation with NMR-SAMS. 12

3.3 What Spectral Data Does NMR-SAMS Use?. 13

3.4 Use of 2D NMR Connectivities: Bond Constraints. 13

3.5 Use of Chemical Shifts And Peak Multiplicities. 15

3.6 Structure Generation. 15

3.7 User Intervention. 16

3.8 Control Parameters. 17

Working Data Set 18

4.1 Overview.. 18

4.2 Open An Existing Working Data Set 18

4.3 Opening A New Working Data Set 20

4.4 Input Molecular Formula. 21

4.5 Save A Working Data Set 23

4.6 Save A Working Data Set as Different Name. 23

4.7 Exiting NMR-SAMS. 23

Input of NMR Spectral Data. 24

5.1 Overview.. 24

5.2 Conversion of SpecMan 1H Peak List 24

5.3 Conversion of SpecMan 13C Peak List 27

5.4 Conversion of SpecMan DQF-COSY Peaks Table. 30

5.5 Conversion of SpecMan HMQC/HETCOR Peaks Table. 35

5.6 Conversion of SpecMan HMBC/COLOC Peaks Table. 36

5.7 Conversion of SpecMan NOESY Peaks Table. 38

5.8 Conversion of SpecMan INADEQUATE Data. 38

5.9 Manual Peak Picking. 39

Spectral Interpretation. 41

6.1 Overview.. 41

6.2 Interpretation of MF, 1H, 13C and HMQC Data as Building   Blocks. 42

6.1.1.  Interpretation of Molecular Formula. 42

6.2.2.  Interpretation of 1D 1H Data. 42

6.2.3.  Interpretation of 1D 13C Data. 43

6.2.4.  Interpretation of HMQC/HETCOR Connectivities. 43

6.2.5.  Generation of Building Blocks. 44

6.3 User-Defined Building Blocks. 46

6.4  Interpretation of 2D Spectral Data as Bond Constraints. 48

6.4.1.  Interpretation of COSY Connectivities. 49

6.4.2.  Interpretation of HMBC/COLOC Connectivities. 50

6.4.3.  Interpretation of NOESY Connectivities. 51

6.4.4.  Interpretation of INADEQUATE Connectivities. 51

6.4.5.  Transformation of Bond Constraints. 52

6.4.6.  Setting up Atom-Atom Connection Matrix (ACMX) 55

2D Structure Generation. 57

7.1 Overview.. 57

7.2 User-Defined Bond Constraints. 58

7.2.1.  Interactive Structure Generation. 61

7.3 User-Defined Atom Environment Constraints. 62

7.4 Structure Generation. 64

Resonance Assignment 69

8.1 Overview.. 69

8.2 Input of the Target Structure. 69

8.2.1.  Building a Target Structure in NMR-SAMS. 70

8.2.2.  Importing a Target Structure. 71

8.2.3.  Setting up the Assignment Matrix. 72

8.3 User-Defined Resonance Assignment 73

8.4 Resonance Assignment 73

Quick Enumeration/Elucidation. 77

9.1 Overview.. 77

9.2 MF-Based Structure Generation of Virtual Compounds. 77

9.3 Quick Structure Elucidation. 78

Graphical Display of Results. 79

10.1 Overview.. 79

10.2 Display of Structural Building Blocks. 79

10.3 Display of Target Structure. 80

10.4 Display of Generated Structures/Assignments. 80

10.5 Status Window.. 81

10.6 Display Options. 81

10.7 Editing the Display of Generated Structures. 82

Exporting Results. 84

11.1 Overview.. 84

11.2 Exporting NMR Spectral Data. 84

11.3 Exporting Resonance Assignment 85

11.4 Exporting Candidate or Target Structures. 86

Appendix I: NMR Data File. 87

1D Spectral Data. 87

2D Spectral Data. 87

Appendix II: Master Data File. 89

Appendix III: CCSS-13C Chemical Shift Range Correlation Table. 91

Appendix IV: Control Parameters. 93

Parameters for Spectral Interpretation. 94

Parameters for Setting up ACMX.. 96

Parameters for Structure Generation. 98

References. 102

Index. 103


Abbreviations And Acronyms

d13C                         13C chemical shift.

d1H                          1H chemical shift.

1D                           One-dimensional.

2D                           Two-dimensional.

ACMX                   Atom-atom Connection MatriX, which summarizes the bond-formation probabilities between the constituent atoms of an unknown.

BB                           Structural Building Blocks for structure generation, e.g.,  CH3-, CH2-, and -OH.

BC                           Bond Constraint derived from 2D NMR spectral data, which defines the number of intervening bonds between the correlated spins.

CCSS                      Carbon-Centered Single-spherical Substructure.

COLOC                  COrrelation via Long-range Coupling, a kind of 2D spectrum that provides 2-to-3-bond 13C-1H connectivities.   

COSY                     COrrelated SpectroscopY, a kind of 2D spectrum that provides 1H-1H through-bond connectivities.

CPU                        Central Processing Unit.

DEPT                      Distortionless Enhancement by Polarization Transfer, a kind of 1D spectra that provides information concerning the number of attached protons on each carbon atom.

EC                           Environment Constraint, limitation on the neighboring types of atoms attached to a central atom specified by the user. 

HETCOR                HETeronuclear Correlation, also called C-H COSY, a kind of 2D spectrum that provides one-bond 13C-1H connectivity information.  

HMBC                    Heteronuclear Multi-Bond Connectivity, a kind of 2D spectrum that provides 2-to-3-bond 13C-1H connectivity information.

HMQC                   Heteronuclear Multiple Quantum Coherence, a kind of spectrum that provides one-bond 13C-1H connectivity information.

INADEQUATE     Incredible Natural Abundance Double Quantum Transfer Experiment, a kind of 2D spectrum that provides one-bond 13C-13C connectivity information.

MDF                       The Master Data File produced while using NMR-SAMS for structure elucidation. This file stores the intermediate and final results produced during the execution of NMR-SAMS.

MF                          Molecular formula or empirical formula of a molecule, which is usually derived from mass spectral data.

NMR                      Nuclear Magnetic Resonance

NOESY                   Nuclear Overhauser enhancement and Exchange SpectroscopY, a kind of 2D spectrum that provides 1H-1H through-space connectivity information.

NSBC                     Number of “Sub-bond constraint(s)”, or pair(s) of relevant atoms, that must satisfy a bond constraint in the generated structure.

PSE                         Partial Structure Elucidation.  Structure elucidation based on information available on a portion of the spectral data, which is usually the well-resolved part 


Chapter 1

Introduction

1.1 General

NMR-SAMS (NMR Spectral Assignment Made Simple) is an expert system for computer-assisted structure elucidation of unknown organic or natural product compounds from multidimensional spectroscopy (e.g., MS, NMR, IR and UV) providing complementary information of chemical compounds.  In particular, NMR-SAMS uses information of chemical compounds from routine 1D and 2D NMR spectroscopy.   Together with SpecMan, it serves as a chemist’s workbench for de novo structure elucidation of small molecules such as organic compounds, natural products, peptides, and other small biomolecules.  NMR-SAMS is also used for automated resonance assignment of known compounds.   

 

The basic strategy of structure elucidation using NMR-SAMS is illustrated in Fig. 1.1. When dealing with an unknown compound, the molecular formula (MF) must first be determined by mass spectroscopy or another approach.   Next, the 1D and 2D NMR chemical shifts, multiplicities, J-couplings and intensities are extracted from the processed 1D and 2D spectra (transformed through conventional FFT or Non-FFT techniques) using SpecMan.  The 1D and 2D spectral data extracted as peak lists using SpecMan are imported into NMR-SAMS, and interpreted as structural building blocks and bond constraints based on one-bond, two-bond and other long-range connectivities.  Finally, the building blocks, NMR-derived bond constraints, and other user-defined bond constraints are used to generate plausible candidate structures with resonance assignments.  If the structure is already known, the user can specify the proposed structure and let NMR-SAMS complete the resonance assignments directly.    

 

Figure 1.1. Data flow diagram of NMR-SAMS representing the different phases of spectral interpretation, structure generation and resonance assignment. Gray boxes represent optional input data.  PSE: means partial structure elucidation based on incomplete spectral data. A bond constraint is represented as n intervening bonds, (B)n, between the correlated atoms.

NMR-SAMS has the following main features:

 

·         Input of peak tables with chemical shifts, multiplicities, J-couplings and intensities, from a variety of 1D and 2D NMR experiments.

·         Automated interpretation, bookkeeping, and crosschecking of spectral data with respect to the molecular formula.

·         Novel representation of 2D NMR correlation information based on the concept of chromatic graph.

·         Structure determination and identification of unknown compounds based on complete utilization of 2D NMR correlation information and complementary spectral information from MS, UV and IR spectral data. 

·         Partial structure elucidation of compounds based on incomplete spectral data.  

·         Graphical tools for interactive building and editing of molecular fragments, and for defining bond constraints and atom environment constraints. Graphical tools to display and browse through candidate structures and sub-structures.  Graphical interaction between structures and bond constraints.

·         Background information-independent structure elucidation, which minimizes the potential human bias introduced into the structure elucidation process.

·         Fast structure generation of complex molecules when sufficient constraints are available. 

·         Fast resonance assignment and structure verification of large complex molecules based on proposed structures.  

·         Automated resonance assignment based on assigned resonances of compounds.

·        Flexible format for report generation of the results of spectral and structural analysis. 

1.2 Application Limitations

The current version of NMR-SAMS can only handle molecules that have less than 128 non-hydrogen atoms. The total number of free bonds (unsatisfied valences) of the structural building blocks before structure generation, which determines the complexity of the problem of structure generation, must not exceed 220 (The total number of free bonds is equal to the sum of valences of heavy atoms, less the number of protons and twice the number of known bonds.).  The maximum number of peaks in a 1D and 2D spectrum is limited to 200 and 1000 respectively.  The maximum number of bond constraints is limited to 1000.

 

Most of the previously proposed CASE (computer assisted structure elucidation) systems either use a chemical shift-substructure correlation database or a more concise chemical shift-substructure correlation model, and rely to a large extent on the knowledge of a human expert.  Such systems have been limited to very simple and small molecules.  NMR-SAMS has demonstrated the impact of using 2D NMR correlation information on improving the efficiency of CASE systems when dealing with real-world complex molecules.  For efficient structure elucidation of unknown compounds, NMR-SAMS requires the molecular formula (which may or may not be known accurately from MS or other methods.  If the molecular formula is unknown, NMR-SAMS uses the number of observed carbon and proton peaks along with any available heteroatoms information to estimate the molecular formula), 1D 1H, 13C, DEPT (or APT), and 2D DQF-COSY, HMQC (or HETCOR), HMBC (or COLOC, FLOCK), and INADEQUATE spectral data.  It is not mandatory to have all of these experimental NMR data sets available, because NMR-SAMS can also solve structure elucidation problems with different possible combinations of experimental data (for details refer to Section 3.3).  Structure elucidation based on 1D 13C chemical shifts is only possible for very simple molecules, and is not practical for complex molecules.  NMR-SAMS cannot elucidate unknown structures based solely on 1D 1H chemical shifts.

 

Although most spectra used by NMR-SAMS, e.g., 1D 1H, 2D DQF-COSY and HMBC, are allowed to have peak degeneracy, the 1D 13C spectrum and HMQC (or HETCOR) must be completely resolved for complete structure elucidation.  If severe overlap prevents resolving all of the 13C peaks, NMR-SAMS will use only the well-resolved spectral data to generate the plausible substructures.  This is called partial structure elucidation (PSE).  Some limitations on PSE are described in Section 7.1.

 

In the current version, NMR-SAMS does not consider molecular symmetry, so partial structure elucidation is performed for a molecule with global symmetry.  For a molecule with local symmetry where the 13C signals corresponding to symmetric carbons can be identified, complete structure elucidation by NMR-SAMS is possible.

 

Most of the steps in NMR-SAMS such as interpretation of 1D and 2D data into bond constraints, and generation of the building block sets, are usually performed very fast.   Structure generation, on the other hand, is more time-consuming because of its combinatorial nature.  The efficiency of structure generation (which is a factor of the computation time, the quality of the structure generated, and the number of structures generated) depends on the size of the molecule and the quality and quantity of the spectral data.  When the unknown molecule is big (e.g. with more than 40 heavy atoms) and the correlation information derived from the spectral data is not sufficient, the structure generation could take very long to finish.  In such cases, the user is advised to input as many as known substructures as possible to accelerate the structure generation process.  In addition, the user can also take advantage of some of NMR-SAMS' other tools, such as resonance assignment for verification of proposed structures, and flexible graphics tools for interactive building of structures to solve this problem.

 

Although the spectral interpretation routines of NMR-SAMS are general-purpose, the structure generator of NMR-SAMS cannot deal with molecules containing ionic atoms, tautomeric or coordinate bonds.  It recognizes only single, double and triple bonds.  Aromatic bonds are represented as alternating single and double bonds.  Sometimes this might cause redundancy in the structure generation of aromatic compounds.

 

In the current version of NMR-SAMS, if the structure is already known, then target structure based resonance assignment is possible, provided the NMR data set is complete.

 

Although NMR-SAMS can recognize all chemical elements, the current substructure/d 13C knowledge base (see Appendix III) contains only the substructures consisting of commonly occurring elements, i.e., C, H, O, and N.   The user can customize this knowledge base.  The user will be informed about the undefined substructures when other elements exist in the molecule, and this could reduce the efficiency of structure generation.  

 

NMR-SAMS can be viewed as an expert assistant helping spectroscopists and chemists to solve structure elucidation problems, and is by no means expected to replace the human expert.  NMR-SAMS is designed for flexible human intervention, and efficiently uses the additional user knowledge and judgment to control and enhance the structure elucidation process.  

1.3 System Requirements

The IRIX version of NMR-SAMS runs on SGI systems running IRIX 6.x or higher operating system with R4000 or higher processors and at least 128 MB of RAM or higher and 8-bit graphics.  R8000 or higher processors and 128 MB or more RAM is recommended. 

 

The Solaris version of NMR-SAMS runs on Sun systems running Solaris 2.x (SunOS 5.x) with SPARC processors and at least 128 MB of RAM and 8-bit graphics.  X/Motif 1.2.3 libraries are required.  These are usually supplied with the SUN Common Desktop Environment (CDE).

 

The Microsoft Windows version of NMR-SAMS runs on Pentium or higher processors (or 100% compatibles) with at least 32 MB of RAM running Windows 95/98/2000, or Windows NT 4.0 or later and a VGA or better monitor.  A Pentium II or higher processor with 64 MB or more RAM is recommended. 

 

NMR-SAMS requires from 2 MB to 55 MB of hard disk space, depending on the sample data that is installed.  The sample data with original spectra requires 40MB of hard disk space.  Swap drive space (i.e. virtual memory) required is proportional to the complexity of the data being analyzed.

1.4 Help Facility

NMR-SAMS provides online help information for many of its dialog boxes.  By clicking the Help button, the relevant help message will be displayed.

1.5 Typographical Conventions

Unless otherwise noted in the text, the User’s Guide of NMR-SAMS uses the typographical conventions described below:

·         A command to select is represented in bold typeface by the menu name, the option, and the pull-right option (if any). For example, the command:

Display/Display Options/Chemical Shifts        

means, first click Display menu on the menu bar, then click Display Options in the opened menu.  And then click Chemical Shifts in the pull-right options. 

·         Transcript of a computer file or display is printed in Courier New letters with the keywords shown in bold, and the annotations (if any) in italic Times letters. (Such annotations do not appear in the file or display itself).

ATOM~~ATOM:

For each correlation, listed are the IDs of the correlated atom pair, the range of intervening bonds, and the bond type (0: meaningless or unknown)

(1-23: 1~1 2)
(6-22: 1~1 3)

     .

     .

     .

·         Filenames and parameters are printed in Courier New letter. For example:

Files phasefile and procpar are used for peak picking with SpecMan. 

Parameter GEN_FLAG controls the search criteria of the structure generation.

·         Terms introduced for the first time are presented in boldface type.

·         Words in italic represent variables. For example:

There are n intervening bonds between the correlated atoms.

1.6 A Note on Operating Systems

Spectrum Research has attempted to make its products as similar as possible over the various operating systems.  However, there are some invariable differences that cannot be worked around.  As highest priority, data files have been kept consistent between UNIX and MS Windows machines.

 

It is recommended that the user refer to the online help provided by individual PC vendors for more information on the basics of Operating Systems.  NMR-SAMS follows the interface of the Operating System that it is running on, and therefore, it is important to become acquainted with the Operating System before attempting to learn NMR-SAMS.  See Section 2.4 for information on the basics of the NMR-SAMS Interface.

 


Chapter 2

Getting Started with NMR-SAMS

2.1 Installation of the Program

For instructions on NMR-SAMS installation, please refer to ‘The Release Notes’ or the 'nmrsamsSGI.readme’ and 'nmrsamsSUN.readme' files.

2.2 Spectrum Research Licensing

NMR-SAMS is copy protected by the Spectrum Research Licensing System.  This licensing system allows NMR-SAMS to run only on the computer for which it was sold.  To obtain a valid license file (license.dat) in order to activate NMR-SAMS, please provide Spectrum Research with the System ID from your SGI/SUN workstation.  To retrieve the System ID from an SGI computer, type '/etc/sysinfo' without any options at the UNIX prompt, and the word 'System ID' will be followed by several characters.  The System ID is the first eight characters (for example 69 08 fa b3).  To retrieve the System ID from a SUN computer, type 'hostid' at the UNIX prompt.  This program is usually located in the '/usr/bin' directory, so the user may need to type the full path (e.g. '/usr/bin/hostid') if the program is not located in the search path.  The System ID is the first eight characters (for example 6908fab3).  Once you have received the trial license file, place a copy of it in the …/Spectrum2001/NMR-SAMS directory.

 

When the trial licensing time is nearing expiration, NMR-SAMS will display a dialog box indicating the number of days remaining for the license.  Please contact Spectrum Research for a renewal at that time.

2.3 Starting NMR-SAMS

To launch the NMR-SAMS program, make sure to be in the …/Spectrum2001/NMR-SAMS directory (By default, NMR-SAMS is installed into a 'Spectrum2001' directory in the user's home directory) and then type 'nmrsams24' at the UNIX prompt to run NMR-SAMS.  The program starts with a Main Graphics Window that has a menu bar and status bar.

 

By default, a Status Window is also opened, which displays text messages to indicate the current status of the structure elucidation, and also prompts the user with the “what to do next” steps.  The main graphics window is shown below:

 

When NMR-SAMS is started, it reads the following three files from the directory where the user launched NMR-SAMS:

 

nmrsams.ini: defines some of the initial settings of the program, such as window sizes, background colors, atom colors, bond colors, etc.  If this file is not found, default settings will be used.

periodic_tab.def: defines some properties of the chemical elements.  If this file is not found or if it is not properly read, NMR-SAMS will not be able to recognize any element symbols and perform the related functions. 

chemical_shifts.def: defines the knowledge base of  13C chemical shift dispersion ranges for some common carbon-centered single spherical substructures (CCSS) (see Appendix III).  If this file is not found or it is not correctly read, the structure generation will not be possible (see Section 3.5).

2.4 The Basics

If the user is new to X/Motif, please read this section before using NMR-SAMS, to become better acquainted with the NMR-SAMS interface. 

 

When NMR-SAMS is first initiated, a window will appear with "NMR-SAMS, version 2.4, (C) Spectrum Research, LLC." on the top.  The area where this text appears is referred to as the "Title Bar."  The user can press the left mouse button while the arrow pointer (which is called the "Cursor") is on the title bar and then move the mouse to move the window.  Release the mouse button to stop moving the window.  That combination of events (pressing a mouse button, moving the mouse, and then releasing) is known as "Dragging".  Position the mouse pointer so that it is over the word "File", located immediately below the title bar.  Now press and then immediately release the left mouse button.  This procedure (pressing a mouse button and then releasing without moving the mouse) is known as "Clicking".  The item that was clicked on was the "Menu Bar".  The menu bar consists of several "Menus" ("File", “Edit”, "Display", "Analysis", and "Help").  When the File menu is clicked on, a "Pulldown" appears.  This pulldown consists of "Menu Items" ("Open...", "New...", etc.).  If the user clicks on one of these menu items, an option will occur.  Menu items are the primary way that the user of NMR-SAMS can convey its wishes to NMR-SAMS. 

 

Some items on menus are not menu items, however.  The line that appears above the "Quit" menu item is known as a "Separator".  Its purpose is solely to make the menu easier to read. Click on the "File" menu and notice that the "Create NMR Data File" menu item has a right pointing triangle after its text.  This type of menu item is known as a "Pullright".  Click the mouse on the " Create NMR Data File " menu item and another group of menu items will appear to the right of it.  The pullright feature is used to group related menu items together, reducing the size of the main pulldowns.  Click on the "Display" menu and the menu item "Status Window", which is known, as a "Toggle" will appear.  Toggles have two states:  "Off" (also known as "Deselected" or "Deactivated"), and "On" (also known as "Selected" or "Activated").  If the status window is on, turn off the "Status Window" toggle by clicking on it and the status window will disappear.  Click on the "Display" menu and turn on the “Status Window” toggle by clicking on it again, and the status window will pop up again.

 

Position the mouse cursor over the frame that surrounds the entire NMR-SAMS window.  Drag the mouse to change the size of the NMR-SAMS window.  All sides of the NMR-SAMS window can be moved to size the window.  The field below the NMR-SAMS Toolbar is known as the "Main Graphics Window".  This is where information about chemical structures is displayed.  At the bottom of the Main Graphics Window is the "Status Bar", and this status bar prints out information about what is going on in NMR-SAMS.  It will notify the user if the user has asked NMR-SAMS to perform a function that it is not prepared to do, in addition to giving the user hints about using NMR-SAMS. 

 

Click on the "Open..." menu item from the "File" menu, and a window will appear with the title of "Open".  This type of window is known as a dialog box.  While a dialog box is displayed, the user must interact with it before continuing with other areas of NMR-SAMS.  Dialog boxes also have a "Help" button that when clicked, will bring up online help about the dialog box.  The dialog box that is currently displayed is referred to as the "File Browse Dialog", and it is used to specify a file.  The user can move to a certain directory by using the “Directory” combo box to find the proper parent directory, and the user can descend the directory structure by double clicking on a directory name from the list (a “Double Click” is two clicks followed in rapid succession).  After the user has changed to the appropriate directory, a list of "Files" with the extension “.mdf” will appear.  Click on one of the filenames to select it and then select the "OK" button at the bottom of the dialog box to accept the input the selected file.  Click the "Cancel" button to close the dialog box without performing an action.

 

When multiple candidate structures are generated, the first structure will be displayed along with a window titled Structure Browser.  This window is known as a "Palette."  Palettes are similar to dialog boxes, however the user is able to interact with them and with the main NMR-SAMS window at the same time.  The "Structure Browser" palette is used to control the display of the candidate structures.  In the "Structure Browser" palette, there is a "Slider", and the user can drag the slider bar to the left or right to raise or lower its value, which determines the sequential number of the structure to be displayed.  Some palettes also have text fields where the user can enter numbers or text.

 

The user should now have enough information to start exploring NMR-SAMS.  Note that NMR-SAMS grays out menu items that are not available during specific stages of the structure elucidation process.  For example, if the user has not prepared the NMR data file, the menu item Analysis/Interpret NMR Data will remain grayed out until the data has been prepared. 

2.5 Description of the Main Menus

The menu bar appears at the top of the main graphics window and contains the names of the five NMR-SAMS menus:  All tasks in NMR-SAMS can be performed by selecting from these five menus.  The five menus are described briefly on the following pages and in greater detail in the other chapters of this book.

 

The File menu:         The File menu lists options related primarily to reading data into and out of NMR-SAMS, as displayed below:

The Edit menu:         The Edit menu lists options related to editing of the working data set files and the generated structures, as displayed below:

The Display menu:   The Display menu lists options related to the graphical display of intermediate and final results of NMR-SAMS, as displayed below:

The Analysis menu: The Analysis menu lists the options related to structure elucidation, as displayed below:

The Help menu:        The Help menu lists the options related to the online help of NMR-SAMS, as displayed below:

 

2.6 The NMR-SAMS Toolbar

The NMR-SAMS toolbar contains icons (pictures) that represent commonly used menu items.  If the user clicks on one of the icons, the same action occurs as the corresponding menubar item.

 

 

The following menu items have associated toolbar icons:

 

    File/New                                                               

      File/Open

      File/Save                                                               

      Display/Building Blocks & Fixed Bonds

      Display/Target Structure                                   

     Display/Generated Structures or Assignments

      Display/Status Window                                    

    Display/Display Options/Balls

      Display/Display Options/Carbon Symbols    

      Display/Display Options/Numbers

      Display/Display Options/Chemical Shifts      

      Display/Display Options/Protons

      Display/Display Options/Molecular Formula

      Display/Display Options/Connection Table

      Display/Display Options/Refine                      

      Help/Contents


Chapter 3

Understanding NMR-SAMS

3.1 Overview

This chapter introduces the basic procedure of structure elucidation, with a brief description of the concepts and principles of NMR-SAMS, and concludes with a high-level discussion of the typical flow of activity through NMR-SAMS. 

3.2 General Procedure of Structure Elucidation with NMR-SAMS

The process of structure elucidation of an unknown compound through NMR spectroscopy consists of the following steps:  

 

1.        Determination of the molecular formula (MF) by MS.  Determination of some functional groups in the unknown compound through IR and UV spectroscopy.  MF is optional to NMR-SAMS.

2.        Data acquisition of 1D and 2D NMR spectra.  See Section 3.3 for the spectral data used by NMR-SAMS.

3.        Extraction of peak tables with chemical shifts, intensities, J-coupling and multiplicities.  Peak picking of 1D and 2D NMR spectral data is performed with SpecMan using automatic and semi-automatic procedures (see SpecMan’s User Guide).  The peak tables are then converted to NMR-SAMS representation of connectivity information (see Chapter 5).

4.        Set up of the parameters to control the spectral interpretation and structure generation.  In most cases, the default values of these parameters can be used  (see Appendix IV).

5.        Interpretation of molecular formula (if known), along with 1H, 13C, and HMQC spectral data to obtain the structural building blocks.  If the MF is unknown, the user can interactively add heteroatoms into the building block sets (see Chapter 6).

6.        Interpretation of additional 2D NMR spectral data to obtain the bond constraints (see Chapter 6)

7.        Generation of candidate structures that are consistent with the experimental data for unknown compounds (see Chapter 7), or verification of the proposed structure and completion of 1H and 13C resonance assignments (see Chapter 8) for known compounds.  Interactive structure generation and resonance assignment is also possible (see Section 7.2.1).

8.        Exportation of the results of structure generation and resonance assignments (see Chapter 11).

 

Structure elucidation is usually an interactive approach, so this process may need to be repeated several times until the user obtains satisfactory results.  NMR-SAMS assists the user in identifying and correcting the inconsistencies in the input data.  When sufficient input data is not available, NMR-SAMS generates only partial structures with resonance assignments.   NMR-SAMS also warns the user about some common pitfalls that could lead to incomplete or incorrect structure generation, and provides clues for further refinement.

 

3.3 What Spectral Data Does NMR-SAMS Use?

The possible combinations of 1D and 2D spectral data used by NMR-SAMS for structure elucidation are listed in Table. 3.1.  The fifth combination (routine 1D and 2D spectra along with complementary information from other spectral data (MS, UV and IR)), is the recommended choice for structure elucidation of real-world complex molecules.  Other spectral sources such as MS, IR, and UV are not directly interpreted by NMR-SAMS but they can be conveniently used as user-defined bond/environment constraints. 

Table 3.1. Possible combinations of 1D and 2D NMR spectral data used by NMR-SAMS a

 

1D

2D

Comments

1

None

None

Pure isomer enumeration from MF

2

13C (and DEPT b)

None

Very low efficiency except for simple molecules.

3

13C, DEPT b

INADEQUATE

Very high efficiency, if data available.

4

13C, DEPT b, 1H

DQF-COSY c, HMQC d

Low efficiency except for H-rich molecules.

5

13C, DEPT b, 1H

DQF-COSY c, HMQC d, HMBC e (NOESY f)

Most practical way for de novo structure elucidation of complex molecules.

6 g

1H

DQF-COSY c, HMQC d, HMBC e (NOESY f)

Practical when the amount of sample does not allow for carbon-detecting experiments.

 

a TOCSY is not used directly by NMR-SAMS, but can be used by SpecMan to assist the peak picking of DQF-COSY.

b INEPT, or APT can also be used.

c Various types of COSY experiments can be used, as long as they provides geminal and vicinal H-H through-bond connectivity.

d HSQC, HETCOR, or other types of spectra can also be used, as long as they provide one-bond C-H connectivity.

e COLOC, FLOCK, or other types of spectra can also be used, as long as they provide long-range C-H connectivity.

f NOESY or ROESY is optional.

g HMBC and HMQC must be clean enough to allow extraction of 13C chemical shifts and multiplicity information.  13C chemical shifts can be automatically extracted from HMBC using SpecMan.  13C multiplicities must be identified manually from the HMQC spectrum.

3.4 Use of 2D NMR Connectivities: Bond Constraints

NMR-SAMS uses mainly 2D NMR-derived through-bond spin-spin connectivity information for structure elucidation, because it is reliable and provides comprehensive structural information for de novo structure elucidation.

 

In NMR-SAMS, the coordinates of 2D cross peaks are first converted into connectivities between the relevant 1D peaks, and then interpreted as bond constraints on the relevant atoms.  A bond constraint (BC) is a requirement of a certain number (or a range) of intervening chemical bonds between correlated spins.  For an asymmetric molecule, such spin-spin BC’s are directly used as atom-atom bond constraints.  In addition to its efficient utilization of BC’s involving ambiguous bond separation (e.g., 2 or 3 bonds between two HMBC-correlated spins), NMR-SAMS also copes with BC’s concerning ambiguous atoms.  Such ambiguity typically arises from peak degeneracy or low digital resolution.

 

 

In NMR-SAMS, a BC is represented in the following general format:

 

(Atom_y ... - Atom_x ... : minBond ~ maxBond; BondType; minNSBC ~ maxNSBC)Source

where

Atom_y ... is the correlated atom(s) along the Y dimension (13C domain for an HMQC spectrum). It could be more than one atom in the case of ambiguity.

Atom_x ... is the correlated atom(s) along the X dimension (1H domain for an HMQC spectrum).  It could be more than one atom in the case of ambiguity.

minBond and maxBond are the minimum and maximum bond separations between the relevant atoms.

BondType is the type of the intervening bond between the atoms.  Valid choices are: 0, 1, 2, or 3 for unknown, single, double, and triple, respectively.

minNSBC and maxNSBC are the minimum and maximum numbers of relevant atom pair(s) that must satisfy this BC in the generated structure. 

Source encodes the connectivity (or other source) from which the BC was derived.  A connectivity is represented by its spectral type and its ID number. The following codes are used to represent the different spectral types:

“C” for COSY, “Q” for HMQC (or HETCOR), “B” for HMBC (or COLOC), “N” for NOESY, “I” for INADEQUATE.

Note: The ID of a connectivity is different from, though related to, the peak ID(s) in the SpecMan peak tables.  For more details see Fig. 6.4 in Chapter 6.

The following codes are used to represent other kinds of source:

“S” for a pseudo BC added by the program, “U” for a user-defined BC, and “G” for a previously generated bond (when using a generated substructure as the starting point for the next structure generation cycle).

 

For example, an HMBC-derived bond constraint is represented as:

(10 - 17 18: 2 ~ 3; 0; 1 ~ 2)B10

In the above example, the first set of numbers “10 - 17 18: ” denotes the atoms that are correlated.  In this case, since the chemical shifts of H-17 and H-18 are very close, it is difficult to resolve which one of them is really correlated to C-10.  Therefore, both of the protons are retained to represent the possibilities that there could be a correlation between either C-10 and H-17, or C-10 and H-18, or both.  The next set of numbers “2~3” represents that there could be two or three intervening bonds between the correlated C-H pair(s).  The next number “0” represents the bond type of the intervening bonds, and in this case, they are treated as unknown.  The next set of numbers “1~2” represents that either one or both pairs of the atoms involved in the bond constraint must satisfy this bond constraint in the computed structure (i.e., C-10 and H-17, or C-10 and H-18, or both pairs).  Finally, the character string “B10” means that this bond constraint was derived from the HMBC connectivity #10.  From the comment of this connectivity, the ID of the actual cross peak (in the SpecMan peaks table) can be found in the .nmr file. (See Fig. 6.4 in Chapter 6).

 

By default, NMR-SAMS treats unambiguous BC’s (which have exactly two correlated atoms, one-bond separation, and minNSBC = maxNSBC = 1, which means the BC must be satisfied in a generated structure, as fixed bonds.  The rest, which either have ambiguous bond separation, or ambiguous numbers of correlated atoms, or both, are treated as ambiguous BC’s.  The ambiguous BC’s are used as the major constraints for structure generation.  During structure generation, NMR-SAMS computes the number of violations of BC’s for the current substructure/structure.  If the actual number of violations of a substructure/structure is less than the upper limit of allowed number of violations, then the substructure/structure is retained, otherwise it is rejected.   The BC’s are also used by some advanced heuristic methods for acceleration of the structure generation process.  (See Section 7.4)

3.5 Use of Chemical Shifts And Peak Multiplicities

NMR-SAMS uses chemical shifts as the labels of heavy atoms, so that 2D NMR-derived correlation information can be used as bond constraints on specific atoms.  This is also the reason why a generated structure always has unequivocal 1H and 13C resonance assignments.

 

13C chemical shifts are also used to evaluate the intermediate structures/substructures produced during the structure generation process.  A knowledge base consisting of a correlation table of substructure and 13C chemical shift (d) range is used for predicting 13C chemical shift ranges.  Each of the substructures consists of the central carbon atom (which is being considered), its attached bonds, and the first layer of its neighboring atoms (the outwards bonds of these atoms are not considered).  This is referred to as a carbon-centered single-spherical substructure (CCSS).  Currently, this table consists of the 13C chemical shift ranges of around 93 CCSSs composed of C, N, O, and other common elements that have been adapted from literature.  The correlation table is stored as an ASCII file, chemical_shifts.def (see Appendix III), with the code for each CCSS and its expected minimum and maximum 13C chemical shift.  This file can be customized by the user, and is read when NMR-SAMS is started.

 

During structure generation, whenever a carbon atom has a complete CCSS (i.e., its immediate neighbors are known), then its expected chemical shift range is derived from the knowledge base and compared with the observed 13C chemical shift of the central carbon.  If the observed shift satisfies this range, then it is accepted, otherwise the substructure is discarded.  If the CCSS is not defined in the knowledge base table, the test is assumed to have been passed, and the undefined CCSS's are reported after the structure generation has been completed.  As the CCSS's cover only very limited structural features, their chemical shift ranges are very broad.  Thus in NMR-SAMS, 13C chemical shifts act as a much looser constraint on the structure generation than the 2D NMR connectivities.  Hence it is very important to include as much correlation information as possible for efficient structure generation.  Sometimes the correct structure could be overlooked if the molecule has carbons that show odd chemical shifts.  In such cases, it is recommended that the user broaden the predicted chemical shift ranges by specifying an extra tolerance (For details refer to the Appendix IV describing parameter ADD_C13_RNG). 

 

13C peak multiplicities play an important role in determining the number of attached protons of heavy atoms (i.e., the building blocks).  So it is recommended to use DEPT (or INEPT, APT) spectra to obtain complete 13C multiplicity information.

 

In the current version, 1H chemical shifts are not used to evaluate substructures.  1H peak multiplicities are used to limit the neighboring atoms of the concerned atom. (For details refer to the description about H1MULT_FLAG in Appendix IV.)

3.6 Structure Generation 

During structure generation NMR-SAMS searches all possible ways to assemble the structural building blocks into complete structures.  Within some allowance for the violation of constraints, the generated structures are consistent with all of the available spectral data and chemical constraints. 

 

The efficiency of structure generation is a factor of the computation time, the quality of the structure generated, and the number of structures generated.  Because it is a combinatorial problem, structure generation is usually the most time-consuming step.  “Combinatorial explosion” has been the major bottleneck of early attempts of automated structure elucidation.  NMR-SAMS provides novel heuristic search algorithms that reorder the solution space based on bond constraints, and search only the most probable portion of this space for candidate structures.  These methods exponentially reduce the CPU time for structure generation and hence make it practical for complex molecules.  Moreover, the user has full control of the usage of these methods to perform optimized structure generation.  For example, by modifying a few parameters, the user can extend the search space to a more complete search, or simply turn off the heuristic search methods to perform an exhaustive search.  On the other hand, the user can limit the search space for faster structure generation.  (See Section 7.4 and Appendix IV about the parameters GEN_FLAG, SAT_BC_RATE and N_FBX_STEP).

 

For relatively small molecules (e.g. < 30 heavy atoms) with reasonably clean and sufficient spectral data, this process is usually completed in seconds or minutes.  In most cases the correct structure is generated either uniquely or along with a few alternatives.  For more complex problems (bigger molecules and insufficient spectral constraints), structure generation can be completed in a reasonable computation time if adequate user-defined constraints are included.   

 

The candidate structures generated by NMR-SAMS include complete structures and optionally, substructures.  A complete structure is defined as one having no unsatisfied free bonds.  In the case of partial structure elucidation (see Section 7.1 for details), the chemically incomplete structure obtained is still referred to as a complete structure, because all of the free bonds are satisfied either by real bonds or dummy bonds.  During structure generation, the program enables the user to save the largest intermediate substructures.  The substructures are useful when the generation of complete structures is not possible due to errors in spectral data or other reasons, and they provide clues and hints for improving the input spectral data and completing the structure elucidation successfully.

3.7 User Intervention 

NMR-SAMS was developed to streamline and automate the structure elucidation process with less user-intervention.  However, when the molecular size of the unknown is big (e.g., number of non-hydrogen atoms is greater than 40), or insufficient connectivity information is available, user-intervention is absolutely necessary to improve the efficiency of structure generation.  Currently the user can interact with the structure elucidation procedure in the following ways:

 

1.        Modification of the control parameters for NMR interpretation and structure generation.  For example, the user can decide whether or not to use the “negative information” of DQF-COSY based on the spectral quality, and the user can also limit ring sizes to either 5 or 6-membered rings in the generated structure and discard structures containing other ring sizes.

2.        Modify the intermediate results in the MDF by using Edit/Master Data File.

3.        Supply structural building blocks by using Analysis/Edit Building Blocks if the MF is unknown.

4.        Supply known structural information as user-defined bond constraints. This is very important especially for heteroatoms that are either not observed or have sparse connectivity information in 2D NMR experiments.  Also, different spectral data, such as IR and UV, normally provide positive evidence of some known functional groups.  Using Analysis/User-defined Bond Constraints, the user can add as many known bonds as possible between the constituent atoms (see Section 7.2).  Using this feature, the user can also manually assemble the building blocks as a complete structure, or use a selected substructure (which was previously generated) as the starting point for the next structure generation.

5.        Supply known structural information as atom environment constraints (EC).  An EC defines the number of occurrence of a certain type of atom(s) as the immediate neighbor(s) of an atom under consideration (See Section 7.3).

6.        Propose a possible structure for the unknown and perform resonance assignment.  This way the user can verify user-proposed structures and complete the structure elucidation.

7.        Modify the results of resonance assignment of a target structure using Analysis/User-Defined Assignment.

3.8 Control Parameters

The parameter file (.par file) stores the parameters for controlling spectral interpretation, for setting up ACMX, and for structure generation.  All of the parameters can be modified by selecting Edit/Parameters/NMR Interpretation, Edit/Parameters/Set up ACMX or Edit/Parameter/2D Structure Generation.  Default values are assigned to the parameters according to the nmrsams.ini file when a new working data set is opened.  The default values can be customized by editing the nmrsams.ini and nmrsamspersonal.ini files.  In most cases, the default parameters should be a good starting point for structure elucidation.  In the following chapters, the name of the parameter, e.g., GEN_FLAG, is used to refer to a parameter, and the corresponding titles in the dialog boxes and details about the usage of the parameters are described in Appendix IV.


Chapter 4

Working Data Set

4.1 Overview

This chapter describes the operations related to the data files used by NMR-SAMS.  During each session of structure elucidation, NMR-SAMS works with a working data set, which consists of five text files with the same root name but different extensions.  For example, if the root name is Q-2-test, then the working data set consists of the following files:

 

·         A master data file (MDF), Q-2-test.mdf, where all of the intermediate and final results are stored.  The user can view and edit this file by using Edit/Master Data File (See Appendix II).

·         A parameter file, Q-2-test.par, where the control parameters used for the data interpretation and structure generation are stored. The user can access the parameters by using the commands in the pull-right menu of Edit/Parameters (see Appendix IV).

·         An NMR data file, Q-2-test.nmr, where the NMR data converted from the SpecMan peaks table are stored.  The user can view and edit this file by using Edit/NMR Data File (see Appendix I).

·         A log file, Q-2-test.log, where most of the information, warning, and error messages produced during the analysis are stored.  The user can view the log file by using Edit/Log File.

·         A structure file, Q-2-test.str, where the atom-atom connection table of the generated structures and their resonance assignments are stored.  The user can display the structures by using Display/Generated Structures (see Chapter 10).

·         A lock file, Q-2-test.lock, which is used to prevent two users opening the same data set simultaneously.

4.2 Open An Existing Working Data Set

Command: File/Open.

Description:  This procedure is used to open an existing working data set.  An existing working data set stores the data and results of the last session of structure elucidation with NMR-SAMS.  Opening an existing working data set allows the user to continue from where the dataset had last been saved.  After selecting File/Open, a file browser is displayed, listing the master data files in the current directory.  If necessary, the user can switch to the desired directory, and then click the desired master data file name.  The selected file name appears in the Open MDF field.  Next click OK, and the working data set is then opened for use.

 

 

After a working data file has been opened, the following message will appear:

 

 

The message prompts the user to confirm removal of old log messages from the previous session.   To remove the old log messages, select ‘Yes’ or to retain the old log messages, select ‘No.’ 

 

The status window displays the current state of structure elucidation.  It lists the NMR data files that are being used.  It also lists the steps that have been completed, and provides tips to the user as to what steps need to be done next.  The structural results, such as building blocks or candidate structures, are displayed in the main graphics window (see Chapter 10).

 

Note:  If another working data set is opened before the current modified working data set has been saved, NMR-SAMS will prompt the user to save the changes.

 

If the user wants to discard the changes that have been made to the current working data set without exiting the program, re-open the dataset and click ‘Yes’ to the following message:

 

 

Then it is possible to start from the point at which the working data set was last saved.  Note that if a data set that is being locked by another user is selected, the following warning message will appear:

 

 

Click 'Yes' to open the data file anyway, or click 'No' to cancel.  Note that if 'Yes' is selected, problems may arise.   

4.3 Opening A New Working Data Set

Command: File/New.

Description: This procedure is used to create a new working data set. When dealing with a new structure problem, the user must open a new working data set.  The user can open a totally new working data set, or open one starting from an existing NMR data file that has already been prepared.

 

To open a totally new working data set, choose File/New. In the displayed file browser, make sure to select the file type as 'Completely New Dataset (*.mdf).'  Switch to the desired directory if necessary, and type a root name for the new working data set.  The extension *.mdf will be automatically added.

 

 

After clicking 'Open' NMR-SAMS creates the *.mdf, *.par, *.nmr, *.log and *.str files.  All files, except for the parameter file (*.par) will be empty. 

 

Next, NMR-SAMS prompts the user to input the molecular formula (MF) of the sample as shown below:

 

 

Input the molecular formula into the dialog box (see Section 4.4 for more information about inputting the molecular formula).

 

To open a new working data set starting with an existing NMR file, select the file type as 'Existing NMR File (*.nmr)' in the file browser.  Switch to the desired directory if necessary, and click the desired .nmr file.  Next, click 'OK' and a new working set is created with the selected .nmr file. 

 

Note: If the user selects the filename of an existing data set, NMR-SAMS will warn the user about existing files with the same root name, as shown below:

 

 

 

Click 'Yes' and the program will overwrite the existing files (except the .nmr file if starting from an existing NMR data file).

 

If the user wants to use the existing .nmr file, but doesn't want to overwrite the existing files, click 'No' to cancel this dialog box.  Then, make a copy of the .nmr file with a new root name and reopen the newly named .nmr file. 

4.4 Input Molecular Formula

Command: File/Input Molecular Formula.

Description:  This procedure is used to define the molecular formula of the sample.  Normally this command is used when the user wants to change the MF, since NMR-SAMS always prompts the user to enter the MF when a new working data set is first opened (see Section 4.3), as shown below:

 

 

Note that the element symbol must be typed with the first letter in upper case and the second one, if any, in lower case.  The user can specify the valence of an atom in parenthesis following the element symbol (i.e., C10H12N(V)N2S(VI)O8).  If the valence is not specified, the most common chemical valence is adopted for any elements with multiple valences (i.e., a valence of 3 and 2 would have been adopted for N and S).  The user can also change the valences later by selecting Analysis/User-Defined Building Blocks. 

 

If the exact MF is unknown, enter the closest possible formula or type 'UNKNOWN'.  In any case, the user can modify the elemental composition of the molecule by using Analysis/User-defined Building Blocks later (see Section 6.3).

 

Once a molecular formula has been entered, it is interpreted and a dialog box appears displaying the standardized MF, the molecular weight, and the double bond equivalence (DBE), as shown below:

 

 

Two records are written into the MDF. The first record starts with the keyword “MF:” and contains the standardized MF:

MF: C30H48O3

The second record starts with the keyword “ATOMS:”.  Following this are the molecular weight and the degree of unsaturation (or double bond equivalence) in the same line.  The second line is a brief description of the entries in each of the remaining lines.  Each line consists of the ID, the atomic number, the chemical valence, the minimum and maximum attached protons, the minimum and maximum of attached double bonds, and the minimum and maximum attached triple bonds of a constituent heavy atom, respectively.  The constituent heavy atoms are listed with carbon first, and the remaining elements in the alphabetic order of their element symbol.

 

ATOMS:  (MW = 456.7074, DBE = 7.0)                    

#Atom; Element; Valence; Min. & max. attached H; Min. & max. double bonds; Min. & max. triple bonds

# 1.  C 4   0 3   0 2  0 1

# 2.  C 4   0 3   0 2  0 1

# 3.  C 4   0 3   0 2  0 1

      .

      .

      .

#30.   C 4   0 3   0 2  0 1

#31.   O 2   0 1   0 1  0 0

#32.   O 2   0 1   0 1  0 0

#33.   O 2   0 1   0 1  0 0

 

Note: When an atom has multiple valences, the most common valence will be adopted, by default.  For example, the valence 3 is always adopted for N.  However, the user can specify an uncommon valence while inputting the MF.  If there is a -NO2 group in the molecule, input the MF containing a “N(V)”  (e.g.,  C6H5N(V)O2).  Modifying the valence manually in the .mdf file is not recommended, because whenever Analysis/Building Blocks is selected, the MF will be re-interpreted and the previous changes will be overwritten.   

4.5 Save A Working Data Set

Command: File/Save.

Description:  This command allows NMR-SAMS to update the working data set with the current state of structure elucidation.  The user will be prompted to save changes before exiting the program or opening another working data set.

4.6 Save A Working Data Set as Different Name

Command: File/Save As.

Description:  This command allows NMR-SAMS to save the current state of structure elucidation in a working data set with a different root name.  After selecting File/Save As, the following file browser is displayed.  Switch to the desired directory (if necessary), type the new root name, and then click OK.

 

4.7 Exiting NMR-SAMS

Command: File/Exit.

Description:  This command allows the user to exit NMR-SAMS.  If changes have been made to any of the three data files (*.nmr, *.mdf, or *.par), and those changes have not been saved, NMR-SAMS will prompt the user to save them before exiting the program:

 

 

If 'Yes' is clicked, the changes will be updated before exiting the program.  However, if 'No' is clicked, the changes will be ignored before exiting the program.  The command will be ignored if 'Cancel' is selected.  


Chapter 5

Input of NMR Spectral Data

5.1 Overview

It is important to generate a clean and reliable set of peak lists from different NMR experiments before using them in NMR-SAMS.  SpecMan provides several advanced and intelligent peak-picking tools to perform fast and reliable peak picking.  For details regarding peak picking, refer to the SpecMan User's Guide.  Since SpecMan can independently perform peak picking and peaks table conversion, the user can either perform both steps in SpecMan, or perform peak picking in SpecMan and then peaks table conversion in NMR-SAMS.  Either way, the ability to perform consistency checking during the conversion process will help the user to find potential errors in the peak picking results. 

 

This chapter describes how to prepare 1D and 2D NMR spectral data as input for NMR-SAMS. (for details about the NMR Data File format see Appendix I).  It is assumed that the peak picking has already been performed in SpecMan.  The peak tables from SpecMan are then converted into the NMR-SAMS format by selecting from the following pull-right options of 'Create NMR Data File' from the File menu as shown below:

5.2 Conversion of SpecMan 1H Peak List

Command: File/Create NMR Data File/H1.

Description: In this procedure, SpecMan 1H peaks table is converted into NMR-SAMS format.  First the following dialog box is displayed which prompts the user to enter the filename of the 1H peaks table from SpecMan. 

 

Click 'Browse' to locate the peaks table file, and then click OK.  An information dialog box displays the number of 1H peaks that have been converted:

 

 

In the current version of SpecMan, all 1H peak multiplicities are marked as unknown (u), by default.  Therefore, NMR-SAMS will prompt the user to supply the 1H multiplicity for the peaks (referring to their splitting patterns). As shown in Fig. 5.1, if the multiplicities of all or some of the 1H peaks are known, select Edit/NMR Data File to open the NMR data file and replace the unknown multiplicity (represented as “u”) by one of the following symbols recognizable to NMR-SAMS:

 

s: singlet, d: doublet, t: triplet, q: quartet, m: other multiplet.  If the multiplet is unknown, leave it as unknown (u). 

 

NMR-SAMS uses 1H multiplicity information to eliminate inappropriate bonds while setting up ACMX. For additional details, refer to the usage of parameter H1_MULT_FLAG (in Appendix IV).

 

 

 

 

 

 

Figure. 5.1. Running NMR-SAMS and SpecMan side-by-side provides a convenient way to verify and edit the 1D peaks converted from SpecMan peaks table. Left (NMR-SAMS): select Edit/NMR Data File to open the .nmr file.  Right (SpecMan): Open the 1D spectrum and load the 1D peaks table. From the comment field of a converted peak, the ID (#32) of the original peak is found. By clicking the corresponding entry in the peaks table, the 1D peak (#32, shown in cyan) is highlighted in the spectrum so that the user can see and recognize the multiplicity of this peak before modifying the .nmr file.

Possible Errors: Generally NMR-SAMS crosschecks the converted 1H peak list against the MF (if known) and alerts the user of any potential conflicts.  The following situations will be reported when there is a conflict:

·         If the multiplicity information is unknown for more than three fourths of the peaks, a warning message prompts the user to supply this information if possible.

·         If the number of 1H peaks exceeds the constituent protons, an error message prompts the user to correct either the peak picking result or the MF.

 

Results:  After conversion, the .nmr file is updated with information regarding proton peaks starting with the keyword “H1:”.  Following is a transcript of the converted 1H peaks:

H1: C:\Spectrum2001\Data\NMR-SAMS\Q-2-test/h1.pks

 #1. 4.930 s   ;1

 #2. 4.755 s   ;2

 #3. 3.509 u   ;3

       .

       .

       .

 #32. 0.818 s   ;32

 #33. 0.811 u   ;33

The first line beginning with the keyword “H1:” indicates the start of 1H peak list. Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in the rest of the lines represent the following attributes of each 1H peak:

·         Peak ID, a serial number that uniquely identifies this peak.

·         Chemical shift of the peak in ppm values.

·         Multiplicity, designated as s (singlet), d (doublet), t (triplet), q (quartet), m (other multiplet) or u (unknown).  By default it is assigned as unknown. 

·         Comments, which are optional. The number in the comment field corresponds to the ID of the 1H peak in the SpecMan peaks table.

One or more spaces are used as a delimiter for all items except comments that are separated by a semicolon (;).   Items marked as optional can be omitted unless an item following them is included.  In such a case, the user must include default values for ignored items even if they don’t get used.  Comments can always be included as long as they follow a semicolon (;).  The peak list intensities and comments of the 1H peak list are not currently used by NMR-SAMS.

Note: Whenever the user repeats a 1H peaks table conversion or modifies a converted peak list (using Edit/NMR Data File), the dependent 2D spectral data must also be reconverted.  For example, if a 1H peak is added to the converted 1H peak list, the user must reconvert the COSY, HMQC, HMBC, and NOESY data again (if they had been converted already). Otherwise the added 1H peak will not be reflected in the 2D data.

5.3 Conversion of SpecMan 13C Peak List

Command: File/Create NMR Data File/C13 and DEPT.

Description: In this procedure the SpecMan 13C and DEPT/APT peak tables are converted into a peak list of 13C chemical shifts and multiplicities.  NMR-SAMS requires 13C multiplicity information for reliable structure elucidation, and in order to get the complete 13C multiplicity information, the user needs 13C, DEPT-90/APT-90 and DEPT-135/APT-135 experimental data.  However, NMR-SAMS provides a flexible way to derive the 13C multiplicity information from any combination of available experiments as described below:  

 

1.        13C Only. In the dialog box that appears, select ‘None’ for Peak Multiplicity Experiments and then click ‘Browse’ to find and select the SpecMan-created 13C Peaks Table, as shown below:

 

After clicking ‘OK’ NMR-SAMS updates the .nmr file with a list of 13C chemical shifts having unknown multiplicities as shown in the Results section below.  If the multiplicities of some peaks are known, the user can manually edit the .nmr file to supply this information.

2.        13C and DEPT.  In the dialog box that appears, click ‘Browse’ to enter the SpecMan-created 13C Peaks Table.  Then select ‘DEPT’ for Peak Multiplicity Experiments, and enter the peaks table filenames for DEPT-45 (optional), DEPT-90, and DEPT-135 experiments.  As mentioned before, all of the DEPT experiments are optional, so turn off the corresponding toggle if certain DEPT data has not been obtained.  Note that ignoring some DEPT experiments (except for DEPT-45) could leave some peaks with unknown multiplicities.

 

 

Also enter a matching tolerance (in ppm) to match the 13C and DEPT peaks.  Upon clicking ‘OK’, NMR-SAMS will update the .nmr file with a list of 13C chemical shifts and derived multiplicities as shown in the Results section below. 

 

3.        13C and APT.  In the dialog box that appears, click ‘Browse’ to enter the SpecMan-created 13C Peaks Table.  Select ‘APT’ for Peaks Multiplicity Experiments and then enter the peaks table filenames for APT-45, APT-90, and APT-135 experiments. As mentioned before, all of the APT experiments are optional, so turn off the corresponding toggle if certain APT data has not been obtained.  Note that ignoring some APT experiments (except for APT-45) could leave some peaks with unknown multiplicities.

 

 

Also enter a matching tolerance to match the 13C and APT peaks.  Upon clicking ‘OK’, NMR-SAMS will update the .nmr file with a list of 13C chemical shifts and derived multiplicities as shown in the Results section below. 

 

Possible Errors: During the conversion NMR-SAMS crosschecks the 13C peak list with the MF, and alerts the user of potential inconsistencies.  In such cases, the following general messages will be reported:

 

·         If there are more 13C peaks than the constituent carbon atoms, an error message will prompt the user to remove peak artifacts or correct the MF.

·         If there are fewer 13C peaks than the constituent carbon atoms, a warning message will prompt the user to resolve 13C peak overlap.  Define the overlapping peaks as individual peaks with slightly different chemical shifts by choosing Edit/NMR Data File and editing the NMR data file (it is usually possible to resolve such ambiguities by looking at the peak intensity and the HMQC spectrum, or by acquiring the spectrum at different conditions).  If the user is unable to resolve overlapping peaks (for example, in the case of a symmetric molecule, or due to severe overlap in a spectrum), then partial structure elucidation will be performed (see Section 7.1). 

·         If the multiplicity of one or more 13C peaks is unknown, a warning message will prompt the user to supply this information, if possible.  Lack of this information may result in multiple building block sets (see Section 6.2).

·         The number of carbon-attached protons (n_CH ) is calculated based on the 13C multiplicities.  If n_CH is greater than the number of constituent protons, an error message will prompt the user to correct either the multiplicity information or the MF.

·         When the number of 13C peaks is equal to that of the carbon atoms and all 13C multiplicities are known, the maximum number of heteroatom-attached protons (max_XH ) is calculated based on the valence of the constituent heteroatoms.  If (n_CH + max_XH) is smaller than the number of constituent protons, an error message will prompt the user to correct either the multiplicity information or the MF.

 

Results: After conversion, the .nmr file is updated with information regarding the 13C peaks starting with the keyword “C13:” in the .nmr file.  The following is a transcript of a converted 13C peak list (note that if DEPT or APT data is not used, the multiplicity will be unknown “u” for all peaks):

 

C13: C:\Spectrum2001\Data\NMR-SAMS\Q-2-test\c13.pks

 #1. 178.822 s ;1

 #2. 151.323 s ;2

 #3. 109.931 t ;3

       .

       .

       .

 #28. 16.340 q ;28

 #29. 14.929 q ;29

The first line beginning with the keyword “C13:” indicates the start of the 13C  peak list.  Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in each of the rest of the lines represent the following attributes of the 13C peak:

·         Peak ID, a serial number that uniquely identifies this peak.

·         Chemical shift of the peak in ppm values.

·         Multiplicity, designated as s (singlet, C), d (doublet, CH), t (triplet, CH2), q (quartet, CH3), or u (unknown).

·         Comments, which are optional. The number in the comment field corresponds to the ID of the 13C peak in the SpecMan peaks table.

One or more spaces are used as a delimiter for all items except comments that are separated by a semicolon (;).   Items marked as optional can be omitted unless an item following them is included.  In such a case, the user must include default values for ignored items even if they don’t get used.  Comments can always be included as long as they follow a semicolon (;).  The peak list intensities and comments of the 13C peak list are not currently used by NMR-SAMS.

Note: Whenever the user repeats a 13C peaks table conversion or modifies a converted peak list (using Edit/NMR Data File), the dependent 2D spectral data must also be reconverted.  For example, if a 13C peak is added to the converted 13C peak list, the user must reconvert the HMQC, HMBC, and INADEQUATE data again (if they had been converted already). Otherwise the added 13C peak will not be reflected in the 2D data.

As shown in Fig. 5.1, NMR-SAMS and SpecMan can be used side-by-side to verify the peak picking results of peaks mentioned in warning or error dialog boxes.

5.4 Conversion of SpecMan DQF-COSY Peaks Table

Command: File/Create NMR Data File/COSY.

Description:  In this procedure NMR-SAMS converts the DQF-COSY cross peak coordinates into connectivities between 1D 1H peaks.  As illustrated in Fig. 5.2, the coordinates of the peak center (shown as a cross) are matched to the 1D chemical shifts (shown as dotted lines).  The 1D peaks that match the peak center within the tolerances (±D2 and ±D1 in F2 and F1 dimensions, respectively) are taken as the correlated 1D peaks.  If more than one 1D peak (such as 1H peaks a and b in Fig. 5.2) matches the cross peak center in a certain dimension, then all are treated as possible correlated 1D peaks in that dimension. Such connectivity is called an ambiguous connectivity and NMR-SAMS will internally consider all possible correlations for an ambiguous connectivity (for more details about ambiguous connectivity, see the example in Section 3.4).

Figure. 5.2. Illustrates the conversion of COSY cross peak coordinates into a correlation between the 1D 1H peaks.  The cross (+) denotes the cross peak center.  The dotted lines denote the chemical shifts of the three 1D 1H peaks, a, b, and c, respectively.  D1 and D2 are the matching tolerances along F1 and F2, respectively.  All three peaks, which match the cross peak center within the tolerances, are taken as correlated 1D peaks.

Upon selecting File/Create NMR Data File/COSY, NMR-SAMS opens a dialog box that prompts the user to enter the filename of the COSY peaks table.  The user is also prompted to input matching tolerances along the X (i.e. F2) and Y (i.e. F1) dimensions.

 

 

The default value for the matching tolerance is 0.005 ppm for both dimensions.  It is important to select an appropriate tolerance since too large of a tolerance value could result in undesired ambiguity, and too small of a tolerance value could ignore some real peaks.  To choose a suitable tolerance, the four following factors must be considered:

 

·         Accuracy of the peak picking.  The grid-intelligence-based peak picking of SpecMan provides a very convenient way to verify the accuracy of peak picking by comparing the expected locations of the cross peaks with the picked peaks (see SpecMan's User’s Guide). If a peak list was carefully verified with this method, it is acceptable to start with a small tolerance.

·         Alignment between 1D 1H and the COSY spectra.  SpecMan provides convenient tools to correct frequency offset between the 1D and 2D spectra.  Sometimes different experimental conditions introduce small chemical shift differences between 1D and 2D resonances.  To further correct the differences due to sample conditions, the user can utilize the grid-intelligence-based peak picking method of SpecMan.  If these corrections have been applied, it is acceptable to start with a small tolerance.

 

Possible Errors: During the peak table conversion, depending on the situation, NMR-SAMS may prompt the following error/ warning messages:

·         If the X or Y coordinate of a cross peak does not match any 1D 1H peak within the matching tolerance, the cross peak will be discarded.  When this message appears, the user should verify this peak and check if it is an artifact.  If it is not an artifact, then either it's center has not been picked accurately, or the tolerance was too small.  Click 'Cancel' to stop the conversion process and try refining the peak picking results or repeating the conversion with a bigger matching tolerance.

·         If the X or Y coordinate of a cross peak matches more than one 1D 1H peak within the matching tolerance, then an ambiguous correlation is obtained.  The user can either click 'Cancel' to stop the process and then try a smaller tolerance to reduce ambiguities, or the user can click 'OK to All' to let the conversion finish and then select Edit/NMR Data File to manually remove the undesired ambiguities in the .nmr file.  Note that although NMR-SAMS can use ambiguous correlation information, too many ambiguous correlations will undermine the efficiency of the subsequent structure generation.   

·         If the X or Y coordinate of a cross peak matches more than six 1D 1H peaks within the matching tolerance, the peak will be discarded.  In such a case, the user can either click 'Yes' (or 'Yes to All') to go on without that peak, or click 'No' to define a reduced matching tolerance and repeat the process.  The user can also click 'Cancel' to stop the process and then merge the very close 1D 1H peaks into a degenerate peak in the SpecMan 1H peaks table (see Section 5.2).  Then reconvert the DQF-COSY peaks table again.

 

Tips: As shown in Fig. 5.3, NMR-SAMS and SpecMan can be utilized side-by-side to verify the original peak picking results of peaks mentioned in warning or error dialog boxes.  This is also useful when the user edits the .nmr file using Edit/NMR Data File.

 

 

Figure. 5.3. Running NMR-SAMS and SpecMan side-by-side provides a convenient way to verify and edit the 2D peaks during peaks table conversion.  Left (NMR-SAMS): a dialog box indicates that cross peak #33 is discarded by NMR-SAMS.  Right (SpecMan): Open the DQF-COSY spectrum and load the 2D peaks table. By clicking the corresponding entry in the peaks table, cross peak #33 is highlighted in the spectrum.  This peak was discarded because it is located too far away from the grid center.  If necessary, correct this peak by moving it closer to the grid intersection and then save the refined peaks table and repeat the peaks table conversion.  This method can also be used when editing the .nmr file to remove undesired ambiguities and to mark long-range coupled peaks.

For COSY and other homonuclear spectra, NMR-SAMS discards the diagonal peaks and merges symmetric peaks.  This is not done when ambiguous correlation is involved.  For example, the following connectivities are retained:

(10 - 10 11) 3   0.00   0.60

(8 - 9 10)   3   0.00   0.60

(8 - 9)      3   0.00   0.60

 

The first connectivity may arise from either a diagonal peak or a near-diagonal peak. The latter two, converted from two symmetric peaks, do not have exactly the same correlated 1H peaks so they are not merged.

 

For converted COSY connectivity, the intensity level is assigned the value 3 (i.e., strong).  The J-coupling constant is assigned 0.0 (i.e., unknown).  The reliability of the peak is assigned 0.60 if it has been converted from a single peak, or 0.84 if it has been converted from two symmetric ones.  Since the intensity level of a COSY peak is related to its structural interpretation, NMR-SAMS always prompts the user to mark the connectivities that may be due to long-range couplings after the conversion is finished, as shown in the dialog box below:

 

 

Peaks showing very low intensity or involving sp2-C could be long-range coupled.  When some peaks are suspected to be due to long-rang coupling, select Edit/NMR Data File to edit the .nmr file.  Modify the intensity levels of such connectivities from “3” (i.e., strong) to “1” (i.e., weak), and save the changes.  As described in Fig. 5.3, the user can edit the .nmr file while looking at the original COSY cross peaks.

 

Note: A short-range coupling COSY connectivity is normally interpreted as 2 or 3 intervening bonds between the correlated protons.  If a long-range coupling is mistakenly interpreted as a short-range one, NMR-SAMS will not generate the correct structure.  A COSY connectivity marked as long-range coupling is usually interpreted as 3-5 intervening bonds between the correlated protons, which also covers the possibility of vicinal coupling.  It is safe to treat a short-range coupling peak as long-range coupling, but it may decrease the efficiency of structure generation.  The program always automatically detects geminal coupling.  (For details see Section 6.4).

 

Results: After the conversion, the .nmr file is updated with information regarding the converted COSY connectivities starting with the keyword “COSY:”.  The following is a transcript of a converted COSY connectivity list:

 

COSY:

 #1. (1 - 2)         1      0.0    ;1+4

 #2. (1 - 12)        1      0.0    ;2+31

 #3. (2 - 12)        1      0.0    ;3+32

 #4. (3 - 7 8)       3      0.0    ;6+18

 #5. (3 - 13)        3      0.0    ;7+33

 #6. (3 - 18)        3      0.0    ;5+49

.

.

.

The first line beginning with the keyword “COSY:” indicates the start of the COSY connectivity list. Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in each of the rest of the lines represent the following attributes of connectivity:

·         Connectivity ID, a serial number that uniquely identifies this connectivity.

·         ID's of the correlated 1D 1H peaks, shown in parenthesis.  For ambiguous correlations, the ID's of all possible 1D 1H peaks are included. 

·         Peak intensity level, classified as four types; strong (3), medium (2), weak (1), and unknown (0).  The default value is 3, and for short range coupled DQF-COSY connectivity, intensity levels should

be either 3 or 2.  For a long-range one, the intensity levels should be 1.  If an intensity level of zero (0) is used, NMR-SAMS will expect actual J-coupling values in the field that represents J-coupling.

·         J-coupling, 0.0 is assigned by default, representing unknown. This is optional if peak intensity level is bigger than 0.

·         Comments, which are optional and have a maximum length of 80 characters. The numbers in the comment field correspond to the ID's of the corresponding peaks in the SpecMan peaks table. For merged peaks these numbers are shown with a + sign.  Comments are ignored by NMR-SAMS.

One or more spaces are used as a delimiter for all items except comments that are separated by a semicolon (;).  Items marked as optional can be omitted unless an item following them is included.  In such a case, the user must include default values for ignored items even if they don’t get used.  Comments can always be included as long as they follow a semicolon (;).

Note: The conversion of COSY peaks table is dependent on the converted 1H peak list.  If the 1H peaks table is reconverted, or if the user modifies the converted 1H peak list, the COSY peaks table must be reconverted again. 

5.5 Conversion of SpecMan HMQC/HETCOR Peaks Table

Command: File/Create NMR Data File/HMQC (or HETCOR).

Description:  In this procedure NMR-SAMS converts the HMQC or HETCOR cross peak coordinates into connectivities between 1D 13C and 1H peaks.  In principle, the conversion process is very similar to the COSY conversion process described in Section 5.4. 

Although the process is similar, the user needs to be aware of the fact that correlated 13C peaks are always placed ahead of correlated 1H peaks in a converted connectivity, and this applies to both HMQC and HETCOR.

 

Unlike the other 2D spectral data, ambiguity is not allowed for HMQC connectivity.  NMR-SAMS will first search each 13C peak against an HMQC peak by matching 13C coordinates within the specified tolerance and then the HMQC peak that has been identified by the previous step is searched against all 1H peaks by matching its chemical shift within the specified tolerance.  The 1H peak with the best match is taken as the correlated 1H peak.  This process is repeated until each HMQC connectivity has exactly one correlated 13C-1H pair.

 

Possible Errors: After the conversion, the resulting HMQC peak list is crosschecked against the 13C multiplicity information. NMR-SAMS may prompt the following error/warning messages:

·         If the number of correlated HMQC peaks of a certain 13C peak is fewer than expected (1 for CH and CH3, 2 for CH2), it warns the user to check for missing HMQC peaks, or the 1H integral to verify if a CH2 shows degenerate 1H peaks.

·         If the number of correlated HMQC peaks of a certain 13C peak is more than expected (1 for CH and CH3, 2 for CH2), it prompts the user to check for possible errors due to degenerate 13C peaks, wrong assignment, or artifacts.

 

NMR-SAMS automatically discriminates HMQC from HETCOR.  The peak intensities are not used by NMR-SAMS.

 

Results: Upon conversion, the .nmr file is updated with information regarding the converted HMQC connectivities starting with the keyword “HMQC:”. The following is a transcript of a converted HMQC connectivity list:

 

HMQC:

 #1. (3 - 1) ;2

 #2. (3 - 2) ;1

 #3. (4 - 4) ;3

 #4. (6 - 33)        ;4

       .

       .

       .

The first line beginning with the keyword “HMQC:” indicates the start of the HMQC connectivity list. Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in each of the rest of the lines represent the following attributes of connectivity:

·         Connectivity ID, a serial number that uniquely identifies this connectivity.

·         ID's of the correlated 1D 13C and 1H peaks, shown in parenthesis, which define the correlated 13C and 1H peaks respectively.

·         Comments, which are optional and have a maximum length of 80 characters. The numbers in the comment field correspond to the ID of the corresponding peak in the SpecMan peaks table.

One or more spaces are used as a delimiter for all items except comments that are separated by a semicolon (;).  Items marked as optional can be omitted unless an item following them is included.  In such a case, the user must include default values for ignored items even if they don’t get used.  Comments can always be included as long as they follow a semicolon (;).

Note: The conversion of HMQC/HETCOR peaks table is dependent on the converted 1H and 13C peak lists.  If the 1H/13C peaks table is reconverted, or if the user modifies the converted 1H/13C peak list, the HMQC/HETCOR peaks table must be reconverted again. 

5.6 Conversion of SpecMan HMBC/COLOC Peaks Table

Command: File/Create NMR Data File/HMBC (or COLOC).

Description:  In this procedure, NMR-SAMS converts the HMBC or COLOC cross peak coordinates into connectivities between 1D 13C and 1H peaks.  In principle, the conversion process is very similar to the COSY conversion process described in Section 5.4. 

Although the process is similar, the user needs to be aware of the fact that the correlated 13C peaks are always placed ahead of correlated 1H peaks in a converted connectivity, and this applies to both HMBC and COLOC. 

 

NMR-SAMS automatically discriminates HMBC from COLOC, and by default assigns a strong intensity level of '3' to each peak.  The peak intensity levels are useful if the user wants to interpret some weak peaks as connectivities longer than 3 bonds (see Section 6.4.2).

 

Results: Upon conversion, the .nmr file is updated with information regarding the converted HMBC connectivities starting with the keyword “HMBC:”. The following is a transcript of a converted HMBC connectivity list:

 

HMBC:

 #1.   (1 - 6)       3      ;3

 #2.   (1 - 7 8)     3      ;4

 #3.   (1 - 13)      3      ;5

          .

          .

          .

 #128. (29 - 10)     3      ;133

 #129. (29 - 24)     3      ;131

The first line beginning with the keyword “HMBC:” indicates the start of the HMBC connectivity list. Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in each of the rest of the lines represent the following attributes of connectivity:

·         Connectivity ID, a serial number that uniquely identifies this connectivity.

·         ID's of the correlated 1D 13C and 1H peaks, shown in parenthesis. For ambiguous correlations the ID's of all possible 1D 13C & 1H peaks are included.

·         Peak intensity level, classified as four types: strong (3), medium (2), weak (1), and unknown (0).  This is optional and the default value is 3.

·         Comments, which are optional and have a maximum length of 80 characters. The numbers in the comment field correspond to the ID of the corresponding peak in the SpecMan peaks table.

One or more spaces are used as a delimiter for all items except comments that are separated by a semicolon (;).  Items marked as optional can be omitted unless an item following them is included.  In such a case, please include default values for ignored items even if they don’t get used.  Comments can always be included as long as they follow a semicolon (;).

Note: The conversion of HMBC/COLOC peaks table is dependent on the converted 1H and 13C peak lists.  If the 1H/13C peaks table is reconverted, or if the user modifies the converted 1H/13C peak list, the HMBC/COLOC peaks table must be reconverted again. 

5.7 Conversion of SpecMan NOESY Peaks Table

Command: File/Create NMR Data File/NOESY (or ROESY).

Description:  In this procedure, NMR-SAMS converts the NOESY (or ROESY) cross peak coordinates into connectivities between 1D 1H peaks in exactly the same way as described for COSY in Section 5.4. Strong intensity level (represented as “3”) and the actual peak intensity (from SpecMan peaks table) are assigned to the corresponding entries of each peak.  NMR-SAMS uses NOESY information in a very limited fashion so normally the user does not need to take care of the peak intensity for 2D structure determination (see parameters IDEAL_COSY and NOESY_DIST in Appendix IV).

5.8 Conversion of SpecMan INADEQUATE Data

Command: File/Create NMR Data File/INADEQUATE.

Description:  In this procedure, NMR-SAMS converts the 2D INADEQUATE cross peak coordinates into connectivities between 1D 13C peaks.  In the following dialog box, the user is prompted to define a matching tolerance.  This tolerance will be used to match chemical shifts of 13C peaks and the F2 coordinates of the INADEQUATE peaks.  This tolerance is also used to match the F1 coordinates to search for coupled INADEQUATE peaks.  Similar to the conversion process of DQF-COSY (Section 5.4), ambiguous connectivities will be considered. 

 

 

Results: Upon conversion, the .nmr file is updated with information regarding the converted INADEQUATE connectivities starting with the keyword “INAD:”. The following is a transcript of a converted HMBC connectivity list:

 

INAD: 

#1. (1 - 3)          ;1+2

#2. (2 - 4 5)        ;3+4

       .

       .

      .

The first line beginning with the keyword “INAD:” indicates the start of the INADEQUATE connectivity list. Following the keyword and a blank space, comments may be added up to 80 characters in length. The entries in each of the rest of the lines represent the following attributes of connectivity:

 

·         Connectivity ID, a serial number that uniquely identifies this connectivity.

·         IDs of the correlated 1D 13C peaks, shown in parenthesis.  For ambiguous correlations, the ID's of all possible 1D 13C peaks are included.

·         Comments, optional and have a maximum length of 80 characters. The numbers in the comment field correspond to the ID’s of the corresponding INADEQUATE peaks in the SpecMan peaks table.

5.9 Manual Peak Picking 

To manually prepare the NMR data file required by NMR-SAMS (in case SpecMan has not been utilized to perform peak picking), begin by numbering the 1D 1H and 13C peaks, preferably from down-field to upper-field (see Fig. 5.4).  Then, HMQC can be used to group multiplets and resolve overlapping peaks in the 1H spectrum.  If two (or more) 1H peaks overlap completely, treat them as one degenerate peak.  The 1D 13C peaks must be resolved (i.e., no peak degeneracy is allowed), so if necessary, split a degenerate 13C peak as two peaks with slightly different chemical shifts.  In a case where parts of the spectra cannot be resolved due to multiple atoms with very similar chemical environments (e.g. multiple phenyl groups or a long methylene chain), the unresolved 13C (and 1H as well) peaks can be discarded.  NMR-SAMS will then perform partial structure elucidation (PSE) based on the incomplete spectral data.

Figure. 5.4. Schematic illustration of the manual preparation of NMR data from original spectral plots for input into NMR-SAMS.  The 1D 1H and 13C peaks are numbered and 2D cross peaks are picked as pairs of correlated 1D peaks.  Two COSY peaks, #2 and #3, suspected to be due to long-range coupling, are marked weak with an intensity level of 1.  HMBC peak #2, suspected to be an artifact, is marked with a reliability of 0.4.  The grid lines in the 2D spectra illustrate the intra- and inter-spectral alignments of the 1D resonances.  For clarity, only COSY and HMBC are shown.  See Section 5.4 for details about the format.

Picking of the 2D cross peaks is based on the numbered 1D peaks, and the 2D cross peaks are located and assigned to their corresponding 1D peaks in each dimension.  A cross peak that cannot be resolved can be assigned to more than two 1D peaks.  If it is hard to discriminate the cross peak as a possible artifact or noise, use a probability smaller than 0.5 to designate it as an unreliable peak.  For a COSY peak, the interpretation is dependent on its intensity level (i.e., J-coupling constant), so a potential long-range coupling must be marked as a “weak” intensity level (represented as 1).  Finally the picked peaks can be listed in a text file format described in Appendix I.


Chapter 6

Spectral Interpretation

6.1 Overview

This chapter describes the steps involved in the interpretation of the molecular formula (MF), the 1D and 2D NMR spectral data, and the bond constraints derived from NMR data.  First, the possible set(s) of structural building blocks are determined from MF, 1H, 13C and HMQC spectral data, and then the remaining 2D spectral data are interpreted as bond constraints between the building blocks.  In the same step, the various bond constraints are integrated as a homogenous set of bond constraints, and an atom-atom connection matrix (ACMX) is set up to summarize the possibilities of bond formation between the building blocks.

 

The schematics of deriving bond constraints from different 2D NMR spectral data are illustrated in Fig. 6.1. The general definition of a bond constraint (BC) has been provided in Section 3.4.

Figure 6.1.  The derivation of bond constraints from conventional 2D NMR experiments is displayed.  INDEQUATE connectivity is interpreted as a C-C bond constraint (BC) of one bond, COSY connectivity as a H-H BC of 2 to 5 bonds, HMQC connectivity as a C-H BC of one bond, and HMBC connectivity as a C-H BC of 2 or 3 bonds.  The various BC's are transformed into a unified set of C-C BC's based on the HMQC connectivities. 

The spectral interpretation-related steps correspond to the first three options in the Analysis menu, as shown below:

6.2 Interpretation of MF, 1H, 13C and HMQC Data as Building   Blocks

Command: Analysis /Building Blocks.

Description:  This procedure interprets the MF, 1H, 13C, and HMQC data, and generates all possible sets of building blocks for structure generation.  The user is prompted to enter the MF when a new working data set is opened.  To enter a different MF, select File/Input Molecular Formula.  In addition, the MF can also be listed as unknown (see Section 4.4 for details).

 

1H, 13C, and HMQC data are read from the .nmr file.  If the MF is unknown, the user needs at least 13C spectral data.  If the MF is known, and the user does not have NMR data, then isomer enumeration can be performed. 

 

Parameters: None.

 

Results:  The results of interpretation of MF, 1H, 13C, and HMQC data are written into the .mdf file. The first set of generated building blocks will be displayed on the screen.  The results of this procedure will be described in detail in the next few sections.

6.1.1.      Interpretation of Molecular Formula

See section 4.4 for description of the interpretation of MF. 

6.2.2.    Interpretation of 1D 1H Data

The 1H peak list in the NMR data file is interpreted and written into the MDF as a record starting with the keyword “1DH1:”.  Then the number of 1H peaks and the minimum and maximum number of heteroatom-attached protons are listed.  The latter are currently not used so the minimum and maximum are always set as 0 - 0.  The second line is a brief description of the entries in the rest of the lines.  Each of the subsequent lines includes the peak ID, the chemical shift, the minimum and maximum number of corresponding protons, and the multiplicity of the 1H peak.  The minimum and maximum numbers of the corresponding protons are not currently used, so they are listed as zero.  Following is a transcript of such a record:

 

1DH1: num.peaks = 33, num.hete.Hs = 0-0

#Peak. Chem.shift (min. protons ~ Max. protons multiplicity)

# 1. 4.930(0~0 1)

# 2. 4.755(0~0 1)

# 3. 3.509(0~0 0)

# 4. 3.435(0~0 0)

# 5. 2.725(0~0 0)

# 6. 2.611(0~0 0)

# 7. 2.235(0~0 0)

      .

       .

6.2.3.  Interpretation of 1D 13C Data

The 13C peak list in the NMR data file is interpreted and written into the MDF as a record starting with the keyword “1DC13:”.  The number of 13C peaks follow, and the second line is a brief description of the entries in the rest of the lines.  Each of the subsequent lines includes the peak ID, the chemical shift, and the minimum and maximum number of attached protons of a 13C peak.  If the multiplicity of a peak is unknown, a range of attached protons (i.e., 0 to 3) will be assigned to the carbon.

 

Another record, starting with the keyword “SYMMETRY:” describes the molecular symmetry of the unknown molecule.  Currently this entry is either listed as “No” when the number of 13C peaks equals that of carbon atoms, or as “PSE” for partial structure elucidation.  Following is a transcript of such records:

 

1DC13: num.peaks = 21

#Peak, Chem.shift, (Rng.of att.H, i.e., mult.-1)

# 1. 196.06(0~0)

# 2. 145.56(0~0)

# 3. 144.65(0~0)

# 4. 140.75(1~1)

# 5. 123.40(0~0)

# 6. 121.57(0~0)

# 7. 56.28(1~1)

# 8. 53.85(0~0)

       .

       .

       .

 

SYMMETRY: No               

6.2.4.  Interpretation of HMQC/HETCOR Connectivities

Each HMQC/HETCOR connectivity in the NMR data file is interpreted as a C-H BC according to the following rules:  

 

1.        All connectivities are interpreted as a C-H BC of exactly one bond. 

2.        If a 1H peak is found to have no HMQC peak, the user will be prompted to supply the type of heteroatom attached to it, as shown below in the dialog box.  The program then automatically assigns a heteroatom to the proton and adds an X-H BC (X is the heteroatom) to the list of HMQC-derived C-H bond constraints.  The program first lists all of the 1H peaks without HMQC connectivities, together with the recommended assignment of heteroatoms.  For example: 

 

 

 

To accept the H-X assignment, click ‘Yes’ but to edit each proton, select ‘No’ and the user will be prompted to assign heteroatoms to each of the 1H peaks, as shown below:

 

 

The current heteroatoms with attached 1H peaks are numbered and listed in the dialog box.  This is useful when the user wants to attach more than one 1H peaks to the same heteroatom.  In such a case, type a heteratom followed by a number in the list so that the current 1H will be attached to it.  If the user is unsure as to which kind of heteroatom should be connected to the 1H peak, leave the text field empty or type ‘unknown’, and NMR-SAMS will not attach this proton to any heteroatom.  In such a case, any connectivity information relevant to this proton will be ignored during the subsequent analysis.

 

The results of interpretation of HMQC connectivities are written into the MDF as a record starting with the keyword “HMQC:”. Following the keyword is a comment, denoting the sequence of the correlated atoms in each bond constraint.  Each of the rest of the lines is a C-H bond constraint.  Following is a transcript of the record:

 

HMQC: (Node sequence: C-13, H-1)

(3 - 1: 1 ~ 1; 0)Q1

(3 - 2: 1 ~ 1; 0)Q2

(4 - 4: 1 ~ 1; 0)Q3

(6 - 33: 1 ~ 1; 0)Q4

      

6.2.5.  Generation of Building Blocks

If the MF is known, this procedure allocates the constituent protons to the heavy atoms based on the 13C multiplicities and chemical valences of the heavy atoms.  The generated building blocks sets must comply with the 13C multiplicities and number of attached 1H peaks to the heteroatoms.  Each heavy atom, with its attached protons and unsatisfied valence, is called a building block.  The unsatisfied valence is represented as free bonds. 

 

If the MF is unknown, carbon building blocks are derived directly from the 13C peaks, with a certain or uncertain number of attached protons depending on the 13C multiplicity that is known or unknown.  If some 1H peaks are attached to a heteroatom, heteroatom building blocks are also derived.   The user can use the Analysis/User-Defined Building Blocks function to edit the building blocks.  The free bonds of different building blocks can be connected to form bonds, as illustrated in Fig. 6.2:

 

Figure. 6.2 Examples of structural building blocks and bond formation between them. 

The resulting building blocks are written in the MDF as a record starting with the keyword “FRAG_SET:”.  The following is a transcript of such a record:

 

FRAG_SET:

#1:    C  C  CH2 CH1 C  CH1 CH1 CH1 CH1 C 

       C  C  CH2 CH1 CH2 C  CH2 CH2 CH2 CH2

       CH3 CH2 CH2 CH2 CH3 CH2 CH3 CH3 CH3 CH3

       O  OH1 OH1

 

After the building blocks have been generated, the first set of building blocks will be displayed.  If there are multiple building block sets, a Building Block Browser will be displayed that allows the user to browse through each building block set by moving the slider, as shown below: 

 

 

Multiple sets of building blocks are generated when either some or all of the 13C multiplicities are unknown, or there are different kinds of heteroatoms with attached protons.  NMR-SAMS can use multiple sets of building blocks for structure generation, but it only uses the first one for target structure-based resonance assignment.  So wherever possible the user is advised to delete the undesired building block sets.

 

To remove the building block set that is displayed, click ‘Delete’ from the Building Block Browser.  To select the displayed building block set as the only one for structure generation, click ‘Select’ from the Building Block Browser and the rest of the building block sets will be removed.

 

Note: In the case of a 13C peak with unknown multiplicity, NMR-SAMS will try to enumerate all possible numbers of attached protons for its corresponding building block, as long as MF and 13C spectral data have been provided.  If the MF is unknown, or if there are fewer 13C peaks than carbon atoms, NMR-SAMS will generate a building block with unknown number of attached protons, such as ‘CH?’.  Such a building block will be ignored during the subsequent structure generation process.

 

Possible Errors:

·         If no valid building blocks set is generated, the user will have to check the MF, the 13C multiplicities, and the valence of the atoms.

·         The maximum number of building blocks sets is set to 500, and if this number is exceeded, then the remaining ones will be ignored.  In such a case, use 13C multiplicities to constrain the generation of building blocks.

6.3 User-Defined Building Blocks

Command: Analysis/User-Defined Building Blocks.

Description: Regardless of whether the MF is known or unknown, the user can add, delete, or modify the building blocks.

 

To add a building block, select ‘Add’ and then type the element symbol after ‘Element.’  Select ‘Ignored Atom’ to ignore the atom during structure generation (see Section 7.1 for more details regarding ignored atoms).  Select the correct number of attached protons in the text box after ‘Proton Count’.  If the number of protons is unknown, select Unknown.  The default valence for the element will be listed in the text box after ‘Valence’, although the user can select a different valence.  If “C” is selected after ‘Element’, the user can check the box next to ‘Assigned C-13 Shift’ and then type in an appropriate 13C chemical shift.  When the ‘Proton Count’ is larger than zero, the user can check the box next to ‘Assigned H-1 Shifts’, and then type in one or two 1H chemical shifts for the protons.  When entering multiple proton shifts, use a blank space as a delimiter.  Then, click at an empty place in the main graphics window, and a building block with the defined attributes will be added.

 

The user can also copy the attributes from an existing building block by clicking on that building block while keeping the Ctrl key pressed.

 

Note:  There are some limitations to the use of the ‘Add’ option for building blocks.  The newly added carbon building blocks will be ignored (i.e., not used for bond formation during the structure generation process), and any building blocks that have an unknown number of attached protons will also be ignored.  In addition, the chemical shifts of the added building blocks will not be evaluated during the subsequent analysis although they are always displayed.

 

 

To modify a building block, check ‘Modify’ from the palette and then copy the attributes from that building block by clicking on it while pressing the Ctrl key.  Next, change the corresponding attributes in the palette and then click on the building block again (without pressing the Ctrl key) and the building block will be modified accordingly.

 

Tip:  To modify a non-ignored building block so that it is ignored  (or vise versa), set the option ‘Ignored Atom’ as required and then click on that building block.  The first time the option is clicked, it will only toggle the ‘Ignored Atom’ state if the required value is different from the current state of the building block.  If the user wants to change other attributes, click it again and all other attributes will be modified according to those specified.

 

Note: the user can only modify the ‘Ignored Atom’ and ‘Proton Count’ attributes for a carbon building block derived from 13C data.

 

To delete a building block, check ‘Delete’ from the palette and then click on the building block to delete.

 

 

Note: the user cannot delete a carbon building block that was derived from a 13C peak. 

 

Results:  The modified building blocks are written in the MDF as a record starting with the keyword “FRAG_SET:”.  The original record is overwritten. 

6.4 Interpretation of 2D Spectral Data as Bond Constraints

Command: Analysis/Bond Constraints.

Description:  This procedure interprets the COSY, HMBC, NOESY, and INADQUATE spectral data in the .nmr data file to define bond constraints.  Then the various bond constraints are unified and an atom-atom connection matrix is set up for subsequent structure generation or resonance assignment.

 

Parameters: The relevant parameters for interpreting the 2D spectral data are accessed by selecting Edit/Parameters/NMR Interpretation and the following dialog box will appear: 

 

 

For explanation of the parameters, see Parameters for Spectral Interpretation in Appendix IV.

 

The relevant parameters for setting up the ACMX can be accessed by choosing Edit/Parameters/Setting up ACMX and the following dialog box will be displayed:

 

 

For more explanation of the parameters, see Parameters for Setting Up ACMX in Appendix IV.

 

Results: The results of this procedure will be described for each type of spectral data in the next few sections. 

6.4.1.  Interpretation of COSY Connectivities

The results of COSY interpretation are written into the MDF as a record starting with the keyword “COSY:”, and can be edited by choosing Edit/Master Data File.  Each COSY connectivity in the NMR data file is first classified as being due to either potential long-range coupling or short-range coupling.  Based on that, a H-H BC is assigned to it.  The rules for this step are described below:

 

1.        When the intensity level is weak (represented as “1”), it is treated as due to potential long-range coupling.

2.        When the intensity level is medium (2), strong (3) or blank, it is treated as due to short-range coupling.

3.        When the intensity level is unknown (0), the J-coupling constant is used to classify short-range and long-range couplings.  If the J-coupling constant is also unknown (represented as 0.0), then an error message will be displayed and the interpretation is aborted.  If the J-coupling constant is defined as J Hz, it is compared with the parameter COSY_J_CATEG (3.0, by default).  All connectivities that have J £ COSY_J_CATEG are treated as due to potential long-range coupling, and the rest are treated as due to short-range coupling. 

4.        When a connectivity is classified as being due to short-range coupling and has a correlated singlet 1H peak, then NMR-SAMS will prompt the user to confirm if it is due to long-range coupling.  If the user clicks ‘Yes’, then it will be classified as a long-range coupling, and if the user clicks ‘No’, then it will remains a short-range coupling. 

5.        The user can perform a check of possible long-range couplings based on 1H chemical shifts.  To do this, select Edit/Parameters/NMR Data Interpretation, and then add an appropriate value (e.g. 4.5) after Minimum H-1 Shift for Checking Long-Range H-H Coupling.  This option is turned off, by default (i.e., value set as 0).

6.        By default, all connectivities due to short-range couplings are interpreted as H-H BC’s with 2 to 3 intervening bonds.  By default, all connectivities due to long-range couplings are interpreted as H-H BC’s with 3 to 5 intervening bonds.  The number of intervening bonds is controlled by the COSY_BC parameter.

7.        The bond types of the intervening bonds are always set as unknown (0), and the number of sub-bond constraints (NSBC) that must satisfy a BC, minNSBC and maxNSBC, are determined as follows:

minNSBC = 1 if  P ³ RELIAB_PEAK_PROB, or

minNSBC = 0 if  P < RELIAB_PEAK_PROB, and

maxNSBC  = n1 ´ n2 ,

where P is the reliability of the connectivity, and n1 and n2 are the number of correlated 1D peaks in each dimension, respectively.  The default value of the parameter, RELIAB_PEAK_PROB, is set as 0.50.  For example, the following connectivity is due to an “unreliable” DQF-COSY peak since the reliability is 0.4: 

#8 (2 - 5 6) 3 0.00 0.4 ;unreliable, may be an artifact

So this connectivity is interpreted as the following H-H BC:

(2 - 5 6: 2 ~ 3; 0; 0 ~ 2)C8

which means that this BC is flexible enough to be considered as satisfied if none, one, or both of the proton pairs (i.e. H2-H5 and H2-H6) have a bond separation of two or three bonds in the generated structure.  

8.        If two 1H peaks are very close and no COSY peak is observed between them, the user is alerted to check if a near-diagonal peak has been neglected between them.  If the user is not sure about this, then the program will allow a "pseudo bond constraint" to be added for this proton pair.  The tolerance for checking near-diagonal COSY peaks is controlled by a parameter called COSY_DIAG_RESO, and its default value is 0.02ppm.  The user can change this by selecting Edit/Parameters/NMR Interpretation.  The pseudo BC is used to prevent two atoms from being forbidden to connect while setting up the ACMX.

 

The results of COSY interpretation are written into the MDF as a record starting with the keyword “COSY:”.  Following the keyword is a comment, denoting the parameters used for the interpretation, and each line thereafter is a H-H bond constraint.  The following is a transcript of the record:

 

COSY: (COSY_BC = 3 5 2 3; COSY_DIAG_RESO = 0.020)

(1 - 2: 3 ~ 5; 0; 1 ~ 1)C1

(1 - 12: 3 ~ 5; 0; 1 ~ 1)C2

(2 - 12: 3 ~ 5; 0; 1 ~ 1)C3

(3 - 7 8: 2 ~ 3; 0; 1 ~ 2)C4

       .

       .

6.4.2.  Interpretation of HMBC/COLOC Connectivities

Each HMBC/COLOC connectivity list in the NMR data file is interpreted as a C-H BC according to the following rules:  

 

1.        Each connectivity is interpreted as a C-H BC of a certain range of intervening bonds based on the intensity level of the peak and the relevant parameters.  

2.        The bond types of the intervening bonds are always set as unknown (0), and the number of sub-bond constraints (NSBC) that must satisfy a BC, minNSBC and maxNSBC, are determined as follows:

minNSBC = 1 if  P ³ RELIAB_PEAK_PROB, or

minNSBC = 0 if  P < RELIAB_PEAK_PROB, and

maxNSBC  = n1 ´ n2 ,

where P is the reliability of the connectivity, and n1 and n2 are the number of correlated 1D peaks in each dimension, respectively.   The default value of the parameter, RELIAB_PEAK_PROB is set as 0.50.  For example, the following connectivity is due to an “unreliable” HMBC peak because it’s reliability is 0.4: 

#3 (10 - 8) 3 0.00 0.4; very weak, may be an artifact

So this connectivity is interpreted as the following C-H BC:

(10 - 8: 2 ~ 3; 0; 0 ~ 1)B3

The last two numbers, 0 and 1, mean that bond separation between C10 and H8, can either satisfy or violate this BC in the generated structure. 

 

The results of interpretation of HMQC connectivities are written into the MDF as a record starting with the keyword “HMBC:”.  Following the keyword is a comment, denoting the parameters used for interpretation and sequence of the correlated atoms in each bond constraint.  Each line thereafter is a C-H bond constraint.  The following is a transcript of the record:

 

HMBC: (HMBC_BC = 2 3, Node sequence: C-13, H-1)

(1 - 6: 2 ~ 3; 0; 1 ~ 1)B1

(1 - 7 8: 2 ~ 3; 0; 1 ~ 2)B2

(1 - 13: 2 ~ 3; 0; 1 ~ 1)B3

(1 - 15: 2 ~ 3; 0; 1 ~ 1)B4

       .

       .

       .

6.4.3.  Interpretation of NOESY Connectivities

A NOESY connectivity in the NMR data file is always interpreted as a H-H BC of 2 to 6 bonds.  NOESY is useful to NMR-SAMS only when the user opts to use the negative information of COSY together with NOESY.  For example, if there is neither a COSY nor a NOESY peak observed between two carbon atoms, then this pair is forbidden to connect (see the usage of parameter IDEAL_COSY in Appendix IV).  In the current version of NMR-SAMS, through space NOESY correlations are not used as bond constraints during structure elucidation. 

 

The results of interpretation of NOESY connectivities are written into the MDF as a record starting with the keyword “NOESY:”.  Following the keyword is a comment, denoting the parameters used for interpretation and sequence of the correlated atoms in each bond constraint.  Each of the rest of the lines is a H-H bond constraint.  The following is a transcript of the record:

 

NOESY: (NOESY_BC = 2 6 0, Node sequence: H-1, H-1)

(1 - 2: 2 ~ 6; 0; 1 ~ 1)N1

(1 - 3: 2 ~ 6; 0; 1 ~ 1)N2

(1 - 12: 2 ~ 6; 0; 1 ~ 1)N3

(2 - 12: 2 ~ 6; 0; 1 ~ 1)N4

(3 - 7 8: 2 ~ 6; 0; 1 ~ 2)N5

       .

       .

       .

6.4.4.  Interpretation of INADEQUATE Connectivities

Each INADEQUATE connectivity in the NMR data file is interpreted as a C-C BC according to the following rules:  

 

1.        Each connectivity is interpreted as a C-C BC of one intervening bond, by default.  The number of intervening bonds is controlled by the first two values of the parameter INAD_BC. 

2.        The bond type is controlled by the third value of the parameter, INAD_BC, and by default is defined as unspecified (i.e., unknown).  This can be changed to single, double, or triple.  For example, if an INADEQUATE experiment is optimized to manifest only single C-C bonds, the user can set the third value of INAD_BC as 1, so that all of the connectivities are interpreted as C-C single bonds.  This will improve the efficiency of the structure generation process since NMR-SAMS will not consider the other possibilities of these bonds.

3.        The number of sub-bond constraints (NSBC) that must satisfy a BC, minNSBC and  maxNSBC, are determined as follows:

minNSBC = 1 if  P ³ RELIAB_PEAK_PROB, or

minNSBC = 0 if  P < RELIAB_PEAK_PROB, and

maxNSBC  = n1 ´ n2 ,

where P is the reliability of the connectivity, and n1 and n2 are the number of correlated 1D peaks in each dimension, respectively.  The default value of the parameter, RELIAB_PEAK_PROB is set as 0.50.  For example, the following connectivity is due to an “unreliable” INADEQUATE peak since it’s reliability is set as 0.4: 

#18 (9 10 - 28) 3 0.0 0.4 ;C9 and C10 too close to resolve

This connectivity is interpreted as the following C-C BC:

(9 10 - 28: 1 ~ 1; 0; 0 ~ 2)B3

which means that this BC is flexible enough to be considered as satisfied, if either none, one, or both of carbon pairs (i.e. C9-C28 and C10-C28) have a bond separation of one bond in the generated structure.  

 

The results are written into the MDF as a record starting with the keyword “INADEQUATE:”.  Following the keyword is a comment, denoting the parameters used for interpretation, and each line thereafter is a C-C bond constraint. Following is a transcript of the record:

 

INADEQUATE: (INAD_BC = 1 1 0)

(2 - 1: 1 ~ 1; 0; 1 ~ 1)I1

(4 - 3: 1 ~ 1; 0; 1 ~ 1)I2

(5 - 4: 1 ~ 1; 0; 1 ~ 1)I3

(6 - 5: 1 ~ 1; 0; 1 ~ 1)I4

       .

       .

       .

6.4.5.  Transformation of Bond Constraints

After interpreting the various 2D spectral data as bond constraints, this procedure transforms the various kinds of BC’s into a homogenous set of C-C (or heteroatoms) BC’s based on the HMQC-derived C-H BCs. The following rules are observed:

 

1.        An INDEQUATE-derived C-C BC remains unchanged.

2.        The correlated 1H peaks in a DQF COSY-derived H-H BC is replaced by their correlated 13C peaks in HMQC, and the bond separation is reduced by 2.

3.        The correlated 1H peak(s) in an HMBC-derived C-H BC is replaced by their correlated 13C peaks in HMQC, and the bond separation is reduced by 1.

4.        The correlated 1H peaks in a NOESY-derived H-H BC is replaced by their correlated 13C peaks in HMQC, and the bond separation is reduced by 2.

5.        If a degenerate 1H peak has multiple correlated 13C peaks, pseudo C-C BC’s are added between these 13C peaks.  The pseudo BC is used to prevent the two atoms from being forbidden to connect while setting up the ACMX.

Note: A degenerate 1H peak has multiple correlated 13C peaks in HMQC unless they arise from geminal protons.  If a certain BC involves such a 1H peak, all correlated 13C peaks are included in the resulting C-C BC, so additional ambiguity is introduced to the resulting C-C BC.  In such a case, NMR-SAMS can use such ambiguous BC’s for structure generation.

6.        The source of the relevant BC’s are included as comments in the resulting C-C BC so that the user can keep track of the various connectivities from which a C-C BC is derived.

 

Fig. 6.3 illustrates the transformation of an ambiguous COSY BC into C-C BC.  The ambiguity arises from the overlapping peaks of H8 and H9.

Figure 6.3 Illustration of the transformation of a DQF-COSY-derived H-H BC into a C-C BC, based on the relevant HMQC connectivities.  The two protons in the circle cannot be resolved in the DQF-COSY spectrum, thus introducing ambiguity in the resulting C-C BC.  For more details about the format of the bond constraints, please refer to Section 3.4.

All resultant C-C BC’s will be crosschecked for mutual consistency.  If two BC's have the same relevant nodes, they are merged according to the following rules:

 

·         If all entries are identical except for their source, their sources are merged.

·         If the ranges of bond separation, minBond and maxBond, are different, and an intersection is possible, then the intersection of the two ranges is adopted.  Otherwise, NMR-SAMS will prompt the user to supply a valid minBond and maxBond.  For example, if one BC requires a bond separation of 1 to 3 bonds, and the other, 1 to 1 bond, then the intersection, 1 to 1 bond (i.e., exactly one bond), is adopted for the merged BC.  On the other hand, if one BC requires a bond separation of 2 to 3 bonds, and the other, 1 to 1 bond, then the following message (as shown below) will prompt the user to enter the proper bond separation because no intersection is possible between the two BC’s.

 

In this example, type “1 1” if it is known to be a vicinal coupling, or “1 3” if it is not known.

·         Similar to bond separation, if the ranges of NSBC, minNSBC and maxNSBC, are different, the intersection of the two ranges is adopted whenever an intersection is possible.  Otherwise, the user will be prompted with a similar message as above to supply a valid range for minNSBC and maxNSBC.

·         If the bond types are different, then NMR-SAMS adopts the higher bond order (the order of priority is triple, double, single and unknown). 

 

Note:   Most of the BC’s can be combined with other BC’s (e.g., a COSY BC with an HMBC one) except for NOESY BC’s, which are treated differently.  NOESY BC’s can be combined only with other NOESY BC’s concerning the same protons attached to 13C signals.

 

Results: The results are written into the MDF as a record starting with the keyword “C13~~C13:” and following the keyword are some comments that are internally used by the program (Note: the user must not change these comments).  Every line thereafter represents a C-C bond constraint and for more details regarding the format of bond constraints, see Section 3.4.  The following is a transcript of the record:

 

C13~~C13: COSY-Y, NOESY-Y, HMBC-Y, INAD-N (Node sequence: C-13, C-13)

(3 - 25: 1 ~ 2; 0; 1 ~ 1)C2Q1Q27C3Q2Q27B13Q27B114Q1B115Q2

(9 - 15 19: 1 ~ 1; 0; 1 ~ 2)C4Q7Q11Q17B46Q11Q17

(9 - 8: 1 ~ 1; 0; 1 ~ 1)C5Q7Q6B39Q7B48Q6

       .

       .

       .

 

Tips: Running NMR-SAMS and SpecMan side-by-side provides a convenient way to inspect the original cross peaks when a bond constraint is mentioned in a dialog box, or when the user is editing the bond constraints in the MDF.  Fig. 6.4 illustrates how to keep track of the cross peaks from which a bond constraint is derived.

 

Figure 6.4 Schematic depicting how to keep track of the cross peaks from which a bond constraint (BC) has been derived.  Run NMR-SAMS and SpecMan side-by-side and from the comment field of the BC being verified, find the code of connectivities from which the BC was derived (“C3+66”, “Q18”, and “Q28” in this example).  This means that this BC was derived from COSY peaks #3 and #66, and HMQC peaks #18 and #28. With SpecMan, load the COSY peaks table and then click the ID’s of one of these cross peaks.  Upon clicking the ID’s, SpecMan will display the cross peaks in the 2D spectral window.

6.4.6.  Setting up Atom-Atom Connection Matrix (ACMX)

Once the user selects Analysis/Bond Constraints, Analysis/User-Defined Bond Constraints, or Analysis/User-Defined Environment Constraints, NMR-SAMS tries to generate an ACMX for each building block set based on the available building blocks, bond constraints, and environment constraints. NMR-SAMS uses atom-atom connection matrix (ACMX, also known as free bond connection matrix) to represent the bonding possibilities between the constituent heavy atoms of the unknown molecule.  By default, the unambiguous bond constraints (which define one bond between exactly two atoms) are treated as fixed bonds, and the rest are used as constraints during the subsequent structure generation.

If there is only one set of building blocks, NMR-SAMS will automatically form some common functional groups based on 13C chemical shifts and elemental composition while setting up the ACMX.  These functional groups include >C=O, -COO-, -COOH, -COON<, -COONH-, -NO2, -OSO3Hn (n = 0 or 1), and -OPO3Hn (n £ 0, 1, or 2).  Sometimes these automatically added functional groups are not reliable so the user is advised to check and modify them if necessary (see Section 7.2).

 

Results: For each building block set, a record starting with the keyword “ACMX: #x:” (where x is the sequential number of the ACMX) is written in the MDF.  The following is a transcript of such a record:

 

ACMX: #1:

(HETCON_FLAG = 0, CCBOND_FLAG = 1 1 1,  BC_WEIGHT = 48,

IDEAL_COSY = 1, H1MULT_FLAG = 1, MAX_GEN_ANBC = 3, FIX_BOND_FLAG = 1)

# 1. 6 0  0  1  1    1  0 2 1  0 1 0    3 31 31 32      0

# 2. 6 0  0  2  2    4  0 2 0  0 1 0    0               0

# 3. 6 2  0  3  3    2  0 2 0  0 1 0    0               0

       .

       .

       .

 

After setting up an ACMX, the first building block set is displayed along with the fixed bonds, if any.  If there are multiple ACMX's, a Building Block Browser will be displayed, and this browser enables browsing through the building block sets.  By default, atoms with satisfied valences are displayed in gray, and atoms with free bonds are displayed in blue and marked by an asterisk (*).  Bonds of unspecified type are displayed as dashed lines.  To highlight an atom so that it cannot be connected to a specific atom, select Display/Display Options/Show Disconnectivities and then click on the two atoms, and to display the Connection Table, select Display/Display Options/Connection Table.  The Connection Table lists building blocks, their associated chemical shifts, and the current bond constraints and environment constraints (see Chapter 10).  The ACMX's are not displayed but can be viewed in the MDF by selecting Edit/Master Data File.

 

Possible Errors: Depending on the situation, the following potential error messages appear during the set up of the ACMX:

 

·         Too many fixed bonds for a certain atom.  This means that either a long-range coupled COSY peak was mistakenly interpreted as a vicinal one, or the valence of this atom was set wrong.  In the former case, mark the long-range COSY connectivities in the .nmr file (see Section 6.4.1) and reselect Analysis/Bond Constraints.  In the latter case, modify the valence of the atom according to Section 4.4.

·         Too many double bonds for a certain atom.  The minimum and maximum number of attached double bonds of each atom are determined during the interpretation of the MF (see Section 4.4).  If this happens, modify the corresponding entries and repeat the step.

·         Too many triple bonds for a certain atom.  The minimum and maximum number of attached triple bonds of each atom are determined during the interpretation of the MF (see Section 4.4).  If this happens, modify the corresponding entries and repeat the step.

·         Too many free bonds.  The number of free bonds, n_free_bond, can be calculated as follows:

n_free_bond = Svalence - SH - 2 ´ Sfixed_bond

where Svalence, SH, and Sfixed_bond are the sums of valences of the heavy atoms, the constituent protons, and the fixed bonds (double and triple bonds multiplied by 2 and 3), respectively.  n_free_bond is one of the major factors that determines the complexity of the structure generation problem.  The current upper limit of the free bonds is 220.  If n_free_bond overflows, the user can manually add some known bonds to a record starting from the keyword “ATOM~~ATOM:” in the MDF to reduce the number of free bonds (see Section 7.2).

 


Chapter 7

2D Structure Generation

7.1 Overview

This chapter describes the 2D structure generation of NMR-SAMS.  The structure generation of NMR-SAMS starts from the ACMX described in the previous chapter.  Prior to structure generation, the user can add known bonds, edit fixed bonds derived by the program, add environment constraints, and check the parameters for structure generation.  Then, the structure generator of NMR-SAMS will assemble the building blocks into complete structures that are compatible with all available spectral and chemical constraints. 

 

The structure generation is based on heteroatoms and the carbon atoms labeled by 13C chemical shifts. Depending on the number of observed 13C peaks, the user can either perform complete structure elucidation or partial structure elucidation.  In some cases, such as those with a symmetric molecule or when the 13C spectrum shows severe overlap, partial structure elucidation is performed based on the limited carbon atoms labeled by the well-resolved 13C chemical shifts, as well as the constituent heteroatoms.  The remaining carbon atoms, called ignored atoms, are excluded during structure generation.  The resulting structure is usually a partial structure, with some dummy bonds that are supposedly linked to the ignored moieties.  Fig. 7.1 shows an example of partial structure elucidation:

Figure 7.1.  Illustration of the partial structure elucidation of paclitaxel using NMR-SAMS.  Both the 1H and 13C resonances of the three phenyl groups are difficult to resolve and are thus ignored.  Using only the well-resolved portions of the 1D and 2D spectra, NMR-SAMS generates the core structure, with three dummy bonds (represented as the bold arrows), linked to the ignored phenyl groups.

Note that compared to complete structure elucidation, partial structure elucidation has the following limitations:

·         An ignored moiety is assumed to be linked to the core structure by a single bond (i.e., a dummy bond is of single bond type).  Only one dummy bond is automatically added on each atom.  In the case where an ignored atom is connected to the core structure by a multiple bond, the user is advised to add the remaining dummy bonds as user-defined bond constraints prior to structure generation (see Section 7.2).

·         The user must provide the number (or a range) of the dummy bonds to be fixed in a generated structure, before performing the structure generation (see Section 7.4).

·         For efficient structure elucidation, as many user-defined bond/environment constraints as possible need to be supplied to reduce the search space, and thereby speed up the convergence of structure generation (see Sections 7.2 and 7.3).

 

The structure generation-related steps corresponding to the second group of options in the Analysis menu are shown below:

7.2 User-Defined Bond Constraints

Command: Analysis/User-Defined Bond Constraints.

Description: This procedure is used to define known structural fragments as user-defined bond constraints between the building blocks.

 

 

 

 

 

 

 

 

 

Figure 7.1.  Screen snapshot showing the process of defining user-defined bond constraints. The building blocks and fixed bonds are displayed in the main graphics window and the User-Defined Bond Constraints palette provides tools to add or remove bonds between the building blocks.

Upon selecting Analysis/User-Defined Bond Constraints, a User-Defined Bond Constraints palette will be displayed (Fig. 7.2). To add a bond, select' Add' from the User-Defined Bond Constraints palette and then select the type of bond to add (Single, Double, Triple, or Unknown).  If the type of bond is not certain, select 'Unknown.'  Then, click on two building blocks to add a bond between them (if a building block is clicked by mistake, click on it again to de-select it).  The new bond will be checked against available constraint information and if any inconsistency is detected, the bond will be rejected.

 

To delete a bond, select 'Delete' from the User-Defined Bond Constraints palette and then click on two atoms to delete the bond between them.  When an NMR-derived bond is deleted, NMR-SAMS will prompt the user to confirm that the user wants to prevent the bond from being added again, as shown in the following dialog box:

 

 

To forbid the bond from being added in the future, select 'Yes' and NMR-SAMS will add a pseudo bond constraint to prevent the atoms from being connected.  This pseudo bond constraint can be removed by selecting 'Delete' and then clicking on the two atoms.  To allow the bond to be added in the future, select 'No.' 

 

To modify the bond type between two atoms, select 'Add' and the desired bond type and then click on the two atoms of interest.  NMR-SAMS will ask the user to modify the bond constraint, as shown in the following dialog box.  

 

 

Click 'Yes' to modify and the bond will be modified with the attributes of the previous bond that had been entered.

 

For Partial Structure Elucidation, a dummy bond can be added to an atom that is known to be connected to a certain ignored moiety.  To do this, select 'Add' and check the 'Dummy Bond' box in the User-Defined Bond Constraints palette, click on the desired atom and the dummy bond will be displayed as a tilde (~).  To delete a dummy bond, select 'Delete', check the 'Dummy Bond' box, click on the desired atom and the tilde (~) will disappear.  Note that a dummy bond is of single bond type and if the user adds two dummy bonds to the same atom, they could be two single bonds, or one double bond.

 

Once finished with the addition or removal of bonds, select 'OK' and NMR-SAMS will crosscheck all of the bond constraints, including the user-defined and NMR-derived bond constraints.  Then the ACMX will be regenerated (see Section 6.4.6). 

 

Results: The user-defined bond constraints are saved as a record starting with the keyword “ATOM~~ATOM:” in the MDF.   The previous ACMX will be overwritten by the updated one.  Each updated ACMX is saved as a record starting with the keyword “ACMX: #x:” where x is the sequential number of the ACMX.  If a complete structure has been obtained, it is saved in a record starting with the keyword “RESULTS:”.  The following is a transcript of such a record of user-defined bond constraints:

 

ATOM~~ATOM:

(9 - 8: 1 ~ 1; 0)G

(14 - 8: 1 ~ 1; 0)G

(19 - 9: 1 ~ 1; 0)G

      

Limitation:  Currently the interactive input of user-defined bond constraints is limited to bonds between two assigned atoms.  A general bond constraint that might have ambiguous atoms or bond separation must be manually appended to the bond constraints under the keyword “ACMX:”.  If there are multiple ACMX’s, the user will have to manually append each individual ACMX.

7.2.1.  Interactive Structure Generation

In addition to being able to add known fragments, the user can also interactively complete the structure generation process starting from either a building block set or from a previously generated substructure.

 

To start from a building block set, display the building blocks by selecting Display/Building Blocks and Fixed Bonds and then select the appropriate building block set if there are multiple sets.  To start from a substructure, display the previously generated substructure by selecting Display/Generated Structures, and then select an appropriate substructure as a starting point.  Then, select Analysis/User-Defined Bond Constraints to add/delete/modify bonds between the building blocks until a complete structure is obtained.

 

This is analogous to manually assembling a structure on paper, but it has the advantage of checking the consistency with the spectral and chemical constraints on the fly.  Moreover, the interaction between the displayed atoms and the bond constraints (Fig. 7.2) helps the user to identify potential bonds to add to selected atoms. 

 

Figure. 7.2.Illustration of interaction between the building blocks and the Connection Table.  By clicking an atom in the main graphics window, it's associated chemical shifts and relevant bond constraints are highlighted in the Connection Table.   Alternately, by clicking an entry in the Connection Table, the relevant atom(s) in the main graphics window will be highlighted.

Note:  While adding user-defined bond constraints, the user needs to double click an atom to highlight its relevant bond constraints in the Connection Table, otherwise a bond will be added between this atom and the next atom that is clicked.

 

Once a complete structure has been obtained, NMR-SAMS congratulates the user with the following message:

 

Click 'OK' to this message and then click 'OK' in the User-Defined Bond Constraints palette.  NMR-SAMS will prompt the user to save the completed structure.

7.3 User-Defined Atom Environment Constraints

Command: Analysis/Atom Environment Constraints. 

Description:  This procedure is used to define the known structural information as atom environment constraints (EC).  An EC defines the number of times that a cerrtain type of atom (with specific/non-specific bond type) is the immediate neighbor of a specific atom (focus atom).  For an EC, the user does not need to know the numbering of the neighboring atom.  For example, since it is difficult to distinguish the two different situations illustrated in Fig. 7.3, the user is not able to enter user-defined bond constraints, but can however enter the bond constraint information as two EC's requiring that both C-1 and C-2 have exactly one oxygen as a neighbor.

Figure 7.3.  A situation where it is difficult to predict if C-1 and C-2 are connected to the same oxygen atom (a) or to different oxygen atoms (b).  This can be defined as two environment constraints: (1 - O: 1 ~ 1; 1) and (2 - O: 1 ~ 1; 1). 

In the MDF, an EC is represented as a line in the following format:

 

(focusAtom - neighborElement: minOccurrence ~ maxOccurrence; bondType)

where

focusAtom is the ID of the focus atom,

neighborElement is the element symbol of the neighboring atom(s) under consideration

minOccurrence and maxOccurrence are the minimum and maximum occurrences of the neighboring atom under consideration

bondType is the type of bond between the focus atom and the neighboring atom under consideration. bondType can be 0 for unspecified, 1 for single, 2 for double or 3 for triple.  If the bond is unspecified, it will be treated as all types of bonds.

 

Relevant Operations:

The following 'Edit Atom Environment Constraints' dialog box appears after selecting Atom Environment Constraints from the Analysis menu. The current EC's will be displayed in the dialog box.

 

 

To add an EC, type the ID of the focus atom as 'Focus Atom ID' and then type the element symbol of the neighboring atom under consideration as the 'Neighboring Element'.  Select the 'Bond Type' (unspecified covers all types of bonds) and then type the 'Minimum' and 'Maximum' occurrences of such neighboring atoms.  Finally, click 'Add' and the newly defined EC will be listed in the Atom Environment Constraints table.

 

To modify an EC, click on an EC from the list and the corresponding entries will be updated accordingly.  Then, type the new values for 'Range of Occurrence', and click 'Add' to update the EC.  Note that if the user changes the 'Focus Atom ID', 'Neighboring Element', or 'Bond Type', then a new EC will be added.

 

To delete an EC, click on an EC from the list and then click 'Delete' and it will be removed from the table.

 

After completing the EC alterations, click 'OK' and NMR-SAMS will set up the ACMX again with the updated Environment Constraint information.   

 

Examples:

(1 - N: 0 ~ 1; 0)        requires atom #1 to be linked to no more than one N atom.

(2 - N: 1 ~ 1; 3)    requires atom #2 to be linked to exactly one N with a triple bond. Nitrogen atoms with other types of bonds are not limited by this EC.

(3 - C: 1 ~ 1; 2)    requires atom #3 to be linked to exactly one C with a double bond. Carbon atoms with other types of bonds are not limited by this EC.

Results: The user-defined environment constraints are saved as a record starting with the keyword “ENVIRONMENT:” in the MDF.  Any previous ACMX's will be overwritten by updated ones.  Each updated ACMX is saved as a record starting with the keyword “ACMX: #x:” where x is the sequential number of the ACMX.  The following is a transcript of a record of environment constraints:

 

ENVIRONMENT:

(4 - O: 1 ~ 1; 0)

       .

       .

       .

 

Note: NMR-SAMS does not crosscheck the EC's for consistency with the current structural state and the bond constraints, so it may accept an EC that could potentially conflict with the current structure state or bond constraints.  In addition, NMR-SAMS will not crosscheck the EC's for mutual consistency.  The user is urged to use EC's with caution, since a wrong EC could result in missing a correct structure due to the fact that EC's cannot be violated in the generated structure.  Also, please note that for partial structure elucidation, the user is not permitted to add an EC on an ignored atom.

7.4 Structure Generation

Command: Analysis/Generate 2D Structures.

Description: In this step, NMR-SAMS searches all possible ways to assemble structural building blocks into complete structures.  The resulting structures or substructures should be compatible with all available spectral and chemical constraints, as long as the number of violated constraints is within the user-defined limits.

 

When there are multiple ACMX's, structure generation will be performed using each one of them, one at a time, and the resulting structures will be saved in a structure file (.str).  The user can opt to save intermediate substructures along with complete structures, along with opting to limit the maximum number of structures by changing the control parameters.

 

For Partial Structure Elucidation Only: During partial structure elucidation (PSE), the structure generator will try to generate the largest substructure consistent with available data.  In some instances, a dummy bond is used to satisfy a free bond by assuming that it is connected to one of the ignored atoms.  Upon selecting Analysis/Generate 2D Structures, the user is prompted to define a range of dummy bonds to be fixed in the generated structure.  Note that this does not include the dummy bonds that the user has added as user-defined bond constraints (See Section 7.2).  For example, there are three phenyl groups in the paclitaxel molecule and these groups are not included in the structure elucidation process, as shown below.  Hence, the user needs to type "3 3" in order to add exactly three dummy bonds to the generated structure (see Fig. 7.1).  When the number of dummy bonds is unknown, the user can type a range (e.g. “0 3”), and many more candidate structures will be generated. 

 

 

During structure generation, the following 'Structure Generation in Process' dialog box is displayed showing the initial state and the current results of the structure generation process:

 

Figure 7.4. The Structure Generation in Process dialog box of NMR-SAMS.  The first line indicates the current ACMX being used (if there are multiple ACMX's, all of them will be used), and listed under 'Initial Problem State' are the MF, the number of free bonds, and the unsatisfied bond constraints.  These values define the complexity of the structure generation problem.  The larger the number of constituent atoms and free bonds, and the smaller the number of bond constraints, the more complex the structure generation process will be. Listed under 'Results' are the current number of generated structures, the number of chemically unique structures (in parenthesis), the number of retained substructures, and the elapsed computation time in minutes.

The dialog box is updated at a frequency based on the parameter DISP_CMPLT_DELAY (the default value is 0.1 minute).

 

Depending upon the complexity of the problem, the computational time required for structure generation can range from seconds to hours.  To abort the structure generation at any time, click the 'Stop' button. 

 

Relevant Parameters:

As described in Section 3.6, structure generation is a complex and time-consuming problem, so NMR-SAMS provides the user with a set of parameters to control the speed and completeness of structure generation.  Initially, it is suggested that the user utilize the default values of these parameters that have been optimized for heuristic search and have been proven to be effective for many structure elucidation problems.  However, depending upon individual results, the user can try different parameter value combinations to accelerate the structure generation process or to make the structure generation more exhaustive.

 

Commonly used parameters relevant to structure generation can be modified by selecting Edit/Parameters/2D Structure Generation and the dialog box illustrated in Fig. 7.5 will appear (with default parameters shown).  The user can select or modify the parameters listed in the dialog box, as well as restoring default values by selecting 'Default'.  Click 'OK' to apply any changes and click 'Cancel' to ignore any changes so that the parameters revert back to their original settings. 

 

Figure 7.5. The Edit Parameters for Structure Generation dialog box, with the parameters relevant to structure generation checked.  See Appendix IV for usage of these parameters.

Tips:  After completing structure generation, the parameters used for the calculations are written into the log file.  The log file can be viewed by selecting Edit/Log File.

 

Results:  The results of structure generation are complete structures and substructures, with assignment of 1H and 13C chemical shifts.  A redundant structure usually implies alternative assignment of 13C chemical shifts, and in the MDF the results of structure generation are summarized in a record starting with the keyword “RESULTS:”.  The following is a transcript of such a record:

RESULTS: 

For ACMX #1, 12 structures were generated and 12 of them are chemically unique.

  38 substructures were retained.

 Actually Used SAT_BC_RATE = 1.000.

 

SUMMARY:  50 (sub)structures saved in file

C:\Spectrum2001\Data\NMR-SAMS\Taxol\Paclitaxel-test.str.

N_STR = 12, N_PRO_STR = 12, N_SS = 38, N_PRO_SS = 219274, MIN_GEN_BOND = 31

Time for Structure generation: 1447.90/1502s.

 

where N_STR is the number of chemically unique complete structures generated, N_PRO_STR is the total number of complete structures generated, N_SS is the number of retained largest substructures, N_PRO_SS is the total number of generated substructures, and MIN_GEN_BOND is the minimum number of generated bonds in the retained substructures. The CPU and elapsed times for structure generation are reported in seconds.

 

The generated structures/substructures are stored in a structure file (.str) as connection tables.  For the graphical display of the structures/substructures, see Chapter 10.

Possible Errors:

If structure generation has finished without the generation of any structures, or if the candidate structures do not look correct, then review the following and repeat the structure generation process:

·         If the program exceeds the upper limit of allowed structures this could result in losing some potential structures.  Increase MAX_REC_STR and repeat structure generation.  See parameter MAX_REC_STR in Appendix IV. 

·         Check the peak picking results and look for errors related to long-range coupled DQF-COSY peaks, 1H multiplicity, and the usage of negative information of DQF-COSY data.  Such errors could cause the failure of structure generation. (see the usage of parameters FIX_BOND_FLAG, H1_MULT_FLAG and IDEAL_COSY in Appendix IV).

·         Increase the Maximum Limit for Bond Constraint Violation to allow some of the bond constraints to be violated during structure generation.  See parameter MAX_ERR_BC in Appendix IV.

·         Increase the Additional Tolerance for using 13C chemical shifts.  See parameter ADD_C13_RNG in Appendix IV.

·         Allow the program to search a larger solution space by decreasing the Ending Value of Rate of Bond Constraint Satisfaction, and/or increasing the value of Average Number of Possibilities to Search for Each C-C Bond.  See parameter SAT_BC_RATE and N_FBX_STEP in Appendix IV.  The user can also select different Search Criteria for Structure Generation.  The search space will be automatically increased if Basic or Exhaustive are selected as the Search Criteria for Structure Generation.

·         Check the information in the log file to find potential problems.  Note that in verbose mode the program will display more instructions and warning messages.  To use non-verbose mode for the entire analysis procedure, select Edit/Parameters/NMR Interpretation and turn on Verbose Mode from the 'Edit Parameters for NMR Data Interpretation' dialog box and then repeat the spectral interpretation and structure generation process.

·         For partial structure elucidation, make sure to define the correct number of dummy bonds (or the correct range for dummy bonds) to be fixed in a generated structure.  Pay close attention to the limitations of partial structure elucidation described in Section 7.1.

 

If the structure generation process appears to be endless, and no complete structure is generated prior to interrupting the process with the 'Stop' button, review the following and repeat the structure generation process:

 

·         Make sure the default parameters for structure generation are used.  To se the default values, select Edit/Parameters/2D Structure Generation and then click 'Default' from the dialog box.

·         If the molecule is large (e.g. > 40 heavy atoms), input as many known fragments as possible by selecting Analysis/User-Defined Bond Constraints.  It is especially important to input bond constraints concerning heteroatoms to improve the efficiency of structure generation.  Environment constraints can also be added by selecting Analysis/Atom Environment Constraints.

·         Limit the size of rings to whatever is appropriate (e.g. 5 and 6-membered rings).  See parameters MIN_RING_SIZE and MAX_RING_SIZE in Appendix IV.

·         Use an intermediate structure as the starting point for structure generation.  To do this, make sure to choose to record the intermediate substructures (see parameter REC_SS_FLAG in Appendix IV), interrupt the structure generation process after about 10 minutes, and then save the retained substructures.  Select the substructure closest to the one that seems to have converged in the right direction, and then select Analysis/User-Defined Bond Constraints to modify it.  Finally, repeat structure generation starting from this substructure. 

·         Propose a target structure, and let NMR-SAMS do the resonance assignment.  Resonance assignment is usually much faster than structure generation, and if full assignment is not achieved, increase MAX_ERR_BC by 1 or 2 to allow more bond constraints to be violated.  If full assignment is still not achieved, check the largest partial assignment to compare it with the proposed structure to identify any inconsistencies between the proposed structure and the spectral data.  See Chapter 8 for more detail.

 

Since structure generation is a combinatorial problem, it is normal to see long computation times for complex molecules, especially when the spectral constraints are not sufficient to converge the structure generation process rapidly.  If the above suggestions do not help, try to interactively build the structure  (see Section 7.2.1), since NMR-SAMS will check each bond for consistency with spectral data during the interactive building of a molecule.  If an inconsistency is found, the error message will help the user to trace the potential error in the peak picking results, in the structure, or in the parameter settings.

 


Chapter 8

Resonance Assignment

8.1 Overview

This chapter describes target structure-based resonance assignment using NMR-SAMS.  As described in Chapter 7, each generated structure during the structure generation has its 13C and 1H assignments.  If the user has apriori knowledge about the structure and can provide some proposed structures, it is worthwhile to skip the structure generation step and try NMR-SAMS' resonance assignment option for verification of user-proposed structures.

 

Unlike other methods of resonance assignment that are based on predicting 13C or 1H chemical shifts from large spectral databases, NMR-SAMS uses mainly 2D NMR-derived connectivity information for resonance assignment.  During the assignment process, NMR-SAMS first predicts a coarse 13C chemical shift range for each carbon atom (see Section 3.5) and then obtains tentative assignments.  Next, the 2D NMR-derived connectivity information is used to improve these tentative assignments to the final resonance assignments.   In this manner, the final assignments of NMR-SAMS are much more reliable than assignments based solely on predicted chemical shifts.

 

Resonance assignment is much faster than de novo structure generation, even when bond constraints can be violated.  If the structure generation process is going very slow, the user is urged to propose a couple of candidate structures (based on apriori knowledge of the system) to complete the resonance assignment.  By doing so, NMR-SAMS will assist the user in identifying possible inconsistencies between the structure and the spectral data.  After correcting the errors in the spectral data, repeat the structure generation process to generate all possible structures consistent with the spectra data.  This should help to prevent the omission of any potential structures. 

 

The resonance assignment-related steps correspond to the following group of options on the Analysis menu:

8.2 Input of the Target Structure

Command: Analysis/Input Target Structure.

Description:  In this step the user can input a proposed structure as the target structure for resonance assignment.  If the proposed structure has been built with a third party software, first save it in .mol, .mdl or .sdf format, and then select Analysis/Input Target Structure/Import MDL to import the structure.  A structure can also be built by selecting Analysis/Input Target Structure/Build Molecule.  As soon as the target structure has been imported or drawn, NMR-SAMS will automatically set up an assignment matrix to be used for the subsequent resonance assignment. 

8.2.1.  Building a Target Structure in NMR-SAMS

To input the target structure interactively, select Analysis/Input Target Structure/Build Molecule, and this will bring up the following molecular builder palette for interactive sketching of the molecule.

 

 

The user may need to click 'Clear' to remove a pre-existing structure, or the user can proceed with building the structure starting with the currently displayed structure.  To sketch a target structure, select 'Add', 'Atom', and 'Continuous Mode'.  Leave the 'Element' as 'C', and 'Ambiguous' unchecked (this option is reserved for defining a substructure, and is currently not used).  Next, click in the main graphics window and an atom will be drawn at that location.  Subsequent clicking will add additional atoms connected via bonds (single, by default).  To temporarily turn off the addition of bonds between the atoms, click the right mouse button and bonds will not be displayed between any future added atoms.  To add separate atoms, uncheck 'Continuous Mode'.  

 

To modify a pre-existing structure, select 'Modify' and 'Atom' and then type the desired element symbol after 'Element'.  The user can also move the slider to change the default valence of the atom.  Next, click on the atom to change and the atom will appear with its modified attributes.  To modify a bond, select 'Modify' and 'Bond' and then select the desired bond type.  If the connectivity or attached protons of an atom are uncertain or unknown, the user can select 'Ambiguous'.  Then, click on the two associated atoms of the bond to modify and the bond will be modified. 

 

To delete an atom, select 'Delete' and 'Atom' and then click on the atom to remove.  To delete a bond, select 'Delete' and 'Bond' and then click on the two associated atoms of the bond to delete. 

 

The molecular formula is displayed in the upper left corner of the main graphics window and it displays the elemental composition of the molecule.  To display and hide the molecular formula, select and reselect Display/Display Options/Molecular Formula.  After building the target structure, click 'OK' from the molecular builder to accept the structure.  

 

The elemental composition of the target structure must be identical to that of the unknown structure.  When a target structure built with the molecule builder is accepted, the following dialog box will prompt the user to save the target structure: 

 

 

Click 'OK' and then select File/Export/Structures and the following dialog box will appear:

 

 

The target structure can be exported into a .mol, .mdl or .sdf file named xxx000.*, where xxx is the root name of the working data set.  

 

Results: Results are saved in the MDF file following the keyword “TSS:”.  Following the keyword, the number of heavy atoms in the target structure is listed.  The second, third and fourth lines are annotations.  In the remaining lines, the following are specified for a heavy atom in the target structure:  ID, element symbol, valence number, number of free valences, connectivity (i.e., the number of neighboring heavy atoms, the neighboring atoms and bonds) and predicted 13C chemical shift ranges.  For more details about the prediction of 13C chemical shift, see Section 3.5.

 

TSS: n_atom = 33

------------------------------ Connection table ---------------------

 #At. Symb. Val. Ambi?  Conn. Neighbors and bonds     Pred. C13 range

----------------------------------------------------------------------

 # 1.   C    4   0    3  (32:1) (31:2) ( 5:1)            151.0 - 187.0

 # 2.   C    4   0    3  ( 9:1) ( 3:2) (25:1)            100.0 - 170.0

 # 3.   C    4   0    1  ( 2:2)                           80.0 - 159.0

 # 4.   C    4   0    3  (22:1) (12:1) (33:1)             42.0 - 109.0

 # 5.   C    4   0    4  (18:1) ( 8:1) (15:1) ( 1:1)      27.0 - 100.0

 # 6.   C    4   0    3  (26:1) (12:1) (16:1)             18.0 -  75.0

 # 7.   C    4   0    3  (11:1) (24:1) (16:1)             18.0 -  75.0

 # 8.   C    4   0    3  ( 9:1) (14:1) ( 5:1)             18.0 -  75.0    

      

Note:  In addition to element composition, the valences of the atoms in the target structure must be identical to those of the atoms in the unknown structure (refer to Section 4.4 for a description of the valences of the atoms in the unknown structure).

8.2.2.  Importing a Target Structure

To input a target structure (.mol, .mdl, or .sdf), select Analysis/Input Target Structure/Import Structure File.  A file browser will appear allowing the user to select the target structure file and then the target structure will be displayed in the main graphics window.  If there have been any previous assignment results saved in the target structure file, then the following 'Analog-Based Assignment' dialog box will be displayed:

 

 

When the box next to 'Analog-Based Assignment' is checked, the previously assigned 13C chemical shifts will be compared with the current 13C chemical shifts.  Using the matching tolerance (default value: 3.0 ppm), NMR-SAMS compares the carbon chemical shifts of the assigned analog molecule with the corresponding 13C chemical shifts of the current molecule to complete the first level of assignments for carbon atoms.   Note that only the 13C chemical shift and multiplicity are considered.  1H chemical shifts and 2D connectivities are not considered.  So this function must be used with caution.  The user can further edit the tentative assignments using the Analysis/User-Defined Resonance Assignment option.

8.2.3.  Setting up the Assignment Matrix

After a target structure has either been imported or built, NMR-SAMS sets up a matrix that summarizes a preliminary assignment of the building blocks to the constituent heavy atoms in the target structure.

 

A building block (see Section 6.2) is assigned to a constituent atom in the target structure when the element type, valence number, attached protons, and d13C (for carbons) of the building block match those of the constituent atom.  If an atom in the target structure does not have any matching building block, then NMR-SAMS points out that complete assignment will not be possible for this target structure.

 

Any initial assignments (manually entered by selecting Analysis/User-Defined Assignments) will be set as fixed assignments in the matrix.

 

Relevant Parameter: ADD_C13_RNG is used to increase the predicted 13C chemical shift ranges (see Appendix IV).  This parameter is useful when some odd 13C chemical shifts are expected for the proposed structure. 

 

Results: The results are saved in a record starting with the keyword “AEMX:” in the MDF File.  The number of heavy atoms in the target structure, as well as the number of heavy atoms in the unknown molecule, is listed after the keyword.  The rest of the lines list the elements in the assignment matrix.  An element a[i, j] is 1 if the constituent atom i (in the target structure) can be assigned to building block j.  Otherwise it is 0.

 

AEMX: 23 23

# 1. 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

# 2. 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

# 3. 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

# 4. 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

# 5. 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

# 6. 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

       .

       .

       .

8.3 User-Defined Resonance Assignment

Command: Analysis/User-Defined Assignments.

Description:  After importing a target structure or performing automated resonance assignment, the user can further edit the current assignments by selecting Analysis/User-Defined Assignments.  The following 'User-Defined Resonance Assignment' palette, listing the current assignments of the 13C and 1H chemical shifts, will be displayed:

 

 

To assign a chemical shift to a carbon atom, select 'Add', click on an unassigned peak (an entry in which the Assigned Atom # is listed as ‘none’), and then click on the carbon atom to be assigned.  To remove the assignment of a chemical shift, check 'Delete', and then click on a peak in the palette or click on the corresponding atom in the main graphics window.  The peak will be removed from the atom.

 

When finished, click 'OK' and the current assignments will be updated so that a new assignment matrix is set up and the user can then perform automatic assignment with the modified assignments.  Note: when an assignment is added, the program will only check the 13C multiplicity and 13C chemical shifts; the 2D NMR connectivities will not be verified.

8.4 Resonance Assignment

Command: Analysis/Assign Spectra.

Description:  In this step, NMR-SAMS assigns the building blocks to the constituent heavy atoms in the target structure.  The assignment process is actually a structure generation process based on the   additional constraints from the assignment matrix.  The results can either be complete or partial assignments.

When complete assignment is not possible, NMR-SAMS attempts to generate the largest number of partial assignments.  By comparing the partial assignments with the target structure, it may be possible to identify inconsistencies between the target structure and the spectral data.

The resonance assignment process starts from a selected atom and when a complete assignment is obtained using the first selected atom, NMR-SAMS will stop its search through the remaining possible mappings.  On the other hand, if a complete assignment is not obtained during the first attempt, NMR-SAMS will continue to loop through different starting atoms and repeat the assignment process.  This process will generate the largest number of partial assignments.  The 'Structure Generation in Progress' dialog box displays the number of starting atoms that have been used to start from, as shown below:

 

Since it can take a considerable amount of time to loop through every atom in the structure as a starting atom, the 'Stop' button can be selected to abort the process after a few starting atoms have been tried. 

 

Relevant Parameters:  As the resonance assignment process is very similar to structure generation, most of the parameters relevant to structure generation (see the Parameters for Structure Generation section in Appendix IV) are also relevant to resonance assignment.  The following parameters are exceptions, since the heuristic methods for structure generation are not used during resonance assignment:

GEN_FLAG,

SAT_BC_RATE, and

N_FBX_STEP.

 

Results: The resulting structures/substructures, representing complete/partial assignments are saved in the structure file (.str).  The number of partial assignments, along with some additional information is summarized in a record starting with the keyword “RESULTS:” in the MDF file.  The results of resonance assignment will be displayed as assigned chemical shifts on the target structure.  Since the resonance assignment is actually a structure generation process, the user could also display the target structure with assignment by selecting Display/Generated Structures.  See Section 10.6 for more details regarding the display options.

 

When partial assignment occurs, the unassigned atoms of the proposed target structure usually conflict with the spectral data (Fig. 8.1).  This provides the user clues on how to improve the proposed structure or how to correct any errors in the spectral data.  Once modifications have been made to the proposed structure or to the spectral data, the resonance assignment process can be repeated.

Figure. 8.1.Verification of a proposed structure by resonance assignment.  The displayed partial assignment indicates that NMR-SAMS cannot assign peaks beyond C6 and C16, and the red box highlights the incorrect portion of the target structure where C27 should have been connected to C16 instead of to C13.  By comparing the partial assignment with the proposed structure, it is easy to identify the parts of the proposed structure that need to be revised.

Possible Errors:

If complete assignment is not obtained for the proposed target structure, it usually implies that the proposed structure is incompatible with the spectral data, and in such cases, NMR-SAMS will provide the following suggestions:

·         The stored partial assignments are the largest possible assignments.  The user can view these partial assignments using the 'Structure Browser' shown below:

 

 

By studying the partial assignments, the user can probably determine the inconsistencies between the target structure and the spectral data.  Repeat the assignment process after fixing the inconsistencies.

·         Look under the suggestions for structure generation (see Section 7.4 for details.).  Note that resonance assignment shares most of the parameters used for structure generation, except GEN_FLAG, SAT_BC_RATE and N_FBX_STEP.  After adjusting the parameters, repeat the assignment process.


Chapter 9

Quick Enumeration/Elucidation

9.1 Overview

NMR-SAMS provides several fundamental tools for structure generation of a library of virtual compounds based on MF and known structural fragment information when NMR data is not available.   

9.2 MF-Based Structure Generation of Virtual Compounds

To generate a library of virtual compounds based on a known MF and additional known structural fragment information, perform the following steps:

 

1.        Click File/New to open a new working data set.  Type the MF into the 'Input Molecular Formula' dialog box.

2.        Select Edit/Parameters/Setting up ACMX.  Select 'Do Not Use' for the use of COSY negative information, unselect 'Use H-1 Multiplicity Information to Eliminate Inappropriate Bonds', unselect 'Extract Unambiguous 1-Bond Constraints as Fixed Bonds' and select 'Enabled for All' for the 'Bond Formation between Heteroatoms'.  Then, select 'OK'.

3.        Select Edit/Parameters/2D Structure Generation.  Select 'Exhaustive' for 'Search Criteria for Structure Generation', unselect 'Exclude Structures with Chemically Unstable Moeties', select '0' (for Unlimited) for the 'Maximum Candidate Structures to Store' and unselect 'Store Partially Completed Structures'.  Then, select 'OK'.  Note that if the user wants to retain partial structures, select  a large number for the 'Maximum Candidate Structures to Store' and select 'Store Partially Completed Structures'  

4.        Select Analysis/Building Blocks to generate all of the possible structural building blocks.   Note that the maximum number of building block sets is 500.  At this point, if additional fragment information is not available, skip to Step 8. 

5.        Select Analysis/Bond Constraints to set up the atom-atom connection matrices based on each of the building block sets.  Click 'OK' to the dialog box that appears noting the absence of NMR data.  

6.        Select Analysis/User-Defined Bond Constraints to add any molecular fragments, if known.  If not, the user can proceed to the next step. 

7.        Select Analysis/Atom Environment Constraints to add any environment constraints, if known.  If not, the user can proceed to the next step.

8.        Select Analysis/Quick Enumeration/Elucidation to generate structures.

 

The structures generated by this method can be exported as *.sdf, *.mdl or *.mol files and then used in conjunction with large databases of available compounds (for example, Available Chemicals Directory supplied by MDL and Available Compounds from ChemNavigator, etc.) to identify new lead molecules.

9.3 Quick Structure Elucidation

In order to perform the structure elucidation process (using default control parameters) in a streamlined manner without user-intervention, the user can perform the following: 

 

1.        Select File/New to open a new working dataset (if the user has already created a *.nmr data file in SpecMan using available spectral data, then the user can select 'Start with an existing NMR data file' for the file type).  Next, type the MF into the 'Input Molecular Formula' dialog box. 

2.        If the NMR data file has not already been created, select File/Create NMR Data File to convert the SpecMan peaks tables into the NMR-SAMS NMR data file.

3.        Select Analysis/Quick Enumeration/Elucidation.  This will perform all of the steps related to spectral analysis and structure generation (see Chapters 6 and 7) using the default (or user-modified) parameters.  Note that in this mode, the user cannot input any user-defined bond constraints or environment constraints.  Also, the non-verbose mode is automatically selected so that the majority of the information and warning dialog boxes will not appear.  Note that the user will still be able to access these messages (listed in the log file) by selecting Edit/Log File.


Chapter 10

Graphical Display of Results

10.1 Overview

This chapter describes the operations related to the display of the intermediate and final results of NMR-SAMS.  The structure-related intermediate and final results of NMR-SAMS are displayed in the main graphics window, and the intermediate results, if any, are automatically displayed after every step in the following order of priority:

1.        Candidate structures/substructures (results of structure generation or resonance assignment)

2.        Target structure for resonance assignment

3.        Building blocks. 

 

If none of these results exist, the main graphics window will remain blank.  To display any of these results, or to change the display features, select from the following options in the Display menu:

 

10.2 Display of Structural Building Blocks

Command: Display/Building Blocks & Fixed Bonds.

Description:  Building blocks are displayed in the main graphics window, and if there are multiple sets of building blocks, a 'Building Block Browser' will be displayed.  This browser permits the user to browse through and select building block sets for display.  Each individual building block is displayed as a heavy atom with attached protons, if any.  A star '*' denotes that an atom has an unsatisfied valence (free bonds).  By default, an atom with free bonds is displayed in blue and an atom without any free bonds is displayed in gray.  For partial structure elucidation, ignored atoms are displayed in red.  The user can customize these colors by modifying the appropriate color entries in the nmrsams.ini file (see Section 2.3).  A fixed bond is displayed as a solid line or as a dashed line in the case where a bond has an unspecified bond type  (i.e., single, double, or triple).  For partial structure elucidation, a dummy bond is displayed as a tilde ('~').

 

The user can click on and drag an atom to move it, or click on and drag a bond to move a fragment (it is not possible to move a fragment while adding user-defined bond constraints or while building a molecule). 

 

The user can also modify the displayed features of a structure/substructure by selecting Display/Display Options.  For example, the user can select to display the associated 13C and 1H chemical shifts of the building blocks by selecting Display/Display Options/Chemical Shifts.  In addition, the user can select Display/Display Options/Connection Table to display/hide the Connection Table that lists the building blocks, bond constraints, and atom environment constraints, if any.  See Section 10.6 for complete descriptions of the display options.  Once any modifications have been made, select Display/Display Options/Refine to refine the display.

 

The interaction between the building blocks and the connection table: If the Connection Table is not displayed, select Display/Display Options/Connection Table to display it.  When an atom in a building block is clicked on, relevant entries, such as atom connectivity, assigned chemical shifts, bond constraints, and environment constraints, will be highlighted in the Connection Table.  Similarly, when an entry is clicked on in the Connection Table, relevant atom(s) will be highlighted in the building blocks.

 

Note: When multiple building block sets are present, the user can select the 'Delete' or the 'Select' button from the 'Building Blocks Browser' palette to delete or select the current displayed building block set (see Section 6.2).  The user cannot delete or select a particular building block set after multiple ACMX's have been set up based on those multiple building block sets  (see Section 6.4.6).  To select one particular building block set, the user will have to regenerate the building block sets by selecting Analysis/ Building Blocks.

10.3 Display of Target Structure

Command: Display/Target Structure & Assignments.

Description:  This option displays a target structure for resonance assignment (see Section 8.2) in the main graphics window.  If spectral assignment has been previously performed, chemical shifts will be displayed on the atoms.  If there are multiple assignments, then an 'Assignment Browser' will appear that allows the user to browse through all possible assignments.  See Section 10.6 for complete descriptions of the display options. 

 

Note:  The numbering of the atoms in the structure represents the order in which the atoms were added when the molecule was built.  This is different from a generated structure where an atom ID usually corresponds to the ID of its assigned 13C peaks.

10.4 Display of Generated Structures/Assignments

Command: Display/Generated Structures.

Description:  This feature displays generated structures/substructures in the main graphics window.  If there are more than one structure/substructure, a 'Structure Browser' will be is displayed so that the user can glance through all of the entries.  In a substructure, a star '*' denotes that an atom has an unsatisfied valence (free bonds).  By default, an atom with free bonds is displayed in blue and an atom without any free bonds is displayed in gray.  For partial structure elucidation, ignored atoms are displayed in red.  The user can customize these colors by modifying the appropriate color entries in the nmrsams.ini file (see Section 2.3).  A fixed bond is displayed as a solid line or as a dashed line in the case where a bond has an unspecified bond type  (i.e., single, double, or triple).  For partial structure elucidation, a dummy bond is displayed as a tilde ('~').

   

By default, the results of target structure-based resonance assignment will be displayed as a target structure with its assigned chemical shifts (see Section 10.3).  However, the user can select to display the results as generated candidate structure results.  In this way a complete/partial assignment is displayed as a complete structure/substructure.

 

Interaction between structure and connection table: If the Connection Table is not displayed, select Display/Display Options/Connection Table to display it.  When an atom in the structure is clicked on, relevant entries, such as atom connectivity, assigned chemical shifts, bond constraints, and environment constraints, will be highlighted in the Connection Table.  Similarly, when an entry is clicked on in the Connection Table, relevant atom(s) will be highlighted in the structure.  This is a convenient tool for verifying the structure and its constraints.

10.5 Status Window

Command: Display/Status Window.

Description:  The status window displays text messages that indicate the current status of the structure elucidation process, and they also prompt the user as to what steps could be performed next.  Note that these suggestions are only for general guidance and the user can proceed with or repeat any steps, as necessary.  Select Display/Status Window to display and hide the status window.  The user can also close the status window by selecting its 'Close' button.

10.6 Display Options

Command: Display/Display Options.

Description: The following toggle options are available from the pull-right menu of Display/Display Options (also available as icons on the Tool Bar (see Section 2.6):

 

 

Balls: This option displays circles that represent atoms.  By default, normal atoms are gray, ambiguous atoms are blue, and ignored atoms are red.

Element Symbols: This option displays element symbols for the atoms.  By default, the symbols are yellow. 

Numbers: This option displays the atom numbers that are referred to in the Connection Table.  By default, the numbers are green.

Chemical Shifts: This option displays chemical shifts for all carbon atoms with assigned 13C chemical shifts.  If the protons are displayed, the 1H chemical shifts will be displayed in parentheses after the 13C chemical shifts.  Chemical shifts are displayed in the same color as the atom numbers.

Molecular Formula: This option displays the molecular formula of the current structure or substructure.  Note that for partial structure elucidation, the displayed molecular formula might contain fewer protons than the actual number of protons since the attached protons of the ignored atoms will not be displayed.

Molecular Weight: This option displays the molecular weight of the current structure or substructure.  Note that for partial structure elucidation, the displayed molecular weight might be low since the molecular formula may contain fewer protons than the actual number of protons since the attached protons of the ignored atoms will not be displayed.

Show Disconnectivity: This option displays the atoms that cannot be connected to the currently highlighted atom.  This option is effective only after bond constraints have been generated and the connectivities between the atoms are known.

Protons: This option displays attached protons, if any.  The protons attached to each atom are displayed as "H" or "H#" where '#' is the number of protons attached to that atom.  Protons are displayed in the same color as the Element Symbols.

Connection Table: This option displays the Connection Table that lists atom connectivity, bond constraints, and atom environment constraints.

Refine: This option moves the current molecule's atoms, attempting to place them in the way deemed most appropriate for the molecule by optimizing its internal geometry.

 

Tips: Default colors (and other display options) can be customized by modifying the initialization files (nmrsams.ini, nmrsamspersonal.ini, nmrsamsblack.ini and nmrsamswhite.ini). 

10.7 Editing the Display of Generated Structures

Command: Edit/Generated Structures.

 Description: This option is used to edit the generated structure/substructures by adding, modifying or deleting atoms/bonds. This option is especially useful for partial structure elucidation, where the user can manually link ignored atoms to dummy bonds to complete the full structure.

 

To edit a structure, first display the desired candidate structure by selecting Display/Generated Structures.  Then, select Edit /Generated Structures and the following 'Molecular Editor' palette will appear:

 

The directions for using the 'Molecular Editor' palette to add/remove/modify atoms and bonds is mostly the same as that described in Section 8.2, but the minor differences are listed as follows:

 

·         Normally, if the correct MF has been used, the user will not need to add or remove atoms.  To add/remove/modify bonds, it is suggested (and more convenient) that the user uncheck 'Continuous Mode'. 

·         For partial structure elucidation, the number of attached protons of each ignored atom (displayed in red) is treated as unknown.  When bonds are added between ignored atoms, the atoms are treated as ambiguous ones (displayed in blue) with an unknown number of attached protons.  To display a specific number of attached protons for such an atom, choose 'Modify', 'Atom', the desired 'Element' and 'Valence'.  Uncheck 'Ambiguous' and then click on an atom with an unknown number of attached protons.  The number of attached protons for that atom will be calculated and displayed based on its valence and connectivity.

·         Upon completing the modification, click the 'OK' button.  Then select File/Export/Structures to export the current modified structure into a structure file (*.mdl, *.mol, *.sdf).  See Section 11.2 for more details.  To continue editing the structure, reselect Edit/Generated Structures.

 

Note:  Any changes made to the displayed structure can only be saved in a *.mdl, *.mol or *.sdf file, and not in the *.str file.  The changes will not be retained once the user displays another structure, etc. 


Chapter 11

Exporting Results

11.1 Overview

This chapter describes the options related to report generation using the results of NMR-SAMS, including NMR spectral data, resonance assignment, and candidate structures.  Such files can be readily reformatted for presentation and publication.  The pull-right options of File/Export are shown below:

 

11.2 Exporting NMR Spectral Data

Command: File/Export/Chemical Shift Correlation.

Description:  NMR-SAMS enables the user to create NMR peak lists (in the form of chemical shift correlations).  To create a chemical shift correlation table, select File/Export/Chemical Shift Correlation, and the correlation of the chemical shifts will be written into an xxx.spc file (where xxx is the root name of the current working data set).  The *.spc file can be opened in Notepad or MS Word.

 

Sample Output:

***** NMR Spectral Data of Q-2-demo, Created by NMR-SAMS V2.0 *****

 

---------------------------------------------------------------------------------------

#H-1  Shift  Multi. Integral          COSY                      NOESY

---------------------------------------------------------------------------------------

  1.    4.930    s    5.3e-002   4.755(w) 1.778(w)           4.755(s) 3.509(s) 1.778(s)

  2.    4.755    s    5.2e-002   1.778(w)                    1.778(s)

  3.    3.509    u    1.7e-002   2.235* 1.752 1.513          2.235*(s)

       .

       .

       .

 

---------------------------------------------------------------------------------------

#C-13 Shift  Multi. Integral   HMQC                      HMBC

---------------------------------------------------------------------------------------

  1.  178.822   s    2.7e-002                     2.611 2.235* 1.752 1.565 1.544*

  2.  151.323   s    3.3e-002                     4.930 4.755 3.509 2.232 1.778 1.513

  3.  109.931   t    2.4e-002    4.930  4.755     3.509 1.778

       .

       .

       .

Note:

Long-range coupled COSY peaks are marked by '(w)'.  NOESY peaks are marked by '(s)' for strong, '(m)' for medium and '(w)' for weak.

For a H-H or C-H correlation that involves ambiguous correlated 1D peaks such as the following:

a1  a2 ... an ¾ b1  b2  ...  bm ,                

which means the following combinations:

a1 - b1, a1 - b2 …, and an - bm,

these combinations are represented as n lines, ai - b1 ( 1 < i < n), and b1 is marked by a “*” to represent b1, b2 … or bm:

a1 - b1* 

a2 - b1*

.

.

.

an - b1*.

11.3 Exporting Resonance Assignment

Command:  File/Export/Assignment.

Description: This option allows the user to export the resonance assignment of a candidate structure.  First display the desired candidate structure/substructure and then select File/Export/Assignment.  The resonance assignments of the displayed candidate structure or substructure will be written into the text file, 'xxx00n.rst, where xxx is the root name of the current working data set and n is the sequential number of the substructure.  This file contains the 13C and 1H assignments of all atoms in the molecule, and if NOESY peaks are available, the assignment of the NOESY peaks along with distance constraints and the actual bond separation between the relevant protons will also be included.  This information enables resolution of ambiguous NOE peaks and identification of through-space NOE connectivities.

 

Relevant Parameters:

NOESY_DIST: (default values: 1.90 5.00   1.90 3.00   1.90 2.50) The six values of this parameter are used to define the minimum and maximum distance bounds between the correlated protons, when the NOESY peak intensity level is weak, medium, or strong respectively.

 

Sample Output:

***** Resonance Assignment by NMR-SAMS V2.0*****

 

STRUCTURE #1(Unique #1, generated from ACMX #1 at CPU: 0.21s.)

 

--------------- Assignments of C-13 and H-1 resonances: --------------

 #Atom           Assigned C-13            Assigned H-1

----------------------------------------------------------------------

  C-1            178.822( 1)        

  C-2            151.323( 2)        

  C-3            109.931( 3)            4.755( 2)   4.930( 1)

  C-4             78.147( 4)            3.435( 4)

  C-5             56.647( 5)        

  C-6             55.956( 6)            0.811(33)

       .

       .

       .

-------------------------Assignment of NOE Cross Peaks--------------------------------

 #NOE #H1(ppm,#C13) - #H1(ppm,#C13)  Intensity   Distance constraint  Bond separation

--------------------------------------------------------------------------------------

 1    1(4.930, 3)  -   2(4.755, 3)    0.000 s       1.9  -  2.5         2

 2    1(4.930, 3)  -   3(3.509, 9)    0.000 s       1.9  -  2.5         4

 3    1(4.930, 3)  -  12(1.778,25)    0.000 s       1.9  -  2.5         4

 4    2(4.755, 3)  -  12(1.778,25)    0.000 s       1.9  -  2.5         4

 5*   3(3.509, 9)  -   7(2.235,15)    0.000 s       1.9  -  2.5         4

 5*   3(3.509, 9)  -   8(2.232,19)    0.000 s       1.9  -  2.5         3

      .

      .

      .

Note:

If a NOESY peak involves ambiguous correlation of 1H peaks, all of the relevant proton pairs will be listed and will be marked by a '*'.  Such ambiguity can only be resolved by using molecular modeling methods.

11.4 Exporting Candidate or Target Structures.

Command:  File/Export/Structures.

Description: NMR-SAMS enables the user to export 2D structures for use in third party molecular drawing programs.  To export a structure, select File/Export/Structures, and the following 'Structure Export' dialog box will appear:

 

 

The user can select to export the current structure, all structures, or a specified number of structures into a *.mol, *.mdl or *.sdf file.  If a target structure is displayed with chemical shift assignments, the resonance assignments will also be listed at the end of the *.mol, *.mdl or *.sdf file.

 


Appendix I: NMR Data File

NMR-SAMS accepts NMR spectral data in the form of an ASCII file, with a novel flexible format designed to cope with practical problems that commonly exist in real-world spectral data.  The data file can either be prepared by converting the peak tables generated by SpecMan with the conversion procedures described in Sections 5.2-5.8, or by manually entering spectral information from third party vendors.

1D Spectral Data

In the spectral data file, the 1D peaks are listed first. Keywords H1:” and “C13:” are used to designate the start of the entries of 1H and 13C spectral data, respectively. Following the keywords, each line specifies the data of a peak, and the section ends with an empty line. The following is a transcript of a sample 1H peak list converted from a SpecMan peaks table:

 

H1: /usr/people/peng/NMR-SAMS/ndat/Q-2-test/h1.pks

 #1. 4.930 s 5.331e-02 ;1

 #2. 4.755 s 5.185e-02 ;2

 #3. 3.509 u 1.656e-02 ;3

    .

    .

    .

 

The first line beginning with the keyword “H1:” indicates the start of the 1H peak list.   After “H1:” and a blank space, comments, up to 80 characters in length, can be added.  The entries in the rest of the lines represent the peak ID, chemical shift (in ppm), multiplicity, intensity (optional), and comments (optional) for each 1H peak, respectively.  If the peaks have been converted from a SpecMan peaks table, then the comment will contain the ID of the original peak in the SpecMan peaks table.  Thus the two ID's on a single line can be different.  One or more spaces are used as a delimiter for all items except comments that are separated by a semicolon ';'. 

 

Peak ID's are frequently used in other places to refer to these 1D peaks.  The multiplicity of a 1D peak is represented as u, s, d, t, q, or m for unknown, singlet, doublet, triplet, quartet, or other general multiplets, respectively.  Detailed description of the use of 1D peak multiplicity information can be found in the usage of parameter H1_MULT_FLAG (Appendix IV).  The peak intensity and comments are optional.  The comments are useful to keep track of the original peaks in the 1D spectrum.

2D Spectral Data

Instead of cross peak coordinates, NMR-SAMS uses the 2D NMR data in the form of correlations between 1D peaks, referred to as connectivities in this manual.  In the data file, keywords COSY”, “HMQC”, “HMBC”, “NOESY” and “INAD” are used to designate the start of the entries of DQF-COSY, HMQC/HETCOR, HMBC/COLOC, NOESY, and 2D INADEQUATE connectivities, respectively.   Following the keyword, each line specifies a connectivity, and the section ends with a blank line.  In the line of a connectivity, one or more spaces are used as a delimiter for all items except the comments that are separated by a semicolon ';'.   The following is a transcript of a sample DQF-COSY connectivity list:

 

COSY: DQF-COSY data of Q-2

 #1. (1 - 2)   1  0.0  0.84 ;1+4

 #2. (1 - 12)  1  0.0  0.84 ;2+31

 #3. (2 - 12)  1  0.0  0.84 ;3+32

 #4. (3 - 7 8) 3  0.0  0.84 ;6+18

 #5. (3 - 13)  3  0.0  0.84 ;7+33

 #6. (3 - 18)  3  0.0  0.84 ;5+49

    .

    .

    .

 

The first line beginning with the keyword “COSY” indicates the start of the COSY connectivity list.  After the keyword and a blank space, comments, up to 80 characters in length can be added.  The entries in the rest of the lines represent the connectivity ID, ID's of the correlated 1D 1H peaks shown in parenthesis, peak intensity levels (classified as four types: strong (3), medium (2), weak (1), and unknown (0), J-coupling constant (optional, 0.0 for unknown), reliability (optional, refers to the probability of the peak being considered as a real peak), and comments (optional, maximum size of 80 characters), for each COSY connectivity, respectively. 

 

Again, the ID of a connectivity will be used in other places to refer to this connectivity (such as in the bond constraints, see Section 3.4.  For each of the connectivities converted from a SpecMan peaks table, the comment contains the ID(s) of the cross peak(s) from which the connectivity is derived.  This offers a way to keep track of the cross peak(s) from which a connectivity is derived (see Fig. 6.4 in Chapter 6).  For ambiguous connectivities, the ID’s of all possible 1D peaks are listed as correlated nodes.  The intensity level is used only for DQF-COSY (and NOESY if it is used), and the intensity level of a short-range coupling DQF-COSY peak must be assigned 3 (strong) or 2 (medium), while that of a possible long-range coupling peak must be assigned 1 (weak).  The J-coupling constant entry is used only for DQF-COSY.  See Sections 5.4 and 6.4.1 for details regarding usage of this information.

 

Items marked as optional can be omitted unless an item following them is included.  In such a case, the user must include default values for ignored items even if they will not be used.  Comments can always be included as long as they follow a semicolon (;).  The following example displays some valid representations of connectivities:

#2 (1 - 2)                                         A strong peak between spin 1 and 2

#3 (1 - 10 11) ;10 and 11 too close to resolve     An ambiguous peak

#4 (8 - 10) 1 0.0 0.4                              An unreliable weak peak

Note: The same keywords and formats are used for both 13C-detected and 1H-detected 2D heteronuclear spectra (e.g. HMQC and HETCOR).  For example, the keyword “HMQC:” is also used for HETCOR data and the associated 13C peaks will always appear before the 1H peaks in the representation of a connectivity.  Refer to the examples shown in Sections 5.5 and 5.6.


Appendix II: Master Data File

The master data file (MDF) stores all of the intermediate and final results of NMR-SAMS, except for the connection tables of the candidate structures (saved in the structure file).  The results are saved as records, each starting with a keyword, such as “ATOMS:” and ending with a blank line.  The intermediate results of one analysis step will be used as the input for the next dependent step.  NMR-SAMS will save only one copy of each record in the MDF, so if a certain analysis step is repeated, the relevant records, as well as those produced by the dependent steps, if any, will be overwritten.  For example, the command Analysis/Bond Constraint uses the results of the Analysis/ Building Blocks command.  If the user repeats the latter step after completing Analysis/Bond Constraints, the following message will warn that the previous results, as well as the dependent ones, will be overwritten:

 

 

The MDF can be viewed and edited by selecting Edit/Master Data File.  By default, a Notepad editor is used.  The user can also control the flow of the structure elucidation process by changing some of the intermediate results.  Note that the keywords and the formats must not be modified otherwise the program will be not able to find the record or read the data properly.  Moreover, once the user has modified a certain record, the dependent analysis steps (if performed previously) must be repeated to utilize the modified data.  Table A.3.1 lists the data records that are produced in each of the analysis steps, and the steps (or commands) are arranged in the general order in which they are used for structure elucidation.  In addition, Table A.3.1 also lists whether or not a data record can be modified.

Table A3.1. Data Records in the Master Data File of NMR-SAMS

Command

Keyword

Content of the record

Modify?

File/Input Molecular Formula

MF:

The molecular formula of the unknown.

No

ATOMS:

Elemental composition of the unknown and some properties of the atoms.

No

File/Create NMR Data File/H1

1DH1:

Results of analysis of 1D 1H NMR spectrum.

No

File/Create NMR Data File/13C and DEPT

1DC13:

Results of analysis of 1D 13C NMR spectrum.

No

SYMMETRY:

Either the unknown is symmetric or not, or to pursue partial structure elucidation. 

No

File/Create NMR Data File/HMQC (or HETCOR)

HMQC:

The C-H BC’s derived from HMQC correlations. 

No

Analysis/Building Blocks

FRAG_SET:

The building blocks for structure generation.

Yes

File/Create NMR Data File/COSY

COSY:

The H-H BC’s derived from COSY correlations.

No

File/Create NMR Data File/HMBC (or COLOC)

HMBC:

The C-H BC’s derived from HMBC correlations.

No

File/Create NMRData File/INADEQUATE

INADEQUATE:

The C-C BC’s derived from INADEQUATE correlations.

No

Analysis/Bond Constraints

C13~~C13:

The unified set of C-C BC’s.

Yes

ACMX:#x:
(x = 1, 2, ...)

Atom-atom connection matrix (matrices).

Yes

Analysis/User-Defined Bond Constraints

ATOM~~ATOM:

User-supplied BC’s

No

Analysis/Atom Environment Constraints

ENVIRONMENT:

User-supplied environment constraints.

No

Analysis/2D Structure Generation or Analysis/Assign Spectra

RESULTS:

Summary of the results of structure generation.

No

UNRECOG_CCSS:

The undefined substructures (CCSS) encountered during the structure generation. 

No

Analysis/Input Target Structure

TSS:

Connection table of the target structure for resonance assignment.

No

Analysis/Assign Spectra

AEMX:

The assignment matrix for resonance assignment.

Yes

 


Appendix III: CCSS-13C Chemical Shift Range Correlation Table

The chemical_shifts.def file serves as NMR-SAMS’ knowledge base for 13C chemical shift prediction (see Section 3.5) by storing the 13C NMR chemical shift dispersion ranges of some common carbon-centered single-spherical substructures (CCSS).  Several rare CCSS’s (whose chemical shift ranges cannot be found in the references) are assigned a range of -99 to -999, which, in effect, prohibits the formation of such CCSS’s in structure generation.  The user can modify and expand this knowledge base by adding entries using the appropriate format.

 

Format: All lines starting with an exclamation mark (!) in the first column are taken as comments.  A CCSS is coded as a focus (always C) followed by neighboring atoms and bonds (single: default; double: =; and triple: #.  Aromatic bonds are decomposed into alternating single and double bonds).  The order of the neighboring atoms is of no consequence.  The lower and upper limits of the 13C chemical shift dispersion of the focus carbon follow the code of each CCSS.

 

References:

1. Pretsch, Emo et al., Tableeln zur Strukturaufklarung Organishcer Verbindungen mit Spektroskopisher, Methoden, 2nd ed., Berlin,Springer-Verlag, 1981

2. Bremser, W., Chemical Shift Ranges in Carbon-13 NMR Spectroscopy, Weinheim, Verlag Chemie, 1982

 


C(=S)(N)(N)  165  185

C(C)         0    32

C(S)        6    20

C(C)(C)     10   70

C(S)(C)     16   60

C(C)(C)(C)  18   75

C(S)(C)(C)         22   73

C(N)        27   46

C(C)(C)(C)(C)23   100

C(N)(C)     35   90

C(S)(C)(C)(C)35   90

C(Cl)(C)     37   56

C(N)(C)(C)         40   90

C(O)        49   62

C(N)(C)(C)(C)50   99

C(O)(C)     46   109

C(O)(C)(C)         42   109

C(O)(C)(C)(C)52   110

C(Cl)(C)(C)(C)65  110

C(O)(O)(C)(C)86   120  

C(O)(O)(C)         86   110

C(O)(O)     86   110

C(O)(O)(O)  107  118

C(O)(O)(O)(C)77   125

C(O)(N)(C)(C)71   114

C(O)(N)     60   89

C(N)(O)(C)  70   111

C(N)(N)(C)  41   99

C(=C)       80   159

C(=C)(C)     80   160

C(=C)(Cl)(C)      90   160

C(=C)(O)(O) 141  176

C(=C)(O)(C) 90   161

C(=C)(N)(C) 90   160

C(=C)(C)(C) 100  170  

C(=C)(N)     120  170

C(=C)(N)(C) 120  170

C(=C)(O)     115  189

C(=C)(=C)   118  220

C(=C)(N)(N) 121  180

C(=C)(O)(N) -99   -999

C(=O)(O)(C) 151   187

C(=O)(N)(C) 158   185

C(=O)(=N)   120   131

C(=O)(Cl)(C)      158   180

C(=O)(C)    185   204

C(=O)(C)(C) 164   226

C(=O)(C)    197   204

C(=O)(N)(N) 150   163

C(=O)(O)    158   167

C(=O)(N)    160   183

C(=O)(O)(O)       150   160

C(=O)(=C)   200   206

C(=S)(O)(C) 188   211

C(=S)(N)(C) 188   211

C(=S)(C)(C) 219   240

C(=S)(N)(N)       165   185

C(=N)(=S)  120   140

C(=N)(O)    151   156

C(=N)(C)(C) 144   170

C(=N)(C)    144   170

C(=N)             127   156

C(=N)(=C)   -99   -999

C(=N)(O)(C) 149   195

C(#C)       20    100

C(#C)(C)    20    100

C(#C)(O)    88    89

C(#C)(N)      79    84

C(#C)(S)      71    72

C(#C)(P)      71    107

C(#N)(S)      110   120

C(#N)(C)      115   125

C(#N)(O)      107   110

C(N)(N)(C)(C) 56    107

C(S)(N)(C)(C) 85    100

C(=S)(=C)     230   270

C(=N)(S)(C)   155   170

C(=C)(S)(N)   125   182

C(F)(F)(F)(C) 104   129

C(F)(F)(F)(N) 116   122

C(F)(F)(F)(O) 118   121

C(F)(F)(C)(C) 88    135

C(O)(N)(N)(C) 83    121

C(O)(O)(N)(C) 102   134

C(F)(F)(O)(C) 114   120

C(O)(O)(N)    101   131

C(O)(N)(N)    105   106

C(O)(O)(O)(O) 115   136

C(Cl)(N)(C)(C)73    97

C(Cl)(O)(C)(C)72    107

C(=C)(Cl)(C)  87    167

C(Cl)(C)(C)   45    92

C(Cl)(N)(C)   62    93

C(Cl)(O)(C)   74    97


 


Appendix IV: Control Parameters

The control parameters of NMR-SAMS can be accessed by selecting the pull right options of Edit/Parameters.  To open the ‘Edit Parameters for NMR Interpretation’ dialog box, select Edit/Parameters/NMR Interpretation.  To open the ‘Edit Parameters for Setting up ACMX’ dialog box, select Edit/Parameters/Set up ACMX.  To open the ‘Edit parameters for 2D Structure Generation’ dialog box, select Edit/Parameters/2D Structure Generation. It is not recommended that the user edit the parameter file (*.par).

 

This appendix explains the usage of the control parameters of NMR-SAMS, and the parameters are grouped as follows:

 

1.        Parameters for spectral interpretation.  The names and titles of these parameters are listed in Table A4.1, and the actual operations related to spectral interpretation are described in Section 6.4.

2.        Parameters for setting up ACMX.  The names and titles of these parameters are listed in Table A4.2, and the actual operations related to setting up an ACMX are described in Section 6.4.

3.        Parameters for structure generation.  The names and titles of these parameters are listed in Table A4.3, and the actual operations related to structure generation are described in Section 7.4.

 

Usage of these parameters is described in the following sections, and in each section, the parameters are arranged in the order that they appear in each dialog box.  The parameters are identified by their titles in each dialog box, in addition to the parameter names listed in the *.par file.

 

The default value for each parameter is listed in each parameter dialog box, and whenever a new working data set is opened, default values are assigned to all parameters.  The user can also assign default values to any of the groups of parameters by clicking the ‘Default’ button from each parameter dialog box.

Table A.4.1 Parameters for Spectral Interpretation

Parameter Name

Title in Edit Parameters for Spectral Interpretation Dialog Box

COSY_J_CATEG

J(HH) Cutoff for Long-range Coupling COSY peaks

COSY_BC[4]

H-H Bond Separation for a Long-range COSY-type Peak, Minimum and Maximum
H-H Bond Separation for a Short-range COSY-type Peak, Minimum and Maximum

COSY_DIAG_RESO

Tolerance for Near-diagonal COSY Peak Checking

MIN_MB_H1

Minimum 1H Chemical Shift for Checking Long-range H-H Coupling

HMBC_BC[6]

C-H Bond Separation for a Weak HMBC-Type Peak, Minimum and Maximum
C-H Bond Separation for a Medium HMBC-Type Peak, Minimum and Maximum
C-H Bond Separation for a Strong HMBC-Type Peak, Minimum and Maximum

INAD_BC[3]

C-C Bond Separation for an INADEQUATE Peak, Minimum and Maximum
Type of INADEQUATE-derived C-C Bond

RELIAB_PEAK_PROB

Minimum Probability for All Reliable Cross Peaks

NOESY_DIST[6]

H-H Distance for a Weak NOESY-type Peak, Minimum and Maximum
H-H Distance for a Medium NOESY-type Peak, Minimum and Maximum
H-H Distance for a Strong NOESY-type Peak, Minimum and Maximum

PRO_LEVEL

Verbose Mode

 

Table A.4.2 Parameters for Setting up ACMX

Parameter Name

Title in Edit Parameters for Setting up ACMX Dialog Box

IDEAL_COSY

Use of COSY Negative Information

H1_MULT_FLAG

Use of 1H Multiplicities to Suppress Inappropriate Bonds

FIX_BOND_FLAG

Extract Unambiguous 1-Bond Constraints as Fixed Bonds

HETCON_FLAG

Bond Formation Between Heteroatoms

CCBOND_FLAG

Allowed Carbon-Carbon Bond Types

Table A.4.3 Parameters for Structure Generation

Parameter Name

Title in Edit Parameters for Structure Generation Dialog Box

GEN_FLAG

Search Criteria for Structure Generation

SAT_BC_RATE[3]

Rate of Bond Constraint Satisfaction, Starting, Ending and Step Values

N_FBX_STEP

Average Number of Possibility for Each C-C Bond Formation

MAX_ERR_BC

Maximum Limit for Bond Constraint Violation

MIN_RING_SIZE

Ring Size, Minimum

MAX_RING_SIZE

Ring Size, Maximum

ADD_C13_RNG

Addition Tolerance for Using C-13 Chemical Shifts

MIN_MB_C13

Minimum C-13 Shift for Multi-Bond Carbon

BAD_SS_FLAG

Exclude Structures with Chemically Unstable Moieties

MAX_REC_STR

Maximum Candidate Structures to Store

REC_SS_FLAG

Store the Partially Completed Structures

DISP_CMPLT_DELAY

Interval for Updating Structure Generation Dialog Box

 

Parameters for Spectral Interpretation

J(HH) Cutoff for Long-range Coupling COSY Peaks:

COSY_J_CATEG:     3.0  

The value of this parameter is used by NMR-SAMS to automatically classify the intensity level of DQF-COSY peaks based on the J-coupling constant when the intensity levels for individual COSY peaks are unknown (equals 0).  When a COSY peak has a J coupling constant of less than or equal to the value of this parameter, it is classified as a long-range coupled (or weak) peak.  Otherwise, it is classified as short-range coupled (or strong) peak.

 

H-H Bond Separation for a Long-range COSY-type Peak, Minimum and Maximum:

H-H Bond Separation for a Short-range COSY-type Peak, Minimum and Maximum:

COSY_BC:    4 5  2 3

The values of this parameter are used by NMR-SAMS during the interpretation of COSY peaks as bond constraints.  When a peak is classified as long-range coupled, then NMR-SAMS requires the number of intervening bonds in the structure to be within the range, namely, greater than or equal to COSY_BC [1] and less than or equal to COSY_BC[2].  When a peak is classified as short-range coupled, then NMR-SAMS requires the number of intervening bonds in the structure to be within the range, namely, greater than or equal to COSY_BC[3] and less than or equal to COSY_BC[4].

 

Tolerance for Near-diagonal COSY Peak Checking:

COSY_DIAG_RESO:         0.02

The value of this parameter is used by NMR-SAMS to distinguish the COSY diagonal peaks from the COSY near diagonal cross peaks.  When a near-diagonal COSY peak is not observed and the 1H chemical shift difference between two protons is less than or equal to this value, the user will be notified.  The user can then allow NMR-SAMS to add a pseudo bond constraint to this peak and prevent the loss of a correct structure when real peaks are omitted.

 

Minimum 1H Chemical Shift for Checking Long-range H-H Coupling:

MIN_MB_H1:  0

The value of this parameter is used for checking the presence of long-range coupled COSY peaks, if the value is bigger than 0.  When a COSY peak is interpreted as a geminal or vicinal coupling, and one of the protons has a 1H chemical shift greater than this value, then the user is warned of the potential long-range H-H coupling.  When a long-range coupling is not correctly identified, it could lead to the loss of a correct structure.  Therefore, the user is advised to extend the number of intervening bonds in the bond constraint to cover a long-range coupling.  The default value of this parameter is 0, so therefore, the program will not check for the presence of long-range coupled COSY peaks.

 

C-H Bond Separation for a Weak HMBC-Type Peak, Minimum and Maximum:

C-H Bond Separation for a Medium HMBC-Type Peak, Minimum and Maximum:

C-H Bond Separation for a Strong HMBC-Type Peak, Minimum and Maximum:

HMBC_BC:    2 5   2 3   2 3  

The values of this parameter are used for interpreting the HMBC peaks, and NMR-SAMS uses these values to determine the number of intervening bonds that each peak represents.

 

A weak HMBC peak must result in a carbon to proton separation within the specified range (i.e. greater than or equal to the HMBC_BC[1] and less than or equal to the HMBC_BC[2]).  A medium HMBC peak must result in a carbon to proton separation within the specified range (i.e. greater than or equal to the HMBC_BC[3] and less than or equal to the HMBC_BC[4]).  A strong HMBC peak must result in a carbon to proton separation within the specified range (i.e. greater than or equal to the HMBC_BC[5] and less than or equal to the HMBC_BC[6]).

 

C-C Bond Separation for an INADEQUATE Peak, Minimum and Maximum:

Type of INADEQUATE-derived C-C Bond (the last value):

INAD_BC:    1 1 0

The values of this parameter are used for interpreting an INADEQUATE peak, and NMR-SAMS uses the first two values to determine the number of intervening bonds that each peak represents.  Each INADEQUATE peak must result in a carbon to carbon separation within the specified range (i.e. greater than or equal to INAD_BC[1] and less than or equal to INAD_BC[2]).  The last value is used to determine the type of bond that the peak represents.  For the "Unspecified" type (i.e., INAD_BC[3] = 0), NMR-SAMS allows the bonds to be of any type (single, double or triple).  For the other types, NMR-SAMS forces the bonds to be of a specified type.

 

RELIAB_PEAK_PROB: 0.50

The value of this parameter is used for interpreting COSY, HMBC, and INADEQUATE peaks as bond constraints, and it is used as the minimum probability for a reliable peak.  A peak with a probability greater than or equal to this value is taken as a reliable one, otherwise it is considered unreliable.

 

NOESY_DIST:       1.90 5.00   1.90 3.00   1.90 2.50

The values of this parameter are used for exporting the NOESY peaks.  When exporting resonance assignment results, NMR-SAMS will use these values to calibrate the proton-proton bounds (H-H geometric distance) from the NOE intensity levels.  These values represent the minimum and maximum H-H distances in Angstroms for weak, medium, and strong NOE peaks, respectively.

Verbose Mode:

PRO_LEVEL: 0

When verbose mode is on (i.e., PRO_LEVEL = 0), NMR-SAMS will display numerous information and  warning messages to the user.  This is useful for users who are just beginning to use NMR-SAMS, or for users who want to make sure that they are notified of any strange instances.  When verbose mode is off (i.e., PRO_LEVEL = 1), NMR-SAMS will notify the user only when an error occurs, or when user-input is required.  This mode is useful for advanced users of NMR-SAMS.  Note that this parameter does not affect the messages stored in the log file.

Parameters for Setting up ACMX

Use of COSY Negative Information: 

IDEAL_COSY: 1

When the first button, Treat as Ideal Spectrum, is selected (this corresponds to IDEAL_COSY = 1, the default setting), NMR-SAMS will treat COSY as an ideal spectrum, namely, two proton-bearing carbon atoms will be forbidden to connect if no COSY peak is observed between them.  Although this is usually true and will reduce the time taken to generate structures, it could also lead to the loss of a correct structure if some 3JH,H couplings are not observed, due to such reasons as H-H configuration or chemical environments.

When the second button, Use with NOESY Data, is selected (this corresponds to IDEAL_COSY = 2), NMR-SAMS will use the negative information in conjunction with NOESY data.  In that case, two proton-bearing carbon atoms will be forbidden to connect if neither COSY nor NOESY peaks are observed between them.  This may be safer than the previous choice, and is recommended when NOESY data is available.

 

When the last button, Do Not Use, is selected (this corresponds to IDEAL_COSY = 0), negative information will not be used.  In that case, all proton-bearing atoms will be allowed to connect even if no COSY peaks are observed between them.  Though that is a safe option, it could significantly reduce the efficiency of structure generation.

 

Use of 1H Multiplicities to Suppress Inappropriate Bonds:

H1_MULT_FLAG:             1

When this option is selected (this corresponds to H1_MULT_FLAG = 1, the default setting), the following rules will be used to exclude some carbon atoms from bonding during the structure generation:

1.        Only CHx-CHy (x > 0,  y ³ 0) is considered;

2.        CH3 with a multiplicity M = 1(s), 2(d), 3(t), or 4(q) is forbidden to bond to CHy if y ¹ M -1;

3.        CH3 with other multiplicities M > 4 is forbidden to bond to CHy if y = 0;

4.        CH with a multiplicity M = 1 is forbidden to bond to CHy if y = 2 or 3.

 

When this option is not selected (this corresponds to H1_MULT_FLAG = 0), 1H multiplicity information will not be used.  In such a case, the structure generation process will take longer and produce more candidate structures.  1H multiplicities must be used carefully in order not to lose the correct structure.  If the user finds that the multiplicity of a certain 1H peak is not reliable, or that it does not fit these rules, then input it as an unknown multiplicity (represented as “u”, see Section 5.2), so that its multiplicity information will not be used.  If the user does not want to use 1H multiplicities, the user can turn off this option so that all 1H multiplicities will be ignored by NMR-SAMS. 

Extract Unambiguous 1-Bond Constraints as Fixed Bonds:

FIX_BOND_FLAG: 1

This flag determines whether or not NMR-SAMS will use NMR-derived unambiguous bond constraints (e.g. those from well-resolved COSY peaks) as fixed bonds prior to structure generation.  Once a fixed bond is defined, it cannot be broken although its bond type can be changed in the subsequent structure generation.  While this enhances the efficiency of structure generation, the correct structure may be lost if one of the fixed bonds is incorrect (e.g., a long-range coupled DQF-COSY peak was mistakenly interpreted as a vicinal coupling).  The default is to use unambiguous bond constraints as fixed bonds (this corresponds to FIX_BOND_FLAG = 1).

 

If the user chooses not to use unambiguous bond constraints as fixed bonds (corresponding to FIX_BOND_FLAG = 0), all NMR-derived bond constraints will be used during structure generation.  In that case, the bond constraints can be violated when MAX_ERR_BC > 0, but this can significantly reduce the efficiency of structure generation.

 

Note that NMR-SAMS always treats user-supplied unambiguous bond constraints as fixed bonds.

Bond Formation Between Heteroatoms:

HETCON_FLAG:  0

When the first option, Disabled for All, is selected (corresponding to HETCON_FLAG = 0, the default setting), bonds will be forbidden to be formed between all heteroatoms during structure generation.  When the second option, Disabled for Same Type, is selected (corresponding to HETCON_FLAG = 1), bonds will be forbidden to be formed between the same type of heteroatoms.  When the third option, Enabled for All, is selected (corresponding to HETCON_FLAG = 2), there will be no limitation on the bond formation between heteroatoms.

 

Since the default setting is ‘Disabled for All’, the user must be cautious when functional groups, such as -NO2 or -O-O- exist in the molecule.  However, if the user defines such groups as user-defined bond constraints (see Section 7.2), then the user can still select the ‘Disabled for All’ option to enhance the efficiency of structure generation. 

Allowed Carbon-Carbon Bond Types:

CCBOND_FLAG: 1 1 1

If one or more of the three types of C-C bonds is not checked (corresponding to CCBOND_FLAG[i] = 0, where i = 1, 2 or 3 for single, double, and triple bond, respectively), then the corresponding type of C-C bond will not be formed during the subsequent structure generation.  These options are used only when it is known that specific types of C-C bonds do not need to be formed during structure generation.  This is useful when it is known that a certain type of C-C bond does not exist in the molecule, or that the certain bond type has already been extracted as a fixed bond.  For example, when generally all single C-C bond correlation information is provided (as in the case of INADEQUATE data), the user can set the fixed bonds flag to ‘On’ (FIX_BOND_FLAG = 1) and let NMR-SAMS extract all C-C single bonds.  Then the user can force NMR-SAMS to generate only C-C multi-bonds during the structure generation process by turning off ‘Single’, and checking ‘Double’ and ‘Triple’ for this parameter.  By default, they are all checked (i.e., CCBOND_FLAG[i] = 1, where i = 1, 2 and 3).

Parameters for Structure Generation

Search Criteria for Structure Generation: 

GEN_FLAG: 1

When Advanced (i.e., GEN_FLAG = 1, the default value) is selected, the advanced heuristic search method will be used to accelerate the structure generation process.  Such a heuristic search method will take advantages of the bond constraints and 13C chemical shifts and reorder the solution space so that only the most probable portion of the solution space is searched for candidate structures.  This type of search criteria usually leads to the fastest structure generation process with the most reliable results when sufficient spectral data is used.

 

When Basic (i.e., GEN_FLAG = 2) is selected, more relaxed parameters (SAT_BC_RATE and N_FBX_STEP) will be used for the penalty function so that a wider portion of solution space is searched during structure generation.

 

When Exhaustive (i.e., GEN_FLAG = 0) is selected, the brute-force exhaustive search method will be used for structure generation.  This is usually a very slow process, so it is useful only when the molecule is very small or when the heuristic methods mentioned above fails to give the correct structure.  This option is recommended for exhaustive isomer enumeration based solely on the MF. 

 

Note: SAT_BC_RATE and N_FBX_STEP are two important parameters that control the completeness of the search when the GEN_FLAG is set as 1 or 2.  By modifying their values, the user can set a reasonable balance between speed and completeness of structure generation.

Rate of Bond Constraint Satisfaction, Starting, Ending, and Step Values: 

SAT_BC_RATE:  1.2  0.6  0.1

This parameter is one of the most important parameters related to heuristic structure generation, and is used only when GEN_FLAG = 1 or 2.  The three values of this parameter determine the use of a penalty function for evaluating the substructures based on the “rate of BC-satisfaction”, K:

·         SAT_BC_RATE[1] is the required starting value of K, Ks.  The default value is 1.2; 

·         SAT_BC_RATE[2] is the required ending value of K, Ke.  The default value is 0.6;

·         SAT_BC_RATE[3] is the step value, DK, for automatic adjustment of K. The default value is 0.1.

 

For the first run of structure generation, K = Ks.  If a complete structure is unobtainable  (which usually means that K is too big) and K > Ke, the structure generation process will be automatically repeated using K = K - DK.  Such iteration ends when at least one structure is generated or when K £ Ke or K £ 0.  If DK = 0, or Ks £ Ke, the structure generation will not be repeated, namely, only one structure generation will be performed with K = Ks. 

 

Appropriate usage of SAT_BC_RATE limits the search scope of the structure generation to only the most probable portion, hence it speeds up this process without losing the correct structure.  A bigger value of K makes the search less complete and the computation time shorter, and vice versa.  If Ks = 0.0, the penalty function is ignored, so substructures will not be evaluated based on the rate of BC-satisfaction.  This is the most exhaustive search, but can be very slow.

 

Tip: The variation of the K values can be seen in the log file (*.log).  Also, the K value that was used during the last iteration of structure generation can be found under the keyword “RESULTS:” in the MDF.  For more details regarding evaluation of substructures based on the rate of BC-satisfaction, please refer to References 1-3.

Average Number of Possibilities for Each C-C Bond Formation: 

N_FBX_STEP: 3.0

Analogous to SAT_BC_RATE, this is another important parameter related to heuristic structure generation, and is used only when GEN_FLAG = 1 or 2.  The value of this parameter defines the average number of free bonds to be tried while forming a bond on a certain atom.  This limits the search scope of structure generation to the most plausible portion of the solution space, and the default value is 3.0.  A bigger value of N_FBX_STEP makes the search more complete and the computational time longer, and vice versa.  If N_FBX_STEP = 0.0, then all free bonds will be tried.  This is the most exhaustive search, but it can also be very slow.

 

Note: In contrast to SAT_LRDC_RATE, N_FBX_STEP is not automatically adjusted based on the results.  For details regarding the scope of a search based on N_FBX_STEP, please refer to Reference 3.

Maximum Limit for Bond Constraint Violation: 

MAX_ERR_BC: 1

Sometimes it is necessary to allow a few BC’s to be violated during structure generation.  For example, occasionally 4-bond C-H correlations are observed in HMBC.  As all HMBC-derived BC’s are interpreted as 2 or 3-bond separations, by default (see parameter HMBC_BC), the correct structure can only be generated when a certain number of BC’s are violated.  This is a trade-off, because allowing some BC’s to be violated reduces the efficiency of structure generation significantly since more incorrect substructures need to be considered during structure generation.

Minimum Ring Size:

MIN_RING_SIZE:  0

This parameter defines the minimum ring size for the rings in a generated structure.  When a value smaller than 3 is defined, there will be no limit on the ring sizes (the default value is 0, i.e., no limitation).

Maximum Ring Size: 

MAX_RING_SIZE: 0

This parameter defines the maximum ring size for the rings in a generated structure.  When a value smaller than 3 is defined, there will be no limit on the ring sizes (the default value is 0, i.e., no limitation).

If the structure generation is very slow, the user can try to limit the maximum ring size, e.g., by setting MAX_RING_SIZE as 6, if appropriate.

Addition Tolerance for Using C-13 Chemical Shifts: 

ADD_C13_RNG: 0.0

This parameter (set as 0.0, by default) is the tolerable violation of 13C chemical shifts, which is used for evaluating substructures based on 13C chemical shifts.  When ADD_C13_RNG = t, the predicted d13C range of a carbon is d1 to d2, and the observed d13C is d*, the substructure containing this CCSS is regarded as bad if d* < d1-t or d* > d2 + t.  This parameter is useful when several odd 13C chemical shifts are expected in the molecule (see Section 3.5).

 

This parameter is also used for setting up an assignment matrix during resonance assignment (see Section 8.2).

Minimum C-13 Shift for Multi-Bond Carbon: 

MIN_MB_C13:  60

The value of this parameter determines the lowest possible 13C chemical shift for a sp2 or sp carbon.  Any multi-bonds will be forbidden to attach to a carbon atom with d13C < MIN_MB_C13.  The default value is 60; but 100 can be used when no triple bonds are expected in the unknown structure.

Exclude Structures with Chemically Unstable Moieties: 

BAD_SS_FLAG: 1

When this option is checked (i.e., BAD_SS_FLAG = 1, the default value), simple chemically unstable structural moieties will be rejected during structure generation.  Such structural moieties include:

1.        =C=,  i.e., multiple double bonds on a carbon atom,

2.        =X1=X2=,  where X1 and X2 are any heteroatoms, and =C=,

3.        C(X1)(X2)(X3), where X1, X2 and X3 are any heteroatoms connected to the same carbon atom, except -CF3 group,

4.        Three-membered ring, except an epoxide ring without attached double bonds on each carbon atom.

 

When this option is unchecked (i.e. BAD_SS_FLAG = 0), such moieties will not be excluded.

Maximum Candidate Structures to Store:

MAX_REC_STR: 50

This parameter defines the maximum number of generated structures, and the default value of MAX_REC_STR is 50.  When the number of candidate structures reaches MAX_REC_STR, the structure generation process will be terminated.  Note that MAX_REC_STR does not include redundant structures.  For example, if a structure is generated twice (with alternative 13C assignments), it will be counted as one structure when checking for MAX_REC_STR.  So, if redundant structures are generated, the number of candidate structures will be more than MAX_REC_STR.  If the user chooses to record intermediate substructures (see REC_SS_FLAG), the number of retained substructures, N_SS, is determined as follows:

N_SS = MAX_REC_STR - N_unique_str,

where N_unique_str is the number of chemically unique complete structures.

 

When MAX_REC_STR = 0, unlimited number of candidate structures will be generated.  In such a case, substructures can not be recorded (See REC_SS_FLAG).

Store the Largest Substructures in Addition to Complete Structures:

REC_SS_FLAG: 1

When this option is checked (i.e., REC_SS_FLAG = 1, the default value), the intermediate substructures generated during the structure generation process will be recorded.  Such intermediate substructures are useful when complete structures are not generated (due to errors in spectral data or use of inappropriate parameters). 

 

Since the number of substructures can potentially be very large, they can be stored only when the user defines an upper limit for the total number of structures (i.e., MAX_REC_STR > 0).  The number of retained substructures, N_SS, is determined as follows:

N_SS = MAX_REC_STR - N_unique_str,

where N_unique_str is the number of chemically unique complete structures.  When the number of generated substructures exceeds N_SS, only the largest ones will be retained.

 

Once the structure generation has been completed (or stopped by the user), NMR-SAMS will prompt the user to save the substructures in the structure file, along with the completed structures.  If the user clicks ‘Yes’, then the substructures will be saved and can be displayed along with the completed structures.  If the user clicks ‘No’, then all substructures will be discarded.

Interval for Updating Structure Generation Dialog Box: 

DISP_CMPLT_DELAY: 0.10

This parameter defines the interval (in minutes) for updating the ‘Structure Generation in Process’ dialog box during structure generation or the ‘Resonance Assignment in Process’ dialog box during resonance assignment.



References

1.      Jaspars, Marcel, "Computer Assisted Structure Elucidation of Natural Products Using Two-Dimensional NMR Spectroscopy," Nat. Prod. Rep., 1999, 16, 241-248.

2.      Dorman, Doug, "A 'Non-Classical' CASE Program," NMR Newsletter, 1998, 481, 11-12.

3.      Chen Peng, Shengang Yuan, Chongzhi Zheng, Yongzheng Hui, "Efficient Application of 2D NMR Correlation Information in Computer-Assisted Structure Elucidation of Complex Natural Products," J. Chem. Inf. Comput. Sci., 1994,34, 805-813.

4.      Chen Peng, Shengang Yuan, Chongzhi Zheng, Yongzheng Hui, Houming Wu, Kan Ma, "Application of Expert System NMR-SAMS to the Structure Elucidation of Complex Natural Products," J.  Chem. Inf. Comput. Sci., 1994, 34, 814-819.

 

5.      Chen Peng, Shengang Yuan, Chongzhi Zheng, Zhengshuang Shi, Houming Wu, "Toward Practical Computer-Assisted Structure Elucidation for Complex Natural Products: Efficient Use of Ambiguous 2D NMR Correlation Information," J. Chem. Inf. Comput. Sci. 1995, 35, 539-546.

 

 

6.      Chen Peng, Shengang Yuan, Chongzhi Zheng, Lingran Chen, "From Spectra to Structure by Computer: Dreams and Reality," Computers and Applied Chemistry, Computer Chemistry Monograph Series 4, Beijing: Science Press, 1995: 26-33.

 

7.      Chen Peng, Shengang Yuan, Chongzhi Zheng, Yongzheng Hui, “Graph-theory-based Computer Representation of Two-Dimensional NMR Correlation Information for Automated Analysis,” Computers and Applied Chemistry, Computer Chemistry Monograph Series 4, Beijing: Science Press, 1995: 34-38.

 

 

8.      Chen Peng, Geoffrey Bodenhausen, Shengxiang Qiu, Harry H. S. Fong, Norman R. Farnsworth, Shengang Yuan, Chongzhi Zheng, “Computer-Assisted Structure Elucidation: Application of CISOC-SES to the Resonance Assignment and Structure Generation of Betulinic Acid,” Magnetic Resonance in Chemistry, 1998, 36, 267-278.


 

Index


13C spectrum                                                           27

1H spectrum                                                            24

ACMX                                                      i, 55, 56, 89

ADD_C13_RNG                                 15, 67, 72, 99

Analysis/2D Structure Generation                      77

Analysis/Assign Spectra                                      73

Analysis/Atom Environment Constraints 62, 67, 77

Analysis/Generate 2D Structures                        64

Analysis/Generate Building Blocks              77, 80

Analysis/Input Target Structure                         69

Analysis/Input Target Structure/Build Molecule 69

Analysis/Input Target Structure/Import MDL  69

Analysis/Molecular Formula                               42

Analysis/NMR Data                                       46, 48

Analysis/Quick Enumeration or Elucidation      78

Analysis/User-Defined Bond Constraints 58, 61, 67, 77

APT                                                                         27

assignment matrix                                                   72

atom-atom connection matrix                               55

Atom-atom Connection MatriX                              i

BAD_SS_FLAG                                                   100

bond constraint                                                  1, 13

ambiguous                                                          52

ambiguous                                                          14

cross-check of                                                   53

format of                                                              14

merge of                                                              53

pseudo                                                   50, 52, 60

unambiguous                                                     55

user-defined                                                       58

violation of                                                          99

building block                                               i, 44, 89

display of                                                            79

candidate structure                                                  1

display of                                                      79, 80

editing                                                                  82

export of                                                              86

maximum of                                                       100

CASE                                                                          3

CCBOND_FLAG                                                    97

CCSS                                                     i, 7, 15, 91, 99

chemical shift                                                     1, 15

prediction of                                                       91

chemical valence                                       22, 44, 71

chemical_shifts.def                                     7, 15, 91

chromatic graph                                                        2

COLOC                                                          i, 36, 50

combinatorial explosion                                        15

computer assisted structure elucidation              3

connectivity                 1, 13, 30, 35, 36, 38, 49, 87

format of                                                              88

ID of                                                                    14

COSY                     i, 13, 16, 30, 49, 50, 52, 95, 96, 97

COSY_BC                                                   49, 50, 95

COSY_DIAG_RESO                                        50, 95

COSY_J_CATEG                                             49, 94

data acquisition                                                      12

DEPT                                                        i, 13, 15, 27

diagonal peak                                                         33

DISP_CMPLT_DELAY                                 65, 101

Display/Building Blocks & Fixed Bonds     61, 79

Display/Display Option/Chemical Shifts            81

Display/Display Options/Balls                            81

Display/Display Options/Chemical Shifts          81

Display/Display Options/Connection Table 80, 81, 82

Display/Display Options/Element Symbols       81

Display/Display Options/Molecular Formula 70, 82

Display/Display Options/Numbers                     81

Display/Display Options/Protons                       82

Display/Display Options/Refine                    80, 82

Display/Generated Structures or Assignments 61, 80

Display/Status Window                                        81

Display/Target Structure                                      80

distance constraint                                                85

double bond equivalence                                     22

dummy bond                       16, 57, 60, 64, 67, 79, 80

EC                                                                                i

Edit/Generated Structures                                     82

Edit/Log File                                               49, 66, 78

Edit/Master Data File                                            49

Edit/NMR Data File                                  27, 30, 32

Edit/Parameters/2D Structure Generation     77, 93

Edit/Parameters/NMR Interpretation                  48

Edit/Parameters/Parameter File                             93

environment constraint                               i, 16, 62

format of                                                             62

input of                                                                62

File/Create NMR Data File                                    78

File/Create NMR Data File/C13 and DEPT        27

File/Create NMR Data File/COSY                       30

File/Create NMR Data File/H1                             24

File/Create NMR Data File/HMQC (or HETCOR) 35

File/Create NMR Data File/NOESY (or ROESY) 38

File/Exit                                                                    23

File/Molecular Formula                                         21

File/New                                                      20, 77, 78

File/Open                                                                18

File/Save                                                                 23

File/Save As                                                           23

FIX_BOND_FLAG                               55, 67, 97, 98

fixed bond                                  14, 55, 56, 79, 80, 97

focus atom                                                              62

free bond                                                                 44

number of                                                           56

GEN_FLAG                                       5, 17, 74, 98, 99

graphical display                                                    79

H1_MULT_FLAG                               25, 67, 87, 96

H1MULT_FLAG                                                    55

HETCON_FLAG                                                     97

HETCOR                                                   i, 13, 35, 43

HMBC                        i, 13, 36, 38, 50, 51, 52, 95, 96

HMBC_BC                                          50, 51, 95, 99

HMQC                               i, 13, 29, 35, 39, 43, 44, 52

IDEAL_COSY                              38, 51, 55, 67, 96

ignored atom                                                          57

INAD_BC                                                   51, 52, 95

INADEQUATE                                  i, 13, 51, 96, 97

intensity level                              34, 37, 38, 40, 49

IR                                                                        12, 13

J-coupling                                                                  1

J-coupling constant                                 33, 40, 49

knowledge base                                                        4

licensing                                                                    6

log file                                                                      18

long-range coupling                          34, 40, 49, 67

through-p-electron                                            49

main graphics window                                           79

mass spectroscopy                                                  1

master data file                                                        89

keyword of                                                          89

modification of                                                   89

record of                                                              89

MAX_ERR_BC                                          67, 97, 99

MAX_REC_STR                                            67, 100

MAX_RING_SIZE                                           67, 99

maxNSBC                                      14, 49, 50, 51, 54

MDF                                              i, 16, 18, 52, 56, 89

MDL file                                                                   69

MF                                                       i, 13, 21, 26, 77

MIN_MB_C13                                                        99

MIN_MB_H1                                                          95

MIN_RING_SIZE                                             67, 99

minNSBC                                             14, 50, 51, 54

molecular formula                                         i, 12, 21

molecular symmetry                                           3, 43

MS                                                                      12, 13

multiplicity                                                   1, 15, 25

13C                                                                        44

1H                                                                          67

N_FBX_STEP                                             74, 98, 99

near-diagonal peak                                         33, 50

negative information                                16, 51, 67

NMR data file                                                   18, 87

1D data                                                                87

2D data                                                                87

keywords of                                                        87

nmrsams.ini                                                       79, 80

NOESY                                          i, 13, 38, 51, 52, 85

NOESY_BC                                                             51

NOESY_DIST                                             38, 85, 96

NSBC                                                   i, 49, 50, 51, 54

Operating Systems                                                   5

paclitaxel                                                      58, 64, 66

parameters                                            16, 17, 20, 74

for resonance assignment                          72, 74

for setting up ACMX                                        96

for spectral interpretation                           48, 94

for structure generation                              65, 98

parameter file                                                      18

summary of                                                         93

partial structure elucidation i, 2, 3, 16, 29, 39, 43, 57, 67

Partial Structure Elucidation                           60, 64

peak ID                                                                    14

peak intensity                                                     1, 38

peak picking                                                     12, 24

export of                                                              84

manual                                                                 39

peak table conversion                                           24

periodic_tab.def                                                       7

PRO_LEVEL                                                            96

PSE                                                                  i, 39, 43

rate of BC-satisfaction                                           98

REC_SS_FLAG                                               67, 100

reference                                                                102

RELIAB_PEAK_PROB                     49, 50, 51, 96

reliability                                                           33, 50

report generation                                                    84

resonance assignment       1, 3, 4, 15, 16, 67, 69, 73

display of                                                            81

export of                                                              85

ring size

maximum of                                                         99

minimum of                                                          99

ROESY                                                                     38

root name                                                                18

SAT_BC_RATE                                   66, 74, 98, 99

short-range coupling                                      34, 49

SpecMan                                          1, 12, 24, 31, 87

spectral interpretation                                           48

13C                                                                        43

1H                                                                         42

COSY                                                                   49

HMQC                                                          43, 44

INADEQUATE                                                  51

NOESY                                                         51, 52

spectral source                                                       14

status window                                                    6, 81

structure file                                                           18

structure generation                15, 57, 64, 65, 67, 97

complexity of                                                      56

efficiency of                               3, 15, 32, 34, 51

heuristic                                                         98, 99

interactive                                                     61, 68

symmetric peaks                                                    33

target structure                                                       67

display of                                                      79, 80

user intervention                                                   16

UV                                                                       12, 13

vebose mde                                                             96

working data set                                              17, 18