Usage-Driven Database Design - From Logical Data Modeling through Physical Schema Definition

Usage-Driven Database Design - From Logical Data Modeling through Physical Schema Definition

 

 

 

von: George Tillmann

Apress, 2017

ISBN: 9781484227220

Sprache: Englisch

379 Seiten, Download: 7945 KB

 
Format:  PDF, auch als Online-Lesen

geeignet für: Apple iPad, Android Tablet PC's Online-Lesen PC, MAC, Laptop
Typ: B (paralleler Zugriff)

 

eBook anfordern

Mehr zum Inhalt

Usage-Driven Database Design - From Logical Data Modeling through Physical Schema Definition



  Contents at a Glance 5  
  Contents 7  
  About the Author 18  
  Preface 19  
  Part I: Introduction 23  
     Chapter 1: Introduction to Usage-Driven Database Design 24  
        Database Design Principle 1: Separation Principle 26  
        Database Design Principle 2: Distinction Principle 27  
           The Difference Between Separation and Distinction 29  
        Database Design Principle 3: Convergence Principle 29  
           The Separation, Distinction, and Convergence Principles 30  
        Database Design Principle 4: Minimal Regression Principle 30  
        Usage-Driven Database Design 30  
           Logical Data Modeling 31  
           Physical Schema Definition 32  
           The Terminology Trap 32  
        Notes 33  
  Part II: Logical Data Modeling 34  
     Chapter 2: The E-R Approach 35  
        A Little Data Modeling History 37  
        Some Important Definitions 38  
        Logical Data Modeling Objects 39  
           Entities 39  
           Type-Instance Distinction 40  
           Relationships 40  
           Attributes 41  
        Notes 42  
     Chapter 3: More About the E-R Approach 43  
        More About Relationships 43  
           Membership Class 43  
              Cardinality 44  
              Modality 45  
           Degree 47  
              Binary Relationship 47  
              N-ary Relationships 48  
              Unary or Recursive Relationships 48  
           Relationship Constraints 49  
              Inclusion 49  
              Exclusion 50  
              Conjunction 50  
              Simple Conjunction 50  
              Conditional Conjunction 51  
              Recursive Modality Constraints 52  
        More About Entities 55  
           Attributive Entity 55  
           Associative Entities 56  
           Supertype and Subtype Entities (Generalization and Specialization) 56  
        More About Attributes 58  
           Attribute Domain 58  
           Attribute Source: Primitive and Derived 59  
           Attribute Descriptor and Unique Identifier 59  
           Compound or Concatenated Unique Identifiers 60  
           Attribute Complexity: Simple and Group 60  
           Attribute Valuation: Single Value and Multivalue 61  
           Attribute Complexity and Valuation 61  
     Chapter 4: Building the Logical Data Model 64  
        The Interview Process 65  
           Gather Information and Review 66  
              1. Identify the Users Who Are Authorities or Experts on the Subject 66  
              2. Meet and Interview the Experts and Identify the Subject (Application) Entities 66  
                 Preparation 66  
                 The First Interview 67  
              3. Identify Relationships Between the Entities 67  
              4. Identify the Properties or Attributes of the Entities and Relationships 67  
           Analyze Information 67  
           Construct Model 68  
           Repeat as Necessary 68  
        Making Sense of the Interview 68  
           Modeling Rules 70  
        Verifying What You Have Heard 72  
           Immediate Interview Feedback 72  
           Formal Walk-Throughs 72  
        Increasing E-R Diagram Comprehension 74  
           Subject Areas 74  
           Entity Fragments 75  
           Neighborhood Diagrams 76  
           Relationship Bridges and Stubs 77  
        Some Model Building Best Practices 78  
           Getting Started 78  
           Don’t Lose Control of the Project to Users 79  
           Don’t Lose Control of the Project to Technical Staff 79  
           Don’t Become Dependent on Tools or Techniques 80  
           Don’t Get Bogged Down in Endless Analysis 80  
           The Players…and the Rules of Engagement 81  
        Deliverables 82  
        Examples of Deliverables 83  
           Sample Data Dictionary, Data Object Definitions 84  
        Notes 86  
     Chapter 5: LDM Best Practices 87  
        Abbreviations 88  
        Almost Unique Identifiers 89  
        Clarity 90  
        Compound Unique Identifiers 90  
        Conceptual Integrity 91  
        Conjunctive Relationships 93  
        Duplicate Super-Subtypes “Type” Data 93  
           Exclusive and Nonexclusive Generalization 94  
           Required and Nonrequired Participation 96  
        Exclusive Relationships 96  
        Group Attributes 97  
        Level of Abstraction 97  
        Many-to-Many Relationships 98  
        N-ary Relationships 100  
           N-ary Relationships and Membership Class 101  
        Naming Objects 104  
           Multiple Names 104  
           Naming Conventions 105  
           Name Uniqueness 105  
           Naming Convention Goals 106  
        Null Attributes 109  
           There Be Blanks in Them Thar Nulls 109  
        Optional Relationships (Optional-Optional Relationships) 110  
        Subject Areas 110  
        Supertypes and Subtypes 111  
        Unique Identifiers 113  
        Note 115  
     Chapter 6: LDM Pitfalls 116  
        Circular Relationships 116  
        Data Values 117  
        Data Value–Differentiated Entities and Attributes 118  
        Derived Data 119  
           Three Poor Arguments Against Modeling Derived Data 119  
           Derived Data as Process 120  
           Derived Data and Physical Database Design 121  
        Discrete Attributes 122  
        Embedded Attributes 123  
           Uniqueness 123  
           Group Attributes 123  
           The Problem with Embedded Attributes 123  
           The Solution 124  
           The Moral of the Story 124  
        Entity Fragmentation 124  
        Foreign Keys 125  
        Junction Entities 126  
        Normalization 126  
        Presentation Data 127  
        Primary Keys 127  
        Process Data 128  
        Repeating Groups 129  
           Multivalue Attribute 129  
           Group Attribute 130  
        Single-Attribute Entities 130  
           Code 131  
           Multivalue Attribute (Repeating Group) 131  
           Associative Entities 131  
        Substitution Data 132  
        Substitution Tables 132  
        Transient Data 132  
           Location-Dependent Data 133  
     Chapter 7: LDM Perils to Watch For 135  
        Associatives Related to Other Associatives 135  
        Diagrammable Objects 136  
        Disassociated Entity Clusters (“Islands”) 137  
        Duplicate Unique Identifiers 138  
           One Entity, Two or More Identifiers 138  
           One Identifier, Two or More Entities 138  
        Multiple Relationships 139  
           One Relationship, Multiple Views 139  
           Multiple Different but Similar Relationships 139  
        One-of-a-Kind (OOAK) Entities 140  
        One-to-One Relationships 140  
        Rare Entity Relationships 141  
        Recursive Modality Constraints 142  
           Updating the Constraints 143  
        Spiderwebs 144  
        Too Many Blanks or Nulls 145  
        Too Many Recursives 146  
        Next U3D Phase: Physical Schema Definition 147  
        Notes 147  
  Part III: Physical Schema Definition 148  
     Chapter 8: Introduction to Physical Database Design 149  
        A Short Incondite History of Automated Information Management (or, a Sequential Look at Random Access) 150  
           Information Management Era 1: Sequential Processing 150  
           Information Management Era 2: The First Random Access DBMS 151  
              A Small Digression: A Couple of Words About Database Access 156  
              Hashing 156  
              Inverted Indices 158  
              Database Pages 159  
              B-Trees 160  
              Bitmaps 162  
              Associative Arrays 163  
           Information Management Era 3: Inverted File Systems 164  
           Information Management Era 4: The Age of Relational 164  
              Problems with Relational 166  
                 First—Performance Issues 166  
                 Second—Not So Simple Simplicity 166  
                    Data Types 166  
                    Procedural Code 166  
                    Groups 167  
                 Third—Communication and Language 167  
                 Fourth—Relational: Theory or DBMS? 169  
                 Fifth—Where Are You Relational Model? 170  
              Just Because It Has Failings Doesn’t Mean It’s a Failure 170  
           Information Management Era 5: Object Technology 171  
              Object-Oriented Programming Led to Object-Oriented Analysis and Design, Which Eventually Led to the Object-Oriented Database Management Systems (OODBMSs) 171  
              A Small Digression (Again): The ACID Test 172  
           Information Management Era 6: NoSQL 173  
              Key-Value 173  
              Graph 174  
              Document Management 174  
              Multimodal 174  
              Is That an ACID or a BASE? 174  
        And the Winner Is… 176  
        What’s to Come 177  
        References 177  
        Notes 177  
     Chapter 9: Introduction to Physical Schema Definition 179  
        Usage-Driven Database Design: Physical Schema Definition 181  
        Step 1: Transformation 182  
           Task 1.1: Translation 183  
           Task 1.2: Expansion 184  
        Step 2: Utilization 185  
           Task 2.1: Usage Analysis 185  
           Task 2.2: Path Rationalization 187  
        Step 3: Formalization 188  
           Task 3.1: Environment Designation 189  
           Task 3.2: Constraint Compliance 191  
        Step 4: Customization 193  
           Task 4.1: Resource Analysis 194  
           Task 4.2: Performance Enhancement 196  
        Summary 197  
     Chapter 10: Transformation: Creating the Physical Data Model 199  
        Task 1.1: Translation 200  
           Activity 1.1.1: Transform LDM Objects to PDM Objects 200  
              Entities to Record Types 201  
              Relationships to Linkages 204  
                 Linkage Membership Class 204  
                 Linkage Degree 205  
                 Linkage Constraints 205  
              Attributes to Data Items 206  
                 Data Item Domain 207  
                 Data Item Source: Primitive and Derived 207  
                    Primitive Data Item: Unique Identifiers and Descriptors 207  
                 Data Item Complexity: Simple and Group 207  
                 Data Item Valuation: Single Value and Multivalue 208  
                 Other Data Item Information 208  
           Activity 1.1.2: Diagram the Objects 208  
        Task 1.2: Expansion 209  
           Activity 1.2.1: Assign Keys 209  
           Activity 1.2.2: Normalize the Model 211  
              Adjustments Needed for Normalization: Keys—Foreign and Domestic? 213  
              Zero Normal Form 214  
              First Normal Form 214  
              Before Getting to Second Normal Form, a Slight Digression 215  
              Second Normal Form 215  
              Third Normal Form 216  
              Post-Normalization—Retreat of Sally Forth? 217  
              Issues with Normalization 217  
        Tranformation Notes 219  
        Deliverables 220  
        Examples of Deliverables 220  
     Chapter 11: Utilization: Merging Data and Process 224  
        Task 2.1: Usage Analysis 225  
           Process Modeling 225  
           Logical Process Modeling 225  
              Natural-Language Logical Process Modeling Techniques 226  
                 Plain English 226  
                 Oy Vey, There Has Got to Be a Better English Translation 226  
                 Structured English 226  
              Graphical Logical Process Modeling Techniques 228  
           Physical Process Modeling 230  
              Natural-Language Physical Process Modeling Techniques 230  
                 Plain English 230  
                 Structured English 230  
                 Pseudocode 230  
              Graphic Physical Process Modeling Techniques 231  
                 Flow Charts 231  
                 Structure Charts 232  
           Activity 2.1.1: Create Usage Scenarios 233  
              Clearing the Decks for Action 234  
           Putting a Usage Scenario Together 237  
           An Example 237  
           Activity 2.1.2: Map Usage Scenarios to the PDM 239  
        Task 2.2: Path Rationalization 240  
           Activity 2.2.1: Reduce to Simplest Paths 240  
           Activity 2.2.2: Simplify Model 241  
        Utilization Notes 242  
        Deliverables 242  
        Example of Deliverables 243  
        Further Reading 243  
           Structured English 243  
           Data Flow Diagramming 244  
           Flow Charts 244  
           Pseudocode 244  
           Structure Charts 244  
     Chapter 12: Formalization: Creating a Schema 245  
        Task 3.1: Environment Designation 246  
           Hierarchical Systems 249  
           Network Systems 249  
           Relational Systems 249  
           Object-Oriented 250  
           NoSQL 250  
           DBMS Product and Version Selection 250  
        Task 3.2: Constraint Compliance 251  
           Pseudocode…Again 251  
           Activity 3.2.1: Map Rationalized Physical Data Model to the Data Architecture 253  
              Record Types 253  
                 Proper 253  
                 Associative 253  
                 Attributive 254  
                 S-Type 255  
              Links 255  
                 Membership Class: Cardinality 255  
                    One-to-One 255  
                    One-to-Many 256  
                    Many-to-Many 256  
                 Membership Class: Modality 256  
                    Mandatory 257  
                    Optional 257  
                 Degree 257  
                    Unary 257  
                    Binary 258  
                    N-ary 258  
                 Constraints 259  
                    Inclusion 259  
                    Exclusion 259  
                    Conjunction 259  
           Data Items 260  
              Domains 260  
                 Source: Primitive and Derived 260  
                    Primitive Data Items 260  
                    Derived Data Items 261  
                 Complexity: Simple and Group 261  
                 Valuation: Single Value and Multivalue 262  
           Activity 3.2.2: Create a DBMS Product/Version-Specific Functional Physical Database Design 263  
              Subschema Creation 265  
        Formalization Notes 267  
        Deliverables 267  
        Example of Deliverables 268  
     Chapter 13: Customization: Enhancing Performance 269  
        Task 4.1: Resource Analysis 270  
           The Trade-Off Triangle 271  
        Task 4.2: Performance Enhancements 274  
           Activity 4.2.1: Customize Hardware 275  
              A Few Words About Secondary Storage 275  
                 Add Disk 277  
                 Faster Disk 277  
                 Main Memory 277  
           Activity 4.2.2: Customize Software 278  
              Indices (B-Tree, Hash, Bitmap) 278  
              Clustering 278  
              Example Using Indices and Clusters 281  
                 Question 1: Should Order Be Indexed? 283  
                 Question 2: Should Product Be Indexed? 283  
                 Question 3: Should Line Item Be Indexed? 284  
                 Question 4: Should Line Item Be Clustered? 284  
                 Question 5: Should Line Item Be Clustered Around Order or Product? 284  
                 Alternative 1: Line Item Clustered Around Order 284  
                 Alternative Two: Line Item Clustered Around Product 285  
              Partitioning 286  
              Derived and Duplicate Data 287  
              Denormalization 288  
              Get Rid of ACID 288  
              Big Data, Big Problems, Big Solution 289  
              To Plunge or Not to Plunge 290  
              NoSQL 291  
              Modeling Big Data U3D Style 292  
        Customization Notes 294  
        Deliverables 294  
        Examples of Deliverables 295  
     Chapter 14: The Data Warehouse 298  
        The Data Warehouse 299  
           Data Warehouse Architecture 301  
        Using U3D to Develop a Data Warehouse 302  
           Step 1: Transformation 302  
           Step 2: Utilization 303  
              The Time Dimension 305  
           Step 3: Formalization 306  
           Step 4: Customization 306  
              Streamlining 306  
              Duplication 306  
              Denormalization 306  
              Indices and Hashing 307  
              Bitmaps 307  
              Bulk Loading 308  
              Clustering 309  
              Partitioning 309  
              Distributed Processing 309  
              All Together Now… 310  
              Oops… 310  
        Customization Notes 311  
        Note 311  
     Chapter 15: The Big Data Decision Support System 312  
        Structured, Unstructured, and Semistructured Data—Another Small Digression 313  
        DSS and Big Data 316  
        Using U3D to Develop a Big Data Decision Support System 317  
           Step 1: Transformation 318  
           Step 2: Utililization 318  
           Step 3: Formalization 319  
           Step 4: Customization 321  
              A Little About Hadoop 321  
              Putting It All Together 324  
        Deliverables 324  
  Part IV: Where from Here? 326  
     Chapter 16: A Look Ahead 327  
        We Need to Ask the Awkward Questions 328  
        Tools Need to Take Usage into Account 330  
        The One and Only DBMS 330  
        Better Training 334  
        Notes 336  
  Part V: Appendixes 337  
     Appendix A: Glossary 338  
     Appendix B: Logical Data Modeling Definitions 357  
        Entity 357  
        Relationship 358  
        Attribute 358  
        Domain 359  
     Appendix C: Physical Schema Definition Object Definitions 360  
        Record 360  
        Linkage 361  
        Data Item 362  
        Domain 362  
        Cluster 363  
        Partition 363  
        Index 364  
     Appendix D: Formulas Used in This Book 365  
     Appendix E: List of U3D Deliverables 367  
  Index 370  

Kategorien

Service

Info/Kontakt

  Info
Hier gelangen Sie wieder zum Online-Auftritt Ihrer Bibliothek