Usage-Driven Database Design - From Logical Data Modeling through Physical Schema Definition

von: George Tillmann

Apress, 2017

ISBN: 9781484227220

Sprache: Englisch

379 Seiten, Download: 7945 KB

Format: PDF, auch als Online-Lesen

geeignet für:

Typ: B (paralleler Zugriff)

eBook anfordern

▸

Mehr zum Inhalt

Usage-Driven Database Design - From Logical Data Modeling through Physical Schema Definition

	Contents at a Glance	5
	Contents	7
	About the Author	18
	Preface	19
	Part I: Introduction	23
	Chapter 1: Introduction to Usage-Driven Database Design	24
	Database Design Principle 1: Separation Principle	26
	Database Design Principle 2: Distinction Principle	27
	The Difference Between Separation and Distinction	29
	Database Design Principle 3: Convergence Principle	29
	The Separation, Distinction, and Convergence Principles	30
	Database Design Principle 4: Minimal Regression Principle	30
	Usage-Driven Database Design	30
	Logical Data Modeling	31
	Physical Schema Definition	32
	The Terminology Trap	32
	Notes	33
	Part II: Logical Data Modeling	34
	Chapter 2: The E-R Approach	35
	A Little Data Modeling History	37
	Some Important Definitions	38
	Logical Data Modeling Objects	39
	Entities	39
	Type-Instance Distinction	40
	Relationships	40
	Attributes	41
	Notes	42
	Chapter 3: More About the E-R Approach	43
	More About Relationships	43
	Membership Class	43
	Cardinality	44
	Modality	45
	Degree	47
	Binary Relationship	47
	N-ary Relationships	48
	Unary or Recursive Relationships	48
	Relationship Constraints	49
	Inclusion	49
	Exclusion	50
	Conjunction	50
	Simple Conjunction	50
	Conditional Conjunction	51
	Recursive Modality Constraints	52
	More About Entities	55
	Attributive Entity	55
	Associative Entities	56
	Supertype and Subtype Entities (Generalization and Specialization)	56
	More About Attributes	58
	Attribute Domain	58
	Attribute Source: Primitive and Derived	59
	Attribute Descriptor and Unique Identifier	59
	Compound or Concatenated Unique Identifiers	60
	Attribute Complexity: Simple and Group	60
	Attribute Valuation: Single Value and Multivalue	61
	Attribute Complexity and Valuation	61
	Chapter 4: Building the Logical Data Model	64
	The Interview Process	65
	Gather Information and Review	66
	1. Identify the Users Who Are Authorities or Experts on the Subject	66
	2. Meet and Interview the Experts and Identify the Subject (Application) Entities	66
	Preparation	66
	The First Interview	67
	3. Identify Relationships Between the Entities	67
	4. Identify the Properties or Attributes of the Entities and Relationships	67
	Analyze Information	67
	Construct Model	68
	Repeat as Necessary	68
	Making Sense of the Interview	68
	Modeling Rules	70
	Verifying What You Have Heard	72
	Immediate Interview Feedback	72
	Formal Walk-Throughs	72
	Increasing E-R Diagram Comprehension	74
	Subject Areas	74
	Entity Fragments	75
	Neighborhood Diagrams	76
	Relationship Bridges and Stubs	77
	Some Model Building Best Practices	78
	Getting Started	78
	Don’t Lose Control of the Project to Users	79
	Don’t Lose Control of the Project to Technical Staff	79
	Don’t Become Dependent on Tools or Techniques	80
	Don’t Get Bogged Down in Endless Analysis	80
	The Players…and the Rules of Engagement	81
	Deliverables	82
	Examples of Deliverables	83
	Sample Data Dictionary, Data Object Definitions	84
	Notes	86
	Chapter 5: LDM Best Practices	87
	Abbreviations	88
	Almost Unique Identifiers	89
	Clarity	90
	Compound Unique Identifiers	90
	Conceptual Integrity	91
	Conjunctive Relationships	93
	Duplicate Super-Subtypes “Type” Data	93
	Exclusive and Nonexclusive Generalization	94
	Required and Nonrequired Participation	96
	Exclusive Relationships	96
	Group Attributes	97
	Level of Abstraction	97
	Many-to-Many Relationships	98
	N-ary Relationships	100
	N-ary Relationships and Membership Class	101
	Naming Objects	104
	Multiple Names	104
	Naming Conventions	105
	Name Uniqueness	105
	Naming Convention Goals	106
	Null Attributes	109
	There Be Blanks in Them Thar Nulls	109
	Optional Relationships (Optional-Optional Relationships)	110
	Subject Areas	110
	Supertypes and Subtypes	111
	Unique Identifiers	113
	Note	115
	Chapter 6: LDM Pitfalls	116
	Circular Relationships	116
	Data Values	117
	Data Value–Differentiated Entities and Attributes	118
	Derived Data	119
	Three Poor Arguments Against Modeling Derived Data	119
	Derived Data as Process	120
	Derived Data and Physical Database Design	121
	Discrete Attributes	122
	Embedded Attributes	123
	Uniqueness	123
	Group Attributes	123
	The Problem with Embedded Attributes	123
	The Solution	124
	The Moral of the Story	124
	Entity Fragmentation	124
	Foreign Keys	125
	Junction Entities	126
	Normalization	126
	Presentation Data	127
	Primary Keys	127
	Process Data	128
	Repeating Groups	129
	Multivalue Attribute	129
	Group Attribute	130
	Single-Attribute Entities	130
	Code	131
	Multivalue Attribute (Repeating Group)	131
	Associative Entities	131
	Substitution Data	132
	Substitution Tables	132
	Transient Data	132
	Location-Dependent Data	133
	Chapter 7: LDM Perils to Watch For	135
	Associatives Related to Other Associatives	135
	Diagrammable Objects	136
	Disassociated Entity Clusters (“Islands”)	137
	Duplicate Unique Identifiers	138
	One Entity, Two or More Identifiers	138
	One Identifier, Two or More Entities	138
	Multiple Relationships	139
	One Relationship, Multiple Views	139
	Multiple Different but Similar Relationships	139
	One-of-a-Kind (OOAK) Entities	140
	One-to-One Relationships	140
	Rare Entity Relationships	141
	Recursive Modality Constraints	142
	Updating the Constraints	143
	Spiderwebs	144
	Too Many Blanks or Nulls	145
	Too Many Recursives	146
	Next U3D Phase: Physical Schema Definition	147
	Notes	147
	Part III: Physical Schema Definition	148
	Chapter 8: Introduction to Physical Database Design	149
	A Short Incondite History of Automated Information Management (or, a Sequential Look at Random Access)	150
	Information Management Era 1: Sequential Processing	150
	Information Management Era 2: The First Random Access DBMS	151
	A Small Digression: A Couple of Words About Database Access	156
	Hashing	156
	Inverted Indices	158
	Database Pages	159
	B-Trees	160
	Bitmaps	162
	Associative Arrays	163
	Information Management Era 3: Inverted File Systems	164
	Information Management Era 4: The Age of Relational	164
	Problems with Relational	166
	First—Performance Issues	166
	Second—Not So Simple Simplicity	166
	Data Types	166
	Procedural Code	166
	Groups	167
	Third—Communication and Language	167
	Fourth—Relational: Theory or DBMS?	169
	Fifth—Where Are You Relational Model?	170
	Just Because It Has Failings Doesn’t Mean It’s a Failure	170
	Information Management Era 5: Object Technology	171
	Object-Oriented Programming Led to Object-Oriented Analysis and Design, Which Eventually Led to the Object-Oriented Database Management Systems (OODBMSs)	171
	A Small Digression (Again): The ACID Test	172
	Information Management Era 6: NoSQL	173
	Key-Value	173
	Graph	174
	Document Management	174
	Multimodal	174
	Is That an ACID or a BASE?	174
	And the Winner Is…	176
	What’s to Come	177
	References	177
	Notes	177
	Chapter 9: Introduction to Physical Schema Definition	179
	Usage-Driven Database Design: Physical Schema Definition	181
	Step 1: Transformation	182
	Task 1.1: Translation	183
	Task 1.2: Expansion	184
	Step 2: Utilization	185
	Task 2.1: Usage Analysis	185
	Task 2.2: Path Rationalization	187
	Step 3: Formalization	188
	Task 3.1: Environment Designation	189
	Task 3.2: Constraint Compliance	191
	Step 4: Customization	193
	Task 4.1: Resource Analysis	194
	Task 4.2: Performance Enhancement	196
	Summary	197
	Chapter 10: Transformation: Creating the Physical Data Model	199
	Task 1.1: Translation	200
	Activity 1.1.1: Transform LDM Objects to PDM Objects	200
	Entities to Record Types	201
	Relationships to Linkages	204
	Linkage Membership Class	204
	Linkage Degree	205
	Linkage Constraints	205
	Attributes to Data Items	206
	Data Item Domain	207
	Data Item Source: Primitive and Derived	207
	Primitive Data Item: Unique Identifiers and Descriptors	207
	Data Item Complexity: Simple and Group	207
	Data Item Valuation: Single Value and Multivalue	208
	Other Data Item Information	208
	Activity 1.1.2: Diagram the Objects	208
	Task 1.2: Expansion	209
	Activity 1.2.1: Assign Keys	209
	Activity 1.2.2: Normalize the Model	211
	Adjustments Needed for Normalization: Keys—Foreign and Domestic?	213
	Zero Normal Form	214
	First Normal Form	214
	Before Getting to Second Normal Form, a Slight Digression	215
	Second Normal Form	215
	Third Normal Form	216
	Post-Normalization—Retreat of Sally Forth?	217
	Issues with Normalization	217
	Tranformation Notes	219
	Deliverables	220
	Examples of Deliverables	220
	Chapter 11: Utilization: Merging Data and Process	224
	Task 2.1: Usage Analysis	225
	Process Modeling	225
	Logical Process Modeling	225
	Natural-Language Logical Process Modeling Techniques	226
	Plain English	226
	Oy Vey, There Has Got to Be a Better English Translation	226
	Structured English	226
	Graphical Logical Process Modeling Techniques	228
	Physical Process Modeling	230
	Natural-Language Physical Process Modeling Techniques	230
	Plain English	230
	Structured English	230
	Pseudocode	230
	Graphic Physical Process Modeling Techniques	231
	Flow Charts	231
	Structure Charts	232
	Activity 2.1.1: Create Usage Scenarios	233
	Clearing the Decks for Action	234
	Putting a Usage Scenario Together	237
	An Example	237
	Activity 2.1.2: Map Usage Scenarios to the PDM	239
	Task 2.2: Path Rationalization	240
	Activity 2.2.1: Reduce to Simplest Paths	240
	Activity 2.2.2: Simplify Model	241
	Utilization Notes	242
	Deliverables	242
	Example of Deliverables	243
	Further Reading	243
	Structured English	243
	Data Flow Diagramming	244
	Flow Charts	244
	Pseudocode	244
	Structure Charts	244
	Chapter 12: Formalization: Creating a Schema	245
	Task 3.1: Environment Designation	246
	Hierarchical Systems	249
	Network Systems	249
	Relational Systems	249
	Object-Oriented	250
	NoSQL	250
	DBMS Product and Version Selection	250
	Task 3.2: Constraint Compliance	251
	Pseudocode…Again	251
	Activity 3.2.1: Map Rationalized Physical Data Model to the Data Architecture	253
	Record Types	253
	Proper	253
	Associative	253
	Attributive	254
	S-Type	255
	Links	255
	Membership Class: Cardinality	255
	One-to-One	255
	One-to-Many	256
	Many-to-Many	256
	Membership Class: Modality	256
	Mandatory	257
	Optional	257
	Degree	257
	Unary	257
	Binary	258
	N-ary	258
	Constraints	259
	Inclusion	259
	Exclusion	259
	Conjunction	259
	Data Items	260
	Domains	260
	Source: Primitive and Derived	260
	Primitive Data Items	260
	Derived Data Items	261
	Complexity: Simple and Group	261
	Valuation: Single Value and Multivalue	262
	Activity 3.2.2: Create a DBMS Product/Version-Specific Functional Physical Database Design	263
	Subschema Creation	265
	Formalization Notes	267
	Deliverables	267
	Example of Deliverables	268
	Chapter 13: Customization: Enhancing Performance	269
	Task 4.1: Resource Analysis	270
	The Trade-Off Triangle	271
	Task 4.2: Performance Enhancements	274
	Activity 4.2.1: Customize Hardware	275
	A Few Words About Secondary Storage	275
	Add Disk	277
	Faster Disk	277
	Main Memory	277
	Activity 4.2.2: Customize Software	278
	Indices (B-Tree, Hash, Bitmap)	278
	Clustering	278
	Example Using Indices and Clusters	281
	Question 1: Should Order Be Indexed?	283
	Question 2: Should Product Be Indexed?	283
	Question 3: Should Line Item Be Indexed?	284
	Question 4: Should Line Item Be Clustered?	284
	Question 5: Should Line Item Be Clustered Around Order or Product?	284
	Alternative 1: Line Item Clustered Around Order	284
	Alternative Two: Line Item Clustered Around Product	285
	Partitioning	286
	Derived and Duplicate Data	287
	Denormalization	288
	Get Rid of ACID	288
	Big Data, Big Problems, Big Solution	289
	To Plunge or Not to Plunge	290
	NoSQL	291
	Modeling Big Data U3D Style	292
	Customization Notes	294
	Deliverables	294
	Examples of Deliverables	295
	Chapter 14: The Data Warehouse	298
	The Data Warehouse	299
	Data Warehouse Architecture	301
	Using U3D to Develop a Data Warehouse	302
	Step 1: Transformation	302
	Step 2: Utilization	303
	The Time Dimension	305
	Step 3: Formalization	306
	Step 4: Customization	306
	Streamlining	306
	Duplication	306
	Denormalization	306
	Indices and Hashing	307
	Bitmaps	307
	Bulk Loading	308
	Clustering	309
	Partitioning	309
	Distributed Processing	309
	All Together Now…	310
	Oops…	310
	Customization Notes	311
	Note	311
	Chapter 15: The Big Data Decision Support System	312
	Structured, Unstructured, and Semistructured Data—Another Small Digression	313
	DSS and Big Data	316
	Using U3D to Develop a Big Data Decision Support System	317
	Step 1: Transformation	318
	Step 2: Utililization	318
	Step 3: Formalization	319
	Step 4: Customization	321
	A Little About Hadoop	321
	Putting It All Together	324
	Deliverables	324
	Part IV: Where from Here?	326
	Chapter 16: A Look Ahead	327
	We Need to Ask the Awkward Questions	328
	Tools Need to Take Usage into Account	330
	The One and Only DBMS	330
	Better Training	334
	Notes	336
	Part V: Appendixes	337
	Appendix A: Glossary	338
	Appendix B: Logical Data Modeling Definitions	357
	Entity	357
	Relationship	358
	Attribute	358
	Domain	359
	Appendix C: Physical Schema Definition Object Definitions	360
	Record	360
	Linkage	361
	Data Item	362
	Domain	362
	Cluster	363
	Partition	363
	Index	364
	Appendix D: Formulas Used in This Book	365
	Appendix E: List of U3D Deliverables	367
	Index	370

Usage-Driven Database Design - From Logical Data Modeling through Physical Schema Definition

von: George Tillmann

Usage-Driven Database Design - From Logical Data Modeling through Physical Schema Definition

Kategorien

eBooks