• Non ci sono risultati.

Relational Databases Databases

N/A
N/A
Protected

Academic year: 2021

Condividi "Relational Databases Databases"

Copied!
64
0
0

Testo completo

(1)

An An Introduction to Relational Introduction to Relational DatabasesDatabases

An Introduction to

Relational Databases An An Introduction to Introduction to

Relational

Relational Databases Databases

–– ChapterChapter 1 –1 –

Foundations of Information Systems (WS 2008/09) Foundations

Foundations of Information Systems (WS 2008/09)of Information Systems (WS 2008/09)

(2)

Strategy

Strategy in this chapterin this chapter

Microsoft Access for Beginners Microsoft Access Microsoft Access

for Beginners for Beginners

Basic concepts of relational

databases Basic

Basic conceptsconcepts of relational of relational

databases databases Strategy

Strategy in thisin this chapter:chapter:

PracticePractice

Application orientedApplication oriented

Hands-Hands-on on experienceexperience

TheoryTheory

PrinciplesPrinciples and ideasand ideas

Scientific basisScientific basis

BothBoth sidessidesareare important:important: LearningLearningLearning without applying is rather useless !withoutwithout applyingapplyingis rather uselessis rather useless !!

(3)

Relational

Relational databasesdatabases: General remarks: General remarks

At presentAt present, , thethe DB-DB-marketmarket is completely dominated by systems is completely dominated by systems supporting the supporting the relational

relational modelmodel of dataof data..

Leading (commercialLeading (commercial) manufacturers) manufacturers of relational DBof relational DB--productsproducts::

The notionThe notion"relational" is"relational" is motivatedmotivated bybythethemathematicalmathematicalconceptconcept of a relationof a relation. . Relations in

Relations in mathematics are sets of mathematics are sets of tuplestuples. .

Relational databasesRelational databases areare collectionscollectionsof of one or moreone or more relations.relations.

In practiceIn practice, relations , relations can becan bevisualized asvisualized as tables,tables, the rowsthe rows of of which are individual which are individual records

records of of datadatawith thewith the samesame(homogeneous(homogeneous) ) fieldfield structure.structure.

In scienceIn science, relational , relational databases have a databases have a broad range broad range of of theoretical foundationstheoretical foundations.. Oracle

Oracle SybaseSybase

Microsoft (Access, SQL Server)

Microsoft (Access, SQL Server) Postgres Postgres (Freeware)(Freeware) IBM (DB2,

IBM (DB2, Informix)Informix) MySQLMySQL (Freeware) (Freeware)

(4)

Relational

Relational databasesdatabases: History: History

Edgar F.

Edgar F. CoddCodd

TheTheideaideato organize datato organize data in tablesin tables isis quite quite old and

old and pretty obviouspretty obvious. .

TheTheidea to idea to investigate this representationinvestigate this representation of data by means of data by means of of the theory of relationsthe theory of relations is due

is due to to one one man, man, who proposed this viewwho proposed this view at the at the end of end of the the 1960s: 1960s:

Edgar F.

Edgar F. CoddCodd

In 1970, he publishedIn 1970, he published his seminalhis seminal paper paper

"A Relational Model of

"A Relational Model of Data for Large Data for Large Shared Data

Shared Data Banks",Banks",

in whichin which he fixedhe fixed all foundationsall foundations of relationalof relational databases

databases withwith amazingamazingprecisionprecision and and clarityclarity. .

Codd died in Codd died in earlyearly 2003.2003.

For this pioneering work Codd received the Turing Award in 1982, the „Nobel price of informatics".

For this pioneering work Codd receivedFor this pioneering work Codd received the Turing theTuring AwardAwardin 1982, thein 1982, the „Nobel Nobel price

price of informaticsof informatics".".

(5)

Literature Literature

A strong recommendation for your own studiesA strong recommendation for your own studies::

Ramon A. Mata-Toledo, Pauline K. Cushman

“Fundamentals of Relational Databases”

(Schaum's Outlines)

McGraw-Hill Professional ISBN 978-0071361880 249 pp.

€ 15,99 (amazon.de) Ramon A. Mata

Ramon A. Mata-Toledo-Toledo, , Pauline K. CushmanPauline K. Cushman

““Fundamentals of Relational Databases”Fundamentals of Relational Databases”

(Schaum's(Schaum's Outlines) Outlines)

McGMcGraw-raw-Hill Professional Hill Professional ISBN

ISBN 978-978-0071361880 0071361880 249 pp.

249 pp.

15,99 (amazon15,99 (amazon.de).de)

There are many

There are many good and good and expensive academic textbooks expensive academic textbooks on (relational) on (relational) databasesdatabases.. This one is cheap

This one is cheap and and not really not really a higha high--profile book, profile book, but it fits perfectly with ourbut it fits perfectly with our lecture

lecture, , is is upup--toto--date, date, very readable and very readable and covers exactly what you needcovers exactly what you need.. Everybody should have

Everybody should have his/her his/her own copy!own copy!

(6)

A A briefbriefprimer in dataprimer in data representationrepresentationformatsformatsand dataand data manipulationmanipulationparadigmsparadigms

BeforeBefore „divingdiving into“into relational databasesrelational databases proper, weproper, we will brieflywill briefly investigateinvestigatevariousvarious competitive

competitive formatsformats of representingof representing and manipulatingand manipulating datadata arrangedarrangedin tabularin tabular form:form:

plainplain text filestext files

formattedformatted text filestext files

spreadsheetsspreadsheets

relational databasesrelational databases

In eachIn each representationrepresentationformat, format, thethe datadataareare storedstored in filesin files. Such . Such filesfiles maymaywell bewell be con-con- sidered

sidered as databasesas databases however, however, therethere areare different degreesdifferent degrees of „of „databasedatabase--nessness““!!

EachEach formatformat comescomesalongalong withwith a speciala special software system (orsoftware system (or programprogram) that) that controlscontrols anyany kindkind of accessof access to and manipulationto and manipulation of of thethe respectiverespective „databasedatabase““..

DataData manipulationmanipulation in thisin this contextcontextmeansmeans searchingsearching forforspecialspecial datadata in thein the filefile and/orand/or changing

changing (adding(adding, , deletingdeleting, , modifyingmodifying) ) datadata..

EachEach of theseof these pairspairs of representationof representation formatformat+ manipulation+ manipulation system cansystem can bebe viewedviewed asas a particulara particular variantvariant of theof the equationequation DBS = DBMS + DB. DBS = DBMS + DB.

(7)

Reminder:

Reminder: databasedatabase (management(management) systems) systems

DBDB DBMSDBMS

Database System Database System Database System

. . . .

users

users and and application

application programmesprogrammes DBMS: Database Management SystemDBMS:

databases databases

(application(application--independent independent servicesservices forfor managing

managing data)data)

(8)

A A littlelittle„casecase study“study“: : TheThe chemicalchemicalelementselements„databasedatabase“

Representing

Representing information/information/datadata aboutaboutthethe 116 chemical116 chemical elementselementsin different in different formats:formats:

as textfile as textfile ((separatedseparatedbyby tabstabs))

as textfile as textfile ((separatedseparatedbyby semicolonssemicolons))

as Word as Word filefile

as as Excel Excel filefile

as Access as Access (relational) database(relational) database

(9)

TheThe simplestsimplest formatformatforfor representingrepresenting thethe chemicalchemicalelementselements datadataisis thethe text file format..

Text filesText files (extension(extension .txt under.txt under Windows) Windows) areareconceptuallyconceptually just longjust long stringsstrings of printable of printable symbols

symbols (such as digits(such as digits, characters, characters, or, or eveneven blanksblanks) arranged) arranged in linesin lines. .

ByBy hand, spaceshand, spaces in a text filein a text file cancan bebearrangedarranged in such a wayin such a way thatthat e.g. a tabular structuree.g. a (rows(rows--columnscolumns))appearsappears..

Thus, Thus, text files appearappear asas simple

simple databases..

A text editorA cancan bebe usedused forfor performingperformingsimplesimple manipulations

manipulations of theof the filefile contents

contents, such , such as patternas pattern matching

matching and substringand substring replacement

replacement: : TheTheeditoreditor

Elements

Elements databasedatabase: .txt : .txt filefile + text editor+ text editor (1)(1)

(10)

DataData in text filesin text files cancan bebe arrangedarranged in anyin any form form convenientconvenientforfor humanshumans readingreading thatthatfile.file.

TheThe text editortext editor isis unableunableto „to „seesee“ thethe particularparticular structuringstructuring conventionconvention (e.g. columns(e.g. columns).).

Computer programsComputer programs usingusing thethe

„text DBtext DB“ don‘don‘t t needneedvisualvisual support

support either, either, theythey just needjust need somesome meansmeans of separating of separating individual

individual partspartsof theof the data.data.

In thisIn this text filetext file versionversion of of thethe elements

elements data, data, thethe line structure isis retainedretained (one(one elementelementper line),per line), butbutcolumnscolumnsarearejust just separatedseparated byby a semicolon delimiter.a .

WeWe will deal laterwill deal later withwithdifferent different popular

popular structuringstructuringconventionsconventions (not(not mademade forforpeople, butpeople, but forfor pro-pro-

grams

grams) such ) such asas XML and RDF.XML and RDF.

Elements

Elements databasedatabase: .txt : .txt filefile + text editor+ text editor (2)(2)

(11)

MS Office MS Office offersoffersa mucha much moremore powerfulpowerfultext text editor, editor, calledcalled MS Word, , supportingsupporting e.g.e.g.

a „a „truetrue“ tabulartabularformatformatforfor datadata(visualizing(visualizing rowsrowsand columnsand columns automatically). Textautomatically). Text files

files managedmanaged byby thetheWord software areWord software are identifiableidentifiable byby theirtheir extensionextension.doc..

In In additionaddition to theto the normal

normal functionsfunctions ofof a „a „plainplain“ text editortext editor, , Word

Word offersoffers severalseveral

„luxuryluxury“ variantsvariants of text

of text editingediting..

WordWord‘s s abilityability to searchto search and changeand change datadata isis notnot moremore powerfulpowerfulthanthan thatthat of „of „normalnormaleditorseditors..

BeingBeing ableable to recognize to recognize tables

tables, , howeverhowever, , makes makes Word

Word moremore convenient.convenient.

Elements

Elements databasedatabase: .doc : .doc filefile + MS Word editor+ MS Word editor

(12)

Word files are just „illusions“ created by the MS Word system. IfIf openedopened withwitha normala normal text

text editoreditor, , theythey turn out to beturn out to be specialspecial text filestext files containingcontaininga lot of a lot of crypticcrypticspecialspecial symbolssymbols generated

generated (and used(and used) by) by thethe Word software.Word software.

TheThe„Word DBMSWord DBMS“ interprets

interprets thesethesespecialspecial symbols

symbols in order to, e.g.,in order to, e.g., generate

generate thethe table formattable format notnotvisiblevisible in „in „plainplain“ text

text filesfiles..

In additionIn addition, Word , Word insertsinserts plenty

plenty of otherof other internalinternal codecode whichwhichisis neededneeded in order to

in order to bebe ableable toto offer

offer thethe extra functionsextra functions notnotpresentpresent in a textin a text editor

editor..

Elements

Elements databasedatabase: .doc : .doc filefile + text editor+ text editor

(13)

ContainedContained in eachin each Office packageOffice package on Microsoft PCs, thereon Microsoft PCs, there isisan an eveneven moremore powerfulpowerful tooltool forformanagingmanaging data, a sodata, a so--calledcalled spreadsheetspreadsheet programprogramcalledcalledMS Excel supporting supporting files

files withwithextensionextension .xls..

IfIf processedprocessed viavia Excel,Excel, eveneven moremore detailsdetails ofof tabular

tabular structurestructurebecomebecome visible

visible and canand can bebe mani-mani- pulated

pulated..

For searchingFor searching and changingand changing,, thethe Excel system doesExcel system does notnot exceed

exceed thethe functionalityfunctionalityofof Word.

Word.

Excel, howeverExcel, however, , isis speciaspecia-- lized

lized in statistical evalua-in tions of numericalof numerical datadata..

Elements

Elements databasedatabase: .: .xlsxls filefile+ MS Excel system+ MS Excel system

(14)

BehindBehind thethe surface, surface, howeverhowever, , therethere isis againagain a speciallya specially formattedformatted text filetext file format, format, containing

containing plentyplenty of of internalinternalcontrolcontrol symbolssymbols interpretedinterpretedbybythethe Excel software.Excel software.

IfIf openedopened withwitha normal text a normal text editor, editor, thisthis „hiddenhidden“ informationinformationbecomesbecomes visible. visible. ItIt isis inin principle

principle notnotdifferent from different from handhand-made-made separatorsseparatorslikelike semicolons

semicolons mixedmixed withwith thethe „properproperdatadata partsparts as seen

as seen before.before.

WhatWhat mattersmatters isisthethe specialspecial software

software managingmanaging thesethese

„enhancedenhanced text filestext files“, , in in thisthis casecase thethe ExcelExcel system.

system.

EnhancedEnhanced system functiosystem functio-- nality

nality requiresrequiresan an enhancedenhanced representation format for representation format for data. data.

Elements

Elements databasedatabase: .: .xlsxls filefile+ text editor+ text editor

(15)

EachEach MS Windows computerMS Windows computer withwithMS Office software MS Office software offersoffersan an evenevenmoremore powerfulpowerfultooltool forfor managingmanaging data, data, calledcalled MS Access thisthis isis thethefirstfirstsystem computersystem computer scientistsscientists wouldwould callcall a „a „realreal“ DBMS. Files „DBMS. Files „understoodunderstood“ byby Access haveAccess have thethe extensionextension.mdb..

Access supportsAccess supports a tabulara tabular viewview of of data, data, tootoo, like, like Word and Excel, Word and Excel, butbut offers

offers a mucha much, , muchmuch moremore powerfulpowerfulsetsetofof techniques

techniques forfor searchingsearching and changingand changing data.data.

Access will beAccess will be consideredconsidered in morein more detail in detail in thethe

remainder

remainder of of thisthis section

section, , dedicateddedicated to to so-so-calledcalled relational databases..

Elements

Elements databasedatabase: .: .mdbmdb filefile+ MS Access DBMS+ MS Access DBMS

(16)

OpeningOpening an mdban mdb..filefile (alias an Access database(alias an Access database) ) withwith a text editora text editor revealsreveals thethe „truetrue nature“nature of of thetherepresentationrepresentation againagain: : HeavylyHeavyly formattedformatted text filetext file formatformatwithwith excessiveexcessiveuseuseof of

internal

internal codingcodinginterpretableinterpretable forfor thethe MS Access DBMS onlyMS Access DBMS only..

NeverthelessNevertheless, searching, searching e.g.

e.g. forfor certaincertain symbolssymbols oror stringsstringswithinwithin thisthis filefile withwith a text editora text editor returns

returns thethe same same results

results as searchingas searching in thein the human-human-friendlyfriendly tabular

tabular text filetext file fromfrom thethe beginning.beginning.

TrickyTricky internalinternal format-format- tingting plus intelligentplus intelligent interpreting

interpreting softwaresoftware isis ableable to generateto generate powerful illusions powerful illusions about databases about databases!!

Elements

Elements databasedatabase: .: .mdbmdb filefile+ text editor+ text editor

(17)

Relational

Relational databasedatabase proper vs. dedicatedproper vs. dedicated filefileformatsformats

In theIn the remainderremainder of of thisthischapterchapter, we, we will usewill use thethe last of theselast of these representationrepresentationformatsformats only:only:

In additionIn addition, we, we will forgetwill forget aboutabouttext text editors, Word and Excel, and editors, Word and Excel, and exploreexplore thethe power ofpower of Access

Access, a , a truetrue relational DBMS.relational DBMS.

In theIn the companioncompanion lecturelecturebyby Prof. Hofmann-Prof. Hofmann-ApitiusApitius youyouwill getwill get to knowto know a widea wide spectrumspectrum of additional

of additional datadata representationrepresentation formats, formats, manymany of of themthemdevelopeddevelopedparticularlyparticularly forfor lifelife science

science applicationsapplications („(„dedicateddedicated formats“formats“).).

All of All of thesethese formatsformatsareare ultimatelyultimately basedbasedon text fileson text files as underlyingas underlying „realreal“ format. Specialformat. Special structuring

structuring informationinformationisis alwaysalways interleaved with „interleaved with „plainplain“ datadata asas was shownwas shown forfor thethe general

general--purposepurposeformatsformats .doc, ..doc, .xlsxls, and ., and .mdbmdb discusseddiscussed beforebefore..

In additionIn addition, most, most of theseof these dedicateddedicated formatsformatscomescomeswithwith itsits ownown „gatekeepergatekeeper“ software,software, comparable

comparable to Word, Excel orto Word, Excel or Access in thatAccess in that ititinterpretsinterprets thethe specialspecialformatformatin a in a particu-particu- larlar, system-, system-specificspecific way. way.

Relational Databases Relational

Relational DatabasesDatabases

(18)

Our example

Our example DBMS: Microsoft AccessDBMS: Microsoft Access

AccessAccessisis a DBMS fora DBMS for relational databasesrelational databases (data(data organizedorganized in form of tablesin form of tables), ), developeddeveloped and distributedand distributed sincesince 1992 by1992 by Microsoft.Microsoft.

"Access-"Access-Homepage" at Microsoft:Homepage" at Microsoft:

http://

http://www.www.microsoftmicrosoft.com.com/office/office//accessaccess//defaultdefault..aspasp

recentrecentversionversionin MS Office packagesin MS Office packages: Access 2000: Access 2000

Access Access isis veryvery wellwell--suitedsuited forfor smallsmall to to mediummediumDB DB applicatonsapplicatons in singlein single--useruser mode.mode.

useful internetuseful internet tutorialstutorialson Access on Access

Michael BrydonMichael Brydon‘‘s s tutorialtutorial at Simon Fraser at Simon Fraser University, University, CanadaCanada http://

http://mis.mis.busbus..sfusfu..caca//tutorialstutorials//MSAccessMSAccess//tutorialstutorials..htmlhtml

Maggie StraplandMaggie Strapland‘‘s Access s Access pagespages at University of Bristol, UKat University of Bristol, UK http://

http://www.www.brisbris.ac.ac..ukuk//isis//servicesservices/software//software/packagespackages//accessaccess//

Jakob Jakob LindenmeyerLindenmeyers Access s Access tutorialtutorialat ETH Zat ETH Züürich, Schweizrich, Schweiz http://

http://www.inf.www.inf.ethz.ethz.chch/personal//personal/lindenmelindenme//publicationspublications//accessaccess//AccessTutorialAccessTutorial..htmlhtml

In additionIn addition, , therethere areare many, many, manymany booksbooks on howon how to useto use Access, mostAccess, most of of themthemnotnot really

really thatthat helpful, helpful, becausebecause therethere isis poorpoorstructurestructure and tooand too manymany detailsdetails..

Throughout the course, we will use a small, but handy DBMS available on most PCs.

Throughout the course

Throughout the course, , we we will will use use a a smallsmall,, but handy

but handy DBMS DBMS availableavailable on most on most PCs.PCs.

(19)

A A first example database: European first example database: European geographygeography

Our first „Our first „real real worldworld“ database exampledatabase example is about geography

is about geography: Facts about coun: Facts about coun-- tries

tries, cities, cities etc. in Europe !etc. in Europe !

A wealth A wealth of of geo data can be accessedgeo data can be accessed freely

freely on on the the web in web in the „the „World World FactFact Book“Book of of the CIA:the CIA:

http://

http://www.www.ciacia..govgov//ciacia/publications/publications// factbook

factbook//indexindex..htmlhtml

You will find a You will find a database called database called

„europeeurope.mdb.mdb“

on the lecture homepage for your ownon the lecture homepage for your own

experimentsexperiments. . This This will will continuously continuously grow

grow you are invited to you are invited to helphelp!!

At this moment At this moment in in our lectureour lecture, , the geothe geo database serves as

database serves as a a first first „„appetizerappetizer“ toto (relational)

(relational) database managementdatabase management..

(20)

A relational

A relational database aboutdatabase aboutEuropean geographyEuropean geography

Europe.

Europe.mdb mdb isis a small a small database for introductory database for introductory purposes

purposes. . Just

Just now it contains now it contains twotwo tables

tables, , one one on on countriescountries in Europe,

in Europe, the other the other on on cities

cities. . Today we

Today we will just will just learnlearn the most basic ideas about the most basic ideas about relational

relational databasesdatabases some some of you of you will will be already famibe already fami-- liar with this

liar with this..

And we And we will will discuss thediscuss the important question why important question why a simple text a simple text filefile is notis not sufficient for keeping data sufficient for keeping data..

(21)

Tables

Tables in Access: some basic terminologyin Access: some basic terminology

Italy I Rome 301230 57460274 2001 . . .

Italy

Italy I Rome I Rome 301230 57460274 2001301230 57460274 2001 . . .

. . .

country code capital area population yea country code capital area population yearr

field names field names

countries countries

table

table namename

text

text char text integer integer integerchar text integer integer integer

field data type field data type record record

Relational

Relational tables are gridstables are grids, , the fields the fields of which are consistingof which are consisting of columns of columns and and rowsrows. . There is

There is a specific terminology for a specific terminology for such such tables in Access.tables in Access.

(22)

Terminological confusion possible Terminological confusion possible !!

relation table datasheet

tuple row record

attribute column field name

domain data type field data type

relation

relation tabletable datasheetdatasheet tuple

tuple rowrow recordrecord

attribute

attribute columncolumn field namefield name domain

domain data typedata type field data typefield data type theory

theory SQLSQL Access Access Unfortunately

Unfortunately, , the basic conceptsthe basic concepts of the of the relational relational model model arearedenoteddenoted byby different termsdifferent terms depending

depending on theon the context. context. There are There are synonymoussynonymous, , but but different different terminologiesterminologies in databasein database theory

theory, , the standard the standard DB DB language language SQL and MS Access:SQL and MS Access:

Be Be warned warned of of this this Babylonic confusionBabylonic confusionof of terms terms we urgently recommend that youwe urgently recommend that you always

always stick to a stick to a single single system of notions system of notions in a in a consistent mannerconsistent manner. . It doesnIt doesn‘‘t matter t matter whichwhich system

system you use you use – but never mix but never mix them them upup !!

(23)

Access table:

Access table: Datasheet view Datasheet view

Access table Access table

„countriescountries":":

Datasheet view Datasheet view

(24)

Access table: Design Access table: Design viewview

Access table Access table

„countriescountries":":

Design

Design viewview

field size field size format format input mask input mask caption caption default value default value validation rule validation rule validation validation texttext Required Required

Allow zero length Allow zero length Indexed

Indexed

Unicode compression Unicode compression

field name field data type

field name field data type description description

Riferimenti

Documenti correlati

In the present study, we explored whether the expertise in using a wheelchair acquired by SCI people influences their abilities to anticipate actions. To address this issue, we

period from 1986 to 2011, in the Paraná River Hydrographical Basin, Brazil, for 32 meteorological stations using 11 climate indices created by the ETCCDI (Expert Team, ET, on

For reactions containing 32% reagent volume, the best genetic profile was generated using 27 PCR cycles and a single disc.. For reactions containing 20% of the recommended

Despite the same energy density is employed in CW and PW strategies, with PW emission the melt pool is observed to be smaller in area, shorter, and less intense

In questa prima parte verrà analizzata la relazione tra arte e politica nella Cina di Mao e sarà evidenziata la trasformazione e l’evoluzione delle arti figurative in questo

Although validation and career guidance broadly share a similar aim (both support an individual’s pursuit of career development and education), validation has a

Compare in terms of predictive performance measures (i. Prediction Error and Brier score) semi-parametric propor- tional hazard models with respect to non parametric tech- niques

Seguendo queste due strategie, sono stati preparati, in collaborazione con gruppi di ricerca esperti nel settore dei materiali e dei biomateriali, bifenili idrossilati funzionalizzati