• Non ci sono risultati.

Web sites need models and schemes

N/A
N/A
Protected

Academic year: 2021

Condividi "Web sites need models and schemes"

Copied!
42
0
0

Testo completo

(1)

Dipartimento di informatica e automazione Università di Roma Tre

WEB SITES NEED

MODELS AND SCHEMES

Paolo Atzeni

atzeni@dia.uniroma3.it

http://www.dia.uniroma3.it/~ atzeni

Outline

u

Databases and information systems over the Web: a great opportunity

u

The design of Web-based information systems (WBIS)

u

Models for the design of WBIS

u

Conceptual and logical design of WBIS

u

Conclusions

(2)

Paolo Atzeni ER98 3

u

Databases and information systems over the Web: a great opportunity

G

The design of Web-based information systems (WBIS)

G

Models for the design of WBIS

G

Conceptual and logical design of WBIS

G

Conclusions

Database and information systems technology: present and future

®

the technology of relational databases is now mature and reliable: a de-facto

standard for business applications

®

the current challenge is integration:

® integration of technologies

® integration of distributed, autonomous, heterogeneous systems

(3)

Paolo Atzeni ER98 5

The need for cross-fertilization

l database technology was developed within the domain of business applications; other domains have other requirements (also very different from each other)

l database technology and “X” technology can be complementary, with potential mutual benefit

l basic problems have been solved; specific areas may have specific problems, for which the general solutions need not be satisfactory

Integration of information systems

u

Various motivations:

– interaction of components independently developed

– cooperation of previously separated business processes

– cooperation (or merge) of companies u

Typical requirements:

– distribution – heterogeneity – autonomy

(4)

Paolo Atzeni ER98 7

A topical issue

l a request from our “users”:

“computing facilities should become similar to standard utilities (gas, phone, power, etc.)”

l our usual reply:

“computing services have application specific features for which standard services would only be a limited solutions (as it is the case for the other utilities)”

l however

what would a standard offer of services be?

The great opportunity

u Internet (and Intranets and Extranets) and the World-Wide-Web offer a great opportunity

u a simplified stack of layers:

–cooperation (of applications)

–interoperability (ftp, telnet, mail, http, ...) –connectivity

u standardization climbs stacks (functionalities get standardized and go down: think to

database systems!)

(5)

Paolo Atzeni ER98 9

The Web: a great opportunity

u the diffusion of the Web is ...

u the Web (with its browsers) is becoming a standard interface for the final user

–the protocol is very simple and public –the interface is uniform

–the content is very rich (in breadth and depth)

u it is becoming a standard interface for accessing many services, with information systems and databases of every type

Evolution of the Web

u

Publishing information

u

Interactive services

u

Cooperative work

u

Integration (of sources, services, etc.)

u

Embedded systems

u

Extranets

(6)

Paolo Atzeni ER98 11

Web and DBs: a contradiction?

u databases are well structured and organized

u how much structure and organization is there in the Web?

u it depends, both on the source and on the user

u there are different degrees of granularity and structure for our data

u we need to be able to make conversions (from DB to HT and viceversa)

we need the best of

databases and hypertexts!

(7)

Paolo Atzeni ER98 13

Database perspectives on the Web

u

What is the equivalent of a database (a

“source of data”)?

– a page independently of the others – the whole Web

– a site

Database approaches

u bottom-up: accessing information from Web sources

u top-down: designing and maintaining Web sites

u global: integrating existing sites and offering the information through new ones

(8)

Paolo Atzeni ER98 15

Integration on the Web

u the Web is a simple and powerful integration tool; it allows the natural implementation of (data-centred) cooperative approach

u various approaches:

–coarse integration: pages of hypertextual links

–fine-grain integration: unified interfaces for accessing different (usually similar)

information systems available on the Web

Problems

u

Databases can be queried in a flexible way; hypertexts are easy to access, but cannot be “queried”

u

Web sites are often difficult to explore, use and monitor

u

Web sites are difficult to design and

maintain

(9)

Paolo Atzeni ER98 17

Database approaches

u bottom-up: accessing information from Web sources, and integrating them

u top-down: designing and maintaining Web sites

u global: integrating existing sites and offering the information through new ones

Web-based information systems: a database point of view

u

Data-Intensive Web Sites:

– large amount of data

– significance the hypertext structure

(10)

Paolo Atzeni ER98 19

G

Databases and information systems over the Web: a great opportunity

u

The design of Web-based information systems (WBIS)

G

Models for the design of WBIS

G

Conceptual and logical design of WBIS

G

Conclusions

Problems with many Web-sites (design)

u

information is often poorly organized and difficult to access

u

it is not even clear which pieces of information are available

u

the access structure is casual and many dandling references occur

u

the style of presentation is

heterogeneous

(11)

Paolo Atzeni ER98 21

Problems with many Web-sites (maintenance)

u

difficulties in updating the content

u

difficulties in changing the initially defined structure

u

difficulties in changing the presentation details

Web-based information systems

u

What we have:

–DBMSs for the management of data –various tools for the generation of Web

pages

u

What we advocate:

–a systematic approach to Web site design:

models, steps, guidelines

–tools to support the development process

(12)

Paolo Atzeni ER98 23

Hypertext data-independence

Hypertext data-independence

(13)

Paolo Atzeni ER98 25

Hypertext data-independence

Hypertext data-independence

u Data “what information is offered through the site and what are the conceptual details and the logical organization”

u Hypertext “how data is arranged in pages and what navigation links correlate them”

u Presentation “the appearance of each piece of information in pages”

(14)

Paolo Atzeni ER98 27

Design Issues

u Data

choosing the content

u Hypertext choosing

navigation paths

u Presentation

defining layout and graphics

uData

changing the content

uHypertext changing

navigation paths

uPresentation changing layout and graphics

Maintenance

Issues

(15)

Paolo Atzeni ER98 29

G

Databases and information systems over the Web: a great opportunity

G

The design of Web-based information systems (WBIS)

u

Models for the design of WBIS

G

Conceptual and logical design of WBIS

G

Conclusions

Components and Models

data ER and Relational

hypertext

presentation HTML

What is missing is a model for hypertexts!

(16)

Paolo Atzeni ER98 31

Models for hypertexts

u in data-intensive Web sites (and often in general) there are (many) pages with a similar (or even the same) structure

u thirty or forty years ago people realized that in an application it is often the case that there are records with the same structure; files with a rather fixed structure were invented with this purpose

u the notion of scheme of the database was later introduced as an overall description of the content of a database

A Web page

(17)

Paolo Atzeni ER98 33

A page-scheme:

ProfessorPage

ProfessorPage

Name Position Address EMail

ResearchList Area ToResP

ADM (Araneus Data Model): a logical model for Web hypertexts

u page-schemes

u “unique” pages

u simple attributes –text, images, ...

–link (anchor, URL)

u complex attributes: lists (possibly nested)

u heterogeneous union

u form (as virtual list over form fields and link to

(18)

Paolo Atzeni ER98 35

A Web page

(containing a list of links)

A “unique” page-scheme:

ProfessorListPage

ProfessorListPage

ProfessorList Name ToProfP

(19)

Paolo Atzeni ER98 37

An ADM Scheme

ProfessorListPage

ProfessorList Name ToProfP

ProfessorPage

Name Position Address EMail

ResearchList Area ToResP

Heterogeneous Union and Forms

(20)

Paolo Atzeni ER98 39

Heterogeneous Union and Forms in ADM

ProfessorListPage

ProfessorList Name ToProfP

ProfessorPage

Name Position Address EMail

ResearchList Area

ToResP Name

Submit

U

SearchProfPage

Data Models

ER

ADM

Database Conceptual Scheme (entities - relationships)

Hypertext Logical Scheme (page-schemes, links)

There is a lot of ‘distance’ between the two!

(21)

Paolo Atzeni ER98 41

A simple ER scheme

An ADM scheme

(22)

Paolo Atzeni ER98 43

NCM

Hypertext Conceptual

Scheme (macroentities, directed relationships, aggregations)

Data Models

Hypertext Logical Scheme (page-schemes, links)

ER

NCM

Hypertext Conceptual

Scheme (macroentities, directed relationships, aggregations)

ADM

NCM fills the gap between the two

Database Conceptual Scheme (entities - relationships)

(23)

Paolo Atzeni ER98 45

Navigation Conceptual Model (NCM)

Hypertext

Conceptual Features

uWhich concepts should be the hypertext nodes

uWhich should be the navigation paths between nodes

uHow nodes should be aggregated to build the hierarchical access structure

NCM Constructs

u Macroentity

u Directed Relationship

u Aggregation

NCM: Macroentities and directed relationships

Professor

Course

Student

Teacher

Tutorship Name

Room Email

Name

Description

Day

Room Lesson Hour

N 1:1 1:N

1:1

Name

...

(24)

Paolo Atzeni ER98 47

NCM: aggregation nodes

Professor Course Student

Teacher Tutorship

1:N 1:1 1:1

Seminar People

Department

Activities

An NCM scheme

(25)

Paolo Atzeni ER98 49

G

Databases and information systems over the Web: a great opportunity

G

The design of Web-based information systems (WBIS)

G

Models for the design of WBIS

u

Conceptual and logical design of WBIS

G

Conclusions

The Araneus Methodology

Database conceptual design

Hypertext logical design

Presentation design

Page Generation Site generation

Presentation design Requirements analysis

Database logical design

Hypertext logical design Hypertext conceptual design

(26)

Paolo Atzeni ER98 51

design from scratch

Database conceptual design

Hypertext logical design

Presentation design

Page Generation Site generation

Presentation design Requirements analysis

Database logical design

Hypertext logical design Hypertext conceptual design

design from an existing database (with an ER scheme)

Database conceptual design

Hypertext logical design

Presentation design

Page Generation Site generation

Presentation design Requirements analysis

Database logical design

Hypertext logical design Hypertext conceptual design

(27)

Paolo Atzeni ER98 53

design from an existing database (without an ER scheme)

Database conceptual design (reverse engineering)

Hypertext logical design

Presentation design

Page Generation Site generation

Presentation design Requirements analysis

Database logical design

Hypertext logical design Hypertext conceptual design

Hypertext conceptual design:

from ER to NCM

Database conceptual design

Hypertext logical design

Presentation design

Page Generation Site generation

Presentation design Requirements analysis

Database logical design

Hypertext logical design Hypertext conceptual design

(28)

Paolo Atzeni ER98 55

Hypertext Conceptual Design

ER scheme NCM Scheme

u

step 1

choose and describe macroentities: design

“views” over the input ER scheme u

step 2

choose navigation paths u

step 3

shape the hypertext access structure on the basis of (“bottom-up”) conceptual aggregation

Hypertext Conceptual Design

ER scheme NCM Scheme

u

step 1

choose and describe macroentities:

design “views” over the input ER scheme

+ usually it corresponds to “de-normalize”

the input ER scheme

Course

Lesson

Course

Name

Description Name

Description Day

Hour Lesson

Day Hour 1:1

1:N

1:N

ER NCM

(29)

Paolo Atzeni ER98 57

Hypertext Conceptual Design

ER scheme NCM Scheme

u

step 2

choose navigation paths

+ it may introduce redundancies

Professor

Research-Group

1:1

1:N

ER NCM

Paper

Professor

Research-Group

1:1

1:N

Paper

1:N

1:N

1:N 1:N

1:N 1:N

Hypertext Conceptual Design

ER scheme NCM Scheme

u

step 3

shape the hypertext access structure

+ it is based on “bottom-up” conceptual aggregations

Professor

Research-Group

1:1

1:N

NCM NCM

Seminar

Professor

Research-Group

1:1

1:N

Seminar

...

...

Research Activities

(30)

Paolo Atzeni ER98 59

The Input ER scheme

The resulting NCM scheme

(31)

Paolo Atzeni ER98 61

Hypertext logical design:

from NCM to ADM

Database conceptual design

Hypertext logical design

Presentation design

Page Generation Site generation

Presentation design Requirements analysis

Database logical design

Hypertext logical design Hypertext conceptual design

Hypertext Logical Design

NCM scheme ADM Scheme

u

step 1

map each macroentity into either

la page-scheme or

la list inside a page-scheme u

step 2

map each directed relationship into a (list of) link attribute(s)

u

step 3

map each aggregation into a unique page- scheme with link attributes to the target page-schemes

(32)

Paolo Atzeni ER98 63

Hypertext Logical Design

Step 1 (example)

Hypertext Logical Design

Step 1 (example)

(33)

Paolo Atzeni ER98 65

Hypertext Logical Design

Step 2 (example)

Hypertext Logical Design

Step 3 (example)

(34)

Paolo Atzeni ER98 67

R e s u l t i n g A D M S c h e m e

Maintenance

u

The Schemes help designers to maintain the hypertext structure

u

Maintenance activities correspond to apply scheme transformations:

– introduce multilevel lists – introduce forms

– split pages – ...

(35)

Paolo Atzeni ER98 69

Maintenance: example

G

Databases and information systems over the Web: a great opportunity

G

The design of Web-based information systems (WBIS)

G

Models for the design of WBIS

G

Conceptual and logical design of WBIS

u

Conclusions

(36)

Paolo Atzeni ER98 71

Conclusions

u

Models and schemes

– are essential in the design and documentation of Web sites

– can help in the generation of Web sites – can also be useful to support querying,

extraction, and integration

DBLP Site at Trier

http://dblp.uni-trier.de

(37)

Paolo Atzeni ER98 73

DBLP Site at Trier: ADM Scheme

Queries over Web Sites:

Wrappers

• The need of Wrappers;

• Pages are often logically homogeneous

Name : TEXT E.F. Codd

wrapper

Internet

HTML ADM

(38)

Paolo Atzeni ER98 75

Queries over Web Sites:

Reverse Engineering

u Building a database representation of a site is a reverse engineering process;

u First Step: Deriving the logical structure of data in the site;

u Second Step: Wrapping pages in order to map physical HTML sources to database objects;

u Both processes should be automated;

Queries over Web Sites:

Query Interfaces: Ulixes

Example of SQL Query: "All papers by Codd in the VLDB Conference"

DEFINE TABLE VLDBPapersByCodd(Title, Year)

AS AuthorSearchPage .NameForm.Submit ->

AuthorPage.WorkList IN DBLPScheme

USING AuthorPage.WorkList.Title, AuthorPage .WorkList.Year,

WHERE AuthorSearchPage .NameForm.Name='E. F. Codd’

AND AuthorPage .WorkList.Reference LIKE '%VLDB%'

(39)

Paolo Atzeni ER98 77

Integration of Web Sites

• Data-Centered Cooperative Applications on the Web:

nExtraction of data from existing sites;

nCorrelation;

nGeneration of new sites;

• Dealing with Heterogeneities:

nSchematic heterogeneities;

nSemantic heterogeneities;

Integration of Web Sites in Araneus

(40)

Paolo Atzeni ER98 79

Integration of Web Sites:

The Integrated Web Museum

– Integrates data coming from several Virtual Museums from the Web (Uffizi, Louvre and Capodimonte);

– Data are re-organized:

nUffizi, paintings organized by rooms;

nLouvre, Capodimonte, works organized by collections;

nIntegrated Museum, data organized by author.

The Integrated Web Museum

(41)

Paolo Atzeni ER98 81

The Integrated Web Museum

Bibliography

u

Will be available soon (together with the presentation) on my Web site:

http://www.dia.uniroma3.it/~atzeni

(42)

Paolo Atzeni ER98 83

Acknowledgements

u The Araneus project at Roma Tre:

– Gianni Mecca – Paolo Merialdo – Alessandro Masci – Valter Crescenzi – Giuseppe Sindoni – Marco Magnante

Riferimenti

Documenti correlati

● We say that a strongly connected component (SCC) in a directed graph is a subset of the nodes such that:. – (i) every node in the subset has a path to every

Nei documenti ipertestuali un tratto di testo può essere marcato come link che porta ad un altro documento HTML.. Tutto cio’ richiede le funzionalita’

User Agent Conformance XHTML 1.0: The Extensible HyperText Markup Language (Second Edition)... The XML processor normalizes different systems’ line end codes into one single LINE

Furthermore, in the last part of the second book of his Geometry, Descartes extends the study of reflection and refraction to the oval lenses (Descartes, 1897- 1913, VI: pp.

This “lens” gives greater visibility to the inter- action between the actors involved: national governments, international institutions and NGOs, city authorities acting in favour

nell’attenzione rivolta all’assistenza, considerata parte integrante della politica cittadina, e nella conseguente cura con la quale queste associazioni legarono la memoria di sé alla

A comprehensive protocol grouping PIMs into drugs with unfavorable benefit/risk ratio, drugs with questionable efficacy, drugs to be avoided with certain

Lato server: vuol dire che le operazioni programmate vengono svolte su server è successivamente visualizzate sul computer..