• Non ci sono risultati.

Dati semistrutturati in XML

N/A
N/A
Protected

Academic year: 2021

Condividi "Dati semistrutturati in XML"

Copied!
58
0
0

Testo completo

(1)

Dati semistrutturati in XML

Massimo Franceschet

m.franceschet@unich.it www.sci.unich.it/francesc

Universit`a “Gabriele D’Annunzio” di Chieti e Pescara

Dati semistrutturati in XML – p.1/30

(2)

What XML is

XML is eXtensible Markup Language.

? XML is a formal language. This means it is defined by a set of formal rules (a grammar) that say exactly how to compose an XML document.

? XML is markup. Data is included in XML document as strings of text and is surrounded by text markup that describes the data.

? XML is extensible. The language allows an extensible set of markup tags that can be adapted to meet many different needs.

An XML document is well-formed if it satisfies the XML grammar.

(3)

What XML is

XML is eXtensible Markup Language.

? XML is a formal language. This means it is defined by a set of formal rules (a grammar) that say exactly how to compose an XML document.

? XML is markup. Data is included in XML document as strings of text and is surrounded by text markup that describes the data.

? XML is extensible. The language allows an extensible set of markup tags that can be adapted to meet many different needs.

An XML document is well-formed if it satisfies the XML grammar.

Dati semistrutturati in XML – p.2/30

(4)

What XML is

XML is eXtensible Markup Language.

? XML is a formal language. This means it is defined by a set of formal rules (a grammar) that say exactly how to compose an XML document.

? XML is markup. Data is included in XML document as strings of text and is surrounded by text markup that describes the data.

? XML is extensible. The language allows an extensible set of markup tags that can be adapted to meet many different needs.

An XML document is well-formed if it satisfies the XML grammar.

(5)

What XML is

XML is eXtensible Markup Language.

? XML is a formal language. This means it is defined by a set of formal rules (a grammar) that say exactly how to compose an XML document.

? XML is markup. Data is included in XML document as strings of text and is surrounded by text markup that describes the data.

? XML is extensible. The language allows an extensible set of markup tags that can be adapted to meet many different needs.

An XML document is well-formed if it satisfies the XML grammar.

Dati semistrutturati in XML – p.2/30

(6)

What XML is

? The markup permitted in a particular XML application can be documented in a schema.

? The most broadly supported schema language is Document Type Definition (DTD).

? An XML document is said valid if it matches the schema.

(7)

What XML is

? The markup permitted in a particular XML application can be documented in a schema.

? The most broadly supported schema language is Document Type Definition (DTD).

? An XML document is said valid if it matches the schema.

Dati semistrutturati in XML – p.3/30

(8)

What XML is

? The markup permitted in a particular XML application can be documented in a schema.

? The most broadly supported schema language is Document Type Definition (DTD).

? An XML document is said valid if it matches the schema.

(9)

XML for people

Common scenarios in which XML can be used by people include:

? Writing a book using DocBook. DocBook is

nonproprietary, portable, modular, and easy to use with any text editor and you may format the final version

according to your needs.

? Write a web page in XHTML. XHTML has a

well-defined syntax, you can work with any XML tool

and web search engines eventually will understand your document and properly index it.

Dati semistrutturati in XML – p.4/30

(10)

XML for people

Common scenarios in which XML can be used by people include:

? Writing a book using DocBook. DocBook is

nonproprietary, portable, modular, and easy to use with any text editor and you may format the final version

according to your needs.

? Write a web page in XHTML. XHTML has a

well-defined syntax, you can work with any XML tool

and web search engines eventually will understand your document and properly index it.

(11)

XML for machines

Common scenarios in which XML can be used by machines include:

? Data exchange. Information comes in different sources (relations, objects, documents, ...) and it needs to be exchanged between these sources. XML acts as the common dataspeak.

? Semistructured databases. These data has no regular schema and does not naturally fit into relational

databases. XML has been proposed as the data model for semistructured data.

Dati semistrutturati in XML – p.5/30

(12)

XML for machines

Common scenarios in which XML can be used by machines include:

? Data exchange. Information comes in different sources (relations, objects, documents, ...) and it needs to be exchanged between these sources. XML acts as the common dataspeak.

? Semistructured databases. These data has no regular schema and does not naturally fit into relational

databases. XML has been proposed as the data model for semistructured data.

(13)

What XML is not

? XML is not a presentation language like HTML.

XML defines the structure of the document and the

semantics (meaning) of the data, but it doesn’t tell how the data should look.

? XML is not a programming language like Java.

An XML document by itself simply is. It does not do anything.

? XML is not a network transport protocol like HTTP.

XML won’t send data across the network.

? XML is not a database management system like Oracle.

XML does not store and retrieve data.

Dati semistrutturati in XML – p.6/30

(14)

What XML is not

? XML is not a presentation language like HTML.

XML defines the structure of the document and the

semantics (meaning) of the data, but it doesn’t tell how the data should look.

? XML is not a programming language like Java.

An XML document by itself simply is. It does not do anything.

? XML is not a network transport protocol like HTTP.

XML won’t send data across the network.

? XML is not a database management system like Oracle.

XML does not store and retrieve data.

(15)

What XML is not

? XML is not a presentation language like HTML.

XML defines the structure of the document and the

semantics (meaning) of the data, but it doesn’t tell how the data should look.

? XML is not a programming language like Java.

An XML document by itself simply is. It does not do anything.

? XML is not a network transport protocol like HTTP.

XML won’t send data across the network.

? XML is not a database management system like Oracle.

XML does not store and retrieve data.

Dati semistrutturati in XML – p.6/30

(16)

What XML is not

? XML is not a presentation language like HTML.

XML defines the structure of the document and the

semantics (meaning) of the data, but it doesn’t tell how the data should look.

? XML is not a programming language like Java.

An XML document by itself simply is. It does not do anything.

? XML is not a network transport protocol like HTTP.

XML won’t send data across the network.

? XML is not a database management system like Oracle.

XML does not store and retrieve data.

(17)

Example 1

1. Read the XML document people.xml with any browser;

2. watch the tree data model in people.ps;

3. check whether people.xml is well-formed by loading it with any browser;

4. read the DTD in people.dtd with any text editor;

5. check whether people.xml is valid by using STG XML Validation Form.

Dati semistrutturati in XML – p.7/30

(18)

Example 2

1. Read the context description in biblio.html; 2. read the XML document biblio.xml;

3. watch the tree data model in biblio.ps; 4. read the DTD in biblio.dtd.

(19)

XML query languages

? A collection of related XML documents is called an XML database.

? The different data model of XML databases (trees) with respect to that of relational databases (tables) call for different query languages.

The most popular XML query languages are:

? XML Path Language (XPath). It is a language to retrieve elements from a single XML document.

? XML Query Language (XQuery). It is a full query language for XML databases.

Dati semistrutturati in XML – p.9/30

(20)

XML query languages

? A collection of related XML documents is called an XML database.

? The different data model of XML databases (trees) with respect to that of relational databases (tables) call for different query languages.

The most popular XML query languages are:

? XML Path Language (XPath). It is a language to retrieve elements from a single XML document.

? XML Query Language (XQuery). It is a full query language for XML databases.

(21)

XML query languages

? A collection of related XML documents is called an XML database.

? The different data model of XML databases (trees) with respect to that of relational databases (tables) call for different query languages.

The most popular XML query languages are:

? XML Path Language (XPath). It is a language to retrieve elements from a single XML document.

? XML Query Language (XQuery). It is a full query language for XML databases.

Dati semistrutturati in XML – p.9/30

(22)

XML query languages

? A collection of related XML documents is called an XML database.

? The different data model of XML databases (trees) with respect to that of relational databases (tables) call for different query languages.

The most popular XML query languages are:

? XML Path Language (XPath). It is a language to retrieve elements from a single XML document.

? XML Query Language (XQuery). It is a full query language for XML databases.

(23)

The structure of an XPath query

? An XPath query is a path, that is a sequence of steps separated by the slash sign:

/step1/step2/ . . . /stepk

? each step has the form:

axis :: test[filter]

? axis indicates how to navigate the XML tree;

? test filters the result according to the nodes’ type;

? filter is an optional Boolean path condition to further restrict the result.

Dati semistrutturati in XML – p.10/30

(24)

The structure of an XPath query

? An XPath query is a path, that is a sequence of steps separated by the slash sign:

/step1/step2/ . . . /stepk

? each step has the form:

axis :: test[filter]

? axis indicates how to navigate the XML tree;

? test filters the result according to the nodes’ type;

? filter is an optional Boolean path condition to further restrict the result.

(25)

The structure of an XPath query

? An XPath query is a path, that is a sequence of steps separated by the slash sign:

/step1/step2/ . . . /stepk

? each step has the form:

axis :: test[filter]

? axis indicates how to navigate the XML tree;

? test filters the result according to the nodes’ type;

? filter is an optional Boolean path condition to further restrict the result.

Dati semistrutturati in XML – p.10/30

(26)

The structure of an XPath query

? An XPath query is a path, that is a sequence of steps separated by the slash sign:

/step1/step2/ . . . /stepk

? each step has the form:

axis :: test[filter]

? axis indicates how to navigate the XML tree;

? test filters the result according to the nodes’ type;

? filter is an optional Boolean path condition to further restrict the result.

(27)

The structure of an XPath query

? An XPath query is a path, that is a sequence of steps separated by the slash sign:

/step1/step2/ . . . /stepk

? each step has the form:

axis :: test[filter]

? axis indicates how to navigate the XML tree;

? test filters the result according to the nodes’ type;

? filter is an optional Boolean path condition to further restrict the result.

Dati semistrutturati in XML – p.10/30

(28)

Learning the English alphabet...

A

B E X

C D F I L R U

G H J K M N Q

O P

S T V W

Y Z

(29)

/descendant::L/child::*

A

B E X

C D F I L R U

G H J K M N Q

O P

S T V W

Y Z

Dati semistrutturati in XML – p.12/30

(30)

/descendant::L/descendant::*

A

B E X

C D F I L R U

G H J K M N Q

O P

S T V W

Y Z

(31)

/descendant::L/parent::*

A

B E X

C D F I L R U

G H J K M N Q

O P

S T V W

Y Z

Dati semistrutturati in XML – p.14/30

(32)

/descendant::L/ancestor::*

A

B E X

C D F I L R U

G H J K M N Q

O P

S T V W

Y Z

(33)

/descendant::L/following-sibling::*

A

B E X

C D F I L R U

G H J K M N Q

O P

S T V W

Y Z

Dati semistrutturati in XML – p.16/30

(34)

/descendant::L/preceding-sibling::*

A

B E X

C D F I L R U

G H J K M N Q

O P

S T V W

Y Z

(35)

/descendant::L/following::*

A

B E X

C D F I L R U

G H J K M N Q

O P

S T V W

Y Z

Dati semistrutturati in XML – p.18/30

(36)

/descendant::L/preceding::*

A

B E X

C D F I L R U

G H J K M N Q

O P

S T V W

Y Z

(37)

/descendant::L/self::*

A

B E X

C D F I L R U

G H J K M N Q

O P

S T V W

Y Z

Dati semistrutturati in XML – p.20/30

(38)

/descendant::*[child::*]

A

B E X

C D F I L R U

G H J K M N Q

O P

S T V W

Y Z

(39)

/descendant::*[child::* and following-sibling::*]

A

B E X

C D F I L R U

G H J K M N Q

O P

S T V W

Y Z

Dati semistrutturati in XML – p.22/30

(40)

/descendant::*[not(child::*) or self::A]

A

B E X

C D F I L R U

G H J K M N Q

O P

S T V W

Y Z

(41)

Full XPath

Moreover, XPath offers:

? the use of node tests different form a tag name and *, for instance comment() and text();

? the use of comparison operators (like =, >, <) in filters;

? the use of functions (like contains(), position(), count(), id()) in filters.

Dati semistrutturati in XML – p.24/30

(42)

Full XPath

Moreover, XPath offers:

? the use of node tests different form a tag name and *, for instance comment() and text();

? the use of comparison operators (like =, >, <) in filters;

? the use of functions (like contains(), position(), count(), id()) in filters.

(43)

Full XPath

Moreover, XPath offers:

? the use of node tests different form a tag name and *, for instance comment() and text();

? the use of comparison operators (like =, >, <) in filters;

? the use of functions (like contains(), position(), count(), id()) in filters.

Dati semistrutturati in XML – p.24/30

(44)

Example

1. Read the XPath queries contained in q1.xp, q2.xp, q3.xp;

2. run them against biblio.xml by using Saxon

(45)

XQuery

? The XML query language (XQuery) is the counterpart of SQL for XML databases;

? XQuery inputs, processes, and outputs sequences (not sets of nodes like XPath);

? each item of a sequence is either an XML element or an atomic value (like a string or a number);

? XPath queries are used in XQuery. Their results are converted into sorted sequences according to the document order.

Dati semistrutturati in XML – p.26/30

(46)

XQuery

? The XML query language (XQuery) is the counterpart of SQL for XML databases;

? XQuery inputs, processes, and outputs sequences (not sets of nodes like XPath);

? each item of a sequence is either an XML element or an atomic value (like a string or a number);

? XPath queries are used in XQuery. Their results are converted into sorted sequences according to the document order.

(47)

XQuery

? The XML query language (XQuery) is the counterpart of SQL for XML databases;

? XQuery inputs, processes, and outputs sequences (not sets of nodes like XPath);

? each item of a sequence is either an XML element or an atomic value (like a string or a number);

? XPath queries are used in XQuery. Their results are converted into sorted sequences according to the document order.

Dati semistrutturati in XML – p.26/30

(48)

XQuery

? The XML query language (XQuery) is the counterpart of SQL for XML databases;

? XQuery inputs, processes, and outputs sequences (not sets of nodes like XPath);

? each item of a sequence is either an XML element or an atomic value (like a string or a number);

? XPath queries are used in XQuery. Their results are converted into sorted sequences according to the document order.

(49)

Flowers on trees

FLWOR expressions are the most common expressions in XQuery. They are similar to select-from-where statements in SQL.

The name FLWOR is an acronym, standing for the first

letter of the clauses that may occur in such an expression:

? For clauses iteratively bind variables to each value of the result of the corresponding expression.

? Let clauses bind variables to the entire result of an corresponding expression.

A sequence of variable bindings created by the for and let clauses of a FLWOR expression is called a tuple.

Dati semistrutturati in XML – p.27/30

(50)

Flowers on trees

FLWOR expressions are the most common expressions in XQuery. They are similar to select-from-where statements in SQL.

The name FLWOR is an acronym, standing for the first

letter of the clauses that may occur in such an expression:

? For clauses iteratively bind variables to each value of the result of the corresponding expression.

? Let clauses bind variables to the entire result of an corresponding expression.

A sequence of variable bindings created by the for and let clauses of a FLWOR expression is called a tuple.

(51)

Flowers on trees

FLWOR expressions are the most common expressions in XQuery. They are similar to select-from-where statements in SQL.

The name FLWOR is an acronym, standing for the first

letter of the clauses that may occur in such an expression:

? For clauses iteratively bind variables to each value of the result of the corresponding expression.

? Let clauses bind variables to the entire result of an corresponding expression.

A sequence of variable bindings created by the for and let clauses of a FLWOR expression is called a tuple.

Dati semistrutturati in XML – p.27/30

(52)

Flowers on trees

FLWOR expressions are the most common expressions in XQuery. They are similar to select-from-where statements in SQL.

The name FLWOR is an acronym, standing for the first

letter of the clauses that may occur in such an expression:

? For clauses iteratively bind variables to each value of the result of the corresponding expression.

? Let clauses bind variables to the entire result of an corresponding expression.

A sequence of variable bindings created by the for and let clauses of a FLWOR expression is called a tuple.

(53)

Flowers on trees

FLWOR expressions are the most common expressions in XQuery. They are similar to select-from-where statements in SQL.

The name FLWOR is an acronym, standing for the first

letter of the clauses that may occur in such an expression:

? For clauses iteratively bind variables to each value of the result of the corresponding expression.

? Let clauses bind variables to the entire result of an corresponding expression.

A sequence of variable bindings created by the for and let clauses of a FLWOR expression is called a tuple.

Dati semistrutturati in XML – p.27/30

(54)

Flowers on trees

? Where clauses filter tuples retaining only those that satisfy a condition;

? Order by clauses sort the tuples;

? Return clauses build the result of the expression.

(55)

Flowers on trees

? Where clauses filter tuples retaining only those that satisfy a condition;

? Order by clauses sort the tuples;

? Return clauses build the result of the expression.

Dati semistrutturati in XML – p.28/30

(56)

Flowers on trees

? Where clauses filter tuples retaining only those that satisfy a condition;

? Order by clauses sort the tuples;

? Return clauses build the result of the expression.

(57)

Example

1. Read the XQuery queries contained in q4.xq, q5.xq, q6.xq;

2. run them against biblio.xml by using Saxon

Dati semistrutturati in XML – p.29/30

(58)

More information

http://www.sci.unich.it/~francesc/xml

Riferimenti

Documenti correlati

The principle of corresponding states in modern form has been applied to the following properties: the critical state, the virial coefficient, the Boyle point,

Однако отрезанное от бессознательного воображение функ- ционализируется, то есть сводится к глобальной направленно- сти на воспроизводство: я представляю себе то,

This article proves a case of the p-adic Birch and Swinnerton–Dyer conjecture for Garrett p-adic L-functions of (Bertolini et al. in On p-adic analogues of the Birch and

Solution proposed by Roberto Tauraso, Dipartimento di Matematica, Universit`a di Roma “Tor Vergata”, via della Ricerca Scientifica, 00133 Roma,

Referring to the scope of this paper, it is important to evaluate how many vehicles could be contemporaneously connected to the grid in the same area. This way, it is possible

planes and traces parallel to the xy-plane were used to sketch the function of two variables as a surface in the 3- dimensional coordinate system. When the traces parallel to

As for the value of Resistance to Uniaxial Compressive Strenght, the survey refers to the results obtained by the compressive press, being it a direct test; the instrument

Usually, this fact is proved by non-constructive techniques, althought a constructive proof, depending on the Principle of Uniform Boundedness can be found in [8]... This argument