SPARQL

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

How would you query for the number of results alone?

"SELECT COUNT(?s) AS ?count" will return the total count of Subjects that meet the query criteria.

What should you take into account when using a Triplestore in regard to SPARQL?

1 - Triplestores are available commercially or via open source outlets so features and cost are factors in selection. 2 - A Triplestore should support the most current, official W3C SPARQL standards 3 - Communication between the SPARQL query and the Triplestore should be easy and nearly seamless from a client and programming 4 - Whether you publish your data locally or globally your Triplestore should have the ability to serve as a SPARQL endpoint

What are the 3 arguments of the regex() function?

1 - the field you want to search 2 - the regular expression you want to match 3 - an optional flag

What are some of the result display options are available from the DBpedia SPARQL Endpoint?

A SPARQL query can be displayed in the default "Browse" or other options such as JSON, XML, XML+XSLT.

What is a blank node in RDF and SPARQL?

A blank node or bnode is basically a temporary aggregation node, it groups values together. When an RDF process encounters a bnode it ignores the name of the bnode

What is the SPARQL End Point DBpedia

A crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link the different data sets on the Web to Wikipedia data. We hope that this work will make it easier for the huge amount of information in Wikipedia to be used in some new interesting ways. Furthermore, it might inspire new mechanisms for navigating, linking, and improving the encyclopedia itself.

What is Apache Jena?

A free and open source Java framework for building Semantic Web and Linked Data applications. The framework is composed of different APIs interacting together to process RDF data. It comes with a utility called ARQ that provides simple command line tools used to run SPARQL queries against RDF files in many serialization formats.

How do you close a triple pattern?

A full Subject, Predicate, Object triple pattern ends in a period (.)

What is SPARQL?

A general term for both a protocol and a query language. SPARQL is a set of specifications that provide languages and protocols to query and manipulate RDF graph content on the Web or in an RDF store (such as a Triplestore). With respect to the query language, SPARQL is a syntactically-SQL-like language for querying RDF graphs via pattern matching.

How is RDF stored ideally?

A large number of triples is ideally stored in a database system where the triples can be indexed and queried. RDF can be stored in traditional databases, but these are not optimized for triples like the preferred database system called a Triplestore

What is BINDING?

A pairing between a SPARQL variable and an RDF term. In practical terms, it's a variable that has had a value assigned.

What is DUBLIN CORE?

A popular standard vocabulary providing a basic set of metadata terms such as title, creator, and date. Many specialized metadata vocabularies are based on Dublin Core.

What is a LOCAL NAME?

A prefixed name without its prefix. For example, in dc:title, the local name is title.

What is LINKED DATA?

A set of best practices for connecting related data on the Web for use by applications. Because these best practices recommend the use of URIs and standardized data formats, RDF does an excellent job at this.

What is a GRAPH PATTERN?

A set of triple patterns between curly braces that specifies the set of triples that a SPARQL processor should retrieve from a dataset.

What is a BNODE?

A subject or object in an RDF graph that has no identity. These are typically used to group together other values. For example, an address book entry may have an email address of "[email protected]", a phone number of 943-234-9664, and an address whose value is a blank node that has its own values: one for a street address, one for a city name, one for a postal code, and so forth. The resource that has these property values is represented by a prefixed name with an underscore prefix (for example, _:xyz) or as a pair of square braces ([]). Tools that serialize triples do not have to save the prefixed name, as long as any new ones maintain all the same connections.

What is a LITERAL?

A value, as opposed to a URI, which is a name for something. A literal may have a datatype or a spoken language tag associated with it, but not both. A simple literal is a literal with no language tag or datatype.

What does the SPARQL 1.1 Overview provide?

An overview of SPARQL 1.1 and an introduction to all SPARQL specifications.

What is the ORDER BY default?

Ascending, DESC() must be used for descending.

Why are angle brackets necessary in FROM?

Because the data source is considered a URI. If on the local PC, the SPARQL processor will look in the same directory as the query itself to find the file. If the data source was somewhere on the Web the URI could be a web reference, for example: <http://example.org/foaf/aliceFoaf>.

Which functions evaluate expressions and return values based on what they find?

Both the IF() and COALESCE() functions

How can you identify the VARIABLE in a Triple Pattern?

By the leading '?' character such as ?ldbookISBN

How does the COALESCE() function work?

Closely related to the IF() function, it is also know as the "try this" function for reasons that will become apparent shortly. COALESCE( ) takes a variable argument list and returns the first non-error value.

What is the difficult part of writing a SPARQL query?

Conditions! the good news is that conditions are expressed as triple patterns which are similar to RDF triples

What is INFERENCING?

Deriving additional facts from existing information. In an RDF application, this often means creating new triples based on logic applied to existing ones; RDFS and OWL provide additional possibilities.

How do you represent a NAMED PREFIX in a Subject, Object or Predicate?

For example, books:isbn, using a named prefix can help make SPARQL but require setting a Namespace.

What is the difference between IRIs and URIs?

IRIs are considered an expansion of the URI in that they can contain characters from more languages (e.g. Chinese or Cyrilic) than URIs. SPARQL query language documentation will often refer to IRIs (rather than URIs) when talking about naming resources and when talking about functions that work with resource names.

What is ENTAILMENT?

If A entails B, and A is true, then we know that B is true. If A is a complicated set of facts, it can be very handy to have technology such as an RDFS- or OWL-aware SPARQL processor to help you discover whether B is true.

What is a GRAPH?

In RDF, a set of triples. While it is not unusual to feed triples to a utility that creates a graphical representation of them, the term comes from the computer science sense of the term as a data structure that is like a tree structure but lets any node connect to any other instead of being hierarchical.

What is an IRI?

Internationalized Resource Identifier: a URI that allows a wider choice of characters, making it "internationalized."

How does the BIND() keyword work in IF() expressions?

It allows a value to be assigned to a variable in one line of code and is useful when performing calculations, for example, that result in a new value that you might want to store and use.

What does the UNION keyword do?

It can be used to write multiple, different triple patterns and then combine the results of these triple patterns into a set of overlapping results.

What is a SPARQL query TRIPLE PATTERN?

It describes the conditions that have to be met in order for data to be selected and returned -- in other words, it defines the data I want to retrieve.

What does the SPARQL 1.1 Graph Store HTTP Protocol provide?

It extends the SPARQL Protocol for RDF with an API for managing graphs directly (together the RDF and Graph Store protocols allow users to manage graph data and RDF data)

What are some of the features of the SPARQL language?

It includes basic conjunctive patterns, value filters, optional patterns, and pattern disjunction

What does the DISTINCT keyword provide?

It indicates to a SPARQL processor that it should not display duplicate results ("show me only DISTINCT or unique results"). The DISTINCT keyword is placed directly after the SELECT keyword in the SPARQL query.

Why is it preferable to represent key words such as PREFIX, SELECT and WHERE in upper case?

It is a coding convention to make the keywords easily distinguishable from other parts of the SPARQL query.

What are the characteristics of DBpedia's ontology?

It is a shallow, cross-domain ontology, which has been manually created based on the most commonly used infoboxes within Wikipedia. The ontology currently covers 529 classes which form a subsumption hierarchy and are described by 2,333 different properties.

What is important to remember about the stability of RDF data on DBpedia?

It is frequently refined and updated. This means that things like Property/Predicate IRIs can change and Object values can also change.

Can RDF be stored locally?

It is possible to store RDF in text files in any of the RDF serialization formats. One could create a Turtle file full of triples and store it on disk as a text file with a .ttl extension. Storing a large number of triples in a text file, however, is not efficient for either management or access.

Why has the declaration of the ":" PREFIX been included among the other common namespaces on the DBpedia SPARQL Endpoint?

It looks a odd, but it's an interesting convention in RDF. Namespaces that are used frequently can be given a Namespace Prefix of ":" in order to make the RDF less messy and easier to read. In this case, notice that the ":" prefix has been declared for http://dbpedia.org/resource/. This means that we can simply prefix our resource name (Medical_Subject_Headings) with a ":" and the DBpedia endpoint will recognize this (:Medical_Subject_Headings) as a resource URI.

What does the web component HTML offer the web?

It provides a way to represent the structure of a document, albeit an insufficient one for the Semantic Web which requires a data model with more structure

What do the SPARQL 1.1 Query Results CSV and TSV Formats provide?

It specifies a CSV (comma separated values format) and TSV (tab separated values format) for query processors to use when returning results

What does the SPARQL 1.1 Query Results JSON Format provide?

It specifies a JSON format for query processors to use when returning results

What does the SPARQL 1.1 Update Language provide?

It specifies a method for adding, replacing, and deleting data to/from a graph data set

What does the SPARQL 1.1 Service Description provide?

It specifies a method for client applications to ask a SPARQL endpoint which SPARQL 1.1 features it supports

What does the SPARQL Query Results XML Format provide?

It specifies a simple XML format for query processors to use when returning results. When SPARQL query results are returned in XML format one can use XSLT to convert the XML results into other formats, including web friendly formats.

What does the SPARQL 1.1 Federated Query provide?

It specifies an extension of the SPARQL 1.1 Query Language for executing queries distributed over different SPARQL endpoints (how a single query can retrieve data from multiple sources)

What does the SPARQL Protocol for RDF provide?

It specifies how a program should pass SPARQL queries to a SPARQL query processing service and how that service should return the results.

What does the SPARQL Query Language for RDF provide?

It specifies the syntax of queries themselves (the "QL" in "SPARQL).

What do the SPARQL 1.1 Entailment Regimes provide?

It specifies what information a SPARQL processor should take into account when performing entailment. Entailment regimes are the ways in which SPARQL utilizes RDFS and OWL Sub-classes, Sub-properties, Domains, Ranges, etc. when querying RDF triples.

What is a FILTER EXPRESSION?

It takes a single argument enclosed in parentheses like so: FILTER (). The argument can be simple or complex, but the argument must always return a boolean (true or false) value. The argument of a FILTER expression can use any number of functions or operators

Why is keeping in mind the role of URIs and Namespaces useful when using SPARQL?

It's crucial when writing queries in SPARQL. For example, when requesting data from a dataset you will need to identify the data via a URI and when cross-referencing data from multiple datasets you will need to identify that data via URIs in order to write the query that will get you the data you want.

How complex should your OPTIONAL triple patterns be?

Keep them simple, don't group more than one triple pattern together in an OPTIONAL statement unless you really want/need to. It's usually more effective to just use a separate OPTIONAL keyword for each triple pattern.

How are Labels useful?

Labels can be mixed with language tags to provide human readable descriptions in multiple languages. SPARQL queries can retrieve label values of resources in place of or in addition to URIs so that a more human readable description of a resource is available. SKOS adds more granular labels to RDF data and these can also be returned with SPARQL query results.

Why is placement important when using ORDER BY?

ORDER BY must come before LIMIT or else the SPARQL processor will throw an error.

Is order important when using OPTIONAL?

Order matters with OPTIONAL. A SPARQL processor will try to match triple patterns in the OPTIONAL statement in the order that it sees them.

How can you use the IN operator for find cities in the US and Canada?

PREFIX dbpedia-owl: <http://dbpedia.org/ontology/> SELECT ?city ?country WHERE { ?subject dbpedia-owl:city ?city . ?subject dbpedia-owl:country ?country . FILTER (?country IN (:Canada, :United_States)) }

How would NOT IN be used to find all US cities in states other than New York and New Jersey?

PREFIX dbpedia-owl: <http://dbpedia.org/ontology/> SELECT DISTINCT ?city ?state WHERE { ?subject dbpedia-owl:city ?city . ?subject dbpedia-owl:country :United_States . ?city <http://dbpedia.org/ontology/isPartOf> ?state . FILTER (?state NOT IN (:New_Jersey, :New_York)) }

What punctuation is needed at the end of a PREFIX declaration?

PREFIX declarations do not end in a period, there is no punctuation at the end of a PREFIX declaration

How can you query any SPARQL endpoint for all Subject, Predicate and Object triples?

SELECT * WHERE { ?s ?p ?o . }

How does SPARQL function?

SPARQL is used to access, query, and integrate local and global RDF data from semantic and Linked Data repositories.

How do STR() and CONTAINS() work?

STR( ) converts non-string values into strings so those values can be used in string functions, such as CONTAINS( ). In the homework the values bound to ? state are actually URIs and CONTAINS() cannot work with URIs so I must tell the SPARQL processor to treat values bound to ?state as strings and not URIs.

How do you list multiple variables in a SELECT statement in SPARQL?

Separate by using a space, not commas. Using commas in SPARQL SELECT statements will cause a SPARQL processor to throw and error and the query will fail to run

What does a semicolon indicate to a SPARQL processor?

That it should expect another Predicate and Object that is associated with the Subject of the current Predicate and Object

What does a period at the end of the final triple pattern indicate?

That the preceding Predicate and Object should be considered the last Predicate and Object pair that belong to the main Subject.

How does the FILTER keyword work?

The FILTER keyword does not need to use SPARQL functions exclusively. One can write some very simple FILTER conditions with basic operators such as math operators that return boolean values.

What is FOAF?

The Friend of a Friend vocabulary lets you describe facts about a person such as his or her name, home page, work place, and job title. The general idea was to provide the foundation for a distributed, RDF-based social networking system, but the FOAF vocabulary identifies such basic facts about people that it gets used in a wide variety of applications.

What does GROUP BY do?

The GROUP BY keyword instructs a SPARQL processor to group results together by the values of the variable(s) listed after the GROUP BY keyword.

What does the web component HTTP offer the web?

The Hypertext Transfer Protocol provides a way for servers and clients of all kinds to communicate with one another (ask and answer questions of one another).

What are the new, additional, recommended specifications for SPARQL 1.1?

The SPARQL 1.1 Overview The SPARQL 1.1 Federated Query The SPARQL 1.1 Update Language The SPARQL 1.1 Service Description The SPARQL 1.1 Query Results JSON Format The SPARQL 1.1 Query Results CSV and TSV Formats SPARQL 1.1 Graph Store HTTP Protocol SPARQL 1.1 Entailment Regimes

What are the 3 specifications of SPARQL 1.0?

The SPARQL Query Language for RDF The SPARQL Protocol for RDF The SPARQL Query Results XML Format

What are some characteristics of SPARQL?

The SPARQL protocol specifies a simple interface that can be supported via HTTP or SOAP that a client can use to issue SPARQL queries against some endpoint. An 'endpoint' is a data repository that has the capability of receiving and interpreting a query and returning data.

What is a DATASET?

The collection of graphs that a given SPARQL query is querying. This collection consists of a default graph and optional named graphs.

What will happen if you specify a data source using the FROM keyword in your SPARQL query and then specify another data source when you call the SPARQL processor?

The data source in the protocol request will override the data source defined by the FROM keyword.

What are you defining in a SPARQL query?

The information you want to retrieve, you are defining the dataset from which the query will retrieve data, and you are defining the conditions that the data must meet in order for it to be a part of the query result.

Why might a query be running slowly?

The more OPTIONAL statements you use, the slower your query will get. OPTIONAL is a very handy tool to have in your SPARQL toolbelt, but it adds complexity to the processing a SPARQL processor has to do to query a dataset and return results. So, the more OPTIONAL statements there are in a SPARQL query, the more complex it is for the SPARQL processor to run the query. Over use of the OPTIONAL statement is one of the prime culprits behind slow queries.

Why should OPTIONAL triple patterns be simple?

The reason is that when you use multiple triple patterns with OPTIONAL, you are creating more complex patterns that will have to be matched and you stand a good chance of actually negating the optional nature of OPTIONAL. In other words, OPTIONAL will group patterns that you most likely want to be evaluated separately, so, use separate OPTIONAL keywords for each triple pattern.

What is the DEFAULT GRAPH?

The triples in an RDF dataset that don't belong to a named graph.

How does SPARQL fit into the Semantic Web?

The web uses URIs to name resources, the RDF data model for describing resources, RDF Schema, OWL, and SKOS for storing vocabularies and ontologies, and the SPARQL query language for asking questions of resources. Linked data principles, together with these standards, make it possible to share semantic data.

What are SPARQL queries?

They are basically statements where the query states what information you wish to retrieve (based on a set of defined conditions) from a dataset

What are the characteristics of variables in SPARQL?

They are user defined and can be specific values or "wildcards" matching any object that matches the triple pattern.

What does a SPARQL query file look like?

They are written and saved to a file using any plain text editor (or a SPARQL tool like Twinkle) and by convention have a .rq file extension.

What do the web components URLs and URIs offer the web?

They provide a simple way for clients to request resources over a web protocol. URIs are globally unique identifiers of resources and properties, URLs are de-referenceable web addresses.

What does the web component Namespaces offer the web?

They provide a way for XML authors to differentiate elements from different specifications or domains. We refer to specific Namespaces with URIs.

What is a best practice in representing variable names?

They should human readable to give some idea as to their purpose.

How are SPARQL variables like variables in other programming languages?

They store information. In this case, the values that the SPARQL query finds will be bound to ?ldbookISBN so that they can be used again in the query.

What does it mean to CAST?

To convert a piece of data from one datatype to another—for example, converting the string "123" to the integer 123 or "2011-10-14T13:19:00"^^xsd:dateTime to "2011-10-14T13:19:00"^^xsd:string. "Cast" is a common programming term and not specific to SPARQL.

How do you represent a URI in a Subject, Object or Predicate?

To insure the SPARQL processor sees more than a line of text, place the URI inside angle brackets < >

What is the main difference between triple patterns and RDF triples?

Triple patterns can include variables, these variables are similar in form and function to what you may have seen in XSLT or XQuery and they are designed to provide a similar type of flexibility in SPARQL Queries.

How can you identify a Turtle file?

Turtle serializations can be written and saved to a file using any plain text editor and by convention have the .ttl file extension.

What punctuation is used for a Turtle-style triple pattern?

Use a semi-colon at the end of each Predicate, Object fragment and a period after the final Predicate, Object fragment like... Predicate Object; Predicate Object.

How does the IF() function work?

Using 3 expressions like this: IF (expression1, expression2, expression3)) like an IF( ), THEN( ), ELSE( ) statement. 1. A SPARQL processor evaluates the first expression as an effective boolean value (i.e.: 'true' or 'false') 2. If the first expression evaluates to boolean 'true', then the function returns the value of the second expression 3. If the first expression evaluates to boolean 'false', then the function returns the value of the third expression

What is the common best practice for typing variables such as ?subject ?predicate ?object ?

Using the shortened notation ?s ?p ?o

When would it be useful to take advantage of rdfs:label ?

When the URI isn't a user friendly way of representing resources in result sets, the rdfs:label (or one of the many OWL or SKOS labels) to present a nice, readable label for a given resource.

When is the OPTIONAL keyword particularly useful?

When you are exploring a new SPARQL data set and don't have a solid understanding of what data may or may not be available in resources in the dataset.

Is whitespace a concern in SPARQL syntax?

Whitespace does not affect SPARQL syntax. Adding a good amount of white space and carriage returns makes the query more readable and the SPARQL processor completely ignores all of it.

How is the OPTIONAL keyword used?

You can assign a SPARQL triple pattern as OPTIONAL, effectively telling the SPARQL processor "show me the value of ?doi if it exists".

Can the OPTIONAL keyword only be used once?

You can use multiple OPTIONAL keywords/statements in your SPARQL query

What will be produced with the below ORDER BY ?year ?s

• First the results are sorted by year (?year) • then within each year, results are sorted by subject (?s)


Kaugnay na mga set ng pag-aaral

AP English Lit MCQ - Prologue to a Supposed Play

View Set

Ch. 25: Body Fluid, Extracellular, Intracellular Fluids

View Set

Nutrition Chapter 4 Human Digestion and absorption

View Set

Science Study Guide Chapter 20:The Sun-Earth-Moon System

View Set

Quiz: Medical-Surgical Nursing: Cardiovascular, Hematologic, and Lymphatic Systems; Pediatric Nursing: Toddlers, Preschoolers

View Set

13.Земля в Солнечной системе

View Set

Part 1: Matching Columns. Technology Grade 6-Virus A computer virus is a program that is designed to harm the computer. They are basically the same thing as human viruses. test 3

View Set

Sometimes Dangerous - Read Theory

View Set