Semantic Web - Research Group At The University Of Innsbruck

1y ago
19 Views
1 Downloads
2.95 MB
62 Pages
Last View : 2m ago
Last Download : 19d ago
Upload by : Jewel Payne
Transcription

4/23/2018Semantic Web ServicesSS 2018Semantic WebAnna Fensel23.04.2018 Copyright 2010‐2018 Dieter Fensel, Ioan Toma, and Anna Fensel1Where are we?#Title1Introduction2Web Science Cathy O’Neil’s talk: “Weapons of Math Destruction”3Service Science4Web services5Web2.0 services ONLIM APIs (separate slideset)6Semantic Web7Semantic Web Service Stack (WSMO, WSML, WSMX)8OWL-S and the others9Semantic Services as a Part of the Future Internet and Big Data Technology10Lightweight Annotations11Linked Services12Applications13Mobile Services21

4/23/2018Agenda1. Motivation1.Development of the Web1.2.3.2.InternetWeb 1.0Web2.0Limitations of the current Web2. Technical solution1.2.Introduction to Semantic WebArchitecture and languages3. Semantic Web - Data4. Extensions1.2.3.Linked (Open) DataSchema.orgLOV5. Summary6. References3MOTIVATION42

4/23/2018DEVELOPMENT OF THE WEB5Development of the Web1. Internet2. Web 1.03. Web 2.063

4/23/2018INTERNET7Internet “The Internet is a global system of interconnectedcomputer networks that use the standard InternetProtocol Suite (TCP/IP) to serve billions of usersworldwide. It is a network of networks that consists ofmillions of private and public, academic, business, andgovernment networks of local to global scope that arelinked by a broad array of electronic and opticalnetworking net84

4/23/2018A brief summary of Internet evolutionAge ofeCommercePacketSwitchingFirst Vast Invented1964ComputerNetworkSilicon EnvisionedAChip1962Mathematical 1958Theory ofMemex 51945Mosaic BeginsWWW1995Internet Created Created1993Named1989andGoesTCP/IP TCP/IPCreated 1984ARPANET 197219691995Source: http://slidewiki.org/slide/247219WEB 1.0105

4/23/2018Web 1.0 “The World Wide Web ("WWW" or simply the "Web") isa system of interlinked, hypertext documents that runsover the Internet. With a Web browser, a user views Webpages that may contain text, images, and othermultimedia and navigates between them rld Wide Web11Web 1.0 Netscape– Netscape is associated with the breakthrough of the Web.– Netscape had rapidly a large user community making attractivefor others to present their information on the Web. Google– Google is the incarnation of Web 1.0 mega grows– Google indexed already in 2008 more than 1 trillion pages [*]– Google and other similar search engines turned out that a pieceof information can be faster found again on the Web than in theown bookmark list[*] -was-big.html126

4/23/2018Web 1.0 principles The success of Web1.0 is based on three simpleprinciples:1. A simple and uniform addressing schema to indentifyinformation chunks i.e. Uniform Resource Identifiers (URIs)2. A simple and uniform representation formalism to structureinformation chunks allowing browsers to render them i.e. HyperText Markup Language (HTML)3. A simple and uniform protocol to access information chunks i.e.Hyper Text Transfer Protocol (HTTP)131. Uniform Resource Identifiers (URIs) Uniform Resource Identifiers (URIs) are used toname/identify resources on the Web URIs are pointers to resources to which request methodscan be applied to generate potentially differentresponses Resource can reside anywhere on the Internet Most popular form of a URI is the Uniform ResourceLocator (URL)147

4/23/20182. Hyper-Text Markup Language (HTML) Hyper-Text Markup Language:– A subset of Standardized General Markup Language (SGML)– Facilitates a hyper-media environment Documents use elements to “mark up” or identifysections of text for different purposes or displaycharacteristics HTML markup consists of several types of entities,including: elements, attributes, data types and characterreferences Markup elements are not seen by the user when page isdisplayed Documents are rendered by browsers153. Hyper-Text Transfer Protocol (HTTP) Protocol for client/server communication– The heart of the Web– Very simple request/response protocol Client sends request message, server replies with responsemessage– Provide a way to publish and retrieve HTML pages– Stateless– Relies on URI naming mechanism168

4/23/2018WEB 2.017Web 2.0 “The term "Web 2.0" (2004–present) is commonlyassociated with web applications that facilitate interactiveinformation sharing, interoperability, user-centered design,and collaboration on the World Wide Web”http://en.wikipedia.org/wiki/Web 2.0189

4/23/2018Web 2.0 Web 2.0 is a vaguely defined phrase referring to varioustopics such as social networking sites, wikis,communication tools, and folksonomies. Tim Berners-Lee is right that all these ideas are alreadyunderlying his original web ideas, however, there aredifferences in emphasis that may cause a qualitativechange. With Web 1.0 technology a significant amount of softwareskills and investment in software was necessary topublish information. Web 2.0 technology changed this dramatically.19Web 2.0 major breakthroughs The four major breakthroughs of Web 2.0 are:1. Blurring the distinction between content consumers and contentproviders.2. Moving from media for individuals towards media forcommunities.3. Blurring the distinction between service consumers and serviceproviders4. Integrating human and machine computing in a new andinnovative way2010

4/23/20181. Blurring the distinction between contentconsumers and content providersWiki, Blogs, and Twiter turned the publication of text in massphenomena, as flickr and youtube did for multimedia212. Moving from a media for individualstowards a media for communitiesSocial web sites such as del.icio.us, facebook, FOAF,linkedin, myspace and Xing allow communities of users tosmoothly interweave their information and activities2211

4/23/20183. Blurring the distinction between serviceconsumers and service providersMashups allow web users to easy integrate services in theirweb site that were implemented by third parties234. Integrating human and machine computingin a new wayAmazon Mechanical Turk - allows to access human servicesthrough a web service interface blurring the distinctionbetween manually and automatically provided services2412

4/23/2018LIMITATIONS OF THE CURRENTWEB25Limitations of the current Web The current Web has its limitations when itcomes to:1. finding relevant information2. extracting relevant information3. combining and reusing information2613

4/23/2018Limitations of the current WebFinding relevant information Finding information on the current Web is based onkeyword search Keyword search has a limited recall and precision dueto:– Synonyms: e.g. Searching information about “Cars” will ignore Web pages that containthe word “Automobiles” even though the information on these pages couldbe relevant– Homonyms: e.g. Searching information about “Jaguar” will bring up pages containinginformation about both “Jaguar” (the car brand) and “Jaguar” (the animal)even though the user is interested only in one of them27Limitations of the current WebFinding relevant information Keyword search has a limited recall and precision duealso to:– Spelling variants: e.g. “organize” in American English vs. “organise” in British English– Spelling mistakes– Multiple languages i.e. information about same topics in published on the Web on differentlanguages (English, German, Italian, ) Current search engines provide no means to specify therelation between a resource and a term– e.g. sell / buy2814

4/23/2018Limitations of the current WebExtracting relevant information One-fit-all automatic solution for extracting information from Web pages isnot possible due to different formats, different syntaxesEven from a single Web page is difficult to extract the relevant informationWhich book isabout the Web?What is the priceof the book?29Limitations of the current WebExtracting relevant information Extracting information from current web sites can bedone using wrappersWEBHTML ed Data,Databases,XMLStructure3015

4/23/2018Limitations of the current WebExtracting relevant information The actual extraction of information from web sites isspecified using standards such as XSL Transformation(XSLT) [1] Extracted information can be stored as structured data inXML format or databases. However, using wrappers do not really scale becausethe actual extraction of information depends again on theweb site format and layout[1] http://www.w3.org/TR/xslt31Limitations of the current WebCombining and reusing information Tasks often require to combine data on theWeb1. Searching for the same information indifferent digital libraries2. Information may come from different websites and needs to be combined3216

4/23/2018Limitations of the current WebCombining and reusing information1. Searches for the same information in different digitallibrariesExample: I want travel from Innsbruck to Rome.33Limitations of the current WebCombining and reusing information2. Information may come from different web sites and needs tobe combinedExample: I want to travel from Innsbruck to Rome where I want to stay in ahotel and visit the city3417

4/23/2018How to improve the current Web? Increasing automatic linking among dataIncreasing recall and precision in searchIncreasing automation in data integrationIncreasing automation in the service life cycle Adding semantics to data and services is thesolution!35TECHNICAL SOLUTION3618

4/23/2018INTRODUCTION TOSEMANTIC WEB37The VisionMore than 3 billion users,more than a trillion pages (2016)StaticWWWURI, HTML, s/3819

4/23/2018The Vision (contd.)Serious problems in Staticinformation finding,information extracting,information representing,information interpreting andand information maintaining.WWWSemantic WebURI, HTML, HTTPRDF, RDF(S), OWL39What is the Semantic Web? “The Semantic Web is an extension of thecurrent web in which information is givenwell-defined meaning, better enablingcomputers and people to work incooperation.”T. Berners-Lee, J. Hendler, O. Lassila, “TheSemantic Web”, Scientific American, May 2001.4020

4/23/2018What is the Semantic Web? The next generation of the WWW Information has machine-processable and machineunderstandable semantics Not a separate Web but an augmentation of thecurrent one The backbone of Semantic Web are ontologies41Ontology definitionunambiguousterminology definitionsconceptual modelof a domain(ontological theory)formal, explicit specification of a shared conceptualizationmachine-readabilitywith computationalsemanticscommonly acceptedunderstandingGruber, “Toward principles for the design of ontologies usedor knowledge sharing?” , Int. J. Hum.-Comput. Stud., vol. 43, no. 5-6,1995.4221

4/23/2018 “well-defined meaning” “An ontology is an explicit specification of aconceptualization”Gruber, “Toward principles for the design of ontologiesused for knowledge sharing?” , Int. J. Hum.-Comput.Stud., vol. 43, no. 5-6,1995. Ontologies are the modeling foundations to SemanticWeb– They provide the well-defined meaning for information43 explicit, specification, conceptualization, An ontology is: A conceptualization– An ontology is a model of the most relevant concepts of aphenomenon from the real world Explicit– The model explicitly states the type of the concepts, therelationships between them and the constraints on their use Formal– The ontology has to be machine readable (the use of thenatural language is excluded) Shared– The knowledge contained in the ontology is consensual, i.e. ithas been accepted by a group of people.Studer, Benjamins, D. Fensel, “Knowledge engineering: Principles and methods”, DataKnowledge Engineering, vol. 25, no. 1-2, 1998.4422

4/23/2018Ontology examplenameConceptconceptual entity of the domainPersonmatr.-nr.Propertyemailattribute describing a conceptresearchfieldisA – hierarchy (taxonomy)Relationrelationship between concepts or propertiesStudentProfessorattendsAxiomcoherency description between Concepts /Properties / Relations via logical sor, Lecture) Lecture.topic Professor.researchField45Types of ontologiesdescribe very generalconcepts like space, time,event, which areindependent of a particularproblem or domainTop Level O., Generic O. Core O.,Foundational O., High-level O,Upper O.DomainOntologyTask & Problemsolving Ontologydescribe thevocabularyrelated to ageneric task oractivity byspecializingthe top-levelontologies.describe thevocabularythe most specificrelated to aontologies. Concepts ingeneric domainApplication Ontologyapplication ontologiesby specializingoften correspond to rolesthe conceptsplayed by domainintroduced inentities while performingthe top-levela certain activity.ontology.Guarino, N. (1998). Formal ontology in information systems:Proceedings of the first international conference (FOIS'98),June 6-8, Trento, Italy (Vol. 46). IOS press.http://www.lirmm.fr/ mugnier/DEA/guarino98formal.pdf4623

4/23/2018The Semantic Web is about Web Data Annotation– connecting (syntactic) Web objects, like text chunks,images, to their semantic notion (e.g., this image isabout Innsbruck, Anna Fensel is a lecturer) Data Linking on the Web (Web of Data)– global networking of knowledge through URI, RDF,and SPARQL (e.g., connecting my calendar with myrss feeds, my pictures, .) Data Integration over the Web– seamless integration of data based on differentconceptual models (e.g., integrating data coming frommy two favorite book sellers)47Web Data AnnotatingOntoprise (formerly), now: http://www.semafora-systems.com4824

4/23/2018Data integration over the Web Data integration involves combining data residing indifferent sources and providing user with a unified viewof these data Data integration over the Web can be implemented asfollows:1. Export the data sets to be integrated as RDF graphs2. Merge identical resources (i.e. resources having the sameURI) from different data sets3. Start making queries on the integrated data, queries thatwere not possible on the individual data sets.49Data integration over the Web1. Export first data set as RDF graphFor example the following RDF graph contains information aboutbook “The Glass Palace” by Amitav ns/SWTutorial/Slides.pdf5025

4/23/2018Data integration over the Web1. Export second data set as RDF graphInformation about the same book but in French this time is modeledin RDF graph ns/SWTutorial/Slides.pdf51Data Integration over the Web2. Merge identical resources (i.e. resources having the same URI)from different data setsSame URI Same tions/SWTutorial/Slides.pdf5226

4/23/2018Data integration over the Web2. Merge identical resources (i.e. resources having the same URI)from different data s/SWTutorial/Slides.pdf53Data integration over the Web3. Start making queries on the integrated data– A user of the second dataset may ask queries like: “giveme the title of the original book”– This information is not in the second dataset– This information can be however retrieved from theintegrated dataset, in which the second dataset wasconnected with the the first dataset5427

4/23/2018ARCHITECTURE AND LANGUAGES55Web Architecture Things are denoted by URIsUse them to denote thingsServe useful information at themDereference them5628

4/23/2018Semantic Web Architecture Give important concepts URIs Each URI identifies one concept Share these symbols between manylanguages Support URI lookup57Semantic Web - Data5829

4/23/2018Identifier, Resource, RepresentationTaken from http://www.w3.org/TR/webarch/59URI, URN, URL A Uniform Resource Identifier (URI) is a string of characters used to identifya name or a resource on the Internet A URI can be a URL or a URNA Uniform Resource Name (URN) defines an item's identity– the URN urn:isbn:0-395-36341-1 is a URI that specifies the identifier system, i.e.International Standard Book Number (ISBN), as well as the unique reference within thatsystem and allows one to talk about a book, but doesn't suggest where and how to obtain anactual copy of itA Uniform Resource Locator (URL) provides a method for finding it–the URL http://www.sti-innsbruck.at/ identifies a resource (STI's home page) and implies thata representation of that resource (such as the home page's current HTML code, as encodedcharacters) is obtainable via HTTP from a network host named www.sti-innsbruck.at6030

4/23/2018eXtensible Markup Language (XML) Language for creating languages– “Meta-language”– XHTML is a language: HTML expressed in XML W3C Recommendation (standard)– XML is, for the information industry,what the container is for international shipping– For structured and semistructured data Main plus: wide support, interoperability– Platform-independent Applying new tools to old data61XML Schema Definition (XSD) A grammar definition language– Like DTDs but better Uses XML syntax– Defined by W3C Primary features– Datatypes e.g. integer, float, date, etc – More powerful content models e.g. namespace-aware, type derivation, etc 6231

4/23/2018Resource Description Framework (RDF) The Resource Description Framework (RDF) provides a domainindependent data model Resource (identified by URIs)– Correspond to nodes in a graph– p://www.w3.org/1999/02/22-rdf-syntax-ns#Property Properties (identified by URIs)– Correspond to labels of edges in a graph– Binary relation between two resources– org/1999/02/22-rdf-syntax-ns#type Literals– Concrete data values– E.g.:"John Smith", "1", "2006-03-07"63636464Resource Description Framework (RDF) –Triple Data Model Triple data model: subject, predicate, object – Subject: Resource or blank node– Predicate: Property– Object: Resource, literal or blank node Example: ex:john, ex:father-of, ex:bill Statement (or triple) as a logical formula P(x, y), where the binarypredicate P relates the object x to the object y. RDF offers only binary predicates (properties).32

4/23/2018Resource Description Framework (RDF) –Graph Model The triple data model can be represented as a graph Such graph is called in the Artificial Intelligence community asemantic net Labeled, directed graphs– Nodes: resources, literals– Labels: properties– Edges: :tom6565RDF Schema (RDFS) RDF Schema (RDFS) is a language for capturing thesemantics of a domain, for example:– In RDF: #john, rdf:type, #Student – What is a “#Student”? RDFS is a language for defining RDF types:– Define classes: “#Student is a class”– Relationships between classes: “#Student is a sub-class of #Person”– Properties of classes: “#Person has a property hasName”6633

4/23/2018RDF Schema (RDFS) Classes: #Student, rdf:type, #rdfs:Class Class hierarchies: #Student, rdfs:subClassOf, #Person Properties: #hasName, rdf:type, rdf:Property Property hierarchies: #hasMother, rdfs:subPropertyOf, #hasParent Associating properties with classes (a):– “The property #hasName only applies to #Person” #hasName, rdfs:domain, #Person Associating properties with classes (b):– “The type of the property #hasName is #xsd:string” #hasName, rdfs:range, xsd:string 67676868RDF Schema (RDFS) - Example34

4/23/2018Web Ontology Language (OWL) RDFS has a number of Limitations:– Only binary relations– Characteristics of Properties, e.g. inverse, transitive, symmetric– Local range restrictions, e.g. for class Person, the property hasNamehas range xsd:string– Complex concept descriptions, e.g. Person is defined by Man andWoman– Cardinality restr

Web 1.0 12 Web 1.0 Netscape – Netscape is associated with the breakthrough of the Web. – Netscape had rapidly a large user community making attractive for others to present their information on the Web. Google – Google is the incarnation of Web 1.0 mega grows – Google indexed already in 2008 more than 1 trillion pages [*]