http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/
Kingsley Idehen's Blog Data Space
I have seen the future and it's full of Linked Data! :-)
Kingsley Uyi Idehen
kidehen@openlinksw.com
2024-03-19T12:17:55Z
Virtuoso Universal Server 08.03.3327
http://www.openlinksw.com:443/weblog/public/images/vbloglogo.gif
Data Spaces
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1662
2011-03-01T23:49:26Z
<p>There is increasing coalescence around the idea that HTTP-based <a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x1e93cbd0">Linked Data</a> adds a tangible dimension to the <a class="auto-href" href="http://dbpedia.org/resource/World_Wide_Web" id="link-id0x1dfdde10">World Wide Web</a> (<a href="http://dbpedia.org/resource/World_Wide_Web">Web</a>). This <i><a href="http://dbpedia.org/resource/Data">Data</a> Dimension</i> grants end-users, power-users, integrators, and developers the ability to experience the Web not solely as a <i><a class="auto-href" href="http://dbpedia.org/resource/Information" id="link-id0x19d02b00">Information</a> Space</i> or <i>Document Space,</i> but now also as a <i><a class="auto-href" href="http://en.wikipedia.org/wiki/Data_Spaces" id="link-id0x1ac33378">Data Space</a>.</i> </p> <p>Here is a simple What and Why guide covering the essence of Data Spaces.</p> <h2>What is a Data Space?</h2> <p>A Data Space is a point of presence on a network, where every <i>Data Object</i> (item or <a class="auto-href" href="http://dbpedia.org/resource/Entity" id="link-id0x1d55f910">entity</a>) is given a <i>Name</i> (e.g., a <a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Identifier" id="link-id0x1736ea28">URI</a>) by which it may be Referenced or Identified. </p> <p>In a Data Space, every <i>Representation</i> of those Data Objects (i.e., every <i>Object Representation</i>) has an <i>Address</i> (e.g., a <a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Locator" id="link-id0x1f17f5a8">URL</a>) from which it may be Retrieved (or "gotten").</p> <p>In a Data Space, every Object Representation is a time variant (that is, it changes over time), streamable, and format-agnostic <i>Resource.</i> </p> <p>An Object Representation is simply a Description of that Object. It takes the form of a graph, pictorially constructed from sets of 3 elements which are themselves named <i>Subject,</i> <i>Predicate,</i> and <i>Object</i> (or <i>SPO</i>); or <i>Entity,</i> <i>Attribute,</i> and <i>Value</i> (or <i>EAV</i>). Each <a class="auto-href" href="http://dbpedia.org/resource/Entity-attribute-value_model" id="link-id0x1dedcfe0">Entity</a>+Attribute+Value or Subject+Predicate+Object set (or <i>triple</i>), is one datum, one piece of data, one persisted observation about a given Subject or Entity.</p> <p>The underlying Schema that defines and constrains the construction of Object Representations is based on Logic, specifically <i>First-Order Logic</i>. Each Object Representation is a collection of persisted observations (<i>Data</i>) about a given Subject, which aid observers in materializing their perception (<i>Information</i>), and ultimately comprehension (<i><a class="auto-href" href="http://dbpedia.org/resource/Knowledge" id="link-id0x1a4c7bf8">Knowledge</a></i>), of that Subject.</p> <h2>Why are Data Spaces important?</h2> <p>In the real-world -- which is networked by nature -- data is heterogeneously (or "differently") shaped, and disparately located. </p> <p>Data has been increasing at an alarming rate since the advent of computing; the interWeb simply provides <a class="auto-href" href="http://dbpedia.org/resource/Context_%28language_use%29" id="link-id0x1ad97358">context</a> that makes this reality more palpable and more exploitable, and in the process virtuously ups the ante through increasingly exponential growth rates.</p> <p>We can't stop data heterogeneity; it is endemic to the nature of its producers -- humans and/or human-directed machines. What we can do, though, is create a powerful Conceptual-level "bus" or "interface" for data integration, based on <i>Data Description oriented Logic</i> rather than Data Representation oriented Formats. Basically, it's possible for us to use a <i><a href="http://en.wikipedia.org/wiki/First-order_predicate_logic" id="link-id0x1a481248">Common Logic</a></i> as the basis for expressing and blending SPO- or EAV-based Object Representations in a variety of Formats (or "dialects").</p> <p>The roadmap boils down to:</p> <ol> <li> <p>Assigning unambiguous Object Names to:</p> <ul> <li> <p>Every record (or, in table terms, every row); </p> </li> <li> <p>Every record attribute (or, in table terms, every field or column);</p> </li> <li> <p>Every record relationship (that is, every relationship between one record and another);</p> </li> <li> <p>Every record container (e.g., every table or view in a relational database, every named graph, every spreadsheet, every text file, etc.);</p> </li> </ul> </li> <li> <p>Making each Object Name resolve to an Address through which Create, Read, Update, and Delete ("CRUD") operations can be performed against (can <i>access</i>) the associated Object Representation graph.</p> </li> </ol>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2011-03-01T17:26:15-05:00
New Preconfigured Virtuoso AMI for Amazon EC2 Cloud comprised of Linked Data from BBC & DBpedia
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1657
2011-02-19T01:20:30Z
<h2>What?</h2> <p>Introducing a new preloaded and preconfigured <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1bbe32d8">Virtuoso</a> (Cluster Edition) AMI for the Amazon EC2 Cloud that hosts combined Linked Datasets from: </p> <ul> <li> <a href="http://dbpedia.org/About" id="link-id0x1d21e780">DBpedia 3.6</a> </li> <li> <a href="http://www.bbc.co.uk/programmes" id="link-id0x1e1e0b10">BBC Programmes</a> </li> <li> <a href="http://www.bbc.co.uk/music" id="link-id0x1db12bd0">BBC Music</a> </li> <li> <a href="http://www.bbc.co.uk/nature/" id="link-id0x1bd46450">BBC Nature</a> </li> <li> <a href="http://www.bbc.co.uk/food/recipes/" id="link-id0x1d1b2468">BBC Food Recipes</a> </li> </ul> <h2>Why?</h2> <p> Predictably instantiate a powerful database with high quality <a href="http://dbpedia.org/resource/Data">data</a> and cross links within minutes, for personal or service specific use. </p> <h2>How?</h2> <p>Simply follow the instructions in our <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtAWSBBCMusicProgNatureFoodAndDBpedia36" id="link-id0x1d4f3210">Amazon EC2 guide for the BBC + DBpedia 3.6 Linked Dataset</a> guide.</p> <p>Your installation steps are as follows:</p> <ol> <li> Instantiate a Virtuoso EC2 AMI </li> <li> Mount the Amazon Elastic Block Storage (EBS) snapshot that hosts the preloaded Virtuoso Database. </li> </ol> <h2>Related</h2> <ul> <li> <a href="http://www.slideshare.net/reduxd/beyond-the-polar-bear" id="link-id0x1b384af0">BBC Linked Data Spaces Presentation</a> </li> <li> <a href="http://kidehen-images.s3.amazonaws.com/bbc_music_solo_artists_snapshot.png" id="link-id0x1a7a5ae0">BBC Music Linked Dataset Snapshot</a> -- PivotViewer Page Screenshot </li> <li> <a href="http://kidehen-images.s3.amazonaws.com/bbc_programmes_snapshot_sorted_by_genre.png" id="link-id0x1c2022a8">BBC Programmes Linked Dataset Snapshot</a> -- -- PivotViewer Page Screenshot </li> <li> <a href="http://kidehen-images.s3.amazonaws.com/bbc_nature_snapshot_sorted_by_adaptation.png" id="link-id0x1e138ac0">BBC Nature Linked Dataset Snapshot</a> -- PivotViewer Page Screenshot </li> <li> <a href="http://kidehen-images.s3.amazonaws.com/bbc_recipes_snapshot.png" id="link-id0x1b795100">BBC Food Recipes Snapshot </a> -- PivotViewer Page Screenshot </li> <li> <a href="http://www.delicious.com/kidehen/bbc_linkeddata" id="link-id0x1a581cf8">My Del.icio.us bookmark collection re. BBC Linked Data Demos</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtAWSDBpediaBBC" id="link-id0x1dc0cc08">Amazon EC2 Snapshots for DBpedia 3.6 + BBC combo</a> -- delivers the BBC and DBpedia dataset combo via a mountable Elastic Block Storage (EBS) device usable with an Amazon Machine Image (AMI) </li> <li> <a href="http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtAWSDBpedia351C" id="link-id0x1de33b50">Amazon EC2 Snapshots for DBpedia 3.6 & 3.5</a> </li> <li> <a href="http://virtuoso.openlinksw.com/download/" id="link-id0x1c3e27c8">Virtuoso Commercial Edition Download Page</a> </li> <li> <a href="http://docs.openlinksw.com/virtuoso/clusterstcnf.html" id="link-id0x1d0ff170">Virtuoso Cluster Edition Guide</a> </li> </ul>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2011-03-29T09:52:17.000001-04:00
DBpedia + BBC (combined) Linked Data Space Installation Guide
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1656
2011-02-17T22:15:41Z
<h2>What? </h2> <p> The <i><a class="auto-href" href="http://dbpedia.org/resource/DBpedia" id="link-id0x1c489cc8">DBpedia</a> + <a class="auto-href" href="http://dbpedia.org/resource/BBC" id="link-id0x1bf12698">BBC</a> Combo Linked Dataset </i> is a preconfigured <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1b16cbb0">Virtuoso</a> Cluster (4 Virtuoso Cluster Nodes, each comprised of one Virtuoso Instance; initial deployment is to a single Cluster Host, but license may be converted for physically distributed deployment), available via the Amazon EC2 Cloud, preloaded with the following datasets: </p> <ul> <li> <a href="http://dbpedia.org/About" id="link-id0x1d21e780">DBpedia 3.6</a> </li> <li> <a href="http://www.bbc.co.uk/programmes" id="link-id0x1e1e0b10">BBC Programmes</a> </li> <li> <a href="http://www.bbc.co.uk/music" id="link-id0x1db12bd0">BBC Music</a> </li> <li> <a href="http://www.bbc.co.uk/nature/" id="link-id0x1bd46450">BBC Nature</a> </li> <li> <a href="http://www.bbc.co.uk/food/recipes/" id="link-id0x1d1b2468">BBC Food Recipes</a> </li> </ul> <h2>Why?</h2> <p>The BBC has been publishing <a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x1b15eb60">Linked Data</a> from its <a href="http://dbpedia.org/resource/World_Wide_Web">Web</a> <a class="auto-href" href="http://en.wikipedia.org/wiki/Data_Spaces" id="link-id0x1c4c38a8">Data Space</a> for a number of years. In line with best practices for injecting Linked Data into the <a class="auto-href" href="http://dbpedia.org/resource/World_Wide_Web" id="link-id0x1e5acda0">World Wide Web</a> (Web), the BBC datasets are interlinked with other datasets such as DBpedia and MusicBrainz. </p> <p>Typical follow-your-nose exploration using a Web Browser (or even via sophisticated <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x1d21e728">SPARQL</a> query crawls) isn't always practical once you get past the initial euphoria that comes from comprehending the Linked Data concept. As your queries get more complex, the overhead of remote sub-queries increases its impact, until query results take so long to return that you simply give up.</p> <p>Thus, maximizing the effects of the BBC's efforts requires Linked Data that shares locality in a Web-accessible Data Space — i.e., where all Linked Data sets have been loaded into the same data store or warehouse. This holds true even when leveraging SPARQL-FED style virtualization — there's always a need to localize data as part of any marginally-decent locality-aware cost-optimization algorithm.</p> <p>This DBpedia + BBC dataset, exposed via a preloaded and preconfigured Virtuoso Cluster, delivers a practical point of presence on the Web for immediate and cost-effective exploitation of Linked Data at the individual and/or service specific levels.</p> <h2>How?</h2> To work through this guide, you'll need to start with 90 GB of free disk space. (Only 41 GB will be consumed after you delete the installer archives, but starting with 90+ GB ensures enough work space for the installation.) <h3>Install Virtuoso</h3> <ol> <li> <p> <a href="http://virtuoso.openlinksw.com/download/" id="link-id0x1af0d230">Download Virtuoso installer archive(s)</a>. You must deploy the Personal or Enterprise Edition; the Open Source Edition does not support Shared-Nothing Cluster Deployment.</p> </li> <li> <p> <a href="http://virtuoso.openlinksw.com/pricing/" id="link-id0x1e089f40">Obtain a Virtuoso Cluster license</a>.</p> </li> <li> <p> <a href="http://wikis.openlinksw.com/dataspace/owiki/wiki/VirtuosoWikiWeb/VirtuosoInstallDocs" id="link-id0x1e86d060">Install Virtuoso</a>.</p> </li> <li> <p>Set key environment variables and start the OpenLink License Manager, using command (this may vary depending on your shell and install directory): </p> <blockquote> <code>. /opt/virtuoso/virtuoso-enterprise.sh</code> </blockquote> </li> <li> <p> <i>Optional:</i> To keep the default single-server configuration file and demo database intact, set the <code>VIRTUOSO_HOME</code> environment variable to a different directory, e.g., </p> <blockquote> <code>export VIRTUOSO_HOME=/opt/virtuoso/cluster-home/</code> </blockquote> <p> <i><b>Note:</b> You will have to adjust this setting every time you shift between this cluster setup and your single-server setup. Either may be made your environment's default through the <code>virtuoso-enterprise.sh</code> and related scripts.</i> </p> </li> <li> <p> <a href="http://docs.openlinksw.com/virtuoso/clusterstcnf.html" id="link-id0x1e184dc0">Set up your cluster</a> by running the <code>mkcluster.sh</code> script. Note that initial deployment of the <i>DBpedia + BBC Combo</i> requires a 4 node cluster, which is the default for this script.</p> </li> <li> <p>Start the Virtuoso Cluster with this command:</p> <blockquote> <code>virtuoso-start.sh</code> </blockquote> </li> <li> <p>Stop the Virtuoso Cluster with this command:</p> <blockquote> <code>virtuoso-stop.sh</code> </blockquote> </li> </ol> <h3>Using the DBpedia + BBC Combo dataset</h3> <ol> <li> <p>Navigate to your installation directory.</p> </li> <li> <p>Download the combo dataset installer script — <code><a href="https://s3.amazonaws.com/bbc-dbpedia-36-usa/bbc-dbpedia-install.sh" id="link-id0x195d7940">bbc-dbpedia-install.sh</a></code>.</p> </li> <li> <p>For best results, set the downloaded script to fully executable using this command:</p> <blockquote> <code>chmod 755 bbc-dbpedia-install.sh </code> </blockquote> </li> <li> <p>Shut down any Virtuoso instances that may be currently running.</p> </li> <li> <p> <i>Optional:</i> As above, if you have decided to keep the default single-server configuration file and demo database intact, set the <code>VIRTUOSO_HOME</code> environment variable appropriately, e.g., </p> <blockquote> <code>export VIRTUOSO_HOME=/opt/virtuoso/cluster-home/</code> </blockquote> </li> <li> <p>Run the combo dataset installer script with this command:</p> <blockquote> <code>sh bbc-dbpedia-install.sh</code> </blockquote> </li> </ol> <h3>Verify installation</h3> <p>The combo dataset typically deploys to EC2 virtual machines in under 90 minutes; your time will vary depending on your network connection speed, machine speed, and other variables.</p> <p>Once the script completes, perform the following steps:</p> <ol> <li> <p>Verify that the Virtuoso Conductor (HTTP-based Admin UI) is in place via:</p> <blockquote> <code>http://localhost:[port]/conductor</code> </blockquote> </li> <li> <p>Verify that the Virtuoso SPARQL endpoint is in place via:</p> <blockquote> <code>http://localhost:[port]/sparql</code> </blockquote> </li> <li> <p>Verify that the Precision Search & Find UI is in place via:</p> <blockquote> <code>http://localhost:[port]/fct</code> </blockquote> </li> <li> <p>Verify that the Virtuoso hosted PivotViewer is in place via:</p> <blockquote> <code>http://localhost:[port]/PivotViewer</code> </blockquote> </li> </ol> <h2>Related</h2> <ul> <li> <a href="http://www.slideshare.net/reduxd/beyond-the-polar-bear" id="link-id0x1bd43bf0">BBC Linked Data Spaces Presentation</a> </li> <li> <a href="http://kidehen-images.s3.amazonaws.com/bbc_music_solo_artists_snapshot.png" id="link-id0x1a7a5ae0">BBC Music Linked Dataset Snapshot</a> -- PivotViewer Page Screenshot </li> <li> <a href="http://kidehen-images.s3.amazonaws.com/bbc_programmes_snapshot_sorted_by_genre.png" id="link-id0x1c2022a8">BBC Programmes Linked Dataset Snapshot</a> -- -- PivotViewer Page Screenshot </li> <li> <a href="http://kidehen-images.s3.amazonaws.com/bbc_nature_snapshot_sorted_by_adaptation.png" id="link-id0x1e138ac0">BBC Nature Linked Dataset Snapshot</a> -- PivotViewer Page Screenshot </li> <li> <a href="http://kidehen-images.s3.amazonaws.com/bbc_recipes_snapshot.png" id="link-id0x1b795100">BBC Food Recipes Snapshot </a> -- PivotViewer Page Screenshot </li> <li> <a href="http://www.delicious.com/kidehen/bbc_linkeddata" id="link-id0x1c0ffcc8">My Del.icio.us bookmark collection re. BBC Linked Data Demos</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtAWSDBpediaBBC" id="link-id0x1dc0cc08">Amazon EC2 Snapshots for DBpedia 3.6 + BBC combo</a> -- delivers the BBC and DBpedia dataset combo via a mountable Elasti<a class="auto-href" href="http://dbpedia.org/resource/C%2B%2B" id="link-id0x1c2ad728">c</a> Block Storage (EBS) device usable with an Amazon Machine Image (AMI) </li> <li> <a href="http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtAWSDBpedia351C" id="link-id0x1de33b50">Amazon EC2 Snapshots for DBpedia 3.6 & 3.5</a> </li> <li> <a href="http://virtuoso.openlinksw.com/download/" id="link-id0x1c3e27c8">Virtuoso Commercial Edition Download Page</a> </li> <li> <a href="http://docs.openlinksw.com/virtuoso/clusterstcnf.html" id="link-id0x1d0ff170">Virtuoso Cluster Edition Guide</a> </li> </ul>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2011-03-29T10:09:45.000001-04:00
Virtuoso + DBpedia 3.6 Installation Guide (Update 1)
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1654
2011-01-25T01:08:55Z
<h3>What is <a class="auto-href" href="http://dbpedia.org/resource/DBpedia" id="link-id0x1d8b5df0">DBpedia</a>?</h3> <p> DBpedia is a community effort to provide a contemporary deductive database derived from Wikipedia content. Project contributions can be partitioned as follows: </p> <ol> <li> Ontology Construction and Maintenance </li> <li> Dataset Generation via Wikipedia Content Extraction & Transformation </li> <li> Live Database Maintenance & Administration -- includes actual <a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x1ba81190">Linked Data</a> loading and publishing, provision of <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x1d8af808">SPARQL</a> endpoint, and traditional DBA activity </li> <li> Internationalization. </li> </ol> <h3>Why is DBpedia important?</h3> <p> Comprising the nucleus of the Linked Open <a href="http://dbpedia.org/resource/Data">Data</a> effort, DBpedia also serves as a fulcrum for the burgeoning <a href="http://dbpedia.org/resource/World_Wide_Web">Web</a> of Linked Data by delivering a dense and highly-interlinked lookup database. In its most basic form, DBpedia is a great source of strong and resolvable identifiers for People, Places, Organizations, Subject Matter, and many other data items of interest. Naturally, it provides a fantastic starting point for comprehending the fundamental concepts underlying <a class="auto-href" href="http://www.w3.org/People/Berners-Lee/card#i" id="link-id0x1a8cc3d0">TimBL</a>'s initial <a href="http://blogs.usnet.private:8893/www.w3.org/DesignIssues/LinkedData.html" id="link-id0x1cbbaf50">Linked Data</a> meme. </p> <h3>How do I use DBpedia?</h3> <p> Depending on your particular requirements, whether personal or service-specific, DBpedia offers the following: </p> <ul> <li> Datasets that can be loaded on your deductive database (also known as triple or quad stores) platform of choice </li> <li> Live browsable HTML+<a class="auto-href" href="http://dbpedia.org/resource/RDFa" id="link-id0x1d6b2148">RDFa</a> based <a class="auto-href" href="http://dbpedia.org/resource/Entity" id="link-id0x1d766a98">entity</a> description pages </li> <li> A wide variety of data formats for importing entity description data into a broad range of existing applications and services </li> <li> A SPARQL endpoint allowing ad-hoc querying over HTTP using the SPARQL query language, and delivering results serialized in a variety of formats </li> <li> A broad variety of tools covering query by example, faceted browsing, <a class="auto-href" href="http://dbpedia.org/resource/Full_text_search" id="link-id0x1b330ff8">full text search</a>, entity name lookups, etc. </li> </ul> <h3>What is the DBpedia 3.6 + <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1d705780">Virtuoso</a> Cluster Edition Combo?</h3> <p> <a class="auto-href" href="http://www.openlinksw.com/dataspace/organization/openlink#this" id="link-id0x1c894338">OpenLink Software</a> has preloaded the DBpedia 3.6 datasets into a preconfigured Virtuoso Cluster Edition database, and made the package available for easy installation.</p> <h3>Why is the DBpedia+Virtuoso package important?</h3> <p> The DBpedia+Virtuoso package provides a cost-effective option for personal or service-specific incarnations of DBpedia. </p> <p>For instance, you may have a service that isn't best-served by competing with the rest of the world for ad-hoc query time and resources on the live instance, which itself operates under various restrictions which enable this ad-hoc query service to be provided at Web Scale.</p> <p>Now you can easily commission your own instance and quickly exploit DBpedia and Virtuoso's database feature set to the max, powered by your own hardware and network infrastructure. </p> <h3>How do I use the DBpedia+Virtuoso package?</h3> <p>Pre-requisites are simply:</p> <ol> <li> <a href="http://wikis.openlinksw.com/dataspace/owiki/wiki/VirtuosoWikiWeb/VirtuosoInstallConfig" id="link-id0x19e3e450">Functional Virtuoso Cluster Edition installation</a>. </li> <li> <a href="http://virtuoso.openlinksw.com/pricing/" id="link-id0x1b703ad8">Virtuoso Cluster Edition License</a>. </li> <li>90 GB of free disk space -- you ultimately only need 43 gigs, but this our recommended free disk space size pre installation completion.</li> </ol> <p> To install the Virtuoso Cluster Edition simply perform the following steps: </p> <ol> <li> <a href="http://virtuoso.openlinksw.com/download/" id="link-id0x17b41648">Download Software</a>. </li> <li> Run installer </li> <li> <p>Set key environment variables and start the OpenLink License Manager, using command (this may vary depending on your shell): </p> <blockquote> <code>. /opt/virtuoso/virtuoso-enterprise.sh</code> </blockquote> </li> <li> Run the <code>mkcluster.sh</code> script which defaults to a 4 node cluster </li> <li> Set <code>VIRTUOSO_HOME</code> environment variable -- if you want to start cluster databases distinct from single server databases via distinct root directory for database files (one that isn't adjacent to single-server database directories) </li> <li> Start Virtuoso Cluster Edition instances using command: <blockquote> <code>virtuoso-start.sh</code> </blockquote> </li> <li> Stop Virtuoso Cluster Edition instances using command: <blockquote> <code>virtuoso-stop.sh</code> </blockquote> </li> </ol> <p>To install your personal or service specific edition of DBpedia simply perform the following steps:</p> <ol> <li> Navigate to your installation directory </li> <li> Download Installer script (<code><a href="https://s3.amazonaws.com/dbpedia-36-usa/dbpedia-install.sh" id="link-id0x1da0c978">dbpedia-install.sh</a></code>) </li> <li> Set execution mode on script using command: <blockquote> <code>chmod 755 dbpedia-install.sh </code> </blockquote> </li> <li> Shutdown any Virtuoso instances that may be currently running </li> <li> Set your <code>VIRTUOSO_HOME</code> environment variable, e.g., to the current directory, via command (this may vary depending on your shell): <blockquote> <code>export VIRTUOSO_HOME=`pwd`</code> </blockquote> </li> <li> Run script using command: <blockquote> <code>sh dbpedia-install.sh</code> </blockquote> </li> </ol> <p> Once the installation completes (approximately 1 hour and 30 minutes from start time), perform the following steps: </p> <ol> <li> Verify that the Virtuoso Conductor (HTML based Admin UI) is in place via: <blockquote> <code>http://localhost:[port]/conductor</code> </blockquote> </li> <li> Verify that the Precision Search & Find UI is in place via: <blockquote> <code>http://localhost:[port]/fct</code> </blockquote> </li> <li>Verify that DBpedia's Green Entity Description Pages are in place via: <blockquote> <code>http://localhost:[port]/resource/DBpedia</code> </blockquote> </li> </ol> <h3>Related</h3> <ul> <li> <a href="http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtAWSDBpedia351C" id="link-id0x1d819b90">Amazon EC2 Snapshots for DBpedia 3.6 & 3.5</a> </li> <li> <a href="http://virtuoso.openlinksw.com/download/" id="link-id0x1c3e27c8">Virtuoso Commercial Edition Download Page</a> </li> <li> <a href="http://docs.openlinksw.com/virtuoso/clusterstcnf.html" id="link-id0x1d0ff170">Virtuoso Cluster Edition Guide</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1594" id="link-id0x1c891cf8">What is the DBpedia Project?</a> </li> </ul>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2011-01-25T14:46:26-05:00
SPARQL Guide for the Javascript Developer
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1653
2011-01-21T19:59:49Z
<h3>What?</h3> <p>A simple guide usable by any Javascript developer seeking to exploit <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x17b447e8">SPARQL</a> without hassles.</p> <h3>Why?</h3> <p>SPARQL is a powerful query language, results serialization format, and an HTTP based <a href="http://dbpedia.org/resource/Data">data</a> access protocol from the W3C. It provides a mechanism for accessing and integrating data across <a href="http://en.wikipedia.org/wiki/Deductive_database" id="link-id0x1cc76540">Deductive Database Systems</a> (colloquially referred to as triple or quad stores in <a class="auto-href" href="http://dbpedia.org/resource/Semantic_Web" id="link-id0x1d944d78">Semantic Web</a> and <a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x1c7a87c8">Linked Data</a> circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form. </p> <h3>How?</h3> <p>SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing.</p> <h4>Steps:</h4> <ol> <li>Determine which SPARQL endpoint you want to access e.g. <a href="http://dbpedia.org/sparql" id="link-id0x1d476520">DBpedia</a> or a local <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1bcfe140">Virtuoso</a> instance (typically: http://localhost:8890/sparql). </li> <li>If using Virtuoso, and you want to populate its quad store using SPARQL, assign "<a href="http://docs.openlinksw.com/virtuoso/rdfsparql.html#rdfsupportedprotocolendpointuri" id="link-id0x1c7630b8">SPARQL_SPONGE</a>" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).</li> </ol> <h4>Script:</h4> <pre> /* Demonstrating use of a single query to populate a # Virtuoso Quad Store via Javascript. */ /* HTTP <a href="http://dbpedia.org/resource/Uniform_Resource_Locator" id="link-id0x1bc27a18">URL</a> is constructed accordingly with JSON query results format as the default via mime type. */ function sparqlQuery(query, baseURL, format) { if(!format) format="application/json"; var params={ "default-graph": "", "should-sponge": "soft", "query": query, "debug": "on", "timeout": "", "format": format, "save": "display", "fname": "" }; var querypart=""; for(var k in params) { querypart+=k+"="+encodeURIComponent(params[k])+"&"; } var queryURL=baseURL + '?' + querypart; if (window.XMLHttpRequest) { xmlhttp=new XMLHttpRequest(); } else { xmlhttp=new ActiveXObject("Microsoft.XMLHTTP"); } xmlhttp.open("GET",queryURL,false); xmlhttp.send(); return JSON.parse(xmlhttp.responseText); } /* setting Data Source Name (DSN) */ var dsn="http://dbpedia.org/resource/DBpedia"; /* Virtuoso pragma "DEFINE get:soft "replace" instructs Virtuoso SPARQL engine to perform an HTTP GET using the IRI in FROM clause as Data Source URL with regards to DBMS record inserts */ var query="DEFINE get:soft \"replace\"\nSELECT DISTINCT * FROM <"+dsn+"> WHERE {?s ?p ?o}"; var data=sparqlQuery(query, "/sparql/"); </pre> <h4>Output</h4> <p> Place the snippet above into the <script/> section of an HTML document to see the <a href="http://twitpic.com/3s2vs3/full" id="link-id0x1cff2288">query result</a>. </p> <h3>Conclusion</h3> <p> JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Javascript developer that already knows how to use Javascript for HTTP based data access within HTML. SPARQL just provides an added bonus to URL dexterity (delivered via <a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Identifier" id="link-id0x1d29da98">URI</a> abstraction) with regards to constructing Data Source Names or Addresses.</p> <h3>Related</h3> <ul> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1652" id="link-id0x1b0ffb28">SPARQL Guide for the PHP Developer</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1651" id="link-id0x1a8c5ae0">SPARQL Guide for the Python Developer</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1648" id="link-id0x1b86ad28">SPARQL Guide for the Ruby Developer</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1646" id="link-id0x1c7af188">Simple Guide for using SPARQL with Virtuoso</a> </li> <li> <a href="http://www.delicious.com/kidehen/sparql_tutorial" id="link-id0x1ac1ba48">General SPARQL Tutorial Collection</a> </li> <li> <a href="http://www.delicious.com/kidehen/virtuoso_sparql_tutorial" id="link-id0x1c7be660">Virtuoso Specific SPARQL Tutorial Collection</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1567" id="link-id0x1c52b438">The URI, URL, and Linked Data Meme's Generic HTTP URI</a>. </li> </ul>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2011-01-26T18:10:28-05:00
SPARQL Guide for the PHP Developer
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1652
2011-01-20T21:25:49Z
<h3>What?</h3> <p>A simple guide usable by any <a class="auto-href" href="http://dbpedia.org/resource/PHP_programming_language" id="link-id0x1bdca7b8">PHP</a> developer seeking to exploit <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x1c894338">SPARQL</a> without hassles.</p> <h3>Why?</h3> <p>SPARQL is a powerful query language, results serialization format, and an HTTP based <a href="http://dbpedia.org/resource/Data">data</a> access protocol from the W3C. It provides a mechanism for accessing and integrating data across <a href="http://en.wikipedia.org/wiki/Deductive_database" id="link-id0x1c319af0">Deductive Database Systems</a> (colloquially referred to as triple or quad stores in <a class="auto-href" href="http://dbpedia.org/resource/Semantic_Web" id="link-id0x1d944d78">Semantic Web</a> and <a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x1c7a87c8">Linked Data</a> circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form. </p> <h3>How?</h3> <p>SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing e.g. local object binding re. PHP.</p> <h4>Steps:</h4> <ol> <li> From your command line execute: aptitude search '^PHP26', to verify PHP is in place </li> <li>Determine which SPARQL endpoint you want to access e.g. <a href="http://dbpedia.org/sparql" id="link-id0x1d476520">DBpedia</a> or a local <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1bcfe140">Virtuoso</a> instance (typically: http://localhost:8890/sparql). </li> <li>If using Virtuoso, and you want to populate its quad store using SPARQL, assign "<a href="http://docs.openlinksw.com/virtuoso/rdfsparql.html#rdfsupportedprotocolendpointuri" id="link-id0x1c7630b8">SPARQL_SPONGE</a>" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).</li> </ol> <h4>Script:</h4> <pre> #!/usr/bin/env php <?php # # Demonstrating use of a single query to populate a # Virtuoso Quad Store via PHP. # # HTTP <a href="http://dbpedia.org/resource/Uniform_Resource_Locator" id="link-id0x1ce1d6d8">URL</a> is constructed accordingly with JSON query results format in mind. function sparqlQuery($query, $baseURL, $format="application/json") { $params=array( "default-graph" => "", "should-sponge" => "soft", "query" => $query, "debug" => "on", "timeout" => "", "format" => $format, "save" => "display", "fname" => "" ); $querypart="?"; foreach($params as $name => $value) { $querypart=$querypart . $name . '=' . urlencode($value) . "&"; } $sparqlURL=$baseURL . $querypart; return json_decode(file_get_contents($sparqlURL)); }; # Setting Data Source Name (DSN) $dsn="http://dbpedia.org/resource/DBpedia"; #Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET #using the IRI in FROM clause as Data Source URL $query="DEFINE get:soft \"replace\" SELECT DISTINCT * FROM <$dsn> WHERE {?s ?p ?o}"; $data=sparqlQuery($query, "http://localhost:8890/sparql/"); print "Retrieved data:\n" . json_encode($data); ?> </pre> <h4>Output</h4> <pre> Retrieved data: {"head": {"link":[],"vars":["s","p","o"]}, "results": {"distinct":false,"ordered":true, "bindings":[ {"s": {"type":"<a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Identifier" id="link-id0x1ca44a98">uri</a>","value":"http:\/\/dbpedia.org\/resource\/DBpedia"},"p": {"type":"uri","value":"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"},"o": {"type":"uri","value":"http:\/\/www.w3.org\/2002\/07\/owl#Thing"}}, {"s": {"type":"uri","value":"http:\/\/dbpedia.org\/resource\/DBpedia"},"p": {"type":"uri","value":"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"},"o": {"type":"uri","value":"http:\/\/dbpedia.org\/ontology\/Work"}}, {"s": {"type":"uri","value":"http:\/\/dbpedia.org\/resource\/DBpedia"},"p": {"type":"uri","value":"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"},"o": {"type":"uri","value":"http:\/\/dbpedia.org\/class\/yago\/Software106566077"}}, ... </pre> <h3>Conclusion</h3> <p> JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a PHP developer that already knows how to use PHP for HTTP based data access. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.</p> <h3>Related</h3> <ul> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1651" id="link-id0x1a8c5ae0">SPARQL Guide for the Python Developer</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1648" id="link-id0x1b86ad28">SPARQL Guide for the Ruby Developer</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1646" id="link-id0x1c7af188">Simple Guide for using SPARQL with Virtuoso</a> </li> <li> <a href="http://www.delicious.com/kidehen/sparql_tutorial" id="link-id0x1ac1ba48">General SPARQL Tutorial Collection</a> </li> <li> <a href="http://www.delicious.com/kidehen/virtuoso_sparql_tutorial" id="link-id0x1c7be660">Virtuoso Specific SPARQL Tutorial Collection</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1567" id="link-id0x1c52b438">The URI, URL, and Linked Data Meme's Generic HTTP URI</a>. </li> </ul>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2011-01-25T10:36:58-05:00
SPARQL Guide for Python Developer
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1651
2011-01-19T17:13:30Z
<h3>What?</h3> <p>A simple guide usable by any <a class="auto-href" href="http://dbpedia.org/resource/Python_programming_language" id="link-id0x1bdca7b8">Python</a> developer seeking to exploit <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x1c894338">SPARQL</a> without hassles.</p> <h3>Why?</h3> <p>SPARQL is a powerful query language, results serialization format, and an HTTP based <a href="http://dbpedia.org/resource/Data">data</a> access protocol from the W3C. It provides a mechanism for accessing and integrating data across <a href="http://en.wikipedia.org/wiki/Deductive_database" id="link-id0x1c319af0">Deductive Database Systems</a> (colloquially referred to as triple or quad stores in <a class="auto-href" href="http://dbpedia.org/resource/Semantic_Web" id="link-id0x1d944d78">Semantic Web</a> and <a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x1c7a87c8">Linked Data</a> circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form. </p> <h3>How?</h3> <p>SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing e.g. local object binding re. Python.</p> <h4>Steps:</h4> <ol> <li> From your command line execute: aptitude search '^python26', to verify Python is in place </li> <li>Determine which SPARQL endpoint you want to access e.g. <a href="http://dbpedia.org/sparql" id="link-id0x1d476520">DBpedia</a> or a local <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1bcfe140">Virtuoso</a> instance (typically: http://localhost:8890/sparql). </li> <li>If using Virtuoso, and you want to populate its quad store using SPARQL, assign "<a href="http://docs.openlinksw.com/virtuoso/rdfsparql.html#rdfsupportedprotocolendpointuri" id="link-id0x1c7630b8">SPARQL_SPONGE</a>" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).</li> </ol> <h4>Script:</h4> <pre> #!/usr/bin/env python # # Demonstrating use of a single query to populate a # Virtuoso Quad Store via Python. # import urllib, json # HTTP <a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Locator" id="link-id0x1bd91cf0">URL</a> is constructed accordingly with JSON query results format in mind. def sparqlQuery(query, baseURL, format="application/json"): params={ "default-graph": "", "should-sponge": "soft", "query": query, "debug": "on", "timeout": "", "format": format, "save": "display", "fname": "" } querypart=urllib.urlencode(params) response = urllib.urlopen(baseURL,querypart).read() return json.loads(response) # Setting Data Source Name (DSN) dsn="http://dbpedia.org/resource/DBpedia" # Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET # using the IRI in FROM clause as Data Source URL query="""DEFINE get:soft "replace" SELECT DISTINCT * FROM <%s> WHERE {?s ?p ?o}""" % dsn data=sparqlQuery(query, "http://localhost:8890/sparql/") print "Retrieved data:\n" + json.dumps(data, sort_keys=True, indent=4) # # End </pre> <h4>Output</h4> <pre> Retrieved data: { "head": { "link": [], "vars": [ "s", "p", "o" ] }, "results": { "bindings": [ { "o": { "type": "<a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Identifier" id="link-id0x1b1470b8">uri</a>", "value": "http://www.w3.org/2002/07/owl#Thing" }, "p": { "type": "uri", "value": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" }, "s": { "type": "uri", "value": "http://dbpedia.org/resource/DBpedia" } }, ... </pre> <h3>Conclusion</h3> <p> JSON was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Python developer that already knows how to use Python for HTTP based data access. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.</p> <h3>Related</h3> <ul> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1648" id="link-id0x1c9e26b0">SPARQL Guide for the Ruby Developer</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1646" id="link-id0x1c7af188">Simple Guide for using SPARQL with Virtuoso</a> </li> <li> <a href="http://www.delicious.com/kidehen/sparql_tutorial" id="link-id0x1ac1ba48">General SPARQL Tutorial Collection</a> </li> <li> <a href="http://www.delicious.com/kidehen/virtuoso_sparql_tutorial" id="link-id0x1c7be660">Virtuoso Specific SPARQL Tutorial Collection</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1567" id="link-id0x1c52b438">The URI, URL, and Linked Data Meme's Generic HTTP URI</a>. </li> </ul>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2011-01-25T10:35:46-05:00
Rough draft poem: Document, what art thou?
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1646
2010-11-11T18:44:36Z
<em>I am the <a href="http://dbpedia.org/resource/Data">Data</a> Container, Disseminator, and Canvas.<br /> I came to be when the cognitive skills of mankind deemed oral history inadequate.<br /> I am transcendent, I take many forms, but my core purpose is constant - Container, Disseminator, and Canvas.<br /> I am dexterous, so I can be blank, partitioned horizontally, horizontally and vertically, and if you get moi excited and I'll show you fractals.<br /> I am accessible in a number of ways, across a plethora of media.<br /> I am loose, so you can access my content too.<br /> I am loose in a cool way, so you can refer to moi independent of my content.<br /> I am cool in a loose way, so you can refer to my content independent of moi.<br /> I am even cool and loose enough to let you figure out stuff from my content including how its totally distinct from moi.<br /> <strong>But...</strong> <br /> I am possessive about my coolness, so all Containment, Dissemination, and Canvas requirements must first call upon moi, wherever I might be.<br /> <strong>So...</strong> <br /> If you postulate about my demise or irrelevance, across any medium, I will punish you with confusion!<br /> <strong>Remember...</strong> <br /> I just told you who I am. <br /> <strong>Lesson to be learned..</strong> <br /> When something tells you what it is, and it is as powerful as I, best you believe it.<br /> BTW -- I am Okay with HTTP response code 200 OK :-) </em>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2010-11-12T18:08:25-05:00
7 Things Brought to You by HTTP-based Hypermedia
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1644
2010-11-08T21:43:28Z
<p>There are some very powerful benefits that accrue from the use of <a href="http://dbpedia.org/resource/Hypertext_Transfer_Protocol" id="link-id0x1b498648">HTTP</a> based <a href="http://dbpedia.org/resource/Hypermedia" id="link-id0x1be1e208">Hypermedia</a>. 7 that come to mind immediately include: </p> <ol> <li>Structured & Platform Independent Enterprise <a class="auto-href" href="http://dbpedia.org/resource/Federated_database_system" id="link-id0x1ab5d6c8">Data Virtualization</a> -- concrete conceptual level access and provisioning of abstract domain entities such as Customers, Orders, Employees, Products, Countries, Competitors etc.</li> <li>Distributed Application State (<a href="http://dbpedia.org/resource/Representational_State_Transfer" id="link-id0x1a8a0e38">REST</a>) -- application state transitions via links</li> <li> Structured Data Representation (<a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x1acf1aa0">Linked Data</a>) -- whole data data representation via links </li> <li> Structured Identity (<a href="http://esw.w3.org/WebID" id="link-id0x1a484548">WebID</a>) -- verifiable distributed identity </li> <li> Structured Profiles (<a class="auto-href" href="http://dbpedia.org/resource/Friend_of_a_friend" id="link-id0xa00bca8">FOAF</a>) -- platform independent profiles for people and organizations </li> <li> Articulation of Structured Value Propositions (<a href="http://www.heppnetz.de/projects/goodrelations/" id="link-id0x1a4793d0">GoodRelations</a>) -- Product & Service Offers, Business Entities, Locations, Business Hours, etc. </li> <li> Structured Collaboration Spaces (<a href="http://rdfs.org/sioc/spec/" id="link-id0x1afb8b40">SIOC</a>) -- Blogs, Wikis, File Sharing, Discussion Forums, Aggregated Feeds, Statuses, Photo Galleries, Polls etc.</li> </ol>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2010-11-08T15:29:43-05:00
Virtuoso Linked Data Deployment 3-Step
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1641
2010-10-29T22:54:32Z
<p>Injecting <a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x17012e18">Linked Data</a> into the Web has been a major pain point for those who seek personal, service, or organization-specific variants of <a class="auto-href" href="http://dbpedia.org/resource/DBpedia" id="link-id0x196518a8">DBpedia</a>. Basically, the sequence goes something like this: </p> <ol> <li> You encounter DBpedia or the <a class="auto-href" href="http://community.linkeddata.org/dataspace/organization/lod#this" id="link-id0x1b26d008">LOD</a> Cloud Pictorial.</li> <li> You look around (typically following your nose from link to link). </li> <li> You attempt to publish your own stuff. </li> <li> You get stuck. </li> </ol> <p>The problems typically take the following form:</p> <ol> <li> Functionality confusion about the complementary Name and Address functionality of a single <a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Identifier" id="link-id0xa108a00">URI</a> abstraction </li> <li> Terminology confusion due to conflation and over-loading of terms such as Resource, <a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Locator" id="link-id0x1b3d08f8">URL</a>, Representation, Document, etc. </li> <li> Inability to find robust tools with which to generate Linked Data from existing data sources such as relational databases, CSV files, XML, Web Services, etc. </li> </ol> <p>To start addressing these problems, here is a simple guide for generating and publishing Linked Data using <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1a7841e0">Virtuoso</a>.</p> <h3>Step 1 - RDF Data Generation</h3> <p>Existing RDF data can be added to the Virtuoso RDF Quad Store via a variety of built-in data loader utilities.</p> <p>Many options allow you to easily and quickly generate RDF data from other data sources:</p> <ul> <li> Install the Sponger Bookmarklet for the <a href="http://uriburner.com" id="link-id0x1aa50800">URIBurner service</a>. Bind this to your own <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x1a4255e0">SPARQL</a>-compliant backend RDF database (in this scenario, your local Virtuoso instance), and then Sponge some HTTP-accessible resources. </li> <li> Convert relational DBMS data to RDF using the Virtuoso RDF Views Wizard. </li> <li> Starting with CSV files, you can <ul> <li>Place them at an HTTP-accessible location, and use the Virtuoso <a class="auto-href" href="http://virtuoso.openlinksw.com/Whitepapers/html/VirtSpongerWhitePaper.html" id="link-id0x16f7ba58">Sponger</a> to convert them to RDF or; </li> <li> Use the CVS import feature to import their content into Virtuoso's relational data engine; then use the built-in RDF Views Wizard as with other <a class="auto-href" href="http://dbpedia.org/resource/Relational_database_management_system" id="link-id0x1982ea80">RDBMS</a> data. </li> </ul> </li> <li> Starting from XML files, you can <ul> <li> Use Virtuoso's inbuilt XSLT-Processor for manual XML to RDF/XML transformation or;</li> <li>Leverage the Sponger Cartridge for <a class="auto-href" href="http://dbpedia.org/resource/GRDDL" id="link-id0x1b350968">GRDDL</a>, if there is a transformation service associated with your XML data source, or;</li> <li>Let the Sponger analyze the XML data source and make a best-effort transformation to RDF.</li> </ul> </li> </ul> <h3>Step 2 - Linked Data Deployment</h3> <p> Install the <a href="http://download.openlinksw.com/packages/6.2/virtuoso/fct_dav.vad" id="link-id0x19845ad0">Faceted Browser VAD package (<code>fct_dav.vad</code>)</a> which delivers the following:</p> <ol> <li> Faceted Browser Engine UI</li> <li> Dynamic Hypermedia Resource Generator <ul> <li>delivers descriptor resources for every <a class="auto-href" href="http://dbpedia.org/resource/Entity" id="link-id0x1b3a69f0">entity</a> (data object) in the Native or Virtual Quad Stores</li> <li>supports a broad array of output formats, including HTML+<a class="auto-href" href="http://dbpedia.org/resource/RDFa" id="link-id0x1a92d2f8">RDFa</a>, RDF/XML, N3/Turtle, NTriples, RDF-JSON, OData+Atom, and OData+JSON. </li> </ul> </li> </ol> <h3>Step 3 - Linked Data Consumption & Exploitation</h3> <p> Three simple steps allow you, your enterprise, and your customers to consume and exploit your newly deployed Linked Data -- </p> <ol> <li> Load a page like this in your browser: <code>http://<cname>[:<port>]/describe/?uri=<entity-uri></code> <ul> <li> <code><cname>[:<port>]</code> gets replaced by the host and port of your Virtuoso instance</li> <li> <code><entity-uri></code> gets replaced by the URI you want to see described -- for instance, the URI of one of the resources you let the Sponger handle. </li> </ul> </li> <li> Follow the links presented in the descriptor page. </li> <li>If you ever see a blank page with a hyperlink subject name in the About: section at the top of the page, simply add the parameter "&sp=1" to the URL in the browser's Address box, and hit [ENTER]. This will result in an "on the fly" resource retrieval, transformation, and descriptor page generation.</li> <li> Use the navigator controls to page up and down the data associated with the "in scope" resource descriptor. </li> </ol> <h3>Related</h3> <ul> <li> <a href="http://linkeddata.uriburner.com/describe/?url=http%3A%2F%2Flinkeddata.uriburner.com%2Fabout%2Fid%2Fentity%2Fhttp%2Fwww.amazon.com%2Fo%2FASIN%2F006251587X" id="link-id0x1a8aeaf8">Sample Descriptor Page</a> (what you see post completion of the steps in this post) </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1639" id="link-id0x1af66f38">What is Linked Data, really?</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1613" id="link-id0x1acdbc58">Painless Linked Data Generation via URIBurner</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtRDFInsert" id="link-id0x1abe3b18">How To Load RDF Data Into Virtuoso</a> (various methods)</li> <li> <a href="http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtBulkRDFLoader" id="link-id0x1a441ff0">Virtuoso Bulk Loader Script for RDF</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtCsvFileBulkLoader" id="link-id0x190382e8">Bulk Loader Script for CSV</a> </li> <li> <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtRdb2RDFViewsGeneration#OneClickLinkedDataGenerationAndDemployment" id="link-id0x1ac9c9c0">Wizard based generation of RDF based Linked Data from ODBC accessible Relational Databases </a> </li> </ul>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2010-11-02T11:57:47.000001-04:00
Virtuoso Linked Data Deployment In 3 Simple Steps
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1642
2010-10-29T22:54:32Z
<p>Injecting <a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x17012e18">Linked Data</a> into the <a href="http://dbpedia.org/resource/World_Wide_Web">Web</a> has been a major pain point for those who seek personal, service, or organization-specific variants of <a class="auto-href" href="http://dbpedia.org/resource/DBpedia" id="link-id0x196518a8">DBpedia</a>. Basically, the sequence goes something like this: </p> <ol> <li> You encounter DBpedia or the <a class="auto-href" href="http://community.linkeddata.org/dataspace/organization/lod#this" id="link-id0x1b26d008">LOD</a> Cloud Pictorial.</li> <li> You look around (typically following your nose from link to link). </li> <li> You attempt to publish your own stuff. </li> <li> You get stuck. </li> </ol> <p>The problems typically take the following form:</p> <ol> <li> Functionality confusion about the complementary Name and Address functionality of a single <a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Identifier" id="link-id0xa108a00">URI</a> abstraction </li> <li> Terminology confusion due to conflation and over-loading of terms such as Resource, <a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Locator" id="link-id0x1b3d08f8">URL</a>, Representation, Document, etc. </li> <li> Inability to find robust tools with which to generate Linked Data from existing <a href="http://dbpedia.org/resource/Data">data</a> sources such as relational databases, CSV files, XML, Web Services, etc. </li> </ol> <p>To start addressing these problems, here is a simple guide for generating and publishing Linked Data using <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1a7841e0">Virtuoso</a>.</p> <h3>Step 1 - RDF Data Generation</h3> <p>Existing RDF data can be added to the Virtuoso RDF Quad Store via a variety of built-in data loader utilities.</p> <p>Many options allow you to easily and quickly generate RDF data from other data sources:</p> <ul> <li> Install the Sponger Bookmarklet for the <a href="http://uriburner.com" id="link-id0x1aa50800">URIBurner service</a>. Bind this to your own <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x1a4255e0">SPARQL</a>-compliant backend RDF database (in this scenario, your local Virtuoso instance), and then Sponge some HTTP-accessible resources. </li> <li> Convert relational DBMS data to RDF using the Virtuoso RDF Views Wizard. </li> <li> Starting with CSV files, you can <ul> <li>Place them at an HTTP-accessible location, and use the Virtuoso <a class="auto-href" href="http://virtuoso.openlinksw.com/Whitepapers/html/VirtSpongerWhitePaper.html" id="link-id0x16f7ba58">Sponger</a> to convert them to RDF or; </li> <li> Use the CVS import feature to import their content into Virtuoso's relational data engine; then use the built-in RDF Views Wizard as with other <a class="auto-href" href="http://dbpedia.org/resource/Relational_database_management_system" id="link-id0x1982ea80">RDBMS</a> data. </li> </ul> </li> <li> Starting from XML files, you can <ul> <li> Use Virtuoso's inbuilt XSLT-Processor for manual XML to RDF/XML transformation or;</li> <li>Leverage the Sponger Cartridge for <a class="auto-href" href="http://dbpedia.org/resource/GRDDL" id="link-id0x1b350968">GRDDL</a>, if there is a transformation service associated with your XML data source, or;</li> <li>Let the Sponger analyze the XML data source and make a best-effort transformation to RDF.</li> </ul> </li> </ul> <h3>Step 2 - Linked Data Deployment</h3> <p> Install the <a href="http://download.openlinksw.com/packages/6.2/virtuoso/fct_dav.vad" id="link-id0x19845ad0">Faceted Browser VAD package (<code>fct_dav.vad</code>)</a> which delivers the following:</p> <ol> <li> Faceted Browser Engine UI</li> <li> Dynamic Hypermedia Resource Generator <ul> <li>delivers descriptor resources for every <a class="auto-href" href="http://dbpedia.org/resource/Entity" id="link-id0x1b3a69f0">entity</a> (data object) in the Native or Virtual Quad Stores</li> <li>supports a broad array of output formats, including HTML+<a class="auto-href" href="http://dbpedia.org/resource/RDFa" id="link-id0x1a92d2f8">RDFa</a>, RDF/XML, N3/Turtle, NTriples, RDF-JSON, OData+Atom, and OData+JSON. </li> </ul> </li> </ol> <h3>Step 3 - Linked Data Consumption & Exploitation</h3> <p> Three simple steps allow you, your enterprise, and your customers to consume and exploit your newly deployed Linked Data -- </p> <ol> <li> Load a page like this in your browser: <code>http://<cname>[:<port>]/describe/?uri=<entity-uri></code> <ul> <li> <code><cname>[:<port>]</code> gets replaced by the host and port of your Virtuoso instance</li> <li> <code><entity-uri></code> gets replaced by the URI you want to see described -- for instance, the URI of one of the resources you let the Sponger handle. </li> </ul> </li> <li> Follow the links presented in the descriptor page. </li> <li>If you ever see a blank page with a hyperlink subject name in the About: section at the top of the page, simply add the parameter "&sp=1" to the URL in the browser's Address box, and hit [ENTER]. This will result in an "on the fly" resource retrieval, transformation, and descriptor page generation.</li> <li> Use the navigator controls to page up and down the data associated with the "in scope" resource descriptor. </li> </ol> <h3>Related</h3> <ul> <li> <a href="http://linkeddata.uriburner.com/describe/?url=http%3A%2F%2Flinkeddata.uriburner.com%2Fabout%2Fid%2Fentity%2Fhttp%2Fwww.amazon.com%2Fo%2FASIN%2F006251587X" id="link-id0x1a8aeaf8">Sample Descriptor Page</a> (what you see post completion of the steps in this post) </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1639" id="link-id0x1af66f38">What is Linked Data, really?</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1613" id="link-id0x1acdbc58">Painless Linked Data Generation via URIBurner</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtRDFInsert" id="link-id0x1abe3b18">How To Load RDF Data Into Virtuoso</a> (various methods)</li> <li> <a href="http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtBulkRDFLoader" id="link-id0x1a441ff0">Virtuoso Bulk Loader Script for RDF</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtCsvFileBulkLoader" id="link-id0x190382e8">Bulk Loader Script for CSV</a> </li> <li> <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtRdb2RDFViewsGeneration#OneClickLinkedDataGenerationAndDemployment" id="link-id0x1ac9c9c0">Wizard based generation of RDF based Linked Data from ODBC accessible Relational Databases </a> </li> </ul>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2010-11-02T11:55:31.000005-04:00
SPARQL Guide for the Perl Developer
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1655
2011-01-25T16:05:17Z
<h3>What?</h3> <p>A simple guide usable by any <a class="auto-href" href="http://dbpedia.org/resource/Perl" id="link-id0x1bdcab80">Perl</a> developer seeking to exploit <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x17b447e8">SPARQL</a> without hassles.</p> <h3>Why?</h3> <p>SPARQL is a powerful query language, results serialization format, and an HTTP based <a href="http://dbpedia.org/resource/Data">data</a> access protocol from the W3C. It provides a mechanism for accessing and integrating data across <a href="http://en.wikipedia.org/wiki/Deductive_database" id="link-id0x1cc76540">Deductive Database Systems</a> (colloquially referred to as triple or quad stores in <a class="auto-href" href="http://dbpedia.org/resource/Semantic_Web" id="link-id0x1d944d78">Semantic Web</a> and <a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x1c7a87c8">Linked Data</a> circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form. </p> <h3>How?</h3> <p>SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing.</p> <h4>Steps:</h4> <ol> <li>Determine which SPARQL endpoint you want to access e.g. <a href="http://dbpedia.org/sparql" id="link-id0x1d476520">DBpedia</a> or a local <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1bcfe140">Virtuoso</a> instance (typically: http://localhost:8890/sparql). </li> <li>If using Virtuoso, and you want to populate its quad store using SPARQL, assign "<a href="http://docs.openlinksw.com/virtuoso/rdfsparql.html#rdfsupportedprotocolendpointuri" id="link-id0x1c7630b8">SPARQL_SPONGE</a>" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).</li> </ol> <h4>Script:</h4> <pre> # # Demonstrating use of a single query to populate a # Virtuoso Quad Store via Perl. # # # HTTP <a href="http://dbpedia.org/resource/Uniform_Resource_Locator" id="link-id0x1d6465e8">URL</a> is constructed accordingly with CSV query results format as the default via mime type. # use CGI qw/:standard/; use LWP::UserAgent; use Data::Dumper; use Text::CSV_XS; sub sparqlQuery(@args) { my $query=shift; my $baseURL=shift; my $format=shift; %params=( "default-graph" => "", "should-sponge" => "soft", "query" => $query, "debug" => "on", "timeout" => "", "format" => $format, "save" => "display", "fname" => "" ); @fragments=(); foreach $k (keys %params) { $fragment="$k=".CGI::escape($params{$k}); push(@fragments,$fragment); } $query=join("&", @fragments); $sparqlURL="${baseURL}?$query"; my $ua = LWP::UserAgent->new; $ua->agent("MyApp/0.1 "); my $req = HTTP::Request->new(GET => $sparqlURL); my $res = $ua->request($req); $str=$res->content; $csv = Text::CSV_XS->new(); foreach $line ( split(/^/, $str) ) { $csv->parse($line); @bits=$csv->fields(); push(@rows, [ @bits ] ); } return \@rows; } # Setting Data Source Name (DSN) $dsn="http://dbpedia.org/resource/DBpedia"; # Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET using the IRI in # FROM clause as Data Source URL en route to DBMS # record Inserts. $query="DEFINE get:soft \"replace\"\n # Generic (non Virtuoso specific SPARQL # Note: this will not add records to the # DBMS SELECT DISTINCT * FROM <$dsn> WHERE {?s ?p ?o}"; $data=sparqlQuery($query, "http://localhost:8890/sparql/", "text/csv"); print "Retrieved data:\n"; print Dumper($data); </pre> <h4>Output</h4> <pre> Retrieved data: $VAR1 = [ [ 's', 'p', 'o' ], [ 'http://dbpedia.org/resource/DBpedia', 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type', 'http://www.w3.org/2002/07/owl#Thing' ], [ 'http://dbpedia.org/resource/DBpedia', 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type', 'http://dbpedia.org/ontology/Work' ], [ 'http://dbpedia.org/resource/DBpedia', 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type', 'http://dbpedia.org/class/yago/Software106566077' ], ... </pre> <h3>Conclusion</h3> <p> CSV was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Perl developer that already knows how to use Perl for HTTP based data access within HTML. SPARQL just provides an added bonus to URL dexterity (delivered via <a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Identifier" id="link-id0x1d29da98">URI</a> abstraction) with regards to constructing Data Source Names or Addresses.</p> <h3>Related</h3> <ul> <li> <a href="http://cpansearch.perl.org/src/TOBYINK/RDF-Query-Client-0.103/README" id="link-id0x1c279130">RDF::Query::Client Guide</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1653" id="link-id0x1cf307f0">SPARQL Guide for the Perl Developer</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1652" id="link-id0x1b0ffb28">SPARQL Guide for the PHP Developer</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1651" id="link-id0x1a8c5ae0">SPARQL Guide for the Python Developer</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1648" id="link-id0x1b86ad28">SPARQL Guide for the Ruby Developer</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1646" id="link-id0x1c7af188">Simple Guide for using SPARQL with Virtuoso</a> </li> <li> <a href="http://www.delicious.com/kidehen/sparql_tutorial" id="link-id0x1ac1ba48">General SPARQL Tutorial Collection</a> </li> <li> <a href="http://www.delicious.com/kidehen/virtuoso_sparql_tutorial" id="link-id0x1c7be660">Virtuoso Specific SPARQL Tutorial Collection</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1567" id="link-id0x1c52b438">The URI, URL, and Linked Data Meme's Generic HTTP URI</a>. </li> </ul>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2011-01-26T18:11:13-05:00
SPARQL for the Ruby Developer
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1648
2011-01-18T19:48:34Z
<h3>What?</h3> <p>A simple guide usable by any <a class="auto-href" href="http://dbpedia.org/resource/Ruby_programming_language" id="link-id0x1bb88908">Ruby</a> developer seeking to exploit <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x1ae67500">SPARQL</a> without hassles.</p> <h3>Why?</h3> <p>SPARQL is a powerful query language, results serialization format, and an HTTP based <a href="http://dbpedia.org/resource/Data">data</a> access protocol from the W3C. It provides a mechanism for accessing and integrating data across <a href="http://en.wikipedia.org/wiki/Deductive_database" id="link-id0x1bc61d88">Deductive Database Systems</a> (colloquially referred to as triple or quad stores in <a class="auto-href" href="http://dbpedia.org/resource/Semantic_Web" id="link-id0x1cc11420">Semantic Web</a> and <a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x1b2e7780">Linked Data</a> circles) -- database systems (or data spaces) that manage proposition oriented records in 3-tuple (triples) or 4-tuple (quads) form. </p> <h3>How?</h3> <p>SPARQL queries are actually HTTP payloads (typically). Thus, using a RESTful client-server interaction pattern, you can dispatch calls to a SPARQL compliant data server and receive a payload for local processing e.g. local object binding re. Ruby. </p> <h4>Steps:</h4> <ol> <li> From your command line execute: aptitude search '^ruby', to verify Ruby is in place </li> <li>Determine which SPARQL endpoint you want to access e.g. <a href="http://dbpedia.org/sparql" id="link-id0x1d476520">DBpedia</a> or a local <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1bcfe140">Virtuoso</a> instance (typically: http://localhost:8890/sparql). </li> <li>If using Virtuoso, and you want to populate its quad store using SPARQL, assign "<a href="http://docs.openlinksw.com/virtuoso/rdfsparql.html#rdfsupportedprotocolendpointuri" id="link-id0x1c7630b8">SPARQL_SPONGE</a>" privileges to user "SPARQL" (this is basic control, more sophisticated WebID based ACLs are available for controlling SPARQL access).</li> </ol> <h4>Script:</h4> <pre> #!/usr/bin/env ruby # # Demonstrating use of a single query to populate a # Virtuoso Quad Store. # require 'net/http' require 'cgi' require 'csv' # # We opt for CSV based output since handling this format is straightforward in Ruby, by default. # HTTP <a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Locator" id="link-id0x1acee348">URL</a> is constructed accordingly with CSV as query results format in mind. def sparqlQuery(query, baseURL, format="text/csv") params={ "default-graph" => "", "should-sponge" => "soft", "query" => query, "debug" => "on", "timeout" => "", "format" => format, "save" => "display", "fname" => "" } querypart="" params.each { |k,v| querypart+="#{k}=#{CGI.escape(v)}&" } sparqlURL=baseURL+"?#{querypart}" response = Net::HTTP.get_response(<a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Identifier" id="link-id0x1d24dfd8">URI</a>.parse(sparqlURL)) return CSV::parse(response.body) end # Setting Data Source Name (DSN) dsn="http://dbpedia.org/resource/DBpedia" #Virtuoso pragmas for instructing SPARQL engine to perform an HTTP GET #using the IRI in FROM clause as Data Source URL query="DEFINE get:soft \"replace\" SELECT DISTINCT * FROM <#{dsn}> WHERE {?s ?p ?o} " #Assume use of local installation of Virtuoso #otherwise you can change URL to that of a public endpoint #for example DBpedia: http://dbpedia.org/sparql data=sparqlQuery(query, "http://localhost:8890/sparql/") puts "Got data:" p data # # End </pre><h4>Output</h4> <pre> Got data: [["s", "p", "o"], ["http://dbpedia.org/resource/DBpedia", "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", "http://www.w3.org/2002/07/owl#Thing"], ["http://dbpedia.org/resource/DBpedia", "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", "http://dbpedia.org/ontology/Work"], ["http://dbpedia.org/resource/DBpedia", "http://www.w3.org/1999/02/22-rdf-syntax-ns#type", "http://dbpedia.org/class/yago/Software106566077"], ... </pre> <h3>Conclusion</h3> <p> <a href="http://en.wikipedia.org/wiki/Comma-separated_values" id="link-id0x1cac8420">CSV</a> was chosen over XML (re. output format) since this is about a "no-brainer installation and utilization" guide for a Ruby developer that already knows how to use Ruby for HTTP based data access. SPARQL just provides an added bonus to URL dexterity (delivered via URI abstraction) with regards to constructing Data Source Names or Addresses.</p> <h3>Related</h3> <ul> <li> <a href="http://www.taxonconcept.org/how-to/ruby-code-examples/how-do-i-use-ruby-to-query-a-sparql-endpoint.html" id="link-id0x1aa83678">SPARQL and Ruby SPARQL Client Library Example</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1646" id="link-id0x1c7af188">Simple Guide for using SPARQL with Virtuoso</a> </li> <li> <a href="http://www.delicious.com/kidehen/sparql_tutorial" id="link-id0x1ac1ba48">General SPARQL Tutorial Collection</a> </li> <li> <a href="http://www.delicious.com/kidehen/virtuoso_sparql_tutorial" id="link-id0x1c7be660">Virtuoso Specific SPARQL Tutorial Collection</a> </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1567" id="link-id0x1c52b438">The URI, URL, and Linked Data Meme's Generic HTTP URI</a>. </li> </ul>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2011-01-25T10:17:12.000002-05:00
Simple Virtuoso Installation & Utilization Guide for SPARQL Users (Update 5)
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1647
2011-01-16T07:06:21Z
<h3>What is <a class="auto-href" href="http://dbpedia.org/resource/SPARQL" id="link-id0x1ab60ac0">SPARQL</a>?</h3> <p>A declarative query language from the W3C for querying structured propositional <a href="http://dbpedia.org/resource/Data">data</a> (in the form of 3-<a href="http://en.wikipedia.org/wiki/Tuple" id="link-id0x1b1e0010">tuple</a> [triples] or 4-tuple [quads] records) stored in a <a href="http://en.wikipedia.org/wiki/Deductive_database" id="link-id0x1cf8af98">deductive database</a> (colloquially referred to as triple or quad stores in <a class="auto-href" href="http://dbpedia.org/resource/Semantic_Web" id="link-id0x1caf5050">Semantic Web</a> and <a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x19d781b8">Linked Data</a> parlance).</p> <p>SPARQL is inherently platform independent. Like <a class="auto-href" href="http://dbpedia.org/resource/SQL" id="link-id0x1b879140">SQL</a>, the query language and the backend database engine are distinct. Database clients capture SPARQL queries which are then passed on to compliant backend databases.</p> <h3>Why is it important?</h3> <p>Like SQL for relational databases, it provides a powerful mechanism for accessing and joining data across one or more data partitions (named graphs identified by IRIs). The aforementioned capability also enables the construction of sophisticated Views, Reports (HTML or those produced in native form by desktop productivity tools), and data streams for other services.</p> <p>Unlike SQL, SPARQL includes result serialization formats and an HTTP based wire protocol. Thus, the ubiquity and sophistication of HTTP is integral to SPARQL i.e., client side applications (user agents) only need to be able to perform an HTTP GET against a <a class="auto-href" href="http://dbpedia.org/resource/Uniform_Resource_Locator" id="link-id0x1ba287e8">URL</a> en route to exploiting the power of SPARQL.</p> <h3>How do I use it, generally?</h3> <ol> <li>Locate a SPARQL endpoint (<a href="http://dbpedia.org/sparql" id="link-id0x1d7436b0">DBpedia</a>, <a href="http://lod.openlinksw.com/sparql" id="link-id0x1bf20690">LOD Cloud Cache</a>, <a href="http://semantic.data.gov" id="link-id0x1a8ebc28">Data.Gov</a>, <a href="http://linkeddata.uriburner.com/sparql" id="link-id0x1be93070">URIBurner</a>, <a href="http://www.delicious.com/kidehen/sparql_endpoint" id="link-id0x1cce9b40">others</a>), or;</li> <li>Install a SPARQL compliant database server (quad or triple store) on your desktop, workgroup server, data center, or cloud (e.g., <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtuosoEC2AMI" id="link-id0x1cd697a0">Amazon EC2 AMI</a>)</li> <li>Start the database server</li> <li>Execute SPARQL Queries via the <a href="http://lod.openlinksw.com/sparql" id="link-id0x1b99d790">SPARQL endpoint.</a> </li> </ol> <h3>How do I use SPARQL with <a class="auto-href" href="http://virtuoso.openlinksw.com" id="link-id0x1c9adc80">Virtuoso</a>?</h3> <p>What follows is a very simple guide for using SPARQL against your own instance of Virtuoso:</p> <ol> <li>Software Download and Installation</li> <li>Data Loading from Data Sources exposed at Network Addresses (e.g. HTTP URLs) using very simple methods</li> <li>Actual SPARQL query execution via SPARQL endpoint.</li> </ol> <h3>Installation Steps</h3> <ol> <li> Download <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSDownload" id="link-id0x1b795100">Virtuoso Open Source</a> or <a href="http://download.openlinksw.com/virtwiz/virtuoso.php" id="link-id0x1cce46f0">Virtuoso Commercial</a> Editions </li> <li> Run installer (if using Commercial edition of Windows Open Source Edition, otherwise follow build guide) </li> <li> Follow post-installation guide and verify installation by typing in the command: virtuoso -? (if this fails check you've followed installation and setup steps, then verify environment variables have been set) </li> <li> Start the Virtuoso server using the command: virtuoso-start.sh </li> <li> Verify you have a connection to the Virtuoso Server via the command: isql localhost (assuming you're using default DB settings) or the command: isql localhost:1112 (assuming demo database) or goto your browser and type in: http://<virtuoso-server-host-name>:[port]/conductor (e.g. http://localhost:8889/conductor for default DB or http://localhost:8890/conductor if using Demo DB) </li> <li> Go to SPARQL endpoint which is typically -- http://<virtuoso-server-host-name>:[port]/sparql </li> <li> Run a quick sample query (since the database always has system data in place): select distinct * where {?s ?p ?o} limit 50 .</li> </ol> <h3>Troubleshooting</h3> <ol> <li>Ensure environment settings are set and functional -- if using Mac OS X or Windows, so you don't have to worry about this, just start and stop your Virtuoso server using native OS services applets</li> <li>If using the Open Source Edition, follow the <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSMake#Getting%20Started" id="link-id0x1bfa7548">getting started guide</a> -- it covers PATH and startup directory location re. starting and stopping Virtuoso servers.</li> <li>Sponging (HTTP GETs against external Data Sources) within SPARQL queries is disabled by default. You can enable this feature by assigning "<a href="http://docs.openlinksw.com/virtuoso/rdfsparql.html#rdfsupportedprotocolendpointuri" id="link-id0x1d566270">SPARQL_SPONGE</a>" privileges to user "SPARQL". Note, more sophisticated security exists via <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtAuthPolicyFOAFSSL" id="link-id0x1a3c9eb8">WebID based ACLs</a>. </li> </ol> <h3>Data Loading Steps</h3> <ol> <li> Identify an RDF based structured data source of interest -- a file that contains 3-tuple / triples available at an address on a public or private HTTP based network </li> <li>Determine the Address (URL) of the RDF data source</li> <li>Go to your Virtuoso SPARQL endpoint and type in the following SPARQL query: DEFINE GET:SOFT "replace" SELECT DISTINCT * FROM <RDFDataSourceURL> WHERE {?s ?p ?o} </li> <li> All the triples in the RDF resource (data source accessed via URL) will be loaded into the Virtuoso Quad Store (using RDF Data Source URL as the internal quad store Named Graph IRI) as part of the SPARQL query processing pipeline. </li> </ol> <p> Note: the data source URL doesn't even have to be RDF based -- which is where the Virtuoso <a class="auto-href" href="http://virtuoso.openlinksw.com/Whitepapers/html/VirtSpongerWhitePaper.html" id="link-id0x1d1a0978">Sponger</a> Middleware comes into play (download and install the <a href="http://s3.amazonaws.com/opldownload/uda/vad-packages/6.1/virtuoso/rdf_mappers_dav.vad" id="link-id0x1d0e1530">VAD installer package</a> first) since it delivers the following features to Virtuoso's SPARQL engine: </p> <ol> <li> Transformation of data from non RDF data sources (file content, hypermedia resources, <a href="http://dbpedia.org/resource/World_Wide_Web">web</a> services output etc..) into RDF based 3-tuples (triples)</li> <li> Cache Invalidation Scheme Construction -- thus, subsequent queries (without the define get:soft "replace" pragma will not be required bar when you forcefully want to override cache).</li> <li> If you have very large data sources like DBpedia etc. from CKAN, simply use our <a href="http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtBulkRDFLoader" id="link-id0x1d19b4b0">bulk loader</a> . </li> </ol> <h3>SPARQL Endpoint Discovery</h3> <p>Public SPARQL endpoints are emerging at an ever increasing rate. Thus, we've setup up a DNS lookup service that provides access to a large number of SPARQL endpoints. Of course, this doesn't cover all existing endpoints, so if our endpoint is missing please ping <a class="auto-href" href="http://myopenlink.net/dataspace/person/kidehen#this" id="link-id0x1d634848">me</a>.</p> <p>Here are a collection of commands for using DNS-SD to discover SPARQL endpoints:</p> <ol> <li>dns-sd -B _sparql._tcp sparql.openlinksw.com -- browse for services instances</li> <li>dns-sd -Z _sparql._tcp sparql.openlinksw.com -- output results in Zone File format</li> <li></li> </ol> <h3>Related</h3> <ol> <li> <a href="http://www.ensta.fr/~diam/ruby/online/ruby-doc-stdlib/libdoc/net/http/rdoc/index.html" id="link-id0x1b156610">Using HTTP from Ruby</a> -- you can just make <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSSparqlProtocol" id="link-id0x1d024d60">SPARQL Protocol URLs</a> re. SPARQL</li> <li> <a href="http://sparql.rubyforge.org/client/" id="link-id0x1cd43a48">Using SPARQL Endpoints via Ruby</a> -- Ruby example using DBpedia endpoint</li> <li> <a href="http://wikis.openlinksw.com/dataspace/owiki/wiki/OATWikiWeb/InteractiveSparqlQueryBuilder" id="link-id0x1b9d2190">Interactive SPARQL Query By Example (QBE) tool</a> -- provides a graphical user interface (as is common in SQL realm re. query building against <a class="auto-href" href="http://dbpedia.org/resource/Relational_database_management_system" id="link-id0x1bfffb70">RDBMS</a> engines) that works with any SPARQL endpoint </li> <li> <a href="http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtRDFInsert" id="link-id0x1ab63de0">Other methods of loading RDF data into Virtuoso</a> </li> <li> <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger" id="link-id0x1ca248e0">Virtuoso Sponger</a> -- architecture and how it turns a wide variety of non RDF data sources into SPARQL accessible data </li> <li> <a href="http://ode.openlinksw.com/example.html" id="link-id0x1be34758">Using OpenLink Data Explorer</a> (ODE) to populate Virtuoso -- locate a resource of interest; click on a bookmarklet or use <a class="auto-href" href="http://dbpedia.org/resource/Context_%28language_use%29" id="link-id0x1ca84af0">context</a> menus (if using ODE extensions for Firefox, Safari, or Chrome); and you'll have SPARQL accessible data automatically inserted into your Virtuoso instance. </li> <li> <a href="http://www.openlinksw.com/dataspace/kidehen@openlinksw.com/weblog/kidehen@openlinksw.com%27s%20BLOG%20%5B127%5D/1295" id="link-id0x1c9060f0">W3C's SPARQLing Data Access Ingenuity</a> -- an older generic SPARQL introduction post </li> <li> <a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSPARQLRef" id="link-id0x1cf1e298">Collection of SPARQL Query Examples </a>-- GoodRelations (Product Offers), <a class="auto-href" href="http://dbpedia.org/resource/Friend_of_a_friend" id="link-id0x1c0445d0">FOAF</a> (Profiles), <a class="auto-href" href="http://dbpedia.org/resource/SIOC" id="link-id0x1b785e48">SIOC</a> (Data Spaces -- <a href="http://ods.openlinksw.com/dataspace/dav/wiki/ODS/ODSAtomOWLRefExampleBlog" id="link-id0x1b6c9f78">Blogs</a>, <a href="http://ods.openlinksw.com/dataspace/dav/wiki/ODS/ODSAtomOWLRefExampleWiki" id="link-id0x1c188280">Wikis</a>, <a href="http://ods.openlinksw.com/dataspace/dav/wiki/ODS/ODSAtomOWLRefExampleBookmarks" id="link-id0x1a9a8f98">Bookmarks</a>, <a href="http://ods.openlinksw.com/dataspace/dav/wiki/ODS/ODSAtomOWLRefExampleFeeds" id="link-id0x1720c658">Feed Collections</a>, <a href="http://ods.openlinksw.com/dataspace/dav/wiki/ODS/ODSAtomOWLRefExampleGallery" id="link-id0x1cdba348">Photo Galleries</a>, <a href="http://ods.openlinksw.com/dataspace/dav/wiki/ODS/ODSAtomOWLRefExampleBriefcase" id="link-id0x1c8f1148">Briefcase/DropBox</a>, <a href="http://ods.openlinksw.com/dataspace/dav/wiki/ODS/ODSAtomOWLRefExampleAddressbook" id="link-id0x1b5eb7e0">AddressBook</a>, <a href="http://ods.openlinksw.com/dataspace/dav/wiki/ODS/ODSAtomOWLRefExampleCalendar" id="link-id0x1c575120">Calendars</a>, <a href="http://ods.openlinksw.com/dataspace/dav/wiki/ODS/ODSAtomOWLRefExampleDiscussions" id="link-id0x1c73be98">Discussion Forums</a>) </li> <li> <a href="http://lod.openlinksw.com/demo_queries/" id="link-id0x1b08aa00">Collection of Live SPARQL Queries against LOD Cloud Cache</a> -- simple and advanced queries. </li> </ol>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2011-01-19T10:43:35-05:00
6 Things That Must Remain Distinct re. Data
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?id=1643
2010-11-03T17:02:32Z
<p>Conflation is the tech industry's equivalent of macroeconomic inflation. Whenever it rears it head, we lose value courtesy of diminishing productivity.</p> <p>Looking retrospectively at any technology failure -- enterprises or industry at large -- you will eventually discover -- at the core -- messy conflation of at least one of the following:</p> <ol> <li> <a href="http://dbpedia.org/resource/Data">Data</a> Model (Semantics) </li> <li> Data Object (<a class="auto-href" href="http://dbpedia.org/resource/Entity" id="link-id0x138a4c88">Entity</a>) Names (Identifiers) </li> <li> Data Representation Syntax (Markup) </li> <li> Data Access Protocol </li> <li> Data Presentation Syntax (Markup) </li> <li> Data Presentation Media. </li> </ol> <p>The <a class="auto-href" href="http://dbpedia.org/resource/Internet" id="link-id0x1b4a9918">Internet</a> & <a class="auto-href" href="http://dbpedia.org/resource/World_Wide_Web" id="link-id0x1a8f8700">World Wide Web</a> (InterWeb) are massive successes because their respective architectural cores embody the critical separation outlined above.</p> <p>The <a href="http://dbpedia.org/resource/World_Wide_Web">Web</a> of <a class="auto-href" href="http://dbpedia.org/resource/Linked_Data" id="link-id0x156246e0">Linked Data</a> is going to become a global reality, and massive success, because it leverages inherently sound architecture -- bar conflationary distractions of RDF. :-)</p>
Kingsley Uyi Idehen
kidehen@openlinksw.com
2010-11-04T11:01:39.000002-04:00