http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/
Kingsley Idehen's Blog Data Space
I have seen the future and it's full of Linked Data! :-)
kidehen@openlinksw.com
kidehen@openlinksw.com
2024-03-29T05:48:59Z
Virtuoso Universal Server 08.03.3327
http://www.openlinksw.com:443/weblog/public/images/vbloglogo.gif
Origins of the Term: Middleware
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?date=2006-07-19#1015
2006-07-19T04:41:47Z
2006-07-19T03:36:50-04:00
<p>A nice link to a 2005 post Nick Gall about the <a href="http://radio.weblogs.com/0126951/2005/07/30.html#a194">Origins of the term: Middleware</a> </p>
New Toolkit for Rich Web Applications
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?date=2006-07-19#1014
2006-07-19T00:26:36Z
2006-07-18T21:39:38.000002-04:00
<p> <a href="http://tirania.org/blog/archive/2006/Jun-28-1.html">New Toolkit for Rich Web Applications</a>: "</p> <p>The other day I ran into <a href="http://www.jitsu.org/jitsu/">Jitsu</a>, a new toolkit for creating Ajax-y applications. </p> <p>Jitsu takes an interesting <a href="http://www.jitsu.org/jitsu/guide/approach.html">approach</a> in the Ajaxy space."</p> <p>(Via <a href="http://tirania.org/blog/index.html">Miguel de Icaza</a>.)</p>
Intermediate RDF Bulk Loading (Wikipedia & Wordnet) Experiment Results
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?date=2006-07-18#1013
2006-07-18T15:21:28Z
2006-07-18T14:28:58.000004-04:00
<p> <a href="http://www.openlinksw.com/weblog/oerling/">Orri</a> shares his findings from internal experimentation re. <a href="http://virtuoso.openlinksw.com/wiki/main/">Virtuoso</a> and bulk loading RDF content such as <a href="http://labs.systemone.at/wikipedia3">Wikpedia3</a> and <a href="http://wordnet.princeton.edu/~agraves/index_archivos/Page381.htm">Wordnet</a> Data Sets:</p> <p>Here is a dump of the post titled: <a href="http://www.openlinksw.com/weblog/oerling/?id=1010">Intermediate RDF Loading Results</a>:</p> <blockquote> <p>Following from the post about a new <a href="http://www.openlinksw.com/weblogs/oerling/index.vspx?page=&id=1000">Multithreaded RDF Loader</a>, here are some intermediate results and action plans based on my findings.</p> <p>The experiments were made on a dual 1.6GHz Sun SPARC with 4G RAM and 2 SCSI disks. The data sets were the 48M triple Wikipedia data set and the 1.9M triple Wordnet data set. 100% CPU means one CPU constantly active. 100% disk means one thread blocked on the read system call at all times.</p> <p>Starting with an empty database, loading the Wikipedia set took 315 minutes, amounting to about 2500 triples per second. After this, loading the Wordnet data set with cold cache and 48M triples already in the table took 4 minutes 12 seconds, amounting to 6838 triples per second. Loading the Wikipedia data had CPU usage up to 180% but over the whole run CPU usage was around 50% with disk I/O around 170%. Loading the larger data set was significantly I/O bound while loading the smaller set was more CPU bound, yet was not at full 200% CPU.</p> <p>The RDF quad table was indexed on GSPO and PGOS. As one would expect, the bulk of I/O was on the PGOS index. We note that the pages of this index were on the average only 60% full. Thus the most relevant optimization seems to be to fill the pages closer to 90%. This will directly cut about a third of all I/O plus will have an additional windfall benefit in the form of better disk cache hit rates resulting from a smaller database.</p> <p>The most practical way of having full index pages in the case of unpredictable random insert order will be to take sets of adjacent index leaf pages and compact the rows so that the last page of the set goes empty. Since this is basically an I/O optimization, this should be done when preparing to write the pages to disk, hence concerning mostly old dirty pages. Insert and update times will not be affected since these operations will not concern themselves with compaction. Thus the CPU cost of background compaction will be negligible in comparison with writing the pages to disk. Naturally this will benefit any relational application as well as free text indexing. RDF and free text will be the largest beneficiaries due to the large numbers of short rows inserted in random order.</p> <p>Looking at the CPU usage of the tests, locating the place in the index where to insert, which by rights should be the bulk of the time cost, was not very significant, only about 15%. Thus there are many unused possibilities for optimization,for example writing some parts of the loader current done as stored procedures in C. Also the thread usage of the loader, with one thread parsing and mapping IRI strings to IRI IDs and 6 threads sharing the inserting could be refined for better balance, as we have noted that the parser thread sometimes forms a bottleneck. Doing the updating of the IRI name to IRI id mapping on the insert thread pool would produce some benefit.</p> <p>Anyway, since the most important test was I/O bound, we will first implement some background index compaction and then revisit the experiment. We expect to be able to double the throughput of the Wikipedia data set loading.</p> </blockquote>
Web 2.0 Self-Experiment aids Web 3.0 comprehension
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?date=2006-07-17#1009
2006-07-17T21:46:42Z
2006-07-18T01:17:43-04:00
<blockquote> <p> <a href="http://vzach.blogspot.com/2006/07/web-20-self-experiment.html">Web 2.0 Self-Experiment</a>: "</p> <blockquote>I shopped for everything except food on eBay. When working with foreign-language documents, I used translations from Babel Fish. (This worked only so well. After a Babel Fish round-trip through Italian, the preceding sentence reads, 'That one has only worked therefore well.') Why use up space storing files on my own hard drive when, thanks to certain free utilities, I can store them on Gmail's servers? I saved, sorted, and browsed photos I uploaded to Flickr. I used Skype for my phone calls, decided on books using Amazon's recommendations rather than 'expert' reviews, killed time with videos at YouTube, and listened to music through customizable sites like Pandora and Musicmatch. I kept my schedule on Google Calendar, my to-do list on Voo2do, and my outlines on iOutliner. I voyeured my neighborhood's home values via Zillow. I even used an online service for each stage of the production of this article, culminating in my typing right now in Writely rather than Word. (Being only so confident that Writely wouldn't somehow lose my work -- or as Babel Fish might put it, 'only confident therefore' -- I backed it up into Gmail files.</blockquote> <a href="http://www.technologyreview.com/read_article.aspx?id=17061&ch=infotech">Interesting article</a>, Tim O'Reilly's response is <a href="http://radar.oreilly.com/archives/2006/07/levels_of_the_game.html">here</a>" <p>(Via <a href="http://vzach.blogspot.com">Valentin Zacharias (Student)</a>.)</p> </blockquote> <p>Tim O'Reilly's response provides the following hierarchy for Web 2.0 based on The what he calls: "Web 2.0-ness":</p> <blockquote> <p>level 3: The application could ONLY exist on the net, and draws its essential power from the network and the connections it makes possible between people or applications. These are applications that harness network effects to get better the more people use them. EBay, craigslist, Wikipedia, del.icio.us, Skype, (and yes, Dodgeball) meet this test. They are fundamentally driven by shared online activity. The web itself has this character, which Google and other search engines have then leveraged. (You can search on the desktop, but without link activity, many of the techniques that make web search work so well are not available to you.) Web crawling is one of the fundamental Web 2.0 activities, and search applications like Adsense for Content also clearly have Web 2.0 at their heart. I had a conversation with Eric Schmidt, the CEO of Google, the other day, and he summed up his philosophy and strategy as "Don't fight the internet." In the hierarchy of web 2.0 applications, the highest level is to embrace the network, to understand what creates network effects, and then to harness them in everything you do.</p> <p> Level 2: The application could exist offline, but it is uniquely advantaged by being online. Flickr is a great example. You can have a local photo management application (like iPhoto) but the application gains remarkable power by leveraging an online community. In fact, the shared photo database, the online community, and the artifacts it creates (like the tag database) is central to what distinguishes Flickr from its offline counterparts. And its fuller embrace of the internet (for example, that the default state of uploaded photos is "public") is what distinguishes it from its online predecessors.</p> <p> Level 1: The application can and does exist successfully offline, but it gains additional features by being online. Writely is a great example. If you want to do collaborative editing, its online component is terrific, but if you want to write alone, as Fallows did, it gives you little benefit (other than availability from computers other than your own.) </p> <p> Level 0: The application has primarily taken hold online, but it would work just as well offline if you had all the data in a local cache. MapQuest, Yahoo! Local, and Google Maps are all in this category (but mashups like housingmaps.com are at Level 3.) To the extent that online mapping applications harness user contributions, they jump to Level 2.</p> </blockquote> <p>So, in a sense we have near conclusive confirmation that Web 2.0 is simply about APIs (typically service specific Data Silos or Walled-gardens) with little concern, understanding, or interest in truly open data access across the burgeoning "<a href="http://www.infoworld.com/article/06/05/03/77873_19OPstrategic_1.html">Web of Databases</a>". Or the<a href="http://www.w3.org/2005/Talks/0623-sb-IEEEStorConf/"> Web of "Databases and Programs"</a> that I prefer to describe as "<a href="http://virtuoso.openlinksw.com/wiki/main/Main/DataSpaceFAQ">Data Spaces</a>"</p> <p>Thus, we can truly begin to conclude that Web 3.0 (Data Web) is the addition of Flexible and Open Data Access to Web 2.0; where the Open Data Access is achieved by leveraging Semantic Web deliverables such as the RDF Data Model and the SPARQL Query Language :-)</p>
GeoRSS & Geonames for Philanthropy re. Kiva Microfinance
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?date=2006-07-15#1006
2006-07-15T14:11:47Z
2006-07-15T10:48:36.000002-04:00
<p>(Via <a href="http://www.geospatialsemanticweb.com">Geospatial Semantic Web Blog</a>.)</p> <p> <a href="http://www.geospatialsemanticweb.com/2006/07/14/georss-geonames-for-philanthropy#comments">GeoRSS & Geonames for Philanthropy</a>: "</p> <p>I heard about <a title="kiva.org" href="http://www.kiva.org">Kiva.ORG</a> in a BusinessWeek podcast. After visiting its website, I think there are few places where GeoRSS (in the RDF/A syntax) and Geonames can be used to enhance the siteâs functionality.</p> <h5>Kiva.ORG Background</h5> <h5> <img align="left" title="kiva.org" id="image92" alt="kiva.org" src="http://www.geospatialsemanticweb.com/wp-content/uploads/2006/07/kiva-bannersmall.png" /> </h5> <p>Itâs a microfinance website for people in the developing countries. Its business model is in the intersection between peer-to-peer financing and philanthropy. The goal is to help developing country businesses to borrow small loans from a large group of Web users, so that they can avoid paying high interests to the banks.</p> <p>For example, a person in Uganda can <a target="_blank" title="Kiva Loan Request" href="http://kiva.org/app.php?page=businesses&action=about&id=564">request</a> a $500 loan and use it for buying and selling more poultry. One or more lenders (anyone on the Web) may decide to grant loans to that person in increments as tiny as $25. After few years, that person will pay back the loans to the lenders.</p> <h5>How GeoRSS and Geonames Can Help</h5> <p>I went to the website and discovered the site has a relative weak search and browsing interface. In particular, there is no way to group loan requests based on geographical locations (e.g., countries, cities and regions).<br /> <a id="more-90"></a> <br /> Took a look at individual loan pages. Each page actually has standard ways to describe location information â e.g., <strong>Location:</strong> Mbale, Uganda.</p> <p>It should be relative easy to add <a title="GeoRSS" target="_blank" href="http://www.georss.org/">GeoRSS</a> points (in <a title="Mixing GeoRSS with RDF/A" target="_blank" href="http://www.geospatialsemanticweb.com/2006/06/08/mixing-rdfa-with-georss">the RDF/A syntax</a>) to describe these location information (an alternative maybe using <a title="geocode with microformat" target="_blank" href="http://www.geospatialsemanticweb.com/2006/01/03/how-to-geocode-your-blog">Microformat Geo</a> or <a title="w3c geo" target="_blank" href="http://www.w3.org/2003/01/geo/">W3C Geo</a>). Once the location information is annotated, one can imagine building a map mashup to display loan requests in a geospatial perspective. One can also build search engines to support spatial queries such as âfind me all loans with from Mbaleâ.</p> <p>Since Kiva.ORG webmasters may not be GIS experts, it will be nice if we can find ways to automatically geocode location information and describe that using GeoRSS. This automatic geocoding procedure can be developed using <a title="geonames webservices" target="_blank" href="http://www.geonames.org/export/geonames-search.html">Geonamesâs webservices</a>. Take a string âMbaleâ or âUgandaâ, and send to Geonamesâs search service. The procedure will get back <a target="_blank" title="geonames json saerch" href="http://ws.geonames.org/searchJSON?q=Mbale&maxRows=10">JSON</a> or <a target="_blank" title="geonames xml search" href="http://ws.geonames.org/search?q=Mbale&maxRows=10">XML</a> description of the location, which include latitude and longitude. This will then be used to annotate the location information in a Kiva loan page.</p> <p>Can you think of other ways to help Kiva.ORG to become more âgeospatially intelligentâ?<br /> You can learn more about <a title="kiva.org" target="_blank" href="http://www.kiva.org">Kiva.ORG</a> at its website and listen to <a title="An eBay for Microfinance" target="_blank" href="http://www.businessweek.com/mediacenter/podcasts/innovation/innovation_07_11_06.htm">this podcast</a>. </p>"
Object Relational Rediscovered?
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?date=2006-07-13#1005
2006-07-14T01:59:15Z
2006-07-13T21:59:16.000002-04:00
<p>Microsoft's recent unveiling of the next <a href="http://msdn.microsoft.com/data/default.aspx?pull=/library/en-us/dnvs05/html/ADONETEnFrmOvw.asp">generation of ADO.NET</a> has pretty much crystalized a long running hunch that the era of standardized client/user level interfaces for "Object-Relational" technology is neigh. Finally, this application / problem domain is attracting the attention of industry behemoths such as Microsoft.</p> <p> </p> <p>In an initial response to these developments<a href="http://www.openlinksw.com/weblog/oerling/">Orri Erling</a>, Virtuoso's Program Manager, shares <a href="http://www.openlinksw.com/weblog/oerling/?id=1002">valuable insights from past re. Object-Relational technology developments and deliverables challenges</a>. As Orri notes, the Virtuoso team suspended ORM and ORDBMS work at the onset of the <a href="http://virtuoso.openlinksw.com/wiki/main/Main/VOSHistory">Kubl-Virtuoso transition</a> due to the lack of standardized client-side functionality exposure points.</p> <p>My hope is that Microsoft's efforts trigger community wide activity that result in a collection of interfaces that make scenarios such as generating .NET based Semantic Web Objects (where the S in an S-P->O RDF-Triple becomes a bona fide .NET class instance generated from OWL).</p> <p>To be continued since the interface specifics re. ADO.NET 3.0 remain in flux...</p>
RDF's History
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?date=2006-07-13#1004
2006-07-13T21:42:57Z
2006-07-13T19:04:36-04:00
<p>We are getting very close to a Semantic Web watershed moment (IMHO). Thus, for the purpose of historic record, I would like to create a public bookmark to Tim Bray's 2003 post titled: <a href="http://www.tbray.org/ongoing/When/200x/2003/05/21/RDFNet">RDF.net</a> Challenge that also contains a nice section about the <a href="http://www.tbray.org/ongoing/When/200x/2003/05/21/RDFNet">History of RDF</a>.</p> <p>Note to Tim:</p> <p> Is the RDF.net domain deal still on? I know it's past 1st Jan 2006, but do bear in mind that the critical issue of a broadly supported RDF Query Language only took significant shape approximately 13 months ago (in the form of SPARQL), and this is all so critical to the challenge you posed in 2003.</p> <p> <a href="http://rdf.net">RDF.net</a> could become a point of semantic-web-presence through which the benefits of SPARQL compliant Triple|Quad Stores, Shared Ontologies, and SPARQL Protocol are unveiled in their well intended glory :-).</p>
Linux vs SCO: An opinion from the BSD point of view.
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?date=2003-06-05#100
2003-06-06T02:47:06Z
2006-06-22T08:56:58-04:00
<A href="http://www.oreillynet.com/pub/wlg/3268">Linux vs SCO: An opinion from the BSD point of view.</A> <P> Greg Lehey has written an excellent article for Daemon News on the Linux versus SCO debacle from the point of view of a BSD user. (Or at least from the point of view of one BSD user). <P> One particularly interesting idea: <P> <blockquote> <quote> Linux source code is freely available. UnixWare source code is not, even less than many other proprietary UNIX implementations. Thus it would be easier to copy code from Linux to UnixWare then from UnixWare to Linux. </quote> </blockquote> <P> Lehey also has a page tracking the debate. [via <A href="http://meerkat.oreillynet.com/">Meerkat: An Open Wire Service: O'Reilly Network Weblogs</A>] <DIV></DIV>
<a href="http://www.pcmag.com/article2/0,4149,1071705,00.asp?kc=PCRSS02129TX1K0000530">What's a Wiki?</a>
http://www.openlinksw.com:443/blog/kidehen@openlinksw.com/blog/?date=2003-05-12#1
2003-05-13T02:50:35Z
2006-06-22T08:56:58-04:00
<A href="http://www.pcmag.com/article2/0,4149,1071705,00.asp?kc=PCRSS02129TX1K0000530">What's a Wiki?</A> While blogs are the hot topic in Web-based communication forums, wikis are growing in popularity and are unique forums in a number of ways. <DIV align=right>[via <A href="http://www.eweek.com/">Technology News from eWEEK and Ziff Davis</A>] <DIV></DIV></DIV>