A data access driver/provider that provides conceptual entity oriented access to RDBMS data managed by Virtuoso. Naturally, it also uses Virtuoso's in-built virtual / federated database layer to provide access to ODBC and JDBC accessible RDBMS engines such as: Oracle (7.x to latest), SQL Server (4.2 to latest), Sybase, IBM Informix (5.x to latest), IBM DB2, Ingres (6.x to latest), Progress (7.x to OpenEdge), MySQL, PostgreSQL, Firebird, and others using our ODBC or JDBC bridge drivers.
It delivers an Entity-Attribute-Value + Classes & Relationships model over disparate data sources that are materialized as .NET Entity Framework Objects, which are then consumable via ADO.NET Data Object Services, LINQ for Entities, and other ADO.NET data consumers.
The provider is fully integrated into Visual Studio 2008 and delivers the same "ease of use" offered by Microsoft's own SQL Server provider, but across Virtuoso, Oracle, Sybase, DB2, Informix, Ingres, Progress (OpenEdge), MySQL, PostgreSQL, Firebird, and others. The same benefits also apply uniformly to Entity Frameworks compatibility.
Bearing in mind that Virtuoso is a multi-model (hybrid) data manager, this also implies that you can use .NET Entity Frameworks against all data managed by Virtuoso. Remember, Virtuoso's SQL channel is a conduit to Virtuoso's core; thus, RDF (courtesy of SPASQL as already implemented re. Jena/Sesame/Redland providers), XML, and other data forms stored in Virtuoso also become accessible via .NET's Entity Frameworks.
You can choose which entity oriented data access model works best for you: RDF Linked Data & SPARQL or .NET Entity Frameworks & Entity SQL. Either way, Virtuoso delivers a commercial grade, high-performance, secure, and scalable solution.
Note: When working with external or 3rd party databases, simply use the Virtuoso Conductor to link the external data source into Virtuoso. Once linked, the remote tables will simply be treated as though they are native Virtuoso tables leaving the virtual database engine to handle the rest. This is similar to the role the Microsoft JET engine played in the early days of ODBC, so if you've ever linked an ODBC data source into Microsoft Access, you are ready to do the same using Virtuoso.
CrunchBase: When we released the CrunchBase API, you were one of the first developers to step up and quickly released a CrunchBase Sponger Cartridge. Can you explain what a CrunchBase Sponger Cartridge is?
Me: A Sponger Cartridge is a data access driver for Web Resources that plugs into our Virtuoso Universal Server (DBMS and Linked Data Web Server combo amongst other things). It uses the internal structure of a resource and/or a web service associated with a resource, to materialize an RDF based Linked Data graph that essentially describes the resource via its properties (Attributes & Relationships).
CrunchBase: And what inspired you to create it?
Me: Bengee built a new space with your data, and we've built a space on the fly from your data which still resides in your domain. Either solution extols the virtues of Linked Data i.e. the ability to explore relationships across data items with high degrees of serendipity (also colloquially known as: following-your-nose pattern in Semantic Web circles).
Bengee posted a notice to the Linking Open Data Community's public mailing list announcing his effort. Bearing in mind the fact that we've been using middleware to mesh the realms of Web 2.0 and the Linked Data Web for a while, it was a no-brainer to knock something up based on the conceptual similarities between Wikicompany and CrunchBase. In a sense, a quadrant of orthogonality is what immediately came to mind re. Wikicompany, CrunchBase, Bengee's RDFization efforts, and ours.
Bengee created an RDF based Linked Data warehouse based on the data exposed by your API, which is exposed via the Semantic CrunchBase data space. In our case we've taken the "RDFization on the fly" approach which produces a transient Linked Data View of the CrunchBase data exposed by your APIs. Our approach is in line with our world view: all resources on the Web are data sources, and the Linked Data Web is about incorporating HTTP into the naming scheme of these data sources so that the conventional URL based hyperlinking mechanism can be used to access a structured description of a resource, which is then transmitted using a range negotiable representation formats. In addition, based on the fact that we house and publish a lot of Linked Data on the Web (e.g. DBpedia, PingTheSemanticWeb, and others), we've also automatically meshed Crunchbase data with related data in DBpedia and Wikicompany data.
CrunchBase: Do you know of any apps that are using CrunchBase Cartridge to enhance their functionality?
Me: Yes, the OpenLink Data Explorer which provides CrunchBase site visitors with the option to explore the Linked Data in the CrunchBase data space. It also allows them to "Mesh" (rather than "Mash") CrunchBase data with other Linked Data sources on the Web without writing a single line of code.
CrunchBase: You have been immersed in the Semantic Web movement for a while now. How did you first get interested in the Semantic Web?
Me: We saw the Semantic Web as a vehicle for standardizing conceptual views of heterogeneous data sources via context lenses (URIs). In 1998 as part of our strategy to expand our business beyond the development and deployment of ODBC, JDBC, and OLE-DB data providers, we decided to build a Virtual Database Engine (see: Virtuoso History), and in doing so we sought a standards based mechanism for the conceptual output of the data virtualization effort. As of the time of the seminal unveiling of the Semantic Web in 1998 we were clear about two things, in relation to the effects of the Web and Internet data management infrastructure inflections: 1) Existing DBMS technology had reached it limits 2) Web Servers would ultimately hit their functional limits. These fundamental realities compelled us to develop Virtuoso with an eye to leveraging the Semantic Web as a vehicle from completing its technical roadmap.
CrunchBase: Can you put into laymanâs terms exactly what RDF and SPARQL are and why they are important? Do they only matter for developers or will they extend past developers at some point and be used by website visitors as well?
Me: RDF (Resource Description Framework) is a Graph based Data Model that facilitates resource description using the Subject, Predicate, and Object principle. Associated with the core data model, as part of the overall framework, are a number of markup languages for expressing your descriptions (just as you express presentation markup semantics in HTML or document structure semantics in XML) that include: RDFa (simple extension of HTML markup for embedding descriptions of things in a page), N3 (a human friendly markup for describing resources), RDF/XML (a machine friendly markup for describing resources).
SPARQL is the query language associated with the RDF Data Model, just as SQL is a query language associated with the Relational Database Model. Thus, when you have RDF based structured and linked data on the Web, you can query against Web using SPARQL just as you would against an Oracle/SQL Server/DB2/Informix/Ingres/MySQL/etc.. DBMS using SQL. That's it in a nutshell.
CrunchBase: On your website you wrote that âRDF and SPARQL as productivity boosters in everyday web developmentâ. Can you elaborate on why you believe that to be true?
Me: I think the ability to discern a formal description of anything via its discrete properties is of immense value re. productivity, especially when the capability in question results in a graph of Linked Data that isn't confined to a specific host operating system, database engine, application or service, programming language, or development framework. RDF Linked Data is about infrastructure for the true materialization of the "Information at Your Fingertips" vision of yore. Even though it's taken the emergence of RDF Linked Data to make the aforementioned vision tractable, the comprehension of the vision's intrinsic value have been clear for a very long time. Most organizations and/or individuals are quite familiar with the adage: Knowledge is Power, well there isn't any knowledge without accessible Information, and there isn't any accessible Information without accessible Data. The Web has always be grounded in accessibility to data (albeit via compound container documents called Web Pages).
Bottom line, RDF based Linked Data is about Open Data access by reference using URIs (HTTP based Entity IDs / Data Object IDs / Data Source Names), and as I said earlier, the intrinsic value is pretty obvious bearing in mind the costs associated with integrating disparate and heterogeneous data sources -- across intranets, extranets, and the Internet.
CrunchBase: In his definition of Web 3.0, Nova Spivack proposes that the Semantic Web, or Semantic Web technologies, will be force behind much of the innovation that will occur during Web 3.0. Do you agree with Nova Spivack? What role, if any, do you feel the Semantic Web will play in Web 3.0?
Me: I agree with Nova. But I see Web 3.0 as a phase within the Semantic Web innovation continuum. Web 3.0 exists because Web 2.0 exists. Both of these Web versions express usage and technology focus patterns. Web 2.0 is about the use of Open Source technologies to fashion Web Services that are ultimately used to drive proprietary Software as Service (SaaS) style solutions. Web 3.0 is about the use of "Smart Data Access" to fashion a new generation of Linked Data aware Web Services and solutions that exploit the federated nature of the Web to maximum effect; proprietary branding will simply be conveyed via quality of data (cleanliness, context fidelity, and comprehension of privacy) exposed by URIs.
Here are some examples of the CrunchBase Linked Data Space, as projected via our CruncBase Sponger Cartridge:
]]>Daniel simplifies my post by using diagrams to depict the different paths for PHP based applications exposing Linked Data - especially those that already provide a significant amount of the content that drives Web 2.0.
If all the content in Web 2.0 information resources are distillable into discrete data objects endowed with HTTP based IDs (URIs), with zero "RDF handcrafting Tax", what do we end up with? A Giant Global Graph of Linked Data; the Web as a Database.
So, what used to apply exclusively, within enterprise settings re. Oracle, DB2, Informix, Ingres, Sybase, Microsoft SQL Server, MySQL, PostrgeSQL, Progress Open Edge, Firebird, and others, now applies to the Web. The Web becomes the "Distributed Database Bus" that connects database records across disparate databases (or Data Spaces). These databases manage and expose records that are remotely accessible "by reference" via HTTP.
As I've stated at every opportunity in the past, Web 2.0 is the greatest thing that every happened to the Semantic Web vision :-) Without the "Web 2.0 Data Silo Conundrum" we wouldn't have the cry for "Data Portability" that brings a lot of clarity to some fundamental Web 2.0 limitations that end-users ultimately find unacceptable.
In the late '80s, the SQL Access Group (now part of X/Open) addressed a similar problem with RDBMS silos within the enterprise that lead to the SAG CLI which is exists today as Open Database Connectivity.
In a sense we now have WODBC (Web Open Database Connectivity), comprised of Web Services based CLIs and/or traditional back-end DBMS CLIs (ODBC, JDBC, ADO.NET, OLE-DB, or Native), Query Language (SPARQL Query Language), and a Wire Protocol (HTTP based SPARQL Protocol) delivering Web infrastructure equivalents of SQL and RDA, but much better, and with much broader scope for delivering profound value due to the Web's inherent openness. Today's PHP, Python, Ruby, Tcl, Perl, ASP.NET developer is the enterprise 4GL developer of yore, without enterprise confinement. We could even be talking about 5GL development once the Linked Data interaction is meshed with dynamic languages (delivering higher levels of abstraction at the language and data interaction levels). Even the underlying schemas and basic design will evolve from Closed World (solely) to a mesh of Closed & Open World view schemas.
]]>BTW - We have just released a collection of High-Performance Data Providers for ActiveRecord. Our providers deliver
Consistent Functionalityto RoR developers across Virtuoso, Oracle, SQL Server, Sybase, DB2, Ingres, Informix, and others without compromising performance or cross platform portability.]]>
I added the missing piece regarding the "Virtuoso Conductor" (the Web based Admin UI for Virtuoso) to the original post below. I also added a link to our live SPARQL Demo so that anyone interested can start playing around with SPARQL and SPARQL integrated into SQL right away.
Another good thing about this post is the vast amount of valuable links that it contains. To really appreciate this point simply visit my Linkblog (excuse the current layout :-) - a Tab if you come in via the front door of this Data Space (what I used to call My Weblog Home Page).
]]>"Free" Databases: Express vs. Open-Source RDBMSs: "Open-source relational database management systems (RDBMSs) are gaining IT mindshare at a rapid pace. As an example, BusinessWeek's February 6, 2006 ' Taking On the Database Giants ' article asks 'Can open-source upstarts compete with Oracle, IBM, and Microsoft?' and then provides the answer: 'It's an uphill battle, but customers are starting to look at the alternatives.'
There's no shortage of open-source alternatives to look at. The BusinessWeek article concentrates on MySQL, which BW says 'is trying to be the Ikea of the database world: cheap, needs some assembly, but has a sleek, modern design and does the job.' The article also discusses Postgre[SQL] and Ingres, as well as EnterpriseDB, an Oracle clone created from PostgreSQL code*. Sun includes PostgreSQL with Solaris 10 and, as of April 6, 2006, with Solaris Express.**
*Frank Batten, Jr., the investor who originally funded Red Hat, invested a reported $16 million into Great Bridge with the hope of making a business out of providing paid support to PostgreSQL users. Great Bridge stayed in business only 18 months , having missed an opportunity to sell the business to Red Hat and finding that selling $50,000-per-year support packages for an open-source database wasn't easy. As Batten concluded, 'We could not get customers to pay us big dollars for support contracts.' Perhaps EnterpriseDB will be more successful with a choice of $5,000, $3,000, or $1,000 annual support subscriptions .
**Interestingly, Oracle announced in November 2005 that Solaris 10 is 'its preferred development and deployment platform for most x64 architectures, including x64 (x86, 64-bit) AMD Opteron and Intel Xeon processor-based systems and Sun's UltraSPARC(R)-based systems.'
There is a surfeit of reviews of current MySQL, PostgreSQL andâto a lesser extentâIngres implementations. These three open-source RDBMSs come with their own or third-party management tools. These systems compete against free versions of commercial (proprietary) databases: SQL Server 2005 Express Edition (and its MSDE 2000 and 1.0 predecessors), Oracle Database 10g Express Edition, IBM DB2 Express-C, and Sybase ASE Express Edition for Linux where database size and processor count limitations aren't important. Click here for a summary of recent InfoWorld reviews of the full versions of these four databases plus MySQL, which should be valid for Express editions also. The FTPOnline Special Report article, 'Microsoft SQL Server Turns 17,' that contains the preceding table is here (requires registration.)
SQL Server 2005 Express Edition SP-1 Advanced Features
SQL Server 2005 Express Edition with Advanced Features enhances SQL Server 2005 Express Edition (SQL Express or SSX) dramatically, so it deserves special treatment here. SQL Express gains full text indexing and now supports SQL Server Reporting Services (SSRS) on the local SSX instance. The SP-1 with Advanced Features setup package, which Microsoft released on April 18, 2006, installs the release version of SQL Server Management Studio Express (SSMSE) and the full version of Business Intelligence Development Studio (BIDS) for designing and editing SSRS reports. My 'Install SP-1 for SQL Server 2005 and Express' article for FTPOnline's SQL Server Special Report provides detailed, illustrated installation instructions for and related information about the release version of SP-1. SP-1 makes SSX the most capable of all currently available Express editions of commercial RDBMSs for Windows.
OpenLink Software's Virtuoso Open-Source Edition
OpenLink Software announced an open-source version of it's Virtuoso Universal Server commercial DBMS on April 11, 2006. On the initial date of this post, May 2, 2006, Virtuoso Open-Source Edition (VOS) was virtually under the radar as an open-source product. According to this press release, the new edition includes:VOS only lacks the virtual server and replication features that are offered by the commercial edition. VOS includes a Web-based administration tool called the "Virtuoso Conductor" According to Kingsley Idehen's Weblog, 'The Virtuoso build scripts have been successfully tested on Mac OS X (Universal Binary Target), Linux, FreeBSD, and Solaris (AIX, HP-UX, and True64 UNIX will follow soon). A Windows Visual Studio project file is also in the works (ETA some time this week).'
- SPARQL compliant RDF Triple Store
- SQL-200n Object-Relational Database Engine (SQL, XML, and Free Text)
- Integrated BPEL Server and Enterprise Service Bus
- WebDAV and Native File Server
- Web Application Server that supports PHP, Perl, Python, ASP.NET, JSP, etc.
- Runtime Hosting for Microsoft .NET, Mono, and Java
InfoWorld's Jon Udell has tracked Virtuoso's progress since 2002, with an additional article in 2003 and a one-hour podcast with Kingsley Idehen on April 26, 2006. A major talking point for Virtuoso is its support for Atom 0.3 syndication and publication, Atom 1.0 syndication and (forthcoming) publication, and future support for Google's GData protocol, as mentioned in this Idehen post. Yahoo!'s Jeremy Zawodny points out that the 'fingerprints' of Adam Bosworth, Google's VP of Engineering and the primary force behind the development of Microsoft Access, 'are all over GData.' Click here to display a list of all OakLeaf posts that mention Adam Bosworth.
One application for the GData protocol is querying and updating the Google Base database independently of the Google Web client, as mentioned by Jeremy: 'It's not about building an easier onramp to Google Base. ... Well, it is. But, again, that's the small stuff.' Click here for a list of posts about my experiences with Google Base. Watch for a future OakLeaf post on the subject as the GData APIs gain ground.
Open-Source and Free Embedded Database Contenders
Open-source and free embedded SQL databases are gaining importance as the number and types of mobile devices and OSs proliferate. Embedded databases usually consist of Java classes or Windows DLLs that are designed to minimize file size and memory consumption. Embedded databases avoid the installation hassles, heavy resource usage and maintenance cost associated with client/server RDBMSs that run as an operating system service.
Andrew Hudson's December 2005 'Open Source databases rounded up and rodeoed' review for The Enquirer provides brief descriptions of one commercial and eight open source database purveyors/products: Sleepycat, MySQL, PostgreSQL, Ingres, InnoBase, Firebird, IBM Cloudscape (a.k.a, Derby), Genezzo, and Oracle. Oracle Sleepycat* isn't an SQL Database, Oracle InnoDB* is an OEM database engine that's used by MySQL, and Genezzo is a multi-user, multi-server distributed database engine written in Perl. These special-purpose databases are beyond the scope of this post.
* Oracle purchased Sleepycat Software, Inc. in February 2006 and purchased Innobase OY in October 2005 . The press release states: 'Oracle intends to continue developing the InnoDB technology and expand our commitment to open source software.'
Derby is an open-source release by the Apache Software Foundation of the Cloudscape Java-based database that IBM acquired when it bought Informix in 2001. IBM offers a commercial release of Derby as IBM Cloudscape 10.1. Derby is a Java class library that has a relatively light footprint (2 MB), which make it suitable for client/server synchronization with the IBM DB2 Everyplace Sync Server in mobile applications. The IBM DB2 Everyplace Express Edition isn't open source or free*, so it doesn't qualify for this post. The same is true for the corresponding Sybase SQL Anywhere components.**
* IBM DB2 Everyplace Express Edition with synchronization costs $379 per server (up to two processors) and $79 per user. DB2 Everyplace Database Edition (without DB2 synchronization) is $49 per user. (Prices are based on those when IBM announced version 8 in November 2003.)
** Sybase's iAnywhere subsidiary calls SQL Anywhere 'the industry's leading mobile database.' A Sybase SQL Anywhere Personal DB seat license with synchronization to SQL Anywhere Server is $119; the cost without synchronization wasn't available from the Sybase Web site. Sybase SQL Anywhere and IBM DB2 Everyplace perform similar replication functions.
Sun's Java DB, another commercial version of Derby, comes with the Solaris Enterprise Edition, which bundles Solaris 10, the Java Enterprise System, developer tools, desktop infrastructure and N1 management software. A recent Between the Lines blog entry by ZDNet's David Berlind waxes enthusiastic over the use of Java DB embedded in a browser to provide offline persistence. RedMonk analyst James Governor and eWeek's Lisa Vaas wrote about the use of Java DB as a local data store when Tim Bray announced Sun's Derby derivative and Francois Orsini demonstrated Java DB embedded in the Firefox browser at the ApacheCon 2005 conference.
Firebird is derived from Borland's InterBase 6.0 code, the first commercial relational database management system (RDBMS) to be released as open source. Firebird has excellent support for SQL-92 and comes in three versions: Classic, SuperServer and Embedded for Windows, Linux, Solaris, HP-UX, FreeBSD and MacOS X. The embedded version has a 1.4-MB footprint. Release Candidate 1 for Firebird 2.0 became available on March 30, 2006 and is a major improvement over earlier versions. Borland continues to promote InterBase, now at version 7.5, as a small-footprint, embedded database with commercial Server and Client licenses.
SQLite is a featherweight C library for an embedded database that implements most SQL-92 entry- and transitional-level requirements (some through the JDBC driver) and supports transactions within a tiny 250-KB code footprint. Wrappers support a multitude of languages and operating systems, including Windows CE, SmartPhone, Windows Mobile, and Win32. SQLite's primary SQL-92 limitations are lack of nested transactions, inability to alter a table design once committed (other than with RENAME TABLE and ADD COLUMN operations), and foreign-key constraints. SQLite provides read-only views, triggers, and 256-bit encryption of database files. A downside is the the entire database file is locked when while a transaction is in progress. SQLite uses file access permissions in lieu of GRANT and REVOKE commands. Using SQLite involves no license; its code is entirely in the public domain.The Mozilla Foundation's Unified Storage wiki says this about SQLite: 'SQLite will be the back end for the unified store [for Firefox]. Because it implements a SQL engine, we get querying 'for free', without having to invent our own query language or query execution system. Its code-size footprint is moderate (250k), but it will hopefully simplify much existing code so that the net code-size change should be smaller. It has exceptional performance, and supports concurrent access to the database. Finally, it is released into the public domain, meaning that we will have no licensing issues.'
Vieka Technology, Inc.'s eSQL 2.11 is a port of SQLite to Windows Mobile (Pocket PC and Smartphone) and Win32, and includes development tools for Windows devices and PCs, as well as a .NET native data provider. A conventional ODBC driver also is available. eSQL for Windows (Win32) is free for personal and commercial use; eSQL for Windows Mobile requires a license for commercial (for-profit or business) use.
HSQLDB isn't on most reviewers' radar, which is surprising because it's the default database for OpenOffice.org (OOo) 2.0's Base suite member. HSQLDB 1.8.0.1 is an open-source (BSD license) Java dembedded database engine based on Thomas Mueller's original Hypersonic SQL Project. Using OOo's Base feature requires installing the Java 2.0 Runtime Engine (which is not open-source) or the presence of an alternative open-source engine, such as Kaffe. My prior posts about OOo Base and HSQLDB are here, here and here.
The HSQLDB 1.8.0 documentation on SourceForge states the following regarding SQL-92 and later conformance:
Other less well-known embedded databases designed for or suited to mobile deployment are Mimer SQL Mobile and VistaDB 2.1 . Neither product is open-source and require paid licensing; VistaDB requires a small up-front payment by developers but offers royalty-free distribution.HSQLDB 1.8.0 supports the dialect of SQL defined by SQL standards 92, 99 and 2003. This means where a feature of the standard is supported, e.g. left outer join, the syntax is that specified by the standard text. Many features of SQL92 and 99 up to Advanced Level are supported and here is support for most of SQL 2003 Foundation and several optional features of this standard. However, certain features of the Standards are not supported so no claim is made for full support of any level of the standards.
Java DB, Firebird embedded, SQLite and eSQL 2.11 are contenders for lightweight PC and mobile device database projects that aren't Windows-only.
SQL Server 2005 Everywhere
If you're a Windows developer, SQL Server Mobile is the logical embedded database choice for mobile applications for Pocket PCs and Smartphones. Microsoft's April 19, 2006 press release delivered the news that SQL Server 2005 Mobile Editon (SQL Mobile or SSM) would gain a big brotherâSQL Server 2005 Everywhere Edition.
Currently, the SSM client is licensed (at no charge) to run in production on devices with Windows CE 5.0, Windows Mobile 2003 for Pocket PC or Windows Mobile 5.0, or on PCs with Windows XP Tablet Edition only. SSM also is licensed for development purposes on PCs running Visual Studio 2005. Smart Device replication with SQL Server 2000 SP3 and later databases has been the most common application so far for SSM.
By the end of 2006, Microsoft will license SSE for use on all PCs running any Win32 version or the preceding device OSs. A version of SQL Server Management Studio Express (SSMSE)âupdated to support SSEâis expected to release by the end of the year. These features will qualify SSE as the universal embedded database for Windows client and smart-device applications.
For more details on SSE, read John Galloway's April 11, 2006 blog post and my 'SQL Server 2005 Mobile Goes Everywhere' article for the FTPOnline Special Report on SQL Server."(Via OakLeaf Systems.)
IBM. With BOMP and D-BOMP, IBM was probably the first company to commercialize precursors to DBMS. (BOMP stood for Bill Of Materials Planning, foreshadowing the hierarchical architecture of IMS.) Out of those grew DL/1 and IMS, IBMâs flagship hierarchical DBMS, and the worldâs first dominant DBMS product(s). Of course, IBM also innovated relational DBMS, via the research of E. F. âTedâ Codd, then some prototype products, and eventual the mainframe version of DB2. To this day DB2 on the mainframe remains one of the worldâs major DBMS, as does the separate but related product of DB2 for âopen systems.â
Cincom. In the 1970s, Cincom was probably the most successful independent software product company. Its flagship product was Total, a shallow-network DBMS that was a little more general than the strictly hierarchical IMS. Whatâs more, Total ran on almost any brand of computer hardware. Cincom remains independent and privately held to this day.
Cullinane/Cullinet. Charlie Bachman innovated a true network DBMS at Honeywell, but it didnât turn into a serious product at that time. B. F. Goodrich, however, ran a version. This is what John Cullinaneâs company bought and turned into IDMS, which at least on the mainframe supplanted Total as the technical, mind share, and probably revenue market leader. Cullinet (as it was then called) ran into technical difficulties, however, losing ground to the more flexible index-based DBMS. It was eventually sold to Computer Associates.
A lot of software industry leaders cut their teeth at Cullinet, notably Andrew âFlipâ Filipowski, later the colorful founder of Platinum. Other alumni include Renato âRonâ Zambonini, Dave Litwack, Dave Ireland, and the original PowerBuilder development team. John Landry and Bob Weiler ran the firm for a while toward the end, but they donât really count; rather, theyâre the most prominent alumni of applications pioneer McCormack & Dodge.
Note: Index-based is a term I used in and probably coined for my first report in 1982, comprising both inverted-list and relational RDBMS, as opposed to the link(ed)-list hierarchical and network products such as IMS, Total, and IDBMS. The companies that beat Cullinet were long-time rival Software AG, and then especially Applied Data Research; then all three of those independents were blown out by IBMâs DB2. And then the whole mainframe DBMS business was in turn obsoleted by the rise of UNIX ⦠but Iâm getting ahead of my story.
Software AG. Like Cincom, Germany-based Software AG is a 1970s DBMS pioneer that has always remained independent and privately held. Sort of. Twice, Software AG of North America was spun off as a separate, eventually public company. Software AGâs flagship DBMS was the inverted list product ADABAS. SAPâs MaxDB was also owned by Software AG for a while (and seemingly by every other significant German computer company as well â or more precisely, by Nixdorf where it was developed, and by Siemens after it bought Nixdorf).
I actually visited Software AG in Darmstadt once. Founder Peter Schnell and key techie Peter Page were both gracious hosts. Schnell was proud of their new building, and especially of the hexagon-based wooden dual desks heâd personally designed. General analytic rule â when the CEO is focused on the décor, this is not a good sign for the companyâs near-term prospects. (I call this having an âedifice complex.â)
Applied Data Research (ADR). ADR is often credited as being the first independent software company, having introduced products in the late 1960s and prevailed in antitrust struggles against IBM to allow the business to survive. Basically, it sold programmer productivity tools. This led it to acquire Datacom/DB, an inverted-list DBMS developed in the Dallas area. In the early 1980s, Datacom/DB began to boom, and was on a track to surpass both IDMS and ADABAS in market share until DB2 showed up and blew them all away. ADR was particularly aided by its fourth-generation language (4GL) IDEAL, which was an excellent product notwithstanding the famous State of New Jersey fiasco. (As John Landry said to me about that one, â4GLs are powerful tools. In particular, they allow you to write bad programs really quickly.â)
ADR was an underappreciated powerhouse, boasting all of the Fortune 100 as customers way back in the early 1980s (yes, even archrival IBM). When the DBMS business stalled, however, ADR was quickly sold â first to Ameritech (the Illinois-based Baby Bell company), and soon thereafter to Computer Associates.
Computer Corporation of America (CCA). CCAâs DBMS Model 204 may have been the best of the prerelational products, boasting an inverted-list architecture akin to that of ADABAS and Datacom/DB. The company was also interesting in that it was first and foremost a government contract research shop, and hence did all sorts of interesting prototype work that sadly never got commercialized. In about 1983 it became that the company wasnât going anywhere, and it put itself up for sale.
I was personally instrumental in that decision. Our investment banker pretended he was considering taking CCA public. CCA President Jim Rothnie showed us revenue projections. I asked how he had gotten them. He replied that he had taken the market size projection 5 years out, assumed 10%, and drawn a âplausible curve.â However, I quickly got Socratic with him. âHow many salesmen do you have?â âHow much revenue does the average experienced salesman produce?â âHow many experienced salesmen do you expect to have next year?â âHow high do you think their average productivity can grow?â âLet us multiply.â (Yes, I really said that. I can be a jerk. And anyway Jim was the sort of analytic guy one can say that to without giving serious offense.)
CCA was sold to a Canadian insurance company whose name Iâve now forgotten. Eventually, it was spun back out (perhaps after some intermediate changes of ownership), and resurfaced as primarily a data integration company, called Praxis.
In the real old days (mid 1970s, perhaps), Model 204 was resold by Informatics (later Informatics General, later the hostile takeover that became the guts of Sterling Software, which like so many other companies was eventually absorbed into Computer Associates). I know this because Richard Currier used to sell the product when he worked at Informatics. That probably makes Richard and me about the only two people who still remember the fact.
Hmm. I forgot to mention Intelâs System 2000. Well, truth be told it was a dying product even back when I first became an analyst in 1981, and I recall nothing about it, except Gene Lowenthalâs observation that Intel had had trouble selling chips and DBMS through the same salesforce. I think Al Sisto, who I probably met when he was head of sales at RTI (Relational Technology, Inc. â later called Ingres), came out of that business, but Iâm not 100% sure. I remember Pete Tierney from that RTI management team more clearly anyway, although thatâs mainly because we stayed in touch at subsequent companies over the years.
"(Via Software Memories.)
]]>There are a whopping 44,000 SAP customers running on Oracle databases, and IBM wants them. To get them, for the first time ever, it's optimized its enterprise database for a specific vendor's applications. The new version of DB, 8.2.2, will include a slew of SAP-optimized features, including self-tuning, self-configuration, silent install, dynamic storage allocation and more.
Wouldn't SAP be better served by simply making their application database independent via ODBC? This process really could have commenced years ago and prevented today's dilema: Your Partner has become Your most aggressive Competitor!
SAP tuned for specifically for DB2 or SAP tuned likewise for Microsoft SQL simply reeks of: "Same Sh*t different Pile". Microsoft and IBM will emulate Oracle in due course regarding their assault on SAP's market if DBMS specificity remains the SAP data access API strategy (this is a simple fact).
SAP should be using its quest for DBMS independence to stimulate or contribute ODBC enhancements (should ODBC be lacking in areas critical to its application needs; it is available in Open Source form and across all major platforms). Should the ODBC API not be the problem, then it can push ODBC Driver vendors (DBMS vendors such as IBM included) to get their Drivers in shape (should they be lacking, I know our ODBC Drivers are absolutely fine for this kind of task).
Database specificity gets application vendors nowhere. You can only control your business development destiny by being database independent. When applications are database independent the intellectual capital that drives your applications is preserved. This is akin to building physical and logical firewalls around the ecosystem created by your products. This is much better that being a pseudo DBMS engine reseller for a future competitor.
]]>
I also hope that Oracle will support Mono -off the bat- rather than taking the typical "we will port to Mono sometime in the future..." type message which will not be acceptable, especially as we pulled this off first time around in 2002 (as atop Mono then). Thus, I am sure they can do it in 2005 :-)
Hopefully we should be able to add Oracle 10g Release 2 and DB2 to our SQL CLR hosting features comparison document that currently only covers SQL Server 2005 and Virtuoso.
]]>
DB2 users of PeopleSoft and IBM (the DB2 developer and vendor) suspect that Oracle will obviously try to use its ownership of PeopleSoft to covertly coerce DB2 users into becoming Oracle DBMS users. This strategy would take the form of new features and fixes discrimination as somewhat echoed in these excerpts:
"..In the crescendo surrounding the Oracle-PeopleSoft merger, one question has been repeatedly drowned out: What happens to users of PeopleSoft's DB2 database? Oracle chief Larry Ellison has repeatedly assured DB2 users--and IBM--that Oracle will continue to support DB2 and PeopleSoft's interfaces to IBM's WebSphere platform. But IBM isn't taking any chances, announcing an initiative to alter DB2 to work with products from Oracle rival SAP."
"..IBM has good reason to be concerned. Oracle vies with SAP as the leading vendor for enterprise applications, but it's under pressure to show concrete benefits from the merger by combining assets and pumping up revenue. One obvious tactic will be to use the PeopleSoft applications to steer enterprise customers toward the Oracle database by optimizing performance and features toward the Oracle back end."
If PeopleSoft's application core was ODBC based, the vulnerability to this predictable competitive tactic would at the very least be significantly alleviated. DB2 end-users and IBM the product vendor would have a much stronger basis for countering Oracle by taking them to task about their claimed inability to implement new application functionality enhancements against DB2 etc. especially as this would have morphed into a generic database issue as opposed to a DB2 specific issue -- by virtue of the application and data access layer seperation provided by ODBC's architecture.
]]>
As indicated in an earlier post: IBM is clearly validating what we have done with Virtuoso (as was the case initially with their Virtual / Federated DBMS initiative ala DB2 Integrator). Here is an excerpt from today's eWeek article supporting this position:
To achieve maximum XML performance, bolstered indexing attributes in the technology will enable advanced search functions and a higher degree of filtering. IBM is also adding support for XPath and XQuery data models. This will allow users to create views that involve SQL and XQuery by sending the protocol through DB2's query optimizer for a unified query plan.
Virtuoso has been doing this since 2000; unfortunately a lot of
]]>..So, using relational storage is inadequate for one reason or another, and IBM has concluded that another approach is necessary. The company’s next generation database will therefore have two storage engines: one relational store and one native XML store. And let me be quite clear about this: these engines will be completely separate, with separate tablespaces, separate indexes (Btrees and so forth on the one hand, and hierarchical on the other), and so on...
Hold on here! IBM only
]]>How secure is your data? Looking at your information management resources through a would-be intruder's eyes can help you find (and fix) vulnerabilities.
Naturally :-)
When E. F. Codd developed his relational data model in 1970, the business world was a different place. Almost 35 years after his seminal work appeared, RDBMSs that sprung from Codd's ideas are the standard for storing corporate information. And, with government and industry regulations dictating what kinds of information companies have to store, manage, and audit (and for how long), protecting this information is more important than ever. Unfortunately, it's also more challenging
Even in 1985, when Dr. Codd published 12 guidelines for RDBMSs, there was little concern for data security. In those days, gaining access to a database was so difficult that advanced security features on the database were irrelevant.
Today, RDBMSs carry the lifeblood of every organization. Note the use of the plural: Organizations now have many databases that are decentralized in terms of use and security controls. E-business demands that data access be extended to customers, partners, suppliers, and other parties who were rarely considered in the early data management days. With all this availability ? not to mention pressure from an array of government and industry regulations (see the sidebar, "Security and Compliance") ? the need to control exactly who can access or modify data is becoming paramount.
Absolute facts, that are still partially understood at best. For instance we are still in a so called "Information Age" in which standards based data access remains an issue of contempt instead of absolute necessity.
There are a number of prevailing myths about standards based data access that continue to cloak reality:
Even if the above were true (which I refute strongly), how about the general security vulnerabilities that affect both Native, and Standards compliant, data access interfaces?
Aaron's article does a good job of highlighting 6 areas of vulnerability:
What I have been able to do very quickly (thanks to blogging, and the power of a blog engine that supports WebDAV), is write a tabulated response to each of the items (bar Fixpaks) indicating how the OpenLink Multi-Tier Data Access Drivers (for ODBC, JDBC, ADO.NET, and OLEDB) protect corporate databases from each of these vulnerabilities.
To cut a long story short, we are increasingly living a contradiction where the terms "simple" and "free" are supposed to lead us to products that can adequately handle the challenges of an increasingly sophisticated grid of inter-connecting point.
I have been asked on numerous occassions, "How can you build a company and business based on data access technology?". My reply is the same as usual, "because everything comes down to data". If the data is compromised in anyway, then kiss Information, Knowledge, and everything else goodbye!
]]>
Databases get a grip on XML
From Inforworld.
The
next iteration of the SQL standard was supposed to arrive in 2003. But
SQL standardization has always been a glacially slow process, so nobody
should be surprised that SQL:2003 ? now known as SQL:200n ? isn?t ready
yet. Even so, 2003 was a year in which XML-oriented data management,
one of the areas addressed by the forthcoming standard, showed up on
more and more developers? radar screens.ÃÂ >> READ MORE
This article rounds up product for 2003 in the critical area of Enterprise Database Technology. It's certainly provides an apt reflection of how Virtuoso compares with offerings from some the larger (but certainly slower to implement) database vendors in this space. As usual Jon Udell's quote pretty much sums this up:
"While the spotlight shone on the heavyweight contenders, a couple of agile innovators made noteworthy advances in 2003. OpenLink Software?s Virtuoso 3.0, which we reviewed in March, stole thunder from all three major players. Like Oracle, it offers a WebDAV-accessible XML repository. Like DB2 Information Integrator, it functions as database middleware that can perform federated ?joins? across SQL and XML sources. And like the forthcoming Yukon, it embeds the .Net CLR (Common Language Runtime), or in the case of Linux, Novell/Ximian?s Mono."
Albeit still somewhat unknown to the broader industry we have remained true our "innovator" discipline, which still remains our chosen path to market leadership. Thus, its worth a quick Virtuoso release history, and featuresÃÂ recap as we get set to up the ante even further in 2004:
1998 - Virtuoso's initial public beta release with functional emphasis on Virtual Database Engine for ODBC and JDBC Data Sources.
1999 - Virtuoso's official commercial release, with emphasis stillÃÂ on Virtual Database functionality for ODBC, JDBC accessible SQL Databases.
2000 - Virtuoso 2.0 adds XML Storage, XPath, XML Schema, XQuery, XSL-T, WebDAV, SOAP, UDDI, HTTP, Replication, Free Text Indexing (*feature update*), POP3, and NNTP support.
2002 - Virtuoso 2.7 extends Virtualization prowess beyond data access via enhancements to its Web Services protocol stack implementation by enabling SQL Stored Procedures to be published as Web Services. It also debutsÃÂ its Object-Relational engine enhancements that include theÃÂ incorporation of Java and Microsoft .NET Objects into its User Defined Type, User Defined Functions, and Stored ProcedureÃÂ offerings.
2003 - Virtuoso 3.0 extends data and application logic virtualization into the Application Server realm (basically a Virtual Application server too!), by adding support for ASP.NET, PHP, Java Server Pages runtime hosting (making applications built using any of these languages deployable using Virtuoso across all supported platforms).
Collectively each of these releases have contributed to a very premeditated architecture and vision that will ultimately unveil the inherent power of critical I.S infrastructure virtualizationÃÂ along the following lines; data storage, data access , and application logic via coherent integration of SQL, XML, Web Services, and Persistent Stored Modules (.NET, Java, and other object based component building blocks).
ÃÂ
]]>From Wikipedia, the free encyclopedia.
Informix is a relational database and for almost 20 years was also the name of the company who developed it. Informix DBMS was a development of the pioneering Ingres system that also led to Sybase and SQL Server, and was the #2 database system behind Oracle for some time in the 1990s. Their brush with success was surprisingly short-lived however, and by 2000 a series of management blunders had all but destroyed the company. In 2001 they were purchased by IBM in order to gain access to Informix's existing market share and customer base. Long term plans to merge Informix technology with DB2 are in place, since the Informix Arrowhead project is now called DB2 Arrowhead. IBM is also commited in supporting older versions.
]]>Ingres (technically, Advantage Ingres Enterprise) is, arguably, the forgotten database. There used to be five major databases: Oracle, DB2, Sybase, Informix and Ingres. Then along came Microsoft and, if you listened to most press comment (or the lack of it), you would think that there were only two of these left, plus SQL Server. [From IT-Director]
]]>Oracle, Microsoft, and IBM would certainly like the illusion of a 3 horse race, as this is the only way they can induce Ingres, Informix, and Sybase users to jump ship, and this, even though database migrations are by far the most risk prone and problematic aspects of any IT infrastructure.
Here is the interesting logic from the self-made big three, if you want to take advanatage of new paradigms and technologies such as XML, Web Services, and anything else in the pipeline you have to move all your data out of these databases, and then get all the mission critical applications re-associated with one of these databases, and by the way when you do so it is advisable that you use native interfaces (so that sometime in the future you have no chance whatsoever of repeating this folly at their expense).
The simple fact of the matter (which the self-made big three do not want you to know) is that you can put ODBC, JDBC, even platform specific data access APIs such as OLE DB and ADO.NET atop any of these databases, and then explore and exploit the benefits of new technologies and paradigms as long as the tool pool supports one of more of these standards.
Unfortunately the no-brainer above appears to be the more difficult of the choices before decision makers. In other words, many would rather dig themselves into a deeper hole (unknowingly i can only presume) that ultimately leads to technology lock-in.
The biggest challenge before any RDBMS based infrastructure today isn't which of the self-made big three to migrate to wholesale, rather, how to make progressive use of the pool of disparate applications, and application databases that proliferate the enterprise.
This is another way of understanding the burgeoning market for Virtual Databases, which in my opiion present the new frontier in database technology.
How did the database industry get started? How has it changed the face of business? What were the key milestones, the big obstacles and the lessons learned? I recently came across an interesting panel discussion addressing these very issues, featuring many of the database pioneers and leaders of the last 30 years:
Chris Date
 http://www.sqlsecurity.com/DesktopDefault.aspx?tabindex=2&;tabid=3
Are other SQL Servers (Sybase, Oracle, DB2) subject to SQL injection? Yes, to varying degrees. Here is a site that can get you more details on some of the issues with other SQL Servers. http://www.owasp.org What is SQL Injection and why is all this information not included in the regular FAQ?SQL Injection is simply a term describing the act of passing SQL code into an application that was not intended by the developer.  Since this topic is not specifically restricted to SQL Server it is not included in the normal FAQ. In fact, much of the problems that allow SQL injection are not the fault of the database server per-se but rather are due to poor input validation and coding at other code layers. However, due to the serious nature and prevalence of this problem I feel its inclusion in a thorough discussion of SQL Server security is warranted. What causes SQL Injection? SQL injection is usually caused by developers who use "string-building" techniques in order to execute SQL code. For example, in a search page, the developer may use the following code to execute a query (VBScript/ASP sample shown): Set myRecordset = myConnection.execute("SELECT * FROM myTable WHERE someText ='" & request.form("inputdata") & "'") The reason this statement is likely to introduce an SQL injection problem is that the developer has made a classic mistake - poor input validation. We are trusting that user has not entered something malicious - something like the innocent looking single quote ('). Let's consider what would happen if a user entered the following text into the search form: ' exec master..xp_cmdshell 'net user test testpass /ADD' -- Then, when the query string is assembled and sent to SQL Server, the server will process the following code: SELECT * FROM myTable WHERE someText ='' exec master..xp_cmdshell 'net user test testpass /ADD'--' Notice, the first single quote entered by the user closed the string and SQL Server eagerly executes the next SQL statements in the batch including a command to add a new user to the local accounts database. If this application were running as 'sa' and the MSSQLSERVER service is running with sufficient privileges we would now have an account with which to access this machine. Also note the use of the comment operator (--) to force the SQL Server to ignore the trailing quote placed by the developer's code. Very intresting that these are all Native Interface based exploits. So the security issue isn't ODBC, JDBC, ADO.NET, or OLE DB specific (although they certainly increase the potential damage that can be unleashed via metadata analysis en route to that huge Cartesian Product ; the mother of all Exploits!). Our Session Rules Book was devised in 1993 with many of these issues in mind, and to this date there are no other ODBC/JDBC/OLE DB products out there that even come close to acknowledging this reality. |
http://www.sqlsecurity.com/DesktopDefault.aspx?tabindex=2&;tabid=3
Are other SQL Servers (Sybase, Oracle, DB2) subject to SQL injection? Yes, to varying degrees. Here is a site that can get you more details on some of the issues with other SQL Servers. http://www.owasp.org What is SQL Injection and why is all this information not included in the regular FAQ?SQL Injection is simply a term describing the act of passing SQL code into an application that was not intended by the developer. Since this topic is not specifically restricted to SQL Server it is not included in the normal FAQ. In fact, much of the problems that allow SQL injection are not the fault of the database server per-se but rather are due to poor input validation and coding at other code layers. However, due to the serious nature and prevalence of this problem I feel its inclusion in a thorough discussion of SQL Server security is warranted. What causes SQL Injection? SQL injection is usually caused by developers who use "string-building" techniques in order to execute SQL code. For example, in a search page, the developer may use the following code to execute a query (VBScript/ASP sample shown): Set myRecordset = myConnection.execute("SELECT * FROM myTable WHERE someText ='" & request.form("inputdata") & "'") The reason this statement is likely to introduce an SQL injection problem is that the developer has made a classic mistake - poor input validation. We are trusting that user has not entered something malicious - something like the innocent looking single quote ('). Let's consider what would happen if a user entered the following text into the search form: ' exec master..xp_cmdshell 'net user test testpass /ADD' -- Then, when the query string is assembled and sent to SQL Server, the server will process the following code: SELECT * FROM myTable WHERE someText ='' exec master..xp_cmdshell 'net user test testpass /ADD'--' Notice, the first single quote entered by the user closed the string and SQL Server eagerly executes the next SQL statements in the batch including a command to add a new user to the local accounts database. If this application were running as 'sa' and the MSSQLSERVER service is running with sufficient privileges we would now have an account with which to access this machine. Also note the use of the comment operator (--) to force the SQL Server to ignore the trailing quote placed by the developer's code. Very intresting that these are all Native Interface based exploits. So the security issue isn't ODBC, JDBC, ADO.NET, or OLE DB specific (although they certainly increase the potential damage that can be unleashed via metadata analysis en route to that huge Cartesian Product ; the mother of all Exploits!). Our Session Rules Book was devised in 1993 with many of these issues in mind, and to this date there are no other ODBC/JDBC/OLE DB products out there that even come close to acknowledging this reality. |
IBM TO SHIP DB2 INTEGRATION SOFTWARE
Posted May 15, 2003 4:46 PM Pacific Time
IBM on Tuesday plans to announce availability of its DB2 Information Integrator software, for integrating and analyzing multiple forms of information, the company acknowledged on Thursday.
In beta since February, the software is intended to enable customers to manage centrally data, text, images, photos, video and audio files stored in different databases, according to IBM. XML content and Web services also are supported.
Interesting Quote:
"If we move to information as a utility for giant data grids, this is key technology for hiding or making unimportant the location and type of data. This software enables the data to be accessed transparently wherever it might be," Jones said.
Product Pricing
DB2 Information Integrator will be available for $20,000 per processor and $15,000 per data source connector.
Detail will also be available on Tuesday.
The cost for a bulk adapter license is about $75,000. If change capture is involved, the adapter license costs about $150,000. Real-time integration costs are mips-based, with a starting cost of about $300,000. One adapter can be used to translate and make native calls to all environments.
Very interesting pricing!
For the full story: http://www.infoworld.com/article/03/05/15/HNdb2integrate_1.html
]]>