SAND DNA Overview

I spoke with Linda Arens, VP Alliances and Marketing for SAND. Here’s a summary of the discussion:

  • SAND is based in Montreal, is listed on the OTC (SNDTF.OB). They have about 50-60 employees in the company.
  • They stared in 1983 as a hardware company, but have remade themselves with proprietary storage software purchased from Lockheed Martin
  • They have a number of products listed on their web site, but according to Linda they are all based on the same underlying foundation, hence the “DNA” moniker
  • At the core is a column based, high compression database management system (DNA Analytics). Linda said they advertise 90% compression rates, but see mid-90s and above in real world scenarios. The database runs on a variety of operating systems (HP, Sun, Linux, Windows), and hardware platforms. This is not a full-stack appliance, but a special purpose database management system.
  • SAND has pre-built integration with SAP BI, Oracle, IBM DB2, and Microsoft, and can operate behind or in conjunction with any of them.
  • DNA Access component is intended to offload infrequently used data from the primary server, while still allowing seamless on-line access to the data. The big advantage of this is zero impact to existing user applications. This sounded similar to the way Dataupia operates, working behind the scenes to accelerate queries. DNA Access does allow direct queries, either using SQL or via a ODBC compliant access tool (she mentioned Business Objects, Microstrategy, and Cognos).
  • SAND does have a small professional services group that focuses primarily on the setup and installation of their product. She said they leave the systems integration work to partners such as Accenture.

In their most recent financial results announcement, they touted a deal with P&G for the SAP BI product. Given the volumes in some SAP systems this capability would seem to be a competitive advantage. But I don’t know if SAND has anything else that differentiates them from the host of companies that are in this space now. Looking at their financial report, they listed $1.9M in revenue for the past quarter, which says they are still in their infancy in terms of actual sales. Given the amount of venture capital flowing into this sector, they will have a tough time getting traction.

Data Warehouse Appliance Comparison

I’ve added some information recently to the appliance spreadsheet, and figured it was time to repost. I made the following changes:

  • Added PANTA to the “Harware” category.
  • Updated information on SAND based on my discussion today with Linda Arens (VP Alliances and Marketing)
  • Updated information on ParAccel based on my discussion last week with Kim Stanick (VP Marketing)

 

The information has also been added to the 360DegreeVendor site.

 

Data Warehouse in the Clouds

I’ve been hearing and reading a lot lately about “cloud computing”.  Information Week in particular has run several articles on the topic, including last weeks “Guide to Cloud Computing”.  Most of these articles have been about general purpose platforms, with the focus on Amazon, Google, and Salesforce.com offerings getting the most press.

On the surface computing in the cloud looks very similar to the hosted service offerings that sprang up in the mid-90s, led by companies such as Digex.  I believe fundamentally it is, with the exception of well known on-line brands entering the market.  And with the exception of connection speed and reliability, most of the issues are still in play, namely security, performance, and environment change control.  The benefits touted are lower TCO, faster time to market, and solution scalability.

The Data Warehouse in the Cloud (DWC) concept is just starting to take hold, led by such companies as Vertica (in partnership with Amazon).  I can see the benefits of going this route, most notably time to market.  Connection speeds might still be an issue, particularly in the case of large data load files.  And security will always be an issue, particularly with sensitive customer data.  TCO is often presented as a plus for the DWC, but it’s not that straightforward.  Factors such as initial hardware & software costs, data center operational costs, labor, and upgrade costs must all be included in the mix.

In short, the DWC is a viable alternative, particularly for a company with the following characteristics:

  • Limited data center resources - hardware and/or operations staff are tapped out, and you’d need to significant capacity to deploy a new data solution
  • Deploying a data solution on a “new” platform - you’re an Oracle shop but planning to deploy on Vertica
  • Dispersed users - your user base is geographically spread out, and access corporate systems is through a variety of network channels
  • You’re a mid-sized company and don’t have access to volume hardware and software discounts

 

But remember, it’s like leasing a car.  You never get rid of that payment.

Buzzword: “Column-store database”

There has been copious amounts written recently about the advantages (and disadvantages) of column-store databases, so I thought I’d do a little research to find out what the noise was about.  After all, SybaseIQ has been around for a decade now, touting the benefits of column storage and compression.  Vertica seems to be making the most noise out there now, with Michael Stonebreaker leading the charge.  But there are a number of column-based vendors out there these days (see my data warehouse appliance spreadsheet), among them Kickfire, Calpont, InfoBright, and ParAccel, so this is obviously not just a lone-wolf situation.

Breaking this down, the concept is really simple - you are storing your data in what you would typically think of as an “index” in the row based world.  Storing your data in this manner gives you two big advantages:

  1. You can achieve much higher compression rates, since the likelihood of encountering repeating values within one column is much higher than within a row.
  2. For typical analytical queries access a small number of columns, you can skip all the other columns entirely which provides a huge performance boost.

As with any technological approach, there are downsides:

  1. In the case of operational reporting, you can actually see performance degradations since you’re typically reading across rows as opposed to down columns.
  2. Writing data in a row format is quicker, which is important in a low-latency reporting environment where trickle-feeds or other near-real time updating is required.

This discussion is similar to the “dimensional model vs 3rd normal form” debate.  I don’t think there is a right or wrong answer.  You need to understand how your users are accessing the data and the loading requirements before making a decision.

360DegeeVendor Release

I just released a new version of my vendor web site, 360DegreeVendor.  In addition to the vendor search capability and the analyst web crawling feature, I’ve added the option to search by offering (product, service, or research), and by venture capital firm.  You can also filter by open source offerings.

Please feel free to send me feedback if you have comments or questions after using the site.

Buzzword: “EAI”

Although Enterprise Application Integration (EAI) is not usually considered part of the business intelligence, data management sandbox that I play in, it is useful to discuss how it fits into the overall data integration and data delivery picture.  EAI technologies are used to keep systems in sync, and to provide a virtual layer on top of multiple underlying systems to present a consolidated view of data to a end user. 

There are two primary architectures: hub-and-spoke, and point-to-point.  In a hub-and-spoke model, all systems are connected to a central “routing” point, and all transactions and inquiries flow through that central router to the required system(s).  The point-to-point model employs a directory that allows for systems to talk directly to one another.  Both have their advantages and disadvantages.  Hub-and-spoke reduces the number of interfaces for a given system, but transaction latency can suffer if the central “router” becomes overloaded.  Point-to-point provides the fastest response times, but the number of interfaces can become unmanageable for a large number of systems.

Now let’s tie this all back to the business intelligence world.  We’ve discussed Virtual Data Integration before, and if you break VDI down it’s basically an EAI technology.  The focus is on providing a consolidated view of two or more underlying systems, and it accomplishes this by building interfaces into these systems.   Sounds like EAI to me.

Gerson Lehrman Group

I was talking to a friend of mine, Neil Moses (President of Rogomo), about potentially tying in experts from his site to 360DegreeVendor. He’s focusing his efforts on the tutoring market, but mentioned a company called Gerson Lehrman that focuses on providing high-end consulting advice on an per hour or per issue basis. They offer a broad range of services, including vertical specialties such as Telco and Health Care, as well as horizontal services such as Technology. They advertise over 5,000 service providers in the Technology space, so I did some investigating and found a fair number of providers in business intelligence (262), data warehouse (76), and data management (66) “study groups”. I sent in an inquiry about a partnership or referral program (they didn’t mention one on their web site). I’d like some information on this company if anyone has worked with them (or used their services) in the past.

Buzzwords: “ETL” (and “ELT”)

Extract, Transform, and Load (ETL) is another one of those terms that was very useful as recently as 3-5 years ago, but now has lost some of it’s value for some of the same reasons “data warehouse” has become archaic - it describes a component of the architecture that has sprawled outside the bounds defined by this term. Ten years ago, ETL accurately described the majority of the processing required to load data into a data warehouse, data mart, etc… The term mimics the process used to accomplish this task, namely “extracting”(or receiving) data from source systems, “transforming” the data by applying business rules, data cleansing, and other manipulations on the data sets, and then “loading” the data into the target data store. In a predominately batch world, this is a fine way to handle the loading process, and in fact a significant amount of data is still being loaded in this manner today.

But things are changing - the analytical world today is not so straightforward, and as a result the loading requirements have become much more complex. And in fact the biggest change is the move away from a physical “load” of data, to a virtual integration of data. Virtual data integration (VDI) is not always the answer, but in a business climate that increasingly rewards real-time feedback, VDI provides visibility into the business operations as they occur, not days or even hours later.

One final word about “ELT” - extract, load, and transform. I’ve heard some vendors (and analysts) talk up ELT as a new and improved ETL, but in my mind this is just an architectural choice. It makes little difference if you load the data before transforming, or the other way around.

Leadership

I just finished reading the book “Beyond Band of Brothers - The War Memoirs of Major Dick Winters”, and wanted to share Dick’s list of ten leadership principles.  I’m a big fan of military stories, and have read the book and have seen the mini-series “Band of Brothers” multiple times.  There are examples of each of these principals throughout the story, but this is the first time I’ve seen them listed in one place.   Everyone can benefit from becoming a better leader, both in your professional and personal lives.

  1. Strive to be a leader of character, competence, and courage.
  2. Lead from the front.  Say, “Follow Me!” and then lead the way.
  3. Stay in top physical shape- physical stamina is the root of mental toughness.
  4. Develop your team.  If you know your people, are fair in setting realistic goals and expectations, and lead by example, you will develop teamwork.
  5. Delegate responsibility to your subordinates and let them do their jobs.  You can’t do a good job if you don’t have a chance to use your imagination or your creativity.
  6. Anticipate problems and prepare to overcome obstacles.  Don’t wait until you get to the top of the ridge and then make up your mind.
  7. Remain humble.  Don’t worry about who receives the credit.  Never let power or authority go to your head.
  8. take a moment of self-reflection.  Look at yourself in the mirror every night and ask yourself if you did your best.
  9. True satisfaction comes from getting the job done.  The key to a successful leader is to earn respect-not because of rank or position, but because you are a leader of character.
  10. Hang Tough!-Never, ever, give up.

Buzzword: “CPM” and “BPM”

Corporate Performance Management (CPM) and Business Performance Management (BPM) have gained significant visibility in the past 5 years.  Although there are purists that would argue these are different, I’m lumping them together because from an consumer perspective I don’t think there are appreciable differences.  Dashboards have also gotten a lot of press recently, and often gets lumped into the CPM discussion.  But I see dashboards as an implementation option within a larger CPM initiative.

So what is CPM/BPM?  Essentially, it’s using metrics and KPIs to measure and improve the business.  A successful CPM solution must be driven from the executive ranks down through the business to technology.  It’s DOA if it starts in IT, and has little chance when germinating within a particular business unit such as Finance.  The reason for this is simple: the goal of CPM is to improve the business by taking measurements at various levels (starting at the top), setting thresholds, and managing to exceptions.   Now I don’t necessarily agree with this approach to running a business, particularly if this is presented as the silver bullet.  This approach often fosters a very reactive culture within the business, but coupled with executive direction and sponsorship, and properly defined non-punitive measures, CPM can provide significant visibility into the operations of the organization.

That being said, I can’t stress enough the importance of not driving this from the IT side of the house.  Particularly insidious is the temptation to follow vendor claims of implementing a CPM solution (usually just a tricked out dashboard).  If the executive suite doesn’t buy in and drive this, the technical solution is a moot point.