“I just want to say one word to you. Just one word.
Are you listening?
Yes, I am.
- The Graduate
We are undergoing a renaissance in the business intelligence space.
It started with companies like QlikTech and Tableau who sought to empower departments and analysts who were frustrated with the lengthy, inflexible, and costly implementations of monolithic, traditional BI solutions (this new movement has oftentimes been referred to as ‘self-service BI‘ which is part of the overall ‘consumerization of enterprise software’ trend).
While I am excited by the flurry of activity and innovation in this space, there are glaring gaps that have yet to be addressed.
Furthermore, a major challenge for SaaS-based BI vendors is convincing/enabling their customers to upload their data (typically located on-premise) to the cloud for analysis. This has been such a problem that some vendors who started off with cloud-based solutions have pivoted to on-premise deployments only.
Given these challenges, someone should create ‘data management as a service’.
The idea is fairly simple (although there are some technical challenges to be sure):
- A customer stages documents (.csv, .txt, .xls, .xml, etc.) in a local folder
- The documents are uploaded to the cloud via a Dropbox-like service
- After receiving notification that the documents have been successfully uploaded and processed, the customer logs in for review
- The service identifies patterns across documents and suggests rules to be applied (normalization, classification, error handling/correction, etc.)
- The customer accepts/rejects/modifies these suggestions and defines data transformation rules as well as output format and output location
- The customer saves this ‘job’ and specifies how frequently it gets run (daily, weekly, monthly, whenever a new file is added to the local staging folder, etc.)
- The service is priced according to the amount and frequency of data processed with additional services available such as archiving, advanced processing/transformation, multiple output locations, etc.
‘Data management as a service’ would solve two major points of friction around self-service BI:
- Data access – Previously, users were beholden to IT to access ‘system of truth’ data. They also needed IT or outside consulting services to format data before analysis could begin. With this service, users could exploit ‘good enough’ data they obtained through various means (spreadsheets, reports, screen-scraping, etc.) and control the frequency of data updates.
- Choice – For companies that wanted to leverage SaaS-based BI solutions, they could specify that the output be pushed to a specific location in the cloud instead of back to a local folder (in fact, part of the transformation performed could be abstracting/masking sensitive data to address concerns around storing this type of data outside of the firewall). Customers could also then store their prepared data independently of their BI solution to avoid vendor lock-in.
I could even see having pre-defined output formats according to the BI tool of choice (e.g., QlikView format, etc.).
What do you think about ‘data management as a service’? Does something like this already exist? Let me know your thoughts.