Legacy applications can benefit from big data approaches without the need to replace the legacy architecture with new technologies. Instead the big data can augment the application by collecting higher volume, variety, and velocity data about the user’s activity using the application. Analysis of this data can inform decision makers where there may be problems with the work-products. Correspondingly, it can provide requirements analysts with information about where improvements are needed or with more complete library of edge cases to consider for new designs.
When we look to data technology to solve problems, we should permit the technologies to identify the problems that can be solved with the current capabilities instead of demanding that the technologies evolve to solve the hard problems we have been working on. There are many opportunities to make progress even if we don’t touch the hard problems. Allowing technology to solve what it can solve now may transform the hard problems to be narrower, or possibly even less visible. For example, there are other ways we can improve overall life expectancy without curing any cancers, perhaps with investments in areas unrelated to health care. It is our nature to focus on objectives that catch our attention. This focus can blind us to immediate opportunities that are realistic given our current situation.
If someone wants to cause trouble for the big data owner, they can leverage the known missing data to raise accusations that the big data owner will not have any data to use in defense. The accusations can suggest cheating, fraud, criminal activities, etc that can harm reputations or invoke costly and lengthy investigations that can deny the owner of realizing the potential benefits of the big data analytics.
These examples set a new standard for rapid access to context information to accompany the new information for breaking news. In the case of street maps and aerial/street views, this information required extensive investment long before the event occurred. In the case of the more recent information (street congestion, weather radar imagery, landslide risk assessments) there was a need for prior investment for models and technologies to provide this information on a timely basis. These investments were made on a global scale where the vast majority of this readily available capability may never been needed for matching with a breaking news story. But when a breaking news story does occur, we welcome the ready access to this information specific to the broader context of the story.
With the enthusiasm of personal health diagnostic tools that connect automatically (such as through smart phones) to health data vaults there is an tremendous opportunity to undermine HIPAA privacy protections by secretly encoding the individual’s contact information within the measurement data using steganography techniques. The market for cracking HIPAA protections either for private gain…
Governance involves regulation of some sort, but that regulation would have to be as high frequency as the analytic tools in order to separate the good forms of spoofing from the bad forms. Regulation is not that responsive so the governance is the sluggish and potentially ultimately harmful categorical outlawing of spoofing.
The enthusiasm for the benefits of big data comes from widely promoted reports of past successes. The promise of big data techniques is that it can provide similar successes in other contexts. Big data involves volume, velocity, and variety. The volume and velocity depend on automated queries and report building. The variety introduces the opportunity for new benefits. The combination of automation and opportunity from variety is what makes re-identification possible or even very likely.