Big Data Defined

April 1, 2013 Guest Blogger

Recently, we held a webinar on “Analytics in E-Discovery: Addressing Big Data Challenges,” in which we discussed how (and which) advanced analytical tools can help address issues caused by Big Data – including unnecessary time and money spent on managing non-relevant information, the need to adjust processes and workflow challenges, as well as the technological limitations in finding and assessing information.

Big Data is the buzzword du jour—but it has been around long enough to create some misunderstanding in the e-discovery industry. Big Data is much more than a large collection of data. According to Wikipedia, Big Data is “a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.”

Big Data is also an opportunity to find insights in new and emerging data types and content to answer questions previously considered unanswerable, as there was no practical way to harvest that data. Given a host of new technologies and strategies, organizations can mine and analyze Big Data for market research and trending, pricing optimization, sales forecasting, portfolio risk mitigation, and a host of other use cases to support a particular business decision – enter a new market, leave an existing market, etc.

In an e-discovery context, Big Data should be treated no differently; the key is finding cost-effective ways to extract valuable information from the subsets of potentially useful data buried among the morass of bits and bytes – turning Big Data into an asset, not a liability. Advanced analytical tools, ranging from email threading and analysis to technology-assisted review (TAR)  can help organizations analyze the content of and extract value from Big Data. The most obvious deployment of these tools in e-discovery is to identify and code documents that are responsive to discovery requests.

Big Data has a number of other legal applications as well. Organizations can use TAR and other analytical tools to mine Big Data to gather insights early in a matter to help perform early case assessment, allowing parties to develop a case strategy and determine the appropriate settlement posture. These tools can also be used proactively, well before litigation or a regulatory investigation are initiated, to sort through an organization’s data store, organize it, and code it on an ongoing basis, preparing it for more facile retrieval and defensible deletion as part of an information management strategy. In addition, Big Data can be analyzed for risk management purposes, spotting potential issues and risks based upon the analytics gleaned from the data. For example, advanced analytics can be used to identify patterns or correlations in data that may indicate a violation of law that might otherwise go unnoticed.

In short, Big Data should not be thought of in terms of zettabytes and terabytes; rather, it should be conceived of as a treasure trove of useful information, replete with seemingly limitless insights and innovation to help organizations comply with their legal obligations. In this context, perhaps a better name for Big Data would be Big Opportunity.

Dean Kuhlmann is vice president, business development at Lateral Data, a Conduent company. He can be reached at


About the Author


Previous Article
Managing E-Disclosure Under the new U.K. CPR Amendments: What Your Law Department Should Expect

Sweeping amendments to the UK’s Civil Procedure Rules are now in effect. These amendments, which are design...

Next Article
Is It MySpace or YourSpace? Emerging Trends in the Discovery of Social Media Evidence

In analyzing motions to compel production of social media content, courts generally agree that there is no ...