Electronic discovery (or e-discovery) and its usage in China

Today I am introducing a new theme, namely “electronic-discovery,” this, not only because every day we are using electronic means and documents doing our work, end we need to become more familiar with those means, but also because in determined circumstances there the necessity to prepare and gathering information in an electronic format to be used in a specific case.

In a broad sense

Electronic discovery,

or

e-discovery,

refers to discovery (i.e. the pre-trial phase in a lawsuit in which each party, through the law of civil procedure, can obtain evidence from the opposing party by means of discovery devices including requests for answers to interrogatories, requests for production of documents, requests for admissions and depositions), which deals with the exchange of information in electronic format (often referred to as “electronically stored information” or ESI). These data are subject to local rules and agreed-upon processes, and is often reviewed for privilege and relevance before being turned over to opposing counsel.

Electronic information is considered different from paper information because of its intangible form, volume, transience and persistence. Electronic information is usually accompanied by “

metadata

”. The term

metadata

refers to “

data

about data” that is not found in paper documents, and that can play an important part as evidence (for example the date and time a document was written could be useful in a copyright case).

The Convergence of

e-discovery

and Information Governance

The rapid growth of electronically stored information (ESI) and its increasing relevance in litigation and regulatory matters is compelling organizations to reassess how business-critical information is managed. As the market moves from reactive to proactive e-discovery management, this “Information Governance” (IG) has gained new relevance and priority as a framework for coping with its challenges.

What is Information Governance?

Information Governance is focused on dealing with the many complexities that come along with storing, managing and accessing electronically stored information (or “ESI”) in the digital age. It incorporates the full information lifecycle, accounting for an organization’s regulatory and legal risks, as well as other environmental and operational requirements.

Today to find evidences in order to help a client proving a determined fact, lawyers can no longer shy away from embracing e-discovery tools. In fact, there is increased incentive for lawyers to apply advanced e-discovery techniques in order to perform early case assessment to discover key evidence and put together their case strategy as quickly as possible.

While there have been significant investments made in e-discovery technology and people, many lawyers don’t yet fully understand how to manage or prepare for it, meaning this valuable investment is being wasted. However, with the explosion of the volumes and types of electronic evidence, lawyers can no longer justify applying techniques from the pre-digital world. Since they don’t understand the technology, lawyers must instead rely on intermediaries, but the further lawyers get from the evidence, the less effective and less valuable they become to their clients.

The uptake of and urgency for information governance in an organization often starts with a triggering event; a sudden e-discovery requirement for litigation, an internal investigation, or a regulatory matter. This unforeseen, expensive event often causes businesses to wake up and recognize the need to manage their data proactively. Some businesses now use triggering events as an opportunity to implement processes immediately around information governance, particularly defensible deletion.

Law firms need to understand the issues and the available solutions so they can advise their clients on the right course of action towards better data management. A good starting point is to answer questions around legality, such as where data resides and when it is legal to delete it, as well issues around security and accessibility of information. The most compelling approach is to move away from fearing deletion, instead providing proactive, tactical and actionable steps for managing data.

Collection of data:

Once documents have been preserved, collection can begin. Collection is the transfer of data from a company to their legal counsel, who will determine relevance and disposition of data. Some companies that deal with frequent litigation have software in place to quickly place legal holds on certain custodians when an event (such as legal notice) is triggered and begin the collection process immediately. Other companies may need to call in a digital forensics expert to prevent the spoliation of data. The size and scale of this collection is determined by the identification phase.

Organizations habitually over-retain information, especially unstructured electronic information, for all kinds of reasons. Many organizations simply have not addressed what to do with it so many of them fall back on relying on individual employees to decide what should be kept and for how long and what should be disposed of. On the opposite end of the spectrum a minority of organizations have tried centralized enterprise content management systems and have found them to be difficult to use so employees find ways around them and end up keeping huge amounts of data locally on their workstations, on removable media, in cloud accounts or on rogue SharePoint sites and are used as “data dumps” with or no records management or IT supervision. Much of this information is transitory, expired, or of questionable business value. Because of this lack of management, information continues to accumulate. This information build-up raises the cost of storage as well as the risk associated with e-discovery.

The figure 1 below shows that as data ages, the probability of reuse goes down…very quickly as the amount of saved data rises. Once data has aged 10 to 15 days, its probability of ever being looked at again approaches 1% and as it continues to age approaches but never quite reaches zero (figure 1 – red shading).

Contrast that with the possibility that a large part of any organizational data store has little of no business, legal or regulatory value. In fact the

Compliance, Governance and Oversight Counsel

(CGOC) conducted a survey in 2012 that showed that on the average, 1% of organizational data is subject to litigation hold, 5% is subject to regulatory retention and 25% had some business value (figure 1 – green shading). This means that approximately 69% of an organizations data store has no business value and could be disposed of without legal, regulatory or business consequences.

The average employee creates, sends, receives and stores conservatively 20 MB of data per day. This means that at the end of 15 business days, they have accumulated 220 MB of new data, at the end of 90 days, 1.26 GB of data and at the end of three years, 15.12 GB of data. So how much of this accumulated data needs to be retained? Again referring to figure 1 below, the blue shaded area represents the information that probably has no legal, regulatory or business value according to the 2012 CGOC survey. At the end of three years, the amount of retained data from a single employee that could be disposed of without adverse effects to the organization is 10.43 GB. Now multiply that by the total number of employees and you are looking at some very large data

stores.

Figure 1: The Lifecycle of data

The above lifecycle of data shows us that employees really don’t need all of the data they squirrel away (because its probability of re-use drops to 1% at around 15 days) and based on the CGOC survey, approximately 69% of organizational data is not required for legal, regulatory retention or has business value. The difficult piece of this whole process is how can an organization efficiently determine what data is not needed and dispose of it automatically…

As unstructured data volumes continue to grow, automatic categorization of data is quickly becoming the only way to get ahead of the data flood. Without accurate automated categorization, the ability to find the data you need, quickly, will never be realized. Even better, if data categorization can be based on

the meaning of the content

, not just a simple rule or keyword match, highly accurate categorization and therefore information governance is achievable.

Please note that all the information exposed above only represent the general panorama on this interesting and complicated topic.

Cristiano Rizzi