This article was originally published on InsideCounsel.com
Businesses of all shapes and sizes are flocking to the cloud in droves. For a multitude of economic, technical and business reasons, there is unabated growth of corporate data moving to the cloud. At first adopters dipped their toes into the cloud mostly with email and instant messaging systems. Having realized the benefits of the cloud for business communication, other types of corporate ESI (electronically stored information) are traveling “up there.” This includes standard business documents, spreadsheets and presentations where companies can utilize enhanced collaboration capabilities provided by the cloud. Additionally, more so-called structured data from the alphabet soup of CRM, ERP, HRIS, and other enterprise information systems are finding a permanent home in the cloud. Of course, the most widely-used cloud system is social media which has billions of users happily posting, liking, and following other individuals from all over the world. As businesses shift their data to cloud, they must understand the game-changing impact on e-discovery.
Learn how to successfully navigate the cloud and defensibly identify, preserve and collect ESI from the panelists featured in this recorded webinar.
As stated in the first article of this series, “data stored in the cloud isn’t different than data stored within the confines of an organization; it’s still data, but it has a lot more places to live.” Once the cloud-based ESI has been identified and the legal hold notices are in place, the final phase of the process is collection. Collecting cloud-based data in a forensically sound and defensible manner requires some different processes, procedures and tools. For the purpose of this article, “forensically sound” means that the collected data and its metadata are exactly the same as the original source data. “Defensible” means there is sufficient documentation to trace the collection process from beginning to end. The following best practices will help ensure the integrity of cloud-based ESI during collection.
Understand the Differences
The ESI residing in the cloud is not fundamentally different than ESI contained in a traditional, on premise server. However, the means of accessing cloud-based data certainly differs from traditional collections. It is possible to walk up to a traditional server, plug in a piece of storage media and run collection software, however that is not the case in cloud systems. Due to the distributed nature of cloud computing, servers and storage can be scattered throughout the world and customers are not provided physical access to the hardware. This necessitates an approach to collection via remote access. Another major difference with cloud based data is that the metadata fields can be different from traditionally stored data. For example, in many cloud based storage systems the “last modified” date is the only date tracked. This means the date the file was created and date in which it was last accessed, which are commonly available in traditional ESI, are not available. Some of this information may or may not be tracked by the cloud system in other ways such as through a document history. Another difference is that some cloud based storage systems maintain various versions of a document which may or may not be readily accessible to the end user.
Understand the Risks
The easily accessible nature of cloud data is a generally regarded as a positive attribute for end users. However, the ease of access should not be confused with “easy to collect.” While it may be tempting to self-collect cloud based data, there are risks involved that are similar to the risks involved in any self-collection effort. Some of these risks are bias (real or perceived), a lack of training, supervision, proper tools and taking employees away from their regular work duties. If a custodian collects the data, they might be required to testify about the methods they used to collect the data. Having a third-party and unbiased expert that accurately can describe the defensibly sound manner in which the data was collected is beneficial for litigation purposes
Use Appropriate Tools
The gold standard in forensic collection is the concept of making it impossible to alter the source data during collection. This is why forensic hardware and software providers have sold and practitioners have for decades used “write blockers” to preserve source data and arguably more importantly, the collected data. If a collection process alters the source data, it is not possible to create an exact duplicate of the original data because it has already been changed. This can lead to alteration of metadata and worst case scenario: inadvertent spoliation of evidence. There are tools available that can be used to present cloud based data as “read only” which will in turn allow for a forensically sound collection.
Another popular feature of many cloud based systems is robust search capabilities. This may be great for everyday use but can be detrimental to an eDiscovery collection. The idea of taking a set of keyword search terms and running them against the cloud data to narrow down the amount of data to be collected may seem tempting. However, is it extremely rare that all of the keywords that could indicate relevant documents are known at the time of a collection. As is often the case, new keywords will arise as data moves through the review process. In other words, if a collection is based on a set of keywords and those keywords change after the data has been collected, the source data will have to be searched and collected again which will ultimately cost the company time and money. A better approach is to broadly collect data sources that are likely to contain responsive files and apply keywords, analytics and other forms of culling post-collection.
Protect Collected Data
Loose files and their metadata are easy to alter. A simple right-click can change metadata. One way to prevent inadvertent alterations of collected data is to take another cue from the digital forensic world and wrap the collected files inside a container file. This protects the files and their metadata and can also allow for verification of the data to ensure that it has not been alter since it was placed in the container – ideally at the time of collection.
Documentation of the collection process can save headaches and stave off legal challenges down the road. Take steps to ensure that all pertinent info such as who collected the data, when and how is documented. Even better are collection tools that create a full audit trail that shows information about the source data and collected data with their corresponding metadata and hash values (an “electronic fingerprint” of the document).With this level of detail, one can easily determine that a collection was not only forensically sound but is fully defensible.
Following these best practices can mitigate risks involved with cloud based data collection. Ensuring that the collection is forensically sound and defensible will go a long way toward avoiding costly discovery about discovery. Cloud-based ESI is here to stay and will only become more prevalent in the future. These clouds are not clearing and they cannot be ignored but armed with proper knowledge, they do not have to be harbingers of gloom and doom.