What do all these acronyms mean? If you are a litigation support professional or an attorney dealing with eDiscovery, you have likely encountered email of various types. You may know that a PST file is an Outlook Exchange mail file, but how does it relate to an MSG and an OST? What about some other common mail files with extensions such as EML and MBOX? How are they handled? Do they have to be converted prior to processing and loading to a hosting platform?
Not all email files are created equal and it is important to understand the differences between each type so they can be handled properly during discovery. This article should help to alleviate some of the confusion!
Outlook and Exchange
Let’s first tackle the most common email platform in eDiscovery, Microsoft Exchange. I am going to break this down in layman’s terms and try not to get too technical. The explanations below are for members of the litigation support world dealing with email for discovery purposes.
MS Exchange (EDB) files store the individual mail stores (PST) for custodians. One EDB file may contain one or thousands of PST files. Typically a litigation support professional would be handling individual PST files and not working with the EDB file. A collection or forensic specialist would usually work with an organization’s IT department to extract the PST files for the custodians who are key players in the litigation. However, it is possible to receive an entire EDB and corresponding files that must be loaded into specialized software allowing an operator to extract only the selected PST/Custodians.
Once the PST files are extracted they are usually ready to be processed by the ESI processing application. Files within PSTs can be de-duplicated on a global or custodian basis. PST files contain one or hundreds of thousands of messages. It is possible for someone to create PST files that comprise messages for multiple custodians, although this is not typical. A PST file is a container file for multiple messages for one or more custodians or sources. That is the important point.
An MSG file is an individual message that was likely extracted from a PST file. It is one record from a larger database email store (PST). In eDiscovery some parties may deliver MSG files instead of an entire PST. This may occur when the selection or culling was done prior to delivery to the vendor.
All the files contain metadata related to the message: To, From, CC, BCC, Subject, Date/Time sent, Body, Attachments, Internet header information, internal message ID, etc.
Those metadata elements are contained within the files, both PST and MSG. So one can copy or move an MSG or PST file without altering those important metadata elements. Typically the windows time and date (MAC dates: modified, accessed, and created) associated with the files are not germane to litigation. Remember, the important metadata is inside the file! Another important fact – copying or moving an MSG or PST may change the MAC dates, but it does not alter the hash value.
Click to enlarge image
An OST file is similar to a PST file and for all intents and purposes it is the same thing. It contains messages and metadata. Many ESI processing applications can ingest and process an OST just like a PST. OST, or offline storage, files are typically found on laptops and allow a user to use Outlook without being connected to the Exchange server. When a user goes back online the syncing occurs. The OST and Exchange serve sync and any changes are recorded. It is possible that an OST contains different information from a corresponding PST contained in an EDB if a user has failed to sync the databases.
It is also possible that one finds archived PST files on a laptop or desktop. An archived PST is one that was likely created by the user for organizing messages or reducing the size of storage on the Exchange server. For example, a user may create a PST file called MyEmail_2014.PST and move all of their messages from 2014 to this archive. Just because it is archived doesn’t mean it can’t be kept open in Outlook or accessed. A user can have multiple archived PST files open at one time in Outlook as well as the main PST or OST file. Archive PST files can be stored on local drives, USB drives or server shares for backup purposes.
And if you are wondering where MS Outlook fits in, it’s just the email client used to access the Exchange server and manage the OST and PST files. Assuming one has the proper version of Outlook, a PST or MSG file can be opened and viewed using Outlook.
Click to enlarge image
An EDB contains multiple PSTs and a PST contains multiple MSGs.
All contain metadata related to the message: To, From, CC, BCC, Subject, Date/Time sent, Body, Attachments, Internet header information, internal message ID, etc.
An OST file is an offline PST. Some eDiscovery tools can handle natively, while others need to have the OST converted prior to processing. Ask your vendor or internal litigation support department how they handle OST files!
Outlook is just an email application used to access data stored on Exchange server or open up archived PST files. Outlook is NOT a review tool! I had to throw that in.