Waivering, To and Fro
Crafting a list of keywords that will retrieve a maximum number of responsive documents on your matter requires planning and knowledge. Skilled practitioners in our field understand that it requires interviews with relevant custodians (to understand organizational lingo), and a firm understanding of the specific search technology that's employed. We also know that this methodology shouldn't only apply to the keywords
within a document, but also in the TO and FROM fields in email metadata as well. Almost everyone has, at minimum, two email accounts - one for work and one for personal communication. Some of us have more and I've seen as many as twelve corporate email addresses for the same person at an organization. For example, "customersupport@xyz.com", "marketing@xyz.com", "helpdesk@xyz.com", "accounting@xyz.com", etc. While e-discovery typically targets work and personal email, this will certainly grow once other types of "e-communication" accounts are brought into the fold, such as Instant Messaging and cellular text messaging accounts.
If you are required to search email communication by one or more individuals and the available custodian information won't suffice, you will need to capture all variations in the TO and FROM fields (and possibly the CC and BCC fields). The format of these fields can vary widely by including just the email address (jbui@xyz.com), the display name (Jerry Bui), or some combination of the two. You might also observe some of other formatting wildness, such as the following:
CCMAIL: Jerry T Bui at XYZ_US
MS: XYZ/US/JTBUI
X400:c=US;a=CONCERT;p=XYZ;s=Bui;g=Jerry;i=T;
If you're looking at personal email accounts, then all bets are off. These tend to look like any of the following:
prettyflower_1963@yahoo.com
ifixmustangs@gmail.com
jb74_forensicexpert@msn.com
In this scenario, searching the TO and FROM fields for elements of the person's name just won't work. Keep in mind, too, that individuals can change their DISPLAY NAME alias numerous times over the course of owning an email account. Realize that you will need to tease this information out during custodian interviews and you will also need to sample the material yourself;
look at the email headers and note the variations. You will want to include all variations of a person's name, email address, and display name alias as part of your search term list. Otherwise, any misunderstanding of what's included in the TO and FROM fields could cause you to overlook relevant communication.
Labels: metadata
Meta-Four
For all practical intents and purposes, there are four
major types of Metadata that we're concerned with in the review, analysis & production phases of the E-Discovery lifecycle:
(1) Document Metadata
(2) Container Metadata
(3) Tagging Metadata
(4) Workflow Metadata
Document Metadata - This is the traditional stuff that you're accustomed with when trying to ascertain the Author, Create Date, Modified Date, Last Printed Date, etc. This is also referred to as
embedded metadata.
Container Metadata - Vendors should be populating this metadata type with custodian & source information, as well as culling parameters if applicable. Ideally, there should be
several fields allocated to cover the breadth of container information so that a linking system can reflect how specific batches of data were extracted and processed.
Sidenote: The Socha-Gelbmann team have initiated an industry-wide XML initiative to standardize all the data fields in party-to-party transmittals.Tagging Metadata - All the relevance calls, issue codes, redaction reasons, and privilege reasons comprise this category of metadata.
Workflow Metadata - This oft overlooked set of tags helps organize the workflow steps in your review platform. "First Tier Review complete", "Second Tier Review Complete", "Needs Further Discussion" and other similar tags control how work is distributed for review amongst your team members. There should also be a tag that determines the ultimate Production status of a document after it has traversed all the various tiers of control. A lot of this gets lumped in as "Issue Codes", but it can be more accurately described as "workflow metadata".
Labels: edrm, metadata, xml
Chain of Fools
With so many parties involved in the E-Discovery process these days, can anyone claim to have a clear & precise picture on your project's chain-of-custody? Your case is likely to have a different vendor for Evidence Collection, Processing (Culling and Deduping), Text & TIFF, and Review & Production. If pressed, can your vendor tell you where/how a specific produced file was collected and what treatment it received along the entire chain of evidence? I think they would be hard pressed to answer this without significant research. They would need to call every single vendor that was involved in the process, and it's likely that a painstaking analysis of all the various logs still won't yield a conclusive answer as to where, when, and how the
file in question was derived. The truth of the matter is that this
container metadata is often dropped to the floor as it is handed off between vendors.
As the case manager, it is
your job to enforce the integrity of the chain-of-custody. Ensure that all logs are transcribed accurately with source information and that all the culling parameters are captured (search expressions, date ranges, and deduplication fields). Ideally, the vendor will have the ability to store this in a field (as container metadata) in the appropriate
load file. The recipient vendor should be made aware of these fields and should be instructed to store this in their subsequent output file. In the end, the review & production platform should be configured so that these fields are exposed to you and your end users.
Labels: chain of custody, metadata
This Blog is dedicated to the men & women working directly in the trenches on EDD projects - junior attorneys, paralegals, project managers, document reviewers, data processors, and staff consultants alike, who put in countless stressful (and often thankless) hours doing what seems to be the impossible.

- Name: Jerry Bui
- Location: Los Angeles, California, United States
Jerry leads large scale discovery projects and investigations for government agencies and the country's top law firms. His background is in multi-tiered software architecture, network security, data modeling/warehousing and document analytics. He has been involved in major front-page corporate cases, some of which involve hot-button matters such as Anti-money Laundering, Antitrust, and Options Back-dating.
View my complete profile
Project Managers, Practitioners, and Professionals...
Recall and Precision
Only the Company Can Know Itself
Trend towards the Proactive
The Offline Review
The Media Log
Repopulating Dupes
Database Mitosis
Waivering, To and Fro
Beware of Going Native
Ride The Lightning
E-Discovery 2.0
On the Mark
Law Tech Guru
EDDBlogOnline
April 2007 /
May 2007 /
November 2007 /
December 2007 /
January 2008 /
April 2008 /
May 2008 /
Disclaimer: Opinions and claims contained herein are those of the author only and are not representative of Jerry's employer, its partners, or any of its member firms.
This blog is intended to impart general information and does not offer specific legal advice. Use of this blog does not create an attorney-client relationship. If you require legal advice, consult an attorney.
Subscribe to
Posts [Atom]