This problem usually presents itself in one of two ways: either a company is going to migrate content from one environment to another and realizes there is a lot of "cruft" to move, or a company is seeking to make an environment more efficient by reducing, by various means, the amount of content in the environment. Dark data can be files on file systems, in collaboration platforms such as Sharepoint, or on endpoints such as PCs or mobile devices.
Dark data consumes storage and processing resources that can inflate IT costs.
There are basically two ways to address dark data: reactively and proactively. A reactive approach fixes the problem, while a proactive one prevents it.
Reactively, IT can use a discovery tool to find data of questionable utility. When was the last time it was accessed? When was it created? Is there a valid owner of this content? What kind of content is this - tax documents, or research, or is it just memos about where to order pens and paper from? The IT team can communicate this information to business managers to say, "hey, here's how we're going to find stuff that isn't needed, and then take it offline."
If there is no discovery tool, then IT has to rely on the information available about the content. Even a simple file system has file creation and last modification metadata. In some cases IT may try to figure out who it belongs to based on file name or nearby content, but without a proper discovery tool, IT can only take stabs in the dark.
The proactive approach is to adopt a content life cycle policy, then communicate it and enforce it. Content has to have a valid owner, and there must be a process for electing one in the case one isn't known. Content is not expected to be immortal: project files can be archived and taken offline once a project is complete, forms go out of date and are replaced with new forms, and employees (and their content) come and go.
Certain content types have specific requirements. Financial and legal documents may be required to be kept for years at a time - and once no longer required there may be a requirement to destroy them. Often, a business unit may be unsure of how long it needs content - no one knows what it might be used for, but no one wants to be responsible for throwing it away. This is how attics gets stuffed with three different Christmas trees - it's the role of IT to offer a method for keeping things that are important without consuming unnecessary resources.
Dark data is a problem everyone has but few address. Cleaning it up, followed by measures to prevent its recurrence, are low-cost "easy wins" requiring just a little planning.
1 comment:
Store it space ....
The cost would probably be a problem for some. I think it's a great solution if it can be done on the scale that is affordable and practical .course any data stored here would be
that data that needs to be kept but not really readily accessed.
http://www.fastcoexist.com/3016901/futurist-forum/is-the-future-of-data-centers-in-space
Or better yet Microsoft has an "under the sea venture" I don't know if a company wants to sink their digital assets into the ocean though.
http://www.forbes.com/sites/michaelkanellos/2016/02/05/three-things-to-watch-in-microsofts-underwater-data-center/#35f92dd740e7
Just two "hair-brained" suggestions of mine .
Post a Comment